Polit Behav DOI 10.1007/s11109-007-9035-8

ORIGINAL PAPER

Measuring Exposure to Political in Surveys

Daniel Stevens

Springer Science+Business Media, LLC 2007

Abstract Research on the influence of negative political advertising in America is characterized by fundamentally conflicting findings. In recent years, however, survey research using estimates of exposure based on a combination of self-reported television viewing habits and Campaign Media Analysis Group data (a database of all advertisements broadcast on national and cable television in the top 75 media markets) has argued that exposure to negative political advertising boosts interest in the campaign and turnout. This paper examines the measurement properties of self- reports of television viewing. I argue that the errors from common survey formats may both be nonrandom and larger than previously acknowledged. The nonrandom error is due to the tendency of politically knowledgeable individuals to be more sensitive to question format. Thus the inferences drawn about the relationship be- tween political knowledge, exposure to negative ads, and political behavior are also sensitive to the measures used to estimate exposure. I demonstrate, however, that one commonly used measure of exposure—the log of estimated exposure—is not only more theoretically defensible but also alleviates some of the more serious problems due to measurement error.

Keywords Political advertising Á Measurement error Á Self-reported television viewing Á Survey research

The influence of political advertising campaigns in American elections, especially the negativity of ads, remains a subject of great interest and controversy. Geer (2006, 10) notes that in the last two months of the 2000 election, the Associated Press alone filed 95 stories on negative campaigning, while Brooks’ (2006) more

D. Stevens (&) Department of Politics, University of Exeter, Cornwall Campus, Penryn, Cornwall, TR10 9EZ, England e-mail: [email protected] 123 Polit Behav comprehensive search uncovered 410 articles on the relationship between campaign tone and turnout from 1994 to 2005. The flurry of results has not resulted in consistent findings, however. Lau, Sigelman, Heldman, and Babbitt’s (1999) meta- analysis, which examined findings from studies of the effects of negative ads between 1984 and 1999, uncovered a panoply of conflicting results and arguments. While scholars agree that negativity is increasingly prevalent in campaigns, the flow of research findings since Lau et al. remains contradictory (Clinton & Lapinski, 2004; Goldstein & Freedman, 2002a, Goldstein & Freedman, 2002b; Kahn & Kenney, 2004; Stevens, 2005). Some argue that the conflict is due to different research designs. As Martin puts it (2004, 545–546), ‘‘Evidence supporting the idea that negative campaigning discourages comes primarily through experimental research, whereas evidence supporting the idea that negative campaigning encourages voter turnout comes from survey research.’’ This assumes that disagreement stems from differences in the set-up of experiments and survey research; it ignores problems within particular research designs. This paper argues that there are sizable problems with survey research estimates of the influence of political advertising that prompt doubt about the rosy conclusions characterized by Martin. It examines survey estimates of ad exposure, in particular, state-of-the-art measures using Campaign Media Analysis Group (CMAG) data, which capture the frequency of all political ads in the top 75 media markets in the United States1. Researchers using these measures claim that they have fewer disadvantages of survey estimates of ad exposure while possessing several advantages over experiments. I examine the nature of errors in self-reports of television viewing habits, the foundation of the state-of-the-art measures, and discuss their impact on the inferences that are drawn about the relationship between exposure to advertising and political behavior via a two part multi-method research design. The first part is a within-subjects experiment that allows comparison of actual exposure to television with survey estimates of exposure. The experiment permits a more direct assessment of the properties of survey measures of exposure than alternatives, such as exploring their construct validity or ‘‘predictive power’’ (Ridout, Shah, Goldstein, & Franz 2004). In the second part of the study, I examine American National Election Study (ANES) data in which respondents were asked about their viewing habits in three different question formats to see whether the variation in individuals’ estimates parallels the findings from the diaries. Finding that it does, I then demonstrate that there are systematic individual-level character- istics, such as political knowledge, that drive discrepancies in estimates of television watching based on different question formats and that such sensitivity to question wording has an important impact on estimates of the influence of political advertising. To be sure, survey researchers are aware of the inevitability of measurement error in estimating exposure to advertising. However, this paper shows that the error in survey estimates may be both larger than assumed and, more importantly, correlated with individual-level characteristics associated with key dependent variables such as the propensity to vote and interest in the campaign. These systematic errors in measurement undoubtedly bias estimates of the influence of ads. This affects

1 More recent CMAG data cover the top 100 media markets. 123 Polit Behav observed relationships and the inferences thereby drawn. Fortunately I am able to demonstrate also that one of the more theoretically defensible, though not always used, measures of ad exposure can alleviate many of these problems.

Survey Measures of Ad Exposure

There are three main approaches to estimating exposure to advertising in surveys. One does not attempt to measure individual exposure, instead examining the association between aggregate level variation in negative advertising and survey measures (Finkel & Geer, 1998; Lau & Pomper, 2001). A second approach relies on some form of self-reported exposure to advertising (Wattenberg & Brians, 1999; West, 1994). The problem with these measures, as Ansolabehere, Iyengar, and Simon (1999) demonstrate, is endogeneity with key dependent variables. Individuals who recall being exposed to advertising are also more likely to vote, for example. The current state-of-the-art measure, employing CMAG data, avoids the endogeneity problem because respondents are simply asked about their television viewing habits. It also lacks the weaknesses of aggregate estimates of exposure or measures derived from ads made rather than ads aired (Freedman & Goldstein, 1999; Ridout et al., 2004). CMAG data tell us where and when ads were aired in the largest 75 television markets (covering roughly three-quarters of the United States population). These data are then combined with reports of television viewing habits, generally taken from American National Election Studies (ANES). If we know when and how often an individual watches television, from the ANES data, and the ads that were aired at that time, from the CMAG data, we can estimate the ads an individual is likely to have seen by multiplying one by the other. For example, an individual in Market A who watches television for three of the four hours between 4 pm and 8 pm (i.e., 75 percent of those hours), during which 100 political ads were aired, would be estimated to have been exposed to 75 ads. An individual in Market B with the same viewing habits but where 20 political ads were aired at that time would be estimated to have been exposed to 15 ads.2 Using these individuals’ reported viewing habits from other times of day and weekends then allows us to build a database of individual exposure to . Resulting estimates are sensitive both to whether individuals were likely to be watching television when ads were aired and to variation by television market. For example, regardless of how many ads are aired, if an individual does not watch television she will not have been exposed to any ads, and the estimate will be that she was exposed to no advertising.3 Similarly, an avid television watcher in a

2 This calculation is based on self-reported viewing habits at particular times of day, or ‘‘dayparts.’’ As discussed below, estimates of exposure using CMAG data have also based television viewing habits on how often individuals claim to watch particular shows. Allthese methods, however, are based on the same principle of multiplying self-reported viewing by ads aired (see Ridout et al., 2004). 3 Of course, she may still be indirectly affected by advertising: through media coverage, or if the candidates’ ads become an issue of discussion in the campaign. However, that is beyond what can be explored here and beyond the range of most work using CMAG data. 123 Polit Behav market without competitive races may see less political advertising than an infrequent television watcher in a market with competitive races. These estimates thus greatly improve the main deficiency of survey measures of ad exposure, the clear knowledge of exposure, while easily trumping experiments when it comes to external validity. Freedman and Goldstein (and their collaborators), in particular, have written several carefully argued articles using CMAG data (Freedman, Franz, & Goldstein, 2004; Freedman & Goldstein, 1999; Freedman, Goldstein, & Granato, 2000; Goldstein & Freedman, 2002a, 2002b; Ridout et al., 2004). Other authors interested in questions of political advertising also now employ these data (e.g., Kahn & Kenney, 2004; Martin, 2004). Importantly, Freedman and Goldstein acknowledge possible weaknesses, pointing out that their estimates are probably the upper bound of true exposure because individuals are unlikely to have seen or paid attention to all the ads aired at a certain time. Overall, however, they argue that they should get the relative volume of exposure among individuals about right. To be sure, measurement error is endemic to almost all survey measures, particularly those that rely on self-report and recall. There is an extensive literature that documents exactly this, including reports of behavior with reference to television (Chang & Krosnick, 2003; Price & Zaller, 1993; Tourangeau, Rips, & Rasinski 2000). Chang and Krosnick, for example, show that answers about media use in the ‘‘typical’’ week differ from answers pertaining to the ‘‘past’’ week, with the former having greater predictive validity. Intriguingly, they also find that the differences are largest among the most educated respondents. They theorize that for these respondents small changes in question wording affect their memory search, whereas less educated respondents are more likely to draw on the same information regardless. The motivation for this paper is that too little is known about the measurement properties of the key variable of exposure used in estimates that combine CMAG data and self-reported television viewing habits. The objective is different in critical ways from Ridout et al.’s (2004) examination of CMAG based measures. They compare the construct validity of estimates of exposure using CMAG data to other methods of estimation and conclude that CMAG measures have greater validity because they are more reliably tied to when and what individuals were exposed to. However, Ridout et al. do not examine measurement error and they do not provide the rationale discussed here for using logged measures of exposure. In other words, this paper asks a more fundamental question about CMAG measures. Its purpose is not to argue that CMAG data should not be used, but to urge more caution in using appropriate measures of ad exposure.

Data

The data I use to examine these issues come from two sources: an experiment and the 1998 ANES pilot survey, in which respondents were asked about their television viewing habits. The experiment was conducted at a southern university in two classes of undergraduates in the spring of 2004 and the spring of 2005. There were 123 Polit Behav

95 respondents in total, of whom 91 completed both stages of the experiment, the stated purpose of which was to be used for discussion in a later class, where participants were assured of their anonymity. In the first stage subjects were asked to keep a diary of the television they watched over a four week period. The task was not onerous; participants simply noted the times they watched television each day. In the 2005 study, the diaries also included boxes for subjects to check each day if they watched network news or any edition of the local news. Subjects received credit for maintaining the diaries and were consistently reminded about them. Two weeks after they handed in their diary the same students were given an ostensibly unrelated survey that included survey questions about television viewing habits in the different formats used by the ANES. Participants were then debriefed about the true purpose of the two stages of research, in which none of the students indicated they were aware of a connection between the two. The object of the experiment was to examine the discrepancies between the amount of television subjects typically watched, according to their diaries, and the amount they claimed to watch when answering survey questions. The survey items also allowed me to examine responses to three different ANES measures of television viewing habits that have been combined with CMAG data to estimate ad exposure in the studies cited earlier (see Appendix for question wording). The three measures are:

The Daypart Method

The method divides the day into chunks or ‘‘dayparts’’ (Freedman & Goldstein, 1999) and asks, ‘‘Thinking about this past week, about how many hours did you personally watch television on a typical weekday morning, from 6am to 10am?’’, and about the other 20 hours of the day. Questions about weekend viewing are asked separately.

The Shows Method

The second method asks respondents how often they watch particular programs (e.g., from the ANES 2000 survey, ‘‘How many times in the last week have you watched Jeopardy?’’) or types of programs, such as daytime soap operas, and constructs a scale of the overall extent of television viewing (Freedman et al., 2000). Sometimes the frequency of watching specific shows is first aggregated into the frequency of watching shows in a particular genre, such as game shows, from which the overall extent of television viewing is then calculated (Goldstein and Freedman 2002a). This method of calculating exposure from specific shows and types of shows is most similar to Ridout et al.’s (2004) ‘‘genre-based measure.’’ To illustrate, if an individual watches Jeopardy and Wheel of Fortune almost every day, daytime talk shows regularly, but almost never watches morning news, evening news, or late evening news programs, she might be at .5 on a 1-point scale of television viewing. If 1,000 ads aired in her market during the campaign she would be estimated to have seen 500 of them. Another individual in the same market who watched news programs more often but never watched game shows or talk shows, 123 Polit Behav might be at .25 on the 1-point scale of television viewing and therefore be estimated to have seen 250 ads.

The Ads within Shows Method

Rather than trying to build a scale of television watching and then multiplying it by the total number of ads aired in a market, the ‘‘ads within shows’’ method is based on the fact that candidates tend to concentrate their advertising during particular programs such as news broadcasts. An avid news watcher, during which 294,376 ads were aired in the 2000 election, for example (Freedman et al., 2004), is likely to see a larger number of ads than a regular viewer of Judge Judy, where 10,036 ads were aired. The ‘‘ads within shows’’ measure is based on the ads that were aired during particular programs and how often respondents claim to watch those shows. As with the shows method these are a combination of specific programs such as Judge Judy and types of programs such as daytime television talk shows. An individual who watched news programs seven days a week but never watched Judge Judy would be estimated to have seen 294,376 ads,4 whereas an individual who watched Judge Judy every day but never watched the news would be estimated to have seen 10,036 ads. According to Freedman et al. (2004), in 2000 roughly two- thirds of all ads were aired during the shows about which the ANES asked.5 They calculated likely exposure to the other third using The shows method (i.e., multiplying the total number of ads that were not aired during the specified shows by a measure of mean television viewing).6 I compare the diaries with the daypart and shows methods. The comparison of the diaries with the ads within shows method is less comprehensive because it is limited to how often subjects claimed to watch national and local news rather than all the shows the ANES asks about. Nevertheless, about 44 percent of ads are aired during news programs, making the accuracy of reports of news watching more consequential to ads within shows estimates than how accurately, for example, an individual recalls how often he or she watches Jeopardy. Discrepancies between the diary and survey measures of news watching thus have important implications for the ads within shows method. The second data source is the 1998 ANES pilot study. This survey took place in three states: California, Georgia, and Illinois. All respondents were first asked how many hours of television they watch on a typical weekday morning, afternoon and evening. Later, a random half of the sample was also asked how many hours of television they watched during five segments, or ‘‘dayparts,’’ of the past week; the

4 In fact, the calculation is slightly more complicated because the ads on news programs are the total across the three networks. The estimate is therefore divided by three. 5 The shows were ‘‘Jeopardy’’, ‘‘Wheel of Fortune’’, ‘‘morning news programs such as ‘Today,’ ‘Good Morning America,’ or ‘The Early Show’’’, ‘‘daytime television talk shows such as ‘Oprah Winfrey,’ ‘Rosie O’Donnell,’ or ‘Jerry Springer’, network news programs in the late afternoon or early evening such as ‘World News Tonight’ on ABC, ‘NBC Nightly News,’ ‘The CBS Evening News,’or some other network news, and local TV news shows in the late afternoon or early-evening, such as ‘Eyewitness News’ or ‘Action News.’ 6 The ads within shows method is similar to Ridout et al’s (2004)‘‘five program measure.’’ 123 Polit Behav other random half of the sample, on the other hand, was asked how often in the last week they had watched particular ‘‘shows’’ such as The Today Show and Wheel of Fortune. Comparing responses to these different question formats permits an analysis of discrepancies in reported television viewing that can then be judged against the patterns of discrepancies from the diary study; they turn out to be similar. In addition, the larger sample size of the ANES study, along with the greater variation in respondents’ political knowledge, allows me to examine the factors associated with larger discrepancies in recall. Finding that political knowledge is a major influence, I am able to show how its moderating impact on the relationship between ad exposure and political behavior varies purely as an artifact of different methods of assessing exposure. Before turning to the analysis, the assumption in the experiment that the diaries gauge actual exposure needs to be addressed (additional concerns about the validity of the diaries are discussed in the Appendix). The advantage of time-diaries over surveys is the greater internal validity of a more idiographic approach where subjects record their behavior in real time. Responses to survey questions that ask about time spent on various activities are systematically affected by factors such as aspects of an individual’s lifestyle and personal characteristics that affect recall. In a nutshell, time-diaries are, ‘‘less dependent on respondents’ calculation and augmentation of the time they spend on various activities’’ (Kan & Gershiny, 2006). As a result, time-diaries appear to provide reliable, valid, and generalizable measures of behavior (Robinson & Godbey, 1997, 77). In my study, I also used methods suggested by the time-diary literature to enhance accuracy, such as the consistent reminders subjects in the classes received. My claim is not that the diaries were perfectly accurate records of the television all subjects watched, nor that the television habits of students are representative of the entire population; it is simply that great effort was made to provide incentives to make the diaries as accurate as possible. Empirical examination and the testimony of some of the subjects themselves suggest it succeeded (see Appendix). In addition, the fact that the kinds of discrepancies in self-reported viewing habits between the diaries and survey data of the students are echoed with the ANES data for a mass sample of adults strengthens the external validity of the findings.

Analysis

The diary study indicated that subjects watched an average of 10.4 h of television a week. Figure 1 shows the total number of hours subjects watched each week, as recorded in the diaries. Respondents are ordered by the average number of hours of television watched over these four weeks, from fewest to most hours (i.e., the x-axis represents Respondent 1 (who watched the fewest average hours), Respondent 2 (who watched the next fewest) through Respondent 91 (who watched the most average hours)), the solid black line in Fig. 1. The amount of television they watched each week, represented by the other four lines, showed some variation around a central tendency: correlations between weeks range from .74 to .83, with

123 Polit Behav

40 Week 1 Week 2 35 Week 3 Week 4 Overall Average

30

e 25 e w

r e p

d e 20 h c t a w

s r u

o 15 H

10

5

0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91

Fig. 1 Hours of television watched from diaries. Data from: Student diaries and surveys slightly stronger correlations between adjacent weeks. Thus, from all appearances the average over the four weeks was a valid measure of a typical week’s viewing. Figure 2 compares the average number of hours watched according to the diaries with subjects’ estimates of how much television they watch using the ANES daypart question format (the x-axis represents respondents and the order is the same as in Fig. 1).7 The fact that subjects had kept diaries for four weeks only two weeks prior to the survey should mean that, if anything, awareness of television viewing habits was heightened. Figure 2 makes it clear that subjects vastly overestimated how much television they watch each week. The average amount of television watched by these respondents according to the survey estimates was 27.9 h rather than 10.4. Because the daypart questions ask about the ‘‘past week’’ I cannot be certain that everyone in the sample did not watch much more television than usual, but it seems a remote possibility and the timing of the study aimed to avoid periods during which viewing habits were likely to change. Only three respondents estimated that they watched fewer hours of television than their diaries suggested. If one treats the average number of hours watched per week from the diaries as the ‘‘true score,’’ the reliability of the daypart questions as measures of television viewing—the variance of the true score divided by the variance of the measure—is only .18. What explains these discrepancies? Chang and Krosnick (2003) argue that such overestimates of typical behavior are routine. It looks here as though when asked the

7 The daypart questions were phrased identically to the ANES 1998 pilot (see Appendix). In 2000 the ANES asked questions about specific programs. In 2002 and 2004 the ANES asked only about news programs. For the questions about programs, I used the phrasing of the 2000 ANES. 123 Polit Behav

80

Average from diary Daypart estimate 70

60 d e

h 50 c t a W

s r u

o 40 H

e g a r e v

A 30

20

10

0 1 4 7 10131619222528313437404346495255586164677073767982858891

Fig. 2 Average hours of television watched according to diaries and daypart questions. Data from: Student diaries and surveys daypart questions subjects think of the number of hours of television they are ever likely to watch at certain times rather than the hours they typically watch or are likely to have watched in the past week. Alternatively, Allen (1965) showed that about 40 percent of the time during which the television was switched on either no one was watching (i.e., there was no audience), or the audience was inattentive. Perhaps the daypart questions better capture times during which the television is on than when a respondent is actually watching. The discrepancies are also in keeping with Robinson and Godbey’s (1997) finding that relative to time diary data individuals are prone to overestimate the amount of time engaged in household work in a typical week. However, their comparison was with diaries of the past 24 hours’ activity, whereas the comparison here is with behavior over a four week period. My experiment suggests that estimates of exposure to advertising based on daypart questions are likely to be much too high and that this is not, as generally claimed, an ‘‘upper bound’’ because of the assumption that individuals were viewing and paying attention to ads at the times they were watching television; it is, rather, an overestimate because the other part of the equation—the estimation of hours watched—is inflated. Nevertheless, the argument has been that the relative frequency of viewing captured by survey questions such as in the ANES is about right, so perhaps this does not matter. Indeed, the correlations between the diaries and the self-reports of television watching from daypart questions were .64 (Pearson’s) and .62 (Spearman’s). Table 1 presents Pearson’s correlations for the daypart and other two methods of estimating exposure discussed above. There were 123 Polit Behav reasonably strong correlations between the diaries and the daypart questions—the daypart questions captured about half the variance in average television watching from the diaries—but they were not overwhelming. Even given typical levels of measurement error, if one accepts the accuracy of the diaries there was genuine error in the relative amounts of television watching elicited by the daypart questions. The second method, based on total shows, does not look as valid. The correlation between this index and the average number of hours of television watched per week from the diaries was only .33. The correlation between typical television viewing according to the survey questions and the index of total shows was slightly higher at .38—there is a relationship—but if one imagines multiplying this index by CMAG data (e.g., Freedman et al., 2000) and using the product to estimate what are generally thought to be small influences of advertising it is no wonder that there are conflicting findings in the literature. It could be argued that the low correlations are because students are less likely to be viewers of Wheel of Fortune and Jeopardy, but then one could counterargue that they are more likely to be viewers of talk shows; indeed the premise of these questions is that the balance of shows captures overall viewing habits at all ages. On the other hand, correlations between local and national news viewing in the diaries—so crucial to the ads within shows method—and from the answers given in the surveys were at the higher level of the daypart questions, .69 and .61. It should be remembered though that even if 44 percent of ads are aired during the news (Freedman et al., 2004), 56 percent are not, and one-third of the ads in 2000 were aired during programs the ANES did not ask about. Exposure to those ads is estimated from the total shows method. The diaries suggest that this will reduce the correlation between these estimates and true television watching, particularly for individuals who watch a lot of television other than the shows asked about. Perhaps we can afford to be sanguine; after all, it could be argued, the measurement error in survey measures of exposure to advertising may be greater than acknowledged but it is just random error in an independent variable. In the bivariate case, this implies that the estimated slope of the impact of exposure to advertising will be attenuated, particularly if the reliability is as low as .18, and we are never as concerned by error that makes our estimates more conservative. The multivariate case is more complicated, however: ‘‘In the most general case [however] all one knows is that estimated regression coefficients will be biased when measurement error is present. The direction and magnitude of the error is usually unpredictable’’ (Berry & Feldman, 1985, 30). In other words, in

Table 1 Correlations between Overall diary average average television watching in per week diaries and from ANES questions in student surveys Daypart method .64 (n = 91) Total shows method .33 Local news .69 Data from: Student diaries and National news .61 surveys 123 Polit Behav multivariate models of the type routinely used in the political advertising literature, any kind of measurement error may present a problem. Nonrandom error is particularly problematic, however, and the ANES pilot data will suggest that the error in self-reported television viewing habits may indeed be nonrandom. To recap, in the ANES 1998 pilot respondents were first asked how many hours of television they watched on a typical weekday morning and afternoon, on a typical weekday evening, and on a typical weekend morning and afternoon. Later in the same survey half the sample were asked how many hours of television they watched during five weekday dayparts and between 6 am and 7 pm at the weekend during the past week. Another half of the sample was asked about specific shows. The estimates of weekly television viewing that result from the ANES 1998 pilot study echo those from the diary study. First, the daypart questions yield higher estimates than the typical weekday and weekend questions; almost three-quarters of the sample claimed to watch more television when the questions were asked in daypart form. The 1998 ANES pilot, like the diary study, suggests that daypart questions lead to higher estimates of television viewing and thus of exposure to advertising. Second, the correlation between the measures is, however, reasonably high at .67. And third, the estimate using the total shows method has a weaker correlation with the typical weekday and weekend questions of .47.8 While this evidence reinforces the experimental results, the ANES pilot data provide an additional opportunity to examine the individual-level correlates with discrepancies in reported television watching. I created a dependent variable of the discrepancy, in hours, by subtracting implied weekly hours of television viewing from the daypart question format from the implied weekly hours of television viewing in response to the typical weekday and weekend questions. Table 2 shows the results of regressing this discrepancy variable on key respondent characteristics from the advertising and voting behavior literature: strength of party identification (from 0, Independent, to 3, strong identifier), internal and external efficacy (1 to 5 scales where 5 represents the strongest sense of efficacy), political knowledge (a 0 to 4 scale based on factual questions), mobilization by a party or candidate (a 0 to 3 scale based on types of contact), sex (female = 1), and age. I also look in more detail at the properties of the daypart measure by including it as a control variable. It may be that the discrepancies between the daypart and typical weekday and weekend measures are constant (e.g., a respondent who estimates 10 h given the latter format says 20 with the former, a respondent who estimates 40 h given the latter format says 50 with the former, and so on), in which case the coefficient on the variable will not be statistically different from zero. It is also possible, however, that individuals who watch the least television according to the typical daypart measures have the largest discrepancies because they watch much less according to the typical day measures (the coefficient would be negative),

8 This echoes the diary study (i.e., the shows method has the lowest correlation) but I cannot calculate the correlation with the daypart questions because of the daypart and shows questions being asked of different halves of the sample. 123 Polit Behav

Table 2 Regression of discrepancies between ANES questions on individual level characteristics Variable Coefficient (standard error)

Total daypart estimate .45 (.03)** Political knowledge .96 (.47)* Mobilized by a party/candidate 1.66 (.68)* Strength of party identification .47 (.61) Internal efficacy .21 (.57) External efficacy .39 (.44) Age À.05 (.04) Sex .83 (1.18) Constant À10.91 (2.85)** N 555 Adjusted R2 .35

** p < .01, * p < .05, # p < .10 (two-tailed) Data from: ANES 1998 pilot study or that the largest discrepancies are characteristic of those who watch the most television according to the daypart questions (the coefficient would be positive).9 Table 2 illustrates that several individual characteristics are associated with greater sensitivity to question format; that is with larger discrepancies in estimated television watching. In addition, the positive and statistically significant coefficient on the daypart estimate shows that the discrepancies with the ‘‘typical day’’ questions are not constant but grow larger as the daypart estimates grow larger. It is the relationships with political knowledge and mobilization by a party or candidate that are most interesting, however. Politically knowledgeable individuals and those subject to the most intense mobilization efforts, who we also know are likely to have the greatest resources of time and money and to be politically engaged, are the most sensitive to question format (i.e., the discrepancies in their answers tend to be greatest). This echoes Chang and Krosnick’s (2003) finding for highly educated respondents and the explanation may well be similar: differences in question wording prompt different memory searches for these individuals but do not for those who lack political knowledge or are disengaged.10

9 I excluded 12 respondents who, in answer to the typical weekday day, evening, or weekend questions, said they watched more than 10 h a day because they were all coded as an ‘11’ in the ANES survey rather than by the exact number of hours. Because the hours they watch may exceed 11, the discrepancy with the daypart questions could be exaggerated. This is not a conventional case of censoring for which tobit estimation would be appropriate. The censoring affects a component of a dependent variable (discrepancy), stopping us from knowing whether the two methods of self-report offer very similar answers for these 12 respondents, rather than there being censoring of the dependent variable itself at its upper or lower levels. 10 Indeed, replacing political knowledge with level of education in Table 2 shows the same robust, positive relationship. With the inclusion of both political knowledge and education in the same model, however, the coefficients for each are reduced and political knowledge drifts to statistical insignificance; they share variance because educated individuals tend to be more politically informed. They each indicate that political sophistication is associated with sensitivity to question wording. In the remainder of the paper I continue to focus on political knowledge because it is the more common indicator of political sophistication in this literature (e.g., Freedman et al., 2004; Kahn & Kenney, 1999) 123 Polit Behav

More importantly, however, these nonrandom discrepancies are greatest among precisely those individuals most interested in campaigns, most likely to vote, and so on. The daypart questions inflate estimates of television watching generally, but because discrepancies with other measures are systematically more pronounced among these individuals the relationships between, for example, political knowl- edge, exposure to negative advertising, and attitudes and behavior will be sensitive to the questions used to construct the exposure estimates, varying in sign and size (Berry and Feldman 1985). As implied by this evidence, the literature using CMAG data has been inconsistent, suggesting both that the least politically knowledgeable are unaffected or confused by exposure to (negative) advertising (Stevens, 2005) and that they benefit the most from exposure to advertising (Freedman et al., 2004). In either case the normative implications are profound, either implying that, all else equal, the modern campaign exacerbates cynicism and inequality in political participation or that campaign ads ‘‘represent the multivitamins of American politics’’ (Freedman et al., 2004, 725). The 1998 ANES pilot data allow a more explicit analysis of how question formats may contribute to the confusion. I estimated identical models of the relationship between exposure to negative advertising, political knowledge and four dependent variables typical of the literature: engagement with the campaign (frequency of discussing politics during the past week and awareness of the issues the candidates for governor had been discussing during the campaign), views of government (external efficacy), and the probability of voting (see Appendix for details).11 One set of models operationalized total exposure to negative advertising using the daypart method, another calculated total exposure from the ‘‘typical day’’ questions, while a third constructed total exposure using the shows method. In addition I controlled for standard variables in the literature (e.g., Freedman et al., 2004): the total amount of negative advertising in the respondent’s market—a measure of how intense the campaign was in that locale12—dummy variables for two of the three states (Georgia and Illinois), strength of party identification, the extent of mobilization from the parties, education, race, age, and income. The key variables of interest are exposure to negative advertising and the interaction between exposure to negative advertising and political knowledge. Table 3 presents the

11 The CMAG data for 1998 do not include information about gubernatorial advertising. However, Stevens (2005) argues that because both the gubernatorial and Senate elections in California, Georgia, and Illinois shared similar characteristics, such as competitiveness, and because candidates tend to air ads at the same time it is a reasonable assumption that exposure to advertising in the gubernatorial race was highly correlated with exposure to the Senate race. In Tables 3 and 4 I include one dependent variable that is specific to the gubernatorial races in these states, the number of issues that respondents recognize the candidates have talked about: if exposure to negative advertising increases awareness of issues and individuals who saw a lot of Senate ads also saw a lot of gubernatorial ads we would expect exposure to negative advertising to have a positive relationship with recognition of issues. 12 Total negative advertising in a television market is arguably a better measure of campaign intensity than total advertising because we tend to see more advertising, and more negative advertising, in competitive races. I also estimated all the models in Tables 3 and 4 with total advertising as a proxy for campaign intensity. It made no difference to the results. 123 123 Table 3 The impact of exposure to negative advertising and political knowledge using different methods of estimating exposure Independent variable Dependent variable

# days in past week talked about # issues recognize that candidates External efficacy Intention to vote politics have talked about

Daypart Typical day Daypart Typical day Daypart Typical day Daypart Typical day

Political knowledge .359 (.137)* .354 (.125)** .270 (.178)# .106 (.162) .171 (.088)# .092 (.079) .200 (.055)** .140 (.050)** Exposure to negative .0003 (.0014) .0043 (.0019)* .0015 (.0009)# .0011 (.0006)# advertising (daypart method) Exposure to negative À.0011 (.0005)* À.0006 (.0006) À.0005 (.0003)# À.0003 (.0002)# advertising (daypart method) · Political knowledge Exposure to negative .0004 (.0014) .0014 (.0016) À.0004(.0008) À.0004(.0005) advertising (typical day) Exposure to negative À.0010 (.0004)** .0003 (.0005) . À.0000 (.0002) .0001 (.0001) advertising (typical day) · Political knowledge Mobilized by parties .659 (.146)** .655 (.147)** .708 (.193)** .713 (.194)** .073 (.095) .081 (.095) .212 (.059)** .218 (.060)** Strength of party .229 (.128)# .230 (.128)# .240 (.161)# .253 (.162)# .266 (.079)** .266 (.079)** .234 (.050)** .232 (.051)** identification Total negative spots .0004 (.0003) .0004 (.0003) À.0004 (.0005) À.0002 (.0005) .0001 (.0002) .0003 (.0002)# À.0002 (.0001)# À.0001 (.0001) in market oi Behav Polit Georgia À.246 (.324) À.253 (.324) À.389 (.417) À.413 (.420) À.134 (.205) À.133 (.206) .194 (.129)# .193 (.129)# Illinois À.498 (.373) À.489 (.375) À.842 (.477)# À.849 (.491)# .004 (.234) À.018 (.236) .171 (.147) .149 (.148) Table 3 continued Behav Polit Independent Dependent variable variable # days in past week talked about # issues recognize that candidates External efficacy Intention to vote politics have talked about

Daypart Typical day Daypart Typical day Daypart Typical day Daypart Typical day

Education .120 (.245) .119 (.245) À.266 (.303) .285 (.305) .302 (.149)* .291 (.150)# À.052 (.094) À.061 (.094) African- À.045 (.391) À.046 (.391) À.320 (.502) À.273 (.505) À.241 (.248) À.219 (.249) .233 (.158)# .247 (.159)# American Income À.013(.149) À.008(.150) .090 (.188) .096 (.191) .151 (.093)# .139 (.095)# .064 (.058) .052 (.049) Age .024 (.009)** .024 (.009)** .003 (.011) .003 (.011) À.007(.005) À.007(.005) .007 (.003)* .007 (.003)* Constant .237 (.668) .251 (.642) 4.162 (.789)** 4.573 (.772)** 1.368 (.391)** 1.592 (.381)** .946 (.247)** 1.121 (.241)** N 320 320 377 377 373 373 372 372 Adjusted R2 .15 .15 .07 .06 .08 .07 .18 .18

** p < .01, * p < .05, # p < .15 (two-tailed test) Data from: ANES 1998 pilot study 123 Polit Behav results using the daypart and typical day questions side by side13 (estimates using the shows method are available on request). Focusing first on the relationships from estimates based on the daypart method we see some influence of exposure to negative advertising in all four models, and interaction coefficients between exposure to negative advertising and political knowledge that are statistically significant, or close to it at conventional levels, in three of the four models. In each of these models the sign on the main effect is positive while the interaction term is negative. The implication, echoing recent CMAG-based findings (Freedman et al., 2004) is that it is the least politically sophisticated who derive the greatest benefit from exposure to negative advertising. The daypart estimates in Table 3 suggest that as a result of exposure to negative advertising, relative to political sophisticates, the least politically sophisticated become more likely to talk about politics, have an enhanced sense of governmental responsiveness to its citizens, and are more certain that they will vote. The estimates using the typical day measures of television viewing in Table 3 are, however, quite different in implication. While the result is the same for the relationship between exposure to negative advertising and discussion of politics, the other relationships are overwhelmingly insignificant, suggesting neither an influence of exposure to negative advertising nor any moderating impact of political knowledge. In addition, estimates using the shows method indicate no influence of exposure to negative advertising on any of the dependent variables. It therefore appears as though the relationships, and the conclusions one would draw about the impact of advertising and the moderating influence of political knowledge on the relationship between exposure to negative advertising and campaign learning, attitudes toward government, and voting behavior are highly sensitive to question wording. Perhaps it is not a startling claim that different operationalizations of independent variables produce different results. However, other literatures are more settled both theoretically and empirically. There is relatively little controversy about what party identification or trust in government is, or how to measure them, nor about key variables such as vote choice or turnout in the area of voting behavior. The field of political advertising is not so fortunate; there is not a settled approach to the operationalization of exposure in survey research. So how should survey researchers deal with the measurement problems I have outlined? As always, one should begin with theory. Fortunately the theoretically most defensible specification of exposure also alleviates some of these problems of sensitivity to question wording. In Table 3 I adopted the approach of some research in the field (e.g., Goldstein & Freedman, 2002a) by specifying a relationship in which the marginal effects of exposure to advertising are constant; the impact of exposure to the first ad is assumed to be the same as the impact of exposure to the one hundred and first. This seems unrealistic, however. Much of the qualitative

13 The relatively small sample sizes in Table 3, for an ANES survey, are because, first, the daypart questions were asked of a half sample and, second CMAG data cover only the top 75 television markets, containing about three-quarters of the U.S. population, meaning there is no information about advertising where many of the respondents lived (which is why there are roughly one-third fewer respondents in Table 3 than in Table 2). 123 Polit Behav evidence about negative advertising indicates a growing weariness and weakened impact with increased exposure (e.g., Patterson & McClure, 1976, 150). More importantly from a theoretical perspective, a wealth of psychological research on message repetition (Cacioppo & Petty, 1989) and primacy effects (Holbrook, Krosnick, Visser, Gardner, & Cacioppo, 2001) indicates that the impact of communications is non linear; it tends to decline over time.14 There are two principal ways of operationalizing nonlinear relationships. The first is to add a quadratic term that allows not only for a decline in the marginal effects of exposure but also for their possible reversal. However, one would not expect the reversal of marginal effects across the entire range of commonly analyzed dependent variables. With levels of information, for example, declining marginal effects seem likely but a reversal of marginal effects—the notion that individuals start to lose information at higher levels of exposure—does not. Moreover, the empirical evidence for such a reversal is weak; it only appears at levels of negativity higher than those actually observed (Lau & Pomper, 2001). The most theoretically defensible operationalization of exposure is the second, taking the log of the estimate, which accounts for diminishing marginal effects of exposure. Some research has done this (Freedman et al., 2004; Freedman and Goldstein 1999; Ridout et al. 2004) but what is new in this paper is the argument that this operationalization may also have a payoff in measurement terms. The reason is because, first, taking the log of estimated exposure compresses much of the variation that is an artifact of question wording and, second, overestimates of exposure that are a consequence of overestimates of television viewing in the daypart format in particular are rendered less consequential. Taking the log of the daypart and typical day questions increases the correlation between the two measures to .96; to all intents and purposes the variation is the same, while the correlation between estimates using the typical day and shows questions is .90, not as strong as Ridout et al. (2004) find but stronger than the correlations between the ‘‘raw’’ estimates of exposure.15 Table 4 shows the model results using the daypart, typical day, and show based operationalizations. Two aspects are noteworthy: there is greater consistency in the estimates across the different methods, with one important exception, and the results of the models using the more realistic logged measures of exposure are somewhat different than those assuming a linear impact of exposure. They suggest a more limited influence of exposure, with no impact on perceptions of external efficacy or likelihood to vote, and few differences that are a result of political knowledge; they are confined to the frequency of discussing the campaign. But there is still some inconsistency. The signs on the coefficients for the daypart and typical day questions indicate that exposure to negative advertising stimulates

14 On-line models of attitude formation and updating also imply that the capacity of new information to alter impressions diminishes. 15 Using the log of their estimates is likely the reason why Ridout et al. (2004) find high correlations between their three estimates of exposure using CMAG data. It is not, as they imply, because daypart and show methods provide essentially the same information about television viewing habits but because the correlations are between logged estimates of exposure, meaning the variation due to discrepancies has been reduced. 123 123 Table 4 The impact of exposure to negative advertising and political knowledge using logged measures of exposure Independent variable Dependent variable

# days in past week talked about politics # issues recognize that candidates have talked about

Daypart Typical day Shows Daypart Typical day Shows

Political Knowledge .526 (.196)** .499 (.193)** .112 (.150) .268 (.247) .148 (.243) .263 (.202) Logged exposure to negative advertising (daypart method) .206 (.139)# .314 (.170)# Logged exposure to negative advertising (daypart À.089 (.041)* À.027 (.051) method) · Political knowledge Logged exposure to negative advertising (typical day) .246 (.146)# .223 (.179) Logged exposure to negative advertising (typical day) · À.089 (.043)* .006 (.054) Political knowledge Logged exposure to negative advertising (shows) À.094 (.115) .284 (.147)# Logged exposure to negative advertising (shows) · .064 (.035)# À.034(.046) Political knowledge Mobilized by parties .640 (.147)** .635 (.148)** .388 (.145)** .722 (.193)** .717 (.193)** 1.149 (.194)** Strength of party identification .226 (.130)# .239 (.129)# .116 (.120) .245 (.162)# .248 (.161)# .222 (.165) Total negative spots in market À.0000 (.0004) À.0001 (.0004) À.0004 (.0003) À.0004 (.0005) À.0004 (.0005) À.0004 (.0004) Georgia À.183 (.332) À.205 (.335) .433 (.341) À.583 (.427) À.602 (.430) À1.013 (.466)* Illinois À.468 (.379) À.485 (.377) .191 (.383) À1.010 (.485)* À.969 (.482)* À1.516 (.519)** Education .203 (.245) .206 (.245) À.278 (.213) À.290 (.302) À.299 (.302) .430 (.272)# African-American À.204 (.389) À.215 (.389) À.340 (.333) À.219 (.497) À.206 (.497) .532 (.427) Income À.024 (.151) À.009 (.152) À.018 (.138) .090 (.188) .103 (.189) À.489 (.190)*

Age .024 (.009)** .023 (.009)** .012 (.008) .004 (.011) .004 (.011) .007 (.011) Behav Polit Constant À.270 (.787) À.291 (.780) 1.985 (.637)** 3.787 (.920)** 4.173 (.918)** 3.133 (.804)** N 320 320 340 377 377 416 Adjusted R2 .14 .13 .08 .07 .07 .14 oi Behav Polit

Table 4 continued Independent variable Dependent variable

# days in past week talked about politics # issues recognize that candidates have talked about

Daypart Typical day Shows Daypart Typical day Shows

Political knowledge .036 (.123) .038 (.120) .131 (.097) .184 (.077)* .150 (.075)* .128 (.060)* Logged exposure to À.055 (.083) .038 (.053) negative advertising (daypart method) Logged exposure to negative .012 (.025) À.009 (.016) advertising (daypart method) · Political knowledge Logged exposure to À.106 (.087) À.016 (.056) negative advertising (typical day) Logged exposure to negative . À.013 (.027) À.000 (.017) advertising (typical day) · Political knowledge Logged exposure to negative À.021 (.071) À.013 (.044) advertising (shows) Logged exposure to negative À.008 (.022) À.006 (.014) advertising (shows) · Political knowledge Mobilized by parties .076 (.095) .081 (.095) .187 (.093)* .215 (.060)** .217 (.060)** .313 (.058)** Strength of party identification .271 (.079)** .268 (.079)** .111 (.080) .234 (.051)** .233 (.051)** .159 (.049)** Total negative spots in market .0003 (.0002) .0004 (.0002)# À.0000 (.0002) À.0002 (.0001) À.0001 (.0001) À.0002 (.0001)#

123 Georgia À.122 (.210) À.080 (.211) À.321 (.224) .180 (.132) .204 (.133)# .133 (.140) Illinois À.006 (.238) .008 (.236) À.115 (.249) .156 (.150) .168 (.149) .172 (.156) 123 Table 4 continued

Education .299 (.149)* .293 (.148)* À.010 (.131) À.056 (.094) À.062 (.094) .134 (.082)# African-American À.238 (.246) À.233 (.245) À.103 (.205) .243 (.157)# .248 (.157)# .143 (.127) Income .145 (.094)# .134 (.094) À.000 (.092) .061 (.059) .057 (.059) .139 (.057)* Age À.007 (.005) À.007 (.005) À.007 (.005) .007 (.003)* .007 (.003)* .009 (.003)** Constant 1.757 (.457)** 1.830 (.455)** 2.848 (.388)** .965 (.289)** 1.111 (.288)** .934 (.244)** N 373 373 415 372 372 412 Adjusted R2 .07 .08 .02 .18 .17 .21 ** p < .01, * p < .05, # p < .15 (two-tailed test) Data from: ANES 1998 pilot study oi Behav Polit Polit Behav discussion of the campaign (the first column of results in Table 4); while the negative interaction with political knowledge implies that the effects are strongest on those with the least political knowledge. Simulations based on these estimates (in which all control variables were set at their mean or mode, while knowledge was allowed to vary from its lowest to its highest value and exposure to negative advertising from one standard deviation below to one standard deviation above its mean) suggest that more exposure to negative advertising increases frequency of discussion of the campaign from about two days a week to three days a week among those lowest in political knowledge. The highly politically knowledgeable, meanwhile, are unaffected, and continue to discuss the campaign roughly three days a week regardless of exposure to negative advertising.16 In other words, the implication would be that exposure to negative advertising benefits those who know the least about politics by making them more like those who know the most by discussing the campaign more frequently. However, the shows based estimates of exposure imply that negative advertising hinders discussion of the campaign, especially among the least politically knowledgeable, not only the reverse relationship but one with entirely different normative implications. Instead of exposure to negative advertising reducing the differences in frequency of discussion, similar simulations suggest that it exacerbates them. According to simulations from this model, at high levels of exposure those lowest in political knowledge discuss the campaign an average of one and a half days a week compared to slightly over three days for the most politically knowledgeable (i.e., low sophisticates behave less and less like high sophisticates when exposed to more negative advertising).

Discussion and Conclusion

The impact of exposure to political advertising has aroused great interest in academia and beyond; that interest has only increased as advertising campaigns grow more negative (Geer, 2006). Researchers using survey estimates of ad exposure that draw on CMAG data have presented a rosy image of advertising effects. Exposure to advertising, they argue, especially negative advertising, informs, stimulates, and ultimately enhances political participation. They suggest that less politically sophisticated voters may even benefit the most from exposure, gaining information, growing more interested in the campaign, and voting in larger numbers. My findings indicate that we should be far less sanguine about advertising effects because the measures of ad exposure on which these conclusions are based contain error that is both large and nonrandom. I have demonstrated that while CMAG data offer a remarkably comprehensive picture of the ads that were aired in major television markets in the United States, the estimates of individual ad exposure derived from these data depend on self- reports of television viewing that are riddled with measurement problems.

16 The conditional effects of exposure for high sophisticates, the combination of main effect and interaction, are statistically insignificant. 123 Polit Behav

An experiment in which students kept diaries of the television programming they watched and later answered standard ANES survey questions about their television viewing habits revealed not only a pervasive tendency to overestimate in survey responses the time spent watching television but also large discrepancies in estimates of television watching among different survey methods. I then showed a similar pattern of differences in a large random sample of adults, among estimates of television viewing of the kind commonly used in survey research incorporating CMAG data (the 1998 ANES pilot survey, which asked about television viewing habits with three different question wordings). These discrepancies are not random; indicators of political sophistication, such as political knowledge, are systematically associated with larger discrepancies. As a result, estimates of the relationship between exposure to ads, political sophistication, and political behavior are unstable and hinge on the questions used to gauge television viewing habits. I have also offered two potential, partial solutions. First, by taking the log of estimated exposure (which has the two advantages of accounting for the decreasing marginal effects of additional exposure to advertising and, by ‘‘compressing’’ estimated exposure, reducing the inflated estimates of exposure that appear endemic to these questions) we can diminish those differences in results that are artifacts primarily of question wording. This approach will not eliminate the problem, however. Even using logged estimates, I have shown that researchers can draw sharply contrasting normative inferences from how exposure to negative advertising influences the propensity to talk about politics. My analysis could be used to support one picture in which exposure to negative ads makes low political sophisticates behave like high political sophisticates but also the opposite view in which exposure to negative ads exacerbates the differences between low and high political sophisticates. In either case, the interpretation is purely an artifact of the questions used to gauge television viewing habits. A second potential solution uses multiple measures to gauge ad exposure. Bartels (1996, 2), who is often cited in support of the ‘‘shows’’ method of gauging viewing habits, is similarly circumspect about using a single set of measures; he suggests that we should weigh the net benefit of investing ‘‘entirely in specific exposure items’’ against the advantages of using ‘‘some combination of specific exposure items, general exposure items, and quiz items.’’ The implications for research on the impact of advertising are profound. The mixture of findings in experiments and surveys may be the result of much more than basic differences in research design. Survey estimates of the effects of ad exposure are themselves highly unstable. Any attempt to estimate exposure to television should be wary of individual sensitivity to even the most subtle changes in question wording that can have vast effects on inferences. It is no wonder that carefully conducted studies offer the conflicting interpretations that negative advertising is a boon or a burden to American democracy. Perhaps a combination of approaches, in which we return to multiple measures of ad exposure in order to be more certain of the stability of relationships, while also evaluating effects based on a single common operationalization of exposure, such as logged estimates, will point the way forward.

Acknowledgements Thanks to Barbara Allen, Andrew Seligsohn, and the editors for helpful comments and suggestions.

123 Polit Behav

Appendix

Coding of Variables

Daypart Questions. Question Wording: Thinking about this past week, about how many hours did you personally watch television on a typical weekday morning/ afternoon, from [6:00 to 10:00 AM/ 10:00 AM to 4:00 PM/4:00 PM to 8:00 PM/ 8:00 PM to 11:00 PM/11:00 PM to 1:00 AM]. Thinking about this past weekend, about how many hours did you personally watch television from 6:00 AM to 7:00 PM? Coding: The total number of weekday hours (multiplied by 5) were combined with the total number of weekend hours to estimate the total number of hours of TV watched per week. Typical Week Questions (from ANES 1998 Pilot). Question Wording: On a typical weekday, about how many hours of television do you watch during the morning and afternoon? About how many hours of television do you watch on a typical weekday evening? On a typical weekend day, about how many hours of television do you watch during the morning and afternoon? Coding: The total number of weekday hours (multiplied by 5) were combined with the total number of weekend day hours (multiplied by 2). Show Questions (ANES 1998 Pilot). Question Wording: How many days/times in the past week have you watched [The Today Show/The Rosie O’Donnell Show/ daytime soap operas like General Hospital or Days of Our Lives/Jeopardy or Wheel of Fortune/a sports event/local news]? Coding: The sum of all six genres (each genre was rescaled from zero to one) divided by six. Show Questions (Experiment). Question Wording: How many times in a typical week do you watch [Jeopardy/Wheel of Fortune/morning news programs such as Today, Good Morning America, or The Early Show/daytime television shows such as Oprah Winfrey or Jerry Springer/national network on news/local TV news shows, either in the late afternoon or early evening]?. Efficacy. Question Wording: Please tell me how much you agree or disagree with these statements ... agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, disagree strongly, don’t know? Public officials don’t care what people like me think; Sometimes politics seems so complicated that a person like me can’t really understand what’s going on; People like me don’t have any say about what the government does. Coding: The average response on the 1 to 5 scale. Number of Days in the Past Week Talked About Politics. Question Wording: How many days in the past week did you talk about politics with family or friends? Number of Issues Recognize that Candidates Have Talked About. Question Wording: For each issue we would like to know if you think either one of the candidates, both, or neither is talking about these issues (private school vouchers, abortion, gun-related crimes, campaign contributions from PACs, protecting the quality of the air and water, improving discipline in schools). Coding: Total of number of issues each candidate is talking about. Intention to Vote. Question Wording: (Half sample 1) So far as you know, do you expect to vote in the elections this coming November? Would you say that you are definitely going to vote, probably going to vote, or are you just leaning towards 123 Polit Behav voting? (Half sample 2) Please rate the probability you will vote in the elections this coming November (on a 0 to 100 scale). Coding (Half sample 1): Not going to vote = 0, leaning = 1, probably = 2, definitely = 3. Coding (Half sample 2): 0– 19 = 0, 20–50 = 1, 51–80 = 2, 81–100 = 3. Contacted by a Party/Candidate. Question Wording: Thus far in the campaign, have you received any mail from a candidate or political party about the election? How about door-to-door campaigning? Thus far in the campaign, have any candidates or party workers made any phone calls to you about the election? Coding: 1 for each contact for a range of 0 to 3 (mean = .8). Party Identification. Question Wording: Generally speaking, do you consider yourself to be a Republican, a Democrat, an Independent, or what? [If Republican or Democrat] Would you call yourself a strong [Republican or Democrat] or a not very strong [Republican or Democrat]? [If Independent] Do you think of yourself as closer to the Republican or Democratic party? Coding: Strong identifiers with either party were coded as 3, those saying they considered themselves a not very strong Republican or Democrat as 2, those claiming to be Independent but closer to one of the parties as 1, and those Independent and closer to neither party, or Other as 0. Political Knowledge. Question Wording: Who has the final responsibility to decide if a law is constitutional or not... is it the President, Congress, or the Supreme Court? Whose responsibility is it to nominate judges to the Federal Courts... the President, Congress, or the Supreme Court? Do you happen to know which party has the most members in the House of Representatives in Washington? Do you happen to know which party has the most members in the U.S. Senate? Coding: each correct answer was coded 1, and answers to the four questions combined to create a 0–4 scale. Education. Question Wording: What is the highest grade of school or year of college you have completed? Did you get a high school diploma or pass a high school equivalency test (GED)? What is the highest degree that you have earned? Coding: 0 for 12 years or less and no high school diploma, 1 for 12 years or less with high school diploma or GED, 2 for 13 or more years.

The Validity of the Diary Study

A student sample

A frequent objection to student samples is that college students are not ‘‘real’’ people. Indeed, Chang and Krosnick’s (2003) research suggests that, as relatively educated individuals, students might be more sensitive to question wording about television viewing habits. However, there is no reason to believe that the differences in recall across the questions should be different for student and adult samples. Moreover, sampling educated students who had been keeping diaries for four weeks and were therefore atypically alerted to their viewing habits should, if anything, lessen the discrepancies between the diaries and surveys.

123 Polit Behav

Student subjects may alter their television viewing habits to impress an instructor, or simply lie about them to indicate watching less television or more serious programs

The initial instructions students were given strove to limit false reporting by stressing they should not change their habits, that they would only be noting the times they watched television, not the programs they watched (with the exception of news in the second study), and that the instructor would form no judgments on the basis of how much or when they watched television. Empirically, the results do not suggest social desirability biases in student diary entries. According to Student Monitor, for example, college students watch an average of 11 h of television a week.17 The average amount of television subjects watched per week over the four weeks, according to their diaries, was 10.4 h, with a range of 9.6 h in Week 3 to 11.0 h in Week 4. The average number of times their diaries said they watched national and/or local news a week was .8 times each (i.e., less than once a week), which would not impress many instructors. Finally, I asked members of the Spring 2005 class, after they had received credit for maintaining the diaries and after they had received their course credit, to let me know whether or not they had kept the diaries accurately.18 Roughly 50 percent of the class responded and, without exception, said that their entries had been accurate; some subjects even went to some length to describe the methods by which they had ensured accuracy. I compared the discrepancies between diaries and surveys for this subsample of avowedly accurate diary keepers to the rest of the class. One might think that this subsample would show smaller discrepancies but there was no statistically significant difference in the size of the discrepancies; in fact, if anything they were larger for those subjects who testified to the accuracy of the diaries.

In a four week period subjects may have grown increasingly weary of keeping the diary, implying growing rather than constant inaccuracy

Again, the consistent reminders subjects received were intended to guard against this but it is a possibility that can also be tested empirically. If students were increasingly inaccurate in their diary entries, the correlation between the typical viewing habits they gave in the surveys and the earlier weeks of the diaries should be stronger than in later weeks. However, the correlations were very consistent: .57, .59, .57, and .60 in weeks 1 through 4 respectively.

References

Allen, C. (1965). Photographing the tv audience. Journal of Advertising Research, 5, 2–8. Ansolabehere, S., Iyengar, S., & Simon, A. (1999). Replicating experiments using aggregate and survey data: The case of negative advertising and turnout. American Political Science Review, 93, 901–909.

17 See www.studentmonitor.com 18 There would not have been concerns about future classes with me because I was in the throes of leaving the university. 123 Polit Behav

Bartels, L. (1996). Entertainment television items on 1995 pilot study. Report to the National Election Studies Board of Overseers. Berry, W., & Feldman, S. (1985). Multiple regression in practice. Newbury Park: Sage. Brooks, D. (2006). The resilient voter: Moving toward closure in the debate over negative campaigning and turnout. Journal of Politics, 68, 684–697. Cacioppo, J., & Petty, R. (1989). Effects of message repetition and position on argument processing, recall, and . Journal of Personality and Social Psychology, 107, 3–12. Chang, L., & Krosnick, J. (2003). Measuring the frequency of regular behaviors: Comparing the ‘typical week’ to the ‘past week. Sociological Methodology, 33, 55–80. Clinton, J., & Lapinski, J. (2004). ‘Targeted’ advertising and voter turnout: an experimental study of the 2000 presidential election. Journal of Politics, 66, 69–96. Finkel, S., & Geer, J. (1998). A spot check: casting doubt on the demobilizing effect of attack advertising. American Journal of Political Science, 42, 573–595. Freedman, P., Franz, M., & Goldstein, K. (2004). Campaign advertising and democratic citizenship. American Journal of Political Science, 48, 723–741. Freedman, P., & Goldstein, K. (1999). Measuring media exposure and the effects of negative ads. American Journal of Political Science, 43, 1189–1208. Freedman, P., Goldstein, K., & Granato, J. (2000). Learning, expectations, and the effect of political advertising. Chicago: Paper presented at the annual meeting of the Midwest Political Science Association. Geer, J. (2006). In defense of negativity: Attack ads in presidential campaigns. Chicago: University of Chicago Press. Goldstein, K., & Freedman, P. (2002a). Campaign advertising and voter turnout: new evidence for a stimulation effect. Journal of Politics, 64, 721–740. Goldstein, K., & Freedman, P. (2002b). Lessons learned: Campaign advertising in the 2000 elections. Political Communication 19, 5–28. Holbrook, A., Krosnick, J., Visser, P., Gardner, W., & Cacioppo, J. (2001). Attitudes toward presidential candidates and political parties: Initial optimism, inertial first impressions, and a focus on flaws. American Journal of Political Science, 45, 930–950. Kahn, K. F., & Kenney, P. (1999). Do negative campaigns mobilize or suppress turnout? Clarifying the relationship between negativity and participation. American Political Science Review, 93, 877–890. Kahn, K., & Kenney, P. (2004). No holds barred: Negativity in U.S. Senate Campaigns. Upper Saddle River: Prentice Hall. Kan, M. Y., & Gershiny, J. (2006). Infusing time diary evidence into panel data: an exercise in calibrating time-use estimates for the BHPS. ISER Working Paper 2006-19. Colchester: University of Essex. Lau, R., & Pomper, G. (2001). Effects of negative campaigning on turnout in U.S. Senate elections, 1988–1998. Journal of Politics, 63, 804–819. Lau, R., Sigelman, L., Heldman, C., & Babbitt, P. (1999). The effects of negative political advertisements: A meta-analytic assessment. American Political Science Review, 93, 851–875. Martin, P. (2004). Inside the black box of negative campaign effects: Three reasons why negative campaigns mobilize. Political Psychology, 25, 545–562. Patterson, T., & McClure, R. (1976). Political advertising: Voter reaction to televised political commercials. Princeton: Citizen’s Research Foundation. Price, V., & Zaller, J. (1993). Who gets the news? Alternative measures of news reception and their implications for research. Public Opinion Quarterly, 57, 133–64. Ridout, T., Shah, D., Goldstein, K., & Franz, M. (2004). Evaluating measures of campaign advertising exposure on political learning. Political Behavior, 26, 201–225. Robinson, J., & Godbey, G. (1997). Time for life: The surprising ways Americans use their time. University Park: Pennsylvania State University Press. Stevens, D. (2005). Separate and unequal effects: Information, political sophistication and negative advertising in American elections. Political Research Quarterly, 58, 413–426. Tourangeau, R., Rips, L. R., & Rasinski, K. (2000). The psychology of survey response. Cambridge: Cambridge University Press. Wattenberg, M., & Brians, C. (1999). Negative campaign advertising: Demobilizer or mobilizer? American Political Science Review, 93, 891–899. West, D. (1994). Political advertising and news coverage in the 1992 California U.S. Senate campaigns. Journal of Politics, 56, 1053–1075.

123