<<

COARSE THINKING AND PERSUASION*

SENDHIL MULLAINATHAN

JOSHUA SCHWARTZSTEIN Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard School Library user on 15 June 2021

We present a model of uninformative persuasion in which individuals “think coarsely”: they group situations into categories and apply the same model of in- ference to all situations within a category. Coarse thinking exhibits two features that persuaders take advantage of: (i) transference, whereby individuals transfer the informational content of a given message from situations in a category where it is useful to those where it is not, and (ii) framing, whereby objectively useless information influences individuals’ choice of category. The model sheds light on uninformative advertising and product branding, as well as on some otherwise anomalous evidence on mutual fund advertising.

I. INTRODUCTION Most societies devote huge resources to persuasion (McCloskey and Klamer 1995). Selling, advertising, political cam- paigns, organized religion, law, much of the media, and some ed- ucation are devoted to changing beliefs in a way advantageous to the persuader. Persuasion is not simply an expenditure of re- sources: the content of the message crucially shapes its effective- ness.1 But what constitutes persuasive content? usually assume that only one type of persua- sive content matters: objectively useful information. Stigler (1987, p. 243) defines advertising as “the provision of information about the availability and quality of a commodity.” Economists typ- ically model persuasion, including advertising (Stigler 1961), political campaigns (Downs 1957), and legal argument (Milgrom

* This paper replaces an earlier draft with the same title, as well as “Per- suasion in Finance” by Mullainathan and Shleifer. We are grateful to Nicholas Barberis, , Dan Benjamin, Daniel Bergstresser, , Lauren Cohen, Stefano DellaVigna, Daniel Gilbert, , Xavier Gabaix, , Simon Gervais, Robin Greenwood, Richard Holden, Emir Kamenica, Lawrence Katz, Elizabeth Kensinger, David Laibson, Owen Lamont, , Ulrike Malmendier, , Andrew Postlewaite, , Christina Romer, , Jeremy Stein, Rene Stulz, , Robert Waldmann, Glen Weyl, Gerald Zaltman, Eric Zitzewitz, three anonymous referees, and especially Nicola Gennaioli, Giacomo Ponzetto, and the fourth anonymous referee for helpful comments. We also thank Michael Gottfried, Tim Ganser, and Georgy Egorov for excellent research assistance. Schwartzstein acknowledges financial support from an NSF graduate fellowship. All errors remain our own. 1. A vast advertising literature makes this point (see, e.g., Zaltman 1997; Sutherland and Sylvester 2000). Bertrand et al. (2006) present some clear evidence that persuasive content matters in a field experiment using loan advertisements by a South African consumer lending institution. C 2008 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology. The Quarterly Journal of , May 2008

577 578 QUARTERLY JOURNAL OF ECONOMICS and Roberts 1986b; Dewatripont and Tirole 1999), as provision of information. In some models, such as those of Grossman and Hart Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 (1980), Grossman (1981), Milgrom (1981), Crawford and Sobel (1982), Okuno-Fujiwara, Postlewaite, and Suzumura (1990), and Glazer and Rubinstein (2001, 2004), the persuader uses informa- tion strategically, but conveys information nonetheless. Psychologists and marketers understand persuasion quite differently. They argue that people evaluate various propositions or objects using representativeness, metaphors, analogies, and more generally associative strategies (Gilovich 1981; Edelman 1992; Kahneman and Tversky 1982; Lakoff 1987; Zaltman 1997). The strategy of persuasion is to take advantage of these mental strategies, which we refer to as coarse thinking, to improve the audience’s assessment of the persuader’s issue or product. In this paper, we present a model of coarse thinking and per- suasion. We distinguish two ways in which persuaders, such as product advertisers, can take advantage of coarse thinking. First, the audience might already have some analogy for the product in mind; it already thinks of the product in terms of something else. In this case, one way to persuade is to advertise attributes of the product that are positively related to quality in the anal- ogous situation. The coarse thinker transfers the informational content of these attributes across analogous situations and so im- proves his view of the product. We call this form of coarse thinking “transference.” Second, and more fundamentally, persuaders may themselves try to shape or create the relevant analogy by advertising at- tributes associated with that desired analogy. This method per- suades successfully when it changes the lens through which a thinker views all features of the product. Following Goffman (1974), we call this form of coarse thinking “framing.” In many instances, successful persuasion takes advantage of both trans- ference and framing, but these forms of coarse thinking are con- ceptually distinct.2 To illustrate these ideas, consider several examples: Alberto Culver Natural Silk Shampoo was advertised with a slogan “We put silk in the bottle.” The shampoo actually contained

2. Recently, the term framing has been used much more narrowly to describe the coding of gains and losses in prospect theory (Kahneman and Tversky 1979). Goffman’s original and broader meaning of framing as a lens or “model” through which the audience interprets data is what we try to capture below. COARSE THINKING AND PERSUASION 579 some silk. During the campaign, the company spokesman con- ceded that “silk doesn’t really do anything for hair” (Carpenter, Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Glazer, and Nakamoto 1994).3 This example is a relatively pure case of the persuader relying on transference. Culver takes ad- vantage of the co-categorization of shampoo with hair, which leads consumers to “silk” in shampoo (not very sensible) because they value “silky” in hair. By adding silk to the bottle, Culver ef- fectively transfers a positive trait from hair to shampoo to make its product more attractive. For about forty years, Avis Car Rental advertised itself with the slogan “We are number two. We try harder.” Besides this key message, most ads contain few data. We think of this campaign as a relatively pure case of framing. When Avis was getting started, many of its potential customers knew that it was smaller than Hertz. How the attribute “being second” is interpreted depends on categorization: either negatively as a loser or positively as an underdog. Because the underdog image favors Avis, the campaign primed this frame by stressing the attributes of underdogs (they try harder). The fact that Avis lagged behind Hertz in sales be- came a sign of higher rather than lower quality. Most persuasive messages take advantage of both transfer- ence and framing. Take two examples, one from the economic and one from the political sphere. Over the course of the Internet stock market bubble (1994– 2003), the brokerage firm Merrill Lynch ran six advertising cam- paigns, respectively called “a tradition of trust,” “the difference is Merrill Lynch,” “human achievement,” “be bullish,” “ask Merrill,” and “total Merrill.” The motto of each campaign always appeared in the ad. Roughly speaking, the first two campaigns preceded the bubble, the third and the fourth appeared during it, and the last two ran after the sharp market decline. One way to compare these campaigns is to look at a repre- sentative ad from each. “A tradition of trust” ads often portray a grandfather and a grandson fishing together. The ads talk about slow accumulation of wealth and Merrill’s expertise. The activi- ties of fishing and, even more so, teaching to fish suggest slow- ness, tradition, skill, consistency, and patience. The ads advise on how to protect oneself and one’s financially. Ads from “the

3. A recent reincarnation of this marketing idea, discovered by inspecting products in a drugstore, is Pure Cashmere Softsoap, which contains “cashmere extract.” Cashmere adds quality to sweaters, not soap. 580 QUARTERLY JOURNAL OF ECONOMICS difference is Merrill Lynch” campaign likewise show grandfathers and grandsons with fishing rods, and they recommend . Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 This message changes in 1999. A “human achievement” ad from 1999 shows a twelve-year-old girl wearing a helmet and car- rying a skateboard. The image is much hipper than those from the previous campaigns. The next campaign, from 2000–2001, simply intones: “be bullish.” One ad shows a Merrill Lynch bull wired as a semiconductor board (the word “wired” itself has two mean- ings, connected and hyperactive). The theme of protection is gone; growth and opportunity emerge. After the market declines, Merrill switches to the “ask Merrill” campaign, with its emphasis on uncertainty in the world, and the company’s expertise in protecting and advising its cus- tomers. A representative ad is dominated by a page-sized question mark, invoking insecurity, uncertainty, and the need for answers. Finally, by the end of the decade, the firm moves to the “total Merrill” campaign, with its familiar emphasis on expertise and intergenerational fishing. Merrill’s campaigns take advantage of both transference and framing. In quieter markets, Merrill seeks to frame or position it- self as an expert and adviser, much like a doctor, or a grandfather teaching his grandson to fish. It then wants the audience to trans- fer the positive attributes of this expert to enhance the perceived value of its own advice. At the peak of the bubble, Merrill recog- nizes that many investors are seeking to get rich quickly, to grab the opportunities created by the technology boom. It then frames itself as the agent of such opportunities and wants the audience to transfer the positive attributes of “technology” to its own services. After the crash of the bubble, the patient advisor again becomes an attractive frame. As a final example—from the area of political persuasion— consider Arnold Schwarzenegger’s memorable speech at the 2004 Republican National Convention. In the best remembered part of his speech, Schwarzenegger defended free trade: “To those critics who are so pessimistic about our economy, I say: Don’t be economic girlie men! ... Now they say India and China are overtaking us. Don’t you believe it. We may hit a few bumps—but America always moves ahead. That’s what Americans do.” Schwarzenegger’s speech takes advantage of both framing and transference. He frames international trade as war, with win- ners and losers, not as an in which everyone gains. With this frame in mind, he transfers America’s as well as his COARSE THINKING AND PERSUASION 581 own (a former champion as well as The Terminator) propensity to win onto trade. This message of victory, although helpful for Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 assessing U.S. military engagements and Schwarzenegger’s own accomplishments, is of limited value in evaluating globalization. We will come back to these examples as we discuss our an- alytical results. In our model, individuals deviate from Bayesian rationality in two crucial ways: first, they group situations into categories based on the data they receive and, second, they fail to differentiate between co-categorized situations and use one model of inference for all situations in the same category. Such coarse thinking allows the persuader to create uninformative messages that frame the interpretation of public information (laggard in car rentals) through category choice (underdog). Coarse thinking also allows the persuader to take advantage of transference by creat- ing messages (we will win) that are uninformative in the relevant situation (free trade) but still induce a reaction because they are informative in the co-categorized one (war). The model presented in this paper shares many elements with Crawford and Sobel’s (1982) model of strategic information transmission. Indeed, our model of transference is mathemati- cally similar to an extension of Crawford and Sobel in which the audience does not know the exact situation (the underlying game being played), and so may react to messages that do not contain decision-relevant information in the relevant situation.4 On the other hand, framing through category choice does not have a nat- ural counterpart in the cheap talk literature. Even if we focus on transference, our interpretation of the underlying mathematics is very different. We suppose that the audience knows the situation it is in but uses a single model of inference to interpret information in multiple situations. We do not think that the audience reacts to the message “contains silk” because it is uncertain whether adding silk improves the shampoo. Rather, we suggest that the audience thinks coarsely and reacts to the message “contains

4. Crawford (2003) presents a cheap talk model in which uncertainty sur- rounding the Sender’s type (e.g., whether or not he is strategic) may enable a strategic Sender to “fool” a sophisticated Receiver into taking a suboptimal ac- tion in equilibrium. Kartik, Ottaviani, and Squintani (2007) present a related model that incorporates the possibility of equilibrium deception or misinterpreta- tion of information. They extend the basic cheap talk model by assuming that a fraction of the Sender’s audience misinterprets equilibrium messages with some nonequilibrium-based rule (e.g., they always blindly believe the Sender’s recom- mendation). In the equilibria they identify, the Sender always sends an inflated message that deceives the “na¨ıve” agents. 582 QUARTERLY JOURNAL OF ECONOMICS silk” because it interprets messages about shampoo and hair similarly. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Associational or analogical thinking is both extremely com- mon, and extremely useful in everyday life because it reduces the evaluation of new situations to comparison with familiar ones. Edelman (1992) thinks that, for this reason, our brains have evolved so as to make metaphor and analogy standard hard-wired forms of reasoning. Of course, the patterns of thought that are usu- ally extremely helpful are not always so. Persuaders take advan- tage of people utilizing a strategy that, though generally useful, may not be useful in the situation of to the persuader.5 A large literature looking at persuasion deals with advertis- ing. Nelson (1974) broadens the range of what might be seen as in- formative advertising.6 Stigler and Becker (1977) and Becker and Murphy (1993) put advertising into the function. Gabaix and Laibson (2006) and Shapiro (2006) offer behavioral models of advertising. Recent research on persuasion has gone beyond advertising, and includes studies of hatred (Glaeser 2005), media (Mullainathan and Shleifer 2002, 2005; Gentzkow and Shapiro 2006; DellaVigna and Kaplan 2007), and political persuasion (Becker 2001; Murphy and Shleifer 2004; Glaeser, Ponzetto, and Shapiro 2005). As far as we know, our paper is the first to study persuasion in a model of associative thinking so central to psycho- logical work. The next section outlines our model of coarse thinking and compares Bayesian and coarse decision makers. Section III presents the results on persuasion, showing how the persuader takes advantage of transference and framing. Section IV uses the model to understand a crucial aspect of marketing, namely prod- uct branding. Section V applies the ideas of the model to the case of mutual fund advertising and presents some evidence on such advertising during the Internet bubble. Section VI concludes.

5. Economic theory has also considered analogical reasoning in thinking both about how individuals forecast the payoffs to different actions under uncertainty (Gilboa and Schmeidler 1995) and about how they forecast opponents’ strategies in game theoretic environments (Eyster and Rabin 2005; Jehiel 2005). Ettinger and Jehiel (2007) apply Jehiel’s equilibrium concept to a model similar to Crawford’s (2003) and show how a strategic Sender can exploit the fact that a Receiver who thinks coarsely about strategies may misinterpret the Sender’s actions in equilibrium. 6. Research following Nelson (1974) (e.g., Kihlstrom and Riordan 1984; Milgrom and Roberts 1986a) focuses on how, in equilibrium, the amount of ad- vertising may signal quality. This literature does not make predictions regarding the equilibrium content of advertisements. COARSE THINKING AND PERSUASION 583

II. MODEL Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 II.A. Basic Setup An individual must assess the quality of a given object, such as a shampoo, a mutual fund, or a political candidate. We denote this underlying quality of the object by q ∈ Q, where Q is some subset of R. In assessing the quality of the object, the individual faces one of three similar, but not identical, observable situations s ∈{0, 1, 2}. For instance, s = 0 could be “selecting a mutual fund,” s = 1 could be “selecting a professional ,” and s = 2 could be “grabbing an opportunity.” Our analysis pertains to the assess- ment of quality in situation s = 0. The individual observes a piece of public information r ∈{u, d} sent by nature that is potentially informative about q. The public information is meant to capture any piece of data available to the audience that cannot be controlled by a persuader. For instance, r could stand for past stock market performance, and u and d could stand for up and down, respectively. Later, the individual receives a potentially informative message m ∈{a, b} about q. In situation s = 0, this message is sent by a persuader who privately observes signal x ∈{a, b} prior to sending m, but in situations s = 1 and s = 2, nature sends signal m = x directly. The individual then uses (r, m) to form expectations about underlying quality. The timeline of the individual’s decision problem in s = 0 is given in Figure I.

FIGURE I Timeline of Decision Problem in s = 0

II.B. Bayesian Thinking The underlying joint function over quality, pub- lic information, private signals, and situations is p(q, r, x, s)and is common knowledge. We assume that the induced joint prob- ability mass function of public information and private signals conditional on situations, p(r, x|s), satisfies p(r, x|s) > 0 for each r, x,ands. We also assume that the marginal probability mass of situations, p(s), satisfies p(s = 0) + p(s = 1) + p(s = 2) = 1and p(s) > 0 for each s. 584 QUARTERLY JOURNAL OF ECONOMICS

The individual has beliefs about the probability mass of mes- sages conditional on public information, private signals, and sit- Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 uations (i.e., beliefs about the strategy of the persuader and of nature),σ ˆ (m|r, x, s). These beliefs, combined with p, generate a joint probability distribution over quality, public information, pri- vate signals, messages, and situations,p ˆ(q, r, x, m, s). Prior to making some decision in s = 0, the individual uses (r, m) to form an expectation of the underlying quality of the ob- ject. Because the situation is observable, the updated Bayesian expectation of quality is (1) E[q | r, m, s = 0] = E[q | r, x, s = 0]p ˆ(x | r, m, s = 0), x∈{a,b} where E[q | r, x, s = 0] is calculated using p andp ˆ(x | r, m, s = 0) is uniquely derived from p andσ ˆ using Bayes’ rule whenever possible. (Otherwise it is an arbitrary distribution.) To study the efficacy of persuasion, it is useful to examine the marginal effect of the message sent by the persuader on the individual’s assessment of quality conditional on public in- formation r. We define the Bayesian’s reaction to m to be the difference between the Bayesian’s expectation of quality prior to receiving a message from the persuader and his expectation of quality after receiving the message. Hence, the Bayesian’s reac- tion, E[q | r, m, s = 0] − E[q | r, s = 0], is given by

(2) (ˆp(a | r, m, s = 0) − p(a | r, s = 0)) × (E[q | r, a, s = 0] − E[q | r, b, s = 0]).

As illustrated in equation (2), the Bayesian’s reaction to m is the product of two terms: (i) the revision in the probability placed on the persuader’s private signal being x = a as a result of the message and (ii) the extent to which the conditional expectation of quality is different under private signal x = a than under x = b. Thus, the Bayesian only reacts to a message when he believes both that it is informative about the persuader’s private signal and that the private signal is predictive of quality. When E[q | r, m, s = 0] = E[q | r, s = 0], we say that message m is informative in s = 0 given r;whenE[q | r, m, s = 0] = E[q | r, s = 0], we say that message m is uninformative in s = 0given r. The message m = “looks silky” is informative in situation s = “evaluating hair” when it comes from a trustworthy source but m = “contains silk” is always uninformative in situation COARSE THINKING AND PERSUASION 585 s = “evaluating a shampoo” because silk does not affect the quality of shampoo. A Bayesian would react to m = “silk” in the former Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 situation but not in the latter.

II.C. Coarse Thinking The first essential assumption in our model is that different pieces of data may prime different mental representations or cate- gorizations. For example, the advertisement “be bullish” may lead individuals to think of investing with Merrill Lynch as grabbing an opportunity to get rich. They then interpret decision-relevant information about Merrill accordingly. The second essential assumption is that individuals do not have separate mental representations for every situation. Instead, they have only one representation for all the situations in a cat- egory and are effectively unable to differentiate between these situations. If individuals perceive foreign trade as a kind of war, they interpret the message “Americans are likely to win” in the same way when assessing globalization and military conflict. Cat- egorical thinking has been modeled by Mullainathan (2000), Fryer and Jackson (forthcoming), and Peski (2006);7 we rely most closely on Mullainathan (2000). Specifically, we assume that coarse thinkers either group sit- uation s = 0 together with similar situation s = 1, denoted by categorization C1 ≡{0, 1}, or group situation s = 0 together with similar situation s = 2, denoted by categorization C2 ≡{0, 2}. They can think of investing with Merrill Lynch as grabbing an oppor- tunity, or as hiring professional advice. Crucially, we assume that coarse thinkers do not have a separate mental representation for s = 0 and that they cannot group s = 0 together with both s = 1 and s = 2 at the same time. The latter assumption could arise out of a richer model of associations where s = 1 and s = 2 are sufficiently dissonant. The assumption that coarse thinkers have the same men- tal representation for distinct situations is motivated by evi- dence from psychology. Krueger and Clement (1994) asked ex- perimental participants to estimate the average temperatures of

7. There are some related finance models. Barberis and Shleifer (2003) present a model in which some investors group risky assets into categories and do not distinguish among assets within a category when formulating their demand. Hong, Stein, and Yu (2007) present a model in which investors use simplified univariate theories to forecast dividends. The theories investors use change over time in response to data. 586 QUARTERLY JOURNAL OF ECONOMICS

48 different days of the year. Participants tended to underesti- mate the difference between the average temperature of two days Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 belonging to the same month (e.g., August 12 and August 20) and to overestimate the difference between the average tempera- ture of two days belonging to neighboring months (e.g., August 25 and September 2). Because months are mental categories, coarse thinkers may estimate temperatures to be more similar within months than across months. Carpenter, Glazer, and Nakamoto (1994) asked experimen- tal participants to rate the quality of hypothetical products de- scribed by a list of attributes. Participants preferred products with irrelevant differentiating attributes (“alpine class fill” for down jackets) to products without such attributes even when told that such attributes were irrelevant. One interpretation is that participants responded positively because alpineness con- tains decision-relevant information in a similar situation, for ex- ample, buying skis. The coarse thinker relies on categorization C1 or C2 to form beliefs about quality. The specific categorization depends on the data received. Denote the map between the data received and the chosen categorization by C : {u, d}×{Ø, a, b}→{C1, C2}.With a slight abuse of notation, (r, m) = (r, Ø) denotes the information available to the individual prior to receiving the message from the persuader. We assume that individuals choose the most likely category given the data received,8,9 (3) C(r, m) = arg max pˆ(s ∈ C | r, m) for all C∈{C1,C2} (r, m) ∈{u, d}×{Ø, a, b}, and ignore the alternative category (Mullainathan 2000). This map is meant to capture the idea that different mental rep- resentations or schema may be primed by different stimuli.10

8. Maximizing the expression in (3) is equivalent to solving max pˆ(s | r, m) s∈{1,2} for all (r, m) ∈{u, d}×{Ø, a, b} because s = 0 is in both C1 and C2,where σˆ (m | r, x, s)p(r, x | s) p(s) x∈{a,b} pˆ(s | r, m) = . σˆ (m | r, x, s)p(r, x | s) p(s) s∈{0,1,2} x∈{a,b} 9. With multiple solutions to max pˆ(s ∈ C | r, m) for some (r, m) ∈{u, d}× C∈{C ,C } {Ø, a, b}, pick C1. 1 2 10. Smith (1998) reviews mental representation and psychological models of association. The idea that the coarse thinker chooses the most likely category given the observed data and ignores alternative categories is motivated by experimental evidence (Murphy and Ross 1994; Malt, Ross, and Murphy 1995). COARSE THINKING AND PERSUASION 587

For instance, an individual might associate—and therefore co- categorize—investing with Merrill Lynch with grabbing another Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 opportunity when looking at the ad “be bullish” or with seeking professional advice when looking at “a tradition of trust.” The coarse thinker applies the same model of inference to all situations in the category C(r, m). One natural way to model coarse thinking is as Bayesian thinking with a coarser information set; that is, the coarse thinker forms his beliefs about quality as a Bayesian who observes the category C(r, m) he is in but not the specific situation:

(4) pˆ(q | r, m, Ci) = pˆ(q | r, m, s = 0)p ˆ(s = 0 | r, m, Ci)

+ pˆ(q | r, m, s = i)ˆp(s = i | r, m, Ci), where Ci = C(r, m)forsomei ∈{1, 2}. This update rule (4) implies that, upon receiving (r, m), the coarse thinker’s expectation of quality is

(5) E[q | r, m, Ci] = E[q | r, m, s = 0]p ˆ(s = 0 | r, m, Ci)

+ E[q | r, m, s = i]ˆp(s = i | r, m, Ci).

An alternative way to model coarse thinking, which we actu- ally follow, is to assume that the coarse thinker does not condition on the information received in weighing each situation, but in- stead uses constant weights p(s | Ci). Such a coarse thinker forms beliefs about quality according to the rule

Ci (6) pˆ (q | r, m, s = 0) = pˆ(q | r, m, s = 0)p(s = 0 | Ci)

+ pˆ(q | r, m, s = i)p(s = i | Ci), where Ci = C(r, m)forsomei ∈{1, 2}. Updated rule (6) implies that, upon receiving (r, m), a coarse thinker’s expectation of quality is

Ci (7) E [q | r, m, s = 0] = E[q | r, m, s = 0]p(s = 0 | Ci)

+ E[q | r, m, s = i]p(s = i | Ci).

These two models share many features. Both capture the idea that the coarse thinker updates as if he cannot distinguish the situations within a category: Under each specification, the coarse thinker’s expectation of quality in s = 0 is a weighted average 588 QUARTERLY JOURNAL OF ECONOMICS of the Bayesian’s expectation across all situations co-categorized with s = 0 given (r, m). In the first model, a coarse thinker weighs Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 the expectation of quality in situation s by the likelihood of s in the category given the information received:p ˆ(s | r, m, Ci) . In the second, a coarse thinker weighs the expectation of quality in situation s by the likelihood of that situation in the category: p(s | Ci). In both models, the coarse thinker’s inference approaches the Bayesian’s as p(s = 0 | Ci) tends to 1. We do not favor one model over the other based solely on psy- chological evidence. Because the model of coarseness with weights p(s | Ci) greatly simplifies the formulas and proofs, we focus on it in what follows. We demonstrate in Appendix II that our main results hold with inessential modifications under the alternative assumption that coarse thinkers update using more “Bayesian” weightsp ˆ(s | r, m, Ci). We are interested in how the coarse thinker reacts to the mes- sage sent by the persuader conditional on the public information r. Fixing public information r, assume (without loss of general- ity) that the coarse thinker co-categorizes situation s = 0 with situation s = 1 before receiving the persuader’s message. That is, let C(r) ≡ C(r, Ø) and assume that C(r) = C1. Then the coarse thinker’s expectation of quality is EC1 [q | r, s = 0] (as given by (7)) before he receives a message from the persuader. To evaluate the coarse thinker’s reaction to m, we distinguish the cases where the coarse thinker either does or does not re- categorize situation s = 0 after receiving the persuader’s message. We say that message m is pivotal given r if

(8) C(r, m) = C(r).

Message m is pivotal given r if it leads the coarse thinker to recategorize situation s = 0. First, consider the case where m is not pivotal given r. In this case, the model produces empirically plausible patterns of under- and overreaction to data relative to the Bayesian benchmark. Comparing the magnitude of the coarse thinker’s reaction to m, | EC1 [q | r, m, s = 0] − EC1 [q | r, s = 0] |,tothe Bayesian’s, | E[q | r, m, s = 0] − E[q | r, s = 0] |, suggests two dis- tortions. First, because p(s = 0 | C1) < 1, the reaction of the coarse thinker only mutedly depends on his reaction to message m in situ- ation s = 0 itself, as measured by E[q | r, m, s = 0] − E[q | r, s = 0]. This effect can lead to an underreaction to data. Take, for example, COARSE THINKING AND PERSUASION 589 the case where E[q | r, m, s = 0] = E[q | r, s = 0], so m is informa- tive given r in s = 0,andE[q | r, m, s = 1] = E[q | r, s = 1], so m is Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 uninformative given r in s = 1. We see here that C1 | , , = − C1 | , = (9a) E [q r m s 0] E [q r s 0] = | , , = − | , = = | E[q r m s 0] E[q r s 0]p(s 0 C1) < E[q | r, m, s = 0] − E[q | r, s = 0].

Coarse thinking leads to an underreaction to news relative to Bayesian thinking. The uninformativeness of the data in the co- categorized situation dilutes its impact in situation s = 0 precisely because the current situation is underweighted in the update rule. However, continuing to consider the case where m is not piv- otal given r, the coarse thinker’s response also depends on a term that the Bayesian’s does not depend on: the informativeness of the message in the other situation s = 1 in the same category. This implies that the coarse thinker could react to noninforma- tion or overreact to information. Take, for example, the case where E[q | r, m, s = 0] = E[q | r, s = 0], but E[q | r, m, s = 1] = E[q | r, s = 1]: the message m is uninformative in situation s = 0 given r, but is informative in co-categorized situation s = 1. Then C C (9b) E 1 [q | r, m, s = 0] − E 1 [q | r, s = 0] =| E[q | r, m, s = 1]

−E[q | r, s = 1] | p(s = 1 | C1) >| E[q | r, m, s = 0] −E[q | r, s = 0] |= 0.

The coarse thinker now reacts to an uninformative message in situation s = 0 because it is informative in the co-categorized situation. His use of the same model to interpret messages in all situations in the category may lead him to overreact to nonin- formative messages. This, we suggest, is part of what a person responding to “We put silk in the bottle” is doing. We call this pro- cess transference. This transference of the informational content of messages across situations within a category drives the first set of our results below. Indeed, if the persuader cannot affect how individuals categorize a situation, the strategy of uninformative persuasion is to trigger such transference: successful persuasion in that case takes advantage of overreaction. Now consider the case where message m is pivotal given pub- lic information r,som leads to the recategorization of s = 0 from C1 to C2. Comparing the magnitude of the coarse thinker’s reaction 590 QUARTERLY JOURNAL OF ECONOMICS

to m when categorization depends on the message received, C C Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 E 2 [q | r, m, s = 0] − E 1 [q | r, s = 0] , so that when categoriza- tion is fixed, EC1 [q | r, m, s = 0] − EC1 [q | r, s = 0] , highlights ad- ditional effects from recategorization. Because message m leads to the recategorization of s = 0 from C1 to C2, the coarse thinker’s reaction now depends on more than the informativeness of that message about quality in some situation. To take an extreme example, the coarse thinker may react to m even when it is uninformative about quality in all situ- ations because it affects how he categorizes s = 0. To see this, take the case where m is uninformative about quality in all situations, but where the expectation of quality conditional on r in category C2 is different from that in category C1: E[q | r, m, s] = E[q | r, s] for all s,butEC2 [q | r, s = 0] = EC1 [q | r, s = 0].11 When message m prompts the coarse thinker to re-categorize s = 0 from C1 to C2, the magnitude of his reaction to m is (10) EC2 [q | r, m, s = 0] − EC1 [q | r, s = 0] C C = E 2 [q | r, s = 0] − E 1 [q | r, s = 0] > EC1 [q | r, m, s = 0] − EC1 [q | r, s = 0] = 0.

The coarse thinker reacts to the message because it affects how he categorizes situation s = 0, not because it is informative about quality in any situation. We call this phenomenon framing. “We try harder” frames Avis as an underdog. The message itself is uninformative about quality in all situations, but it encourages the recategorization of Avis from loser to underdog, so public infor- mation (Avis lags behind Hertz in sales) becomes an indicator of higher quality. When the persuader can influence categorization, uninformative persuasion optimally frames that situation in the mind of the audience. In summary, we have presented a simple model that natu- rally describes two manifestations of coarse thinking: framing and transference. Framing refers to how the audience thinks about the data; transference refers to what it thinks about it by analogy.

11. Applying the definition of ECi [q | r, s = 0] and rearranging terms, EC2 [q | C r, s = 0] = E 1 [q | r, s = 0] if and only if E[q | r, s = 0] (p(s = 0 | C2) − p(s = 0 | C1)) = E[q | r, s = 1]p(s = 1 | C1) − E[q | r, s = 2]p(s = 2 | C2). It is clear that this in- equality holds for a wide range of parameter values. For instance, if E[q | r, s = 0] = 0andp(s = 1 | C1) = p(s = 2 | C2) then the requirement is just that the public sig- nal is interpreted differently in s = 2 than in s = 1: E[q | r, s = 2] = E[q | r, s = 1]. COARSE THINKING AND PERSUASION 591

We show next how a rational persuader takes advantage of these mental strategies. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021

III. PERSUASION III.A. Setup Fixing public information r ∈{u, d}, the persuader in situa- tion s = 0 maximizes the individual’s expectation of quality, of persuasion costs. Presumably, the higher the expected quality, the greater the support the persuader receives, whether through sales, votes, or membership in his organization. The persuader can alter messages in situation s = 0. Specifi- cally, we assume that the persuader observes the signal x that the individual would see absent intervention (for example, absent an advertising campaign). He can then either intervene prior to the individual’s observing x and send altered message m ∈{a, b} not equal to x at cost c ≥ 0, or simply report m = x at zero cost. The individual then observes m and never sees the original x. As mentioned above, we make the simplifying assumption that nature directly sends the signal m = x in situations s = 1 and s = 2. This should be a reasonable assumption if s = 1 and s = 2 are situations in which persuaders are compelled (e.g., by law) to truthfully reveal all private information. Alternatively, s = 1 and s = 2 may be common, everyday situations in which there is no persuader to send an altered message (e.g., underdogs in many real-life situations observably try harder). Denote the persuader’s strategy in s = 0 bym ¯ 0 : {u, d}×{a, b} →{a, b}, wherem ¯ 0(r, x) = m represents the strategy of reporting message m whenever the public information is r and the private signal is x.12 Denote the corresponding “strategy” of nature in sit- uations s = 1 and s = 2 bym ¯ 1(r, x) = m¯ 2(r, x) = x for all r and x be- cause nature always reveals the private signal in those situations. An optimal strategy for the persuader in s = 0 selects an m to maximize his payoff conditional on (r, x) for each (r, x), where this conditional payoff is given by

E[q | r, m, s = 0] if m = x (11) E[q | r, m, s = 0] − c if m = x

12. We restrict attention to pure strategies except in Appendix II. 592 QUARTERLY JOURNAL OF ECONOMICS if the audience consists of Bayesians and by Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 EC(r,m)[q | r, m, s = 0] if m = x (12) EC(r,m)[q | r, m, s = 0] − c if m = x if the audience consists of coarse thinkers. Optimal strategies depend on how individuals respond to messages. The standard assumption is that individuals are so- phisticated, meaning that their beliefs are consistent with Bayes’ rule applied to the reporting strategies of the persuader and na- ture. More formally, this implies the following restrictions on post- erior beliefsp ˆ:

(S)p ˆ is derived from p and 1if¯ms(r, x) = m σˆ (m | r, x, s) = using Bayes’ rule 0if¯ms(r, x) = m whenever possible.13 An alternative to sophistication is that individuals take mes- sages at face value: individuals, including both Bayesians and coarse thinkers, take the message m they see and update as if the persuader always nonstrategically reveals the private sig- nal x. Formally, this implies the following restriction on posterior beliefs14:

(F)p ˆ is derived from p 1ifm = x andσ ˆ (m | r, x, s) = using Bayes’ rule. 0ifm = x We will show that, given an additional assumption described be- low, it does not matter whether we assume that individuals are sophisticated or that they take messages at face value in consid- ering the persuader’s optimal strategies. ∗, ∗ DEFINITION. An equilibrium (m ¯ 0 pˆ )satisfies , ∗ , (a) For each (r x) the persuader choosesm ¯ 0(r x) to maximize the audience’s expectation of quality minus costs of persua- sion, as given by (11), if the audience consists of Bayesians

13. When there exists a message m that is sent with zero probability under the persuader’s strategy in situation s = 0, relevant conditional beliefs about quality are derived from p,ˆσ , and an arbitrary distribution µ(x | r, m, s = 0) over {a,b}. 14. Shin (1994) also refers to updating on ex ante probabilities as taking messages at face value. Other authors (e.g., Kartik, Ottaviani, and Squintani 2007) refer to such updating as “na¨ıve.” COARSE THINKING AND PERSUASION 593

and by (12) if the audience consists of coarse thinkers. The persuader takes the audience’s beliefsp ˆ∗ as given. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 (b)p ˆ∗ satisfies (S) given the reporting strategy of the per- = ∗ suader,m ¯ 0 m¯ 0, if the audience is sophisticated and (F) if the audience takes messages at face value. To simplify the remaining analysis as well as to focus on truly uninformative persuasion, we make the uninformative persuasion assumption: the signal the persuader privately observes is always uninformative about quality in s = 0: (13) E[q | r, x, s = 0] = E[q | r, s = 0] for all r, x. The uninformative persuasion assumption implies that all messages are uninformative in situation s = 0 by (2). Hence, Bayesians never react to messages sent by the persuader in s = 0. Before characterizing the equilibria of this model, we estab- lish the following useful result.

LEMMA. Under our assumptions: (i) Categorization rule (3) is independent of the persuader’s strategy. (ii) Sophisticated and face value audiences hold the same ex- pectations of quality conditional on messages whether they consist of Bayesians (with expectations given by (1)) or of coarse thinkers (with expectations given by (7)). ∗ (iii) From (i) and (ii) it follows thatm ¯ 0 is an equilibrium strat- egy for the persuader given some sophisticated beliefs if and only if it is an equilibrium strategy when the audi- ence takes messages at face value. This is true whether the audience consists of Bayesians or of coarse thinkers.

Proof. In Appendix I.  This lemma establishes that, in considering the equilibrium strategies of the persuader, we can assume without loss of gener- ality that the audience takes messages at face value.15 The intu- ition is that face value and sophisticated individuals differ only in their beliefs regarding the equilibrium strategy of the persuader in s = 0. Because the persuader’s underlying private signal is unin- formative about quality and categorization rule (3) is independent

15. This lemma does not hold when coarse thinkers update according to the “more Bayesian” rule explored in Appendix II, where we focus on the sophisticated case. 594 QUARTERLY JOURNAL OF ECONOMICS of the persuader’s strategy, these differences do not generate dis- agreement between face value and sophisticated thinkers in what Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 messages imply about quality and hence do not affect the per- suader’s best response. ∗ This lemma is useful since it implies that, to verify that m0 is an equilibrium strategy for the persuader, we only need to check whether it maximizes the persuader’s payoff conditional on (r, x) for each (r, x), where this conditional payoff is given by E[q | r, x = m, s = 0] if m = x (14) E[q | r, x = m, s = 0] − c if m = x if the audience consists of Bayesians and by

ECi [q | r, x = m, s = 0] if m = x (15) ECi [q | r, x = m, s = 0] − c if m = x, where Ci = arg max p(s ∈ C | r, x = m), if the audience consists C∈{C1,C2} of coarse thinkers. In other words, this lemma establishes that it is unnecessary to compute consistent beliefsp ˆ∗and make sure that the persuader’s strategy is a best response to those beliefs. Rather, we can treat the audience’s beliefs as given by the ex ante probabilities in what follows. Additionally, the lemma implies that our results extend to environments where messages are verifiable. The most literal in- terpretation of a message in situation s = 0 is as unverifiable and potentially costly talk (e.g., m = “We try harder” for s = Car Rental), in which case m = x implies that the persuader paid a weakly positive cost to send an altered message or to engage in advertising. However, the lemma establishes that messages may also be viewed as reflecting the inclusion of observable and veri- fiable product attributes (e.g., m = “contains silk” for s = Sham- poo), in which case m = x implies that the persuader paid some weakly positive cost to change an objective product attribute and sophisticated individuals take messages at face value (as in the “persuasion games” of Milgrom and Roberts [1986b]).16

16. A somewhat subtle question is how coarse thinkers should computep ˆ(s | r, m) when messages are verifiable. However, given categorization rule (3) together with update rule (6), it does not matter whether they calculate pˆ(r, m | s)p(s) pˆ(s | r, m) = pˆ(r, m | s)p(s) s∈{0,1,2} COARSE THINKING AND PERSUASION 595

III.B. Bayesians Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 To simplify the remaining analysis and get clear results, we make several additional assumptions. First, we assume that pub- lic information is uninformative in s = 0 in the sense that

(16) E[q | r, s = 0] = E[q | s = 0] for all r. Second, we assume that the expected quality conditional on each situation equals zero:

(17) E[q | s] = 0.

These two additional assumptions are not important for our qual- itative results and serve to simplify the algebra. Together with the uninformative persuasion assumption, these assumptions imply that the Bayesian expectation of quality in situation s = 0 satisfies

(18) E[q | r, m, s = 0] = E[q | s = 0] = 0 for all r and m. Under these assumptions, we first characterize the per- suader’s optimal (i.e., equilibrium) strategy when the audience consists of Bayesians:

PROPOSITION 1 (Bayesian baseline). Suppose individuals are Baye- sians. Then an optimal strategy of the persuader in situation s = 0 is to always report m = x and to never pay the cost to send some other message m = x.Ifc > 0, this strategy is unique. Proof. Recall that all messages are uninformative in situation s = 0. As a result, fixing r, the persuader receives payoff E[q | r, m, s = 0] = 0 (from (18)) if he sends m = x and receives payoff E[q | r, m, s = 0] − c =−c if he sends m = x. Clearly, 0 ≥−c,soan optimal strategy of the persuader is to always report m = x.Since this inequality is strict whenever c > 0, this strategy is uniquely optimal whenever c > 0.  or p(r, m | s)p(s) pˆ(s | r, m) = , p(r, m | s)p(s) s∈{0,1,2} wherep ˆ(r, m | s) is defined as in footnote 8. 596 QUARTERLY JOURNAL OF ECONOMICS

According to Proposition 1, persuaders never fabricate unin- formative messages (or change uninformative product attributes) Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 for Bayesians when the cost of doing so is positive. Because private signals in situation s = 0 are assumed to be uninformative about quality, Bayesians do not update their beliefs from messages sent by the persuader. Thus the persuader receives no benefit from fabricating such messages and is unwilling to pay a positive cost to do so.

III.C. Coarse Thinkers We now examine the persuader’s optimal strategy when the audience consists of coarse thinkers. In the next two propositions, we characterize the persuader’s optimal strategy conditional on public information. Fix public information r and, without loss of generality, assume that C(r) = C1. (Recall from before that C(r) ≡ C(r, Ø).) First, consider the case where messages are not pivotal given r. Also, without loss of generality, suppose that pri- vate signal x = a is more favorable than private signal x = b in s = 1 in the following sense:

(19) E[q | r, a, s = 1] ≥ E[q | r, b, s = 1].

We have the following proposition:

PROPOSITION 2 (Transference). Suppose that individuals are coarse thinkers, that messages are not pivotal given public information r, and that condition (19) holds. Then an optimal strategy of the persuader in situation s = 0 may involve the creation of a message. Specifically, so long as

(20) c < (E[q | r, a, s = 1] − E[q | r, b, s = 1])p(s = 1 | C1) ≡ c∗,

any optimal strategy of the persuader dictates reporting m = a whenever x = b. Proof. Fix the public information r and recall that neither mes- sage is assumed to be pivotal. If the persuader reports m = x = b to a coarse thinker, his payoff is EC1 [q | r, b, s = 0] = E[q | r, b, s = C 1]p(s = 1 | C1). If he replaces x = b with m = a, his payoff is E 1 [q | r, a, s = 0] − c = E[q | r, a, s = 1]p(s = 1 | C1) − c. Subtracting the first payoff from the second, the persuader optimally replaces x = b with m = a if this difference is positive or, equivalently, if c < c∗.  COARSE THINKING AND PERSUASION 597

Condition (20) yields a corollary: Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 COROLLARY 1. Suppose the conditions of Proposition 2 hold and let c∗ be defined as above. Then c∗ is weakly increasing in (i) the probability of situation s = 1 in category C1, p(s = 1 | C1) = p(s = 1)/{p(s = 0) + p(s = 1)} and (ii) the magnitude of re- action in the co-categorized situation E[q | r, a, s = 1] − E[q | r, b, s = 1].

Proof. From (20), c∗ = (E[q | r, a, s = 1] − E[q | r, b, s = 1]) p(s = 1 | C1). This expression is increasing in (i) p(s = 1 | C1) be- cause E[q | r, a, s = 1] ≥ E[q | r, b, s = 1] by (19) and (ii) in E[q | r, a, s = 1] − E[q | r, b, s = 1] because p(s = 1 | C1) ≥ 0.  Proposition 2 illustrates that persuaders may pay a cost to alter messages for coarse thinkers even when the message they send cannot be pivotal. Their decision depends on whether the gap (in terms of the improved assessment of quality) between the best possible message (or attribute) and the private signal (or original attribute) is large enough to offset the cost of persuasion. This provides a way of thinking about why Culver replaced x = “no silk” with m = “silk” by putting “silk in the bottle.” Corollary 1 highlights the fact that, with two co-categorized situations, a persuader is more likely to manufacture a message in situation s = 0 if it has a lower probability within its category. Since the benefit for manufacturing messages is the transference of the informational content from other situations in a category to the current one, a higher probability of these other situations increases transference and this benefit. This point may shed light on what advertisers refer to as consumer involvement, a notion closely related but not identi- cal to that of “stakes.” A high-involvement product occupies a huge probability space in its category (p(s = 0) is high), so the transference from other situations is small, and hence so is the benefit of noninformative advertising. Our model predicts, as the marketing research recommends, that advertising in these in- stances should be informative (Sutherland and Sylvester 2000). In contrast, low-involvement products are mixed up in consumers’ minds with many similar situations and hence there is greater scope for persuasion, exactly as the marketing literature sug- gests. Alberto Culver Shampoo and Schwarzenegger’s defense of free trade are both consistent with this point. Voting may be an- other low-involvement activity, which encourages noninformative 598 QUARTERLY JOURNAL OF ECONOMICS advertising. Our point is not that people are incapable of rational high-involvement thinking, but rather that in many instances Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 they do not engage in such thinking, perhaps because it is not worth it. It is precisely in those instances of coarse thinking that persuasion pays. We now consider the case where categorization depends on the exact message the persuader sends; that is, one message is pivotal. To limit the number of cases, suppose that message m = a is pivotal in changing categorization from C1 to C2 and that the persuader weakly “prefers” sending pivotal message m = a over nonpivotal message m = b:

(21) E[q | r, a, s = 2]p(s = 2 | C2) ≥ E[q | r, b, s = 1]p(s = 1 | C1).

Because message m = a is pivotal, it does not matter whether (19) still holds. We then have

PROPOSITION 3 (Framing). Suppose that individuals are coarse thinkers, that message m = a is pivotal given public informa- tion r, and that condition (21) holds. Then an optimal strategy of the persuader in situation s = 0 may involve the creation of a message. Specifically, so long as

(22) c < (E[q | r, a, s = 2]p(s = 2 | C2)

−E[q | r, b, s = 1]p(s = 1 | C1)),

any optimal strategy of the persuader dictates reporting m = a whenever x = b. Proof. 17 Fix some piece of public information r and recall the assumption that m = a is pivotal given this information. If the persuader reports m = x = b to a coarse thinker, his pay- C off is E 1 [q | r, b, s = 0] = E[q | r, b, s = 1]p(s = 1 | C1). If he re- places x = b with m = a, his payoff is EC2 [q | r, a, s = 0] − c = E[q | r, a, s = 2]p(s = 2 | C2) − c. Subtracting the first payoff from the second, the persuader optimally replaces x = b with m = a

17. The conditions of Proposition 3 imply that the persuader would never wish to pay a cost to send the nonpivotal message m = b. This result hinges on the assumption (21). With the alternative (and equally reasonable) assumption that the inequality in (21) is reversed, the persuader would pay a sufficiently low cost to replace x = a (and avoid sending the pivotal message m = a)withnonpivotal message m = b. COARSE THINKING AND PERSUASION 599 when this difference is positive or, equivalently, when condition (22) holds.  Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021

COROLLARY 2. Suppose that the conditions of Proposition 3 hold and that any message the persuader sends is uninforma- tive about quality within each categorization given the pub- lic information: ECi [q | r, m, s = 0] = ECi [q | r, s = 0] for all i ∈{1, 2}. So long as (22) still holds, any optimal strategy of the persuader dictates replacing x = b with m = a: the opti- mal strategy of the persuader dictates paying a positive cost to send a message that is uninformative about quality within each categorization in order to desirably categorize situation s = 0 in light of public information r. Examples from the Introduction illustrate Corollary 2. With its ad “Be bullish,” Merrill Lynch frames itself as a provider of opportunities during the Internet bubble, without conveying any useful information. With its ad, “We try harder,” Avis frames itself as an underdog, and encourages car renters to interpret its lagging status in this favorable light, again without conveying any useful information.

COROLLARY 3 (Withholding Good Messages). Suppose the condi- tions of Proposition 3 hold and that m = b is a favorable mes- sage within each categorization given the public information: ECi [q | r, b, s = 0] > ECi [q | r, a, s = 0] for all i ∈{1, 2}.Solong as (22) still holds, any optimal strategy of the persuader dic- tates replacing private signal x = b with message m = a:the optimal strategy of the persuader may dictate paying a posi- tive cost to send a message that is universally less favorable in order to avoid undesirably categorizing situation s = 0in light of public information r. Corollary 3 presents a more subtle prediction of the model, which we return to in Section V, where we study mutual fund advertising. It says that firms avoid presenting good news when such news creates unattractive frames. In Section V, we show that mutual funds avoid presenting favorable information about their relative returns during periods of falling stock prices, perhaps because the mere mention of returns invites co-categorization of as grabbing opportunities, a frame that is not com- pelling during declining markets. 600 QUARTERLY JOURNAL OF ECONOMICS

The previous two propositions characterize the persuader’s optimal strategy conditional on public information. It is also inter- Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 esting to examine the persuader’s full strategy to see how it might depend on the public information. From Proposition 1, we know that the persuader’s optimal strategy is independent of public in- formation when the audience consists of Bayesians. The same is not necessarily true when the audience consists of coarse thinkers:

PROPOSITION 4. Suppose individuals are coarse thinkers. Then an optimal strategy of the persuader in situation s = 0 may de- pend on public information. That is, there may exist some x such that the persuader reports m = x when r = u but reports m = x when r = d.

Proof (by example). Example A. Suppose C(u) = C(d) = C1 and message m = a is pivotal given each piece of public informa- tion; that is, C(u, a) = C(d, a) = C2. Further, suppose

E[q | u, a, s = 2]p(s = 2 | C2) > E[q | u, b, s = 1]p(s = 1 | C1) and

E[q | d, b, s = 1]p(s = 1 | C1) > E[q | d, a, s = 2]p(s = 2 | C2).

These two inequalities imply that EC2 [q | u, a, s = 0] > EC1 [q | u, b, s = 0] and EC1 [q | d, b, s = 0] > EC2 [q | d, a, s = 0]. So long as c < min EC2 [q | u, a, s = 0] − EC1 [q | u, b, s = 0], EC1 [q | d, b, s = 0] − EC2 [q | d, a, s = 0] the persuader’s optimal strategy dictates replacing x = b with m = a when r = u and replacing x = a with m = b when r = d.

Example B. Suppose C(u) = C1, C(d) = C2, and messages are not pivotal given either possible piece of public informa- tion. Further suppose that E[q | u, a, s = 1] > E[q | u, b, s = 1] and E[q | d, b, s = 2] > E[q | d, a, s = 2]. These two inequalities imply that EC1 [q | u, a, s = 0] > EC1 [q | u, b, s = 0] and EC2 [q | d, b, s = 0] > EC2 [q | d, a, s = 0]. So long as c < min EC1 [q | u, a, s = 0] − EC1 [q | u, b, s = 0], EC2 [q | d, b, s = 0] − EC2 [q | d, a, s = 0] the persuader’s optimal strategy dictates replacing x = b with m = a when r = u and replacing x = a with m = b when r = d.  COARSE THINKING AND PERSUASION 601

We supply two examples in the proof of Proposition 4 (though only one is necessary). Merrill Lynch’s advertising campaigns over Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 the course of the Internet bubble illustrate both examples but also show how specific advertisements take advantage of both framing and transference. In the first example, there is always a pivotal message, which allows the persuader to frame s = 0 differently depending on the information he wishes to frame. Merrill Lynch advertises with “be bullish” to frame itself as a provider of op- portunities when market returns have been high, but with “a tradition of trust” to frame itself as a provider of services when market returns have been low. The second example shows that the persuader may wish to send different, nonpivotal, messages depending on how s = 0 is categorized given the public information. Having framed public information about market returns, Merrill Lynch advertises its financial analysts upon framing itself as a provider of opportuni- ties, but its financial consultants or advisors upon framing itself as a provider of services. These features of Merrill Lynch’s ads tap into, rather than alter, the prevailing mental model and take advantage of transference.

IV. PRODUCT BRANDING A major challenge for the fields of and marketing is to understand product branding. Consumers buy many branded products, often repeatedly, at higher prices than identical or nearly identical “generic” products (Tirole 1988). Ac- cording to Peter and Olson (2005, p. 97), 71% of cigarette buyers, 65% of mayonnaise buyers, 61% of toothpaste buyers, and 53% of bath soap buyers are loyal to their brands (i.e., claim in a sur- vey of 2000 respondents to mostly buy the same brand). Although some brands are physically different from generic products, oth- ers, such as Clorox bleach, are identical. In this section, we show how firms may be able to differentiate their products and create brands through uninformative advertising. To fix ideas, consider the case of California “Burgundy.” Bur- gundy is a French region that produces high-quality and expensive wines from the pinot noir grape. California also produces expen- sive wines from the pinot noir grape. In California, they are called pinot noir, not burgundy. About 40 years ago, California wine pro- ducers started making inexpensive red wines called California Burgundy. These wines contain no pinot noir grape, only cheaper 602 QUARTERLY JOURNAL OF ECONOMICS varietals. Even so, it appears that merchants tend to charge more for a 5-liter package of Peter Vella California Burgundy than Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 for the same-sized package of Peter Vella Delicious Red. In May 2007, out of the 48 stores listed on winesearcher.com as selling both wines, 27 charged more, 20 charged the same amount, and 2 charged less for California Burgundy. The average “premium” on California Burgundy is 73 cents ($12.60 for California Burgundy as opposed to $11.87 for Delicious Red). As a brand, California Burgundy appears to command a price premium even though its physical characteristics have nothing to do with Burgundy wine. (It is still possible, however, that the brand is superior for some other reason we cannot verify.) Our model provides a way to understand why California Bur- gundy sells for more than Delicious Red. Consider a consumer facing one of two similar situations: buying a bottle of American red wine (s = US) and buying a bottle of French red wine (s = FR). American bottles are all initially labeled Table (x = Table), but French bottles are either initially labeled Table or Burgundy (x = Burgundy). We assume that neither label is informative in evaluating the quality of American wine, but that the label Bur- gundy is a positive signal of quality for French wine. Specifically,18 E[q | x = Burgundy, s = FR] > 0 > E[q | x = Table, s = FR] (23) E[q | x = Burgundy, s = US] = 0 = E[q | x = Table, s = US]. The audience is populated by measure one of consumers. Frac- tion β of consumers are coarse thinkers who co-categorize buying a bottle of American wine with buying a bottle of French wine no matter how the bottle is labeled: for all m, C(m) ={US, FR}≡C.19 Fraction 1 − β are Bayesians who differentiate the two situations. From (23) it follows that for coarse thinkers EC [q | m = Burgundy, s = US] > 0 > EC [q | m = Table, s = US] (24) and for Bayesians (25) E[q | m = Burgundy, s = US] = E[q | m = Table, s = US].

18. We ignore the possibility of public information in this section. 19. To simplify matters, we only allow situation s = US to be co-categorized with one other situation, s = FR, in this section. In other words, we abstract away from the possibility that the coarse thinker may group the situation “buying a bottle of American red wine” with another distinct situation (e.g., “buying a bottle of Italian red wine”) depending on the message sent by the producer. COARSE THINKING AND PERSUASION 603

A monopolist wine producer in the United States sells two homogeneous (or perfectly substitutable) wines, w = 1andw = 2. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 The producer can at zero cost label the wines “Table” or pay a fixed advertising cost c ≥ 0 to label one or both wines “Burgundy.”20 For simplicity, the marginal cost of production is set to equal 0. A consumer buys at most one bottle of wine. Decision utility from purchasing a bottle of wine w is given by

(26) U(w) = u¯ + qw − pw, where qw denotes the quality and pw the price of wine w and u¯ > 0 is some constant. To limit the number of cases considered, assume that βu¯ < −EC [q | Table, US].21 Utility from not buying wine is assumed to equal 0. Consumers maximize their expected (decision) utility. Consider a game with two periods. In the first period, the mo- nopolist simultaneously decides how to label and price its wines.22 In the second, consumers observe the label and price of each wine and decide which wine (if any) to buy. An equilibrium in this context is defined to be a tuple of strate- gies satisfying (i) the producer’s choice of labels and prices (his strategy) maximizes profits given consumers’ strategies and (ii) for each set of labels and prices each consumer’s strategy dictates making a purchase decision that maximizes expected utility. We restrict attention to pure strategy equilibria in which the firm sells both wines to a positive fraction of the population on the equilibrium path. Defining (27) c¯ ≡ β u¯ + EC [q | Burgundy, US] we have the following results:

PROPOSITION 5. Suppose all individuals are Bayesians (β = 0) and a monopolist wine producer sells two homogeneous wines in the United States. Then there exists an equilibrium such that the producer does not label either wine Burgundy and charges the same priceu ¯ for each of its wines. When c > 0, any equilibrium has this property.

20. If it seems troubling to assume that it is costly to label wine “Burgundy,” note that we allow c = 0. 21. This eliminates the possibility that the monopolist could find it optimal to label both wines Table and chargeu ¯ + EC [q |, Table, US] < u¯ (recall that EC [q | Table, US] < 0) for each wine (selling to both Bayesians and coarse thinkers). 22. We assume that buyers do not take price to be a potential signal of quality in either s = US or s = FR. 604 QUARTERLY JOURNAL OF ECONOMICS

Proof. In Appendix I. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021

PROPOSITION 6. Suppose fraction 0 <β<1 of consumers are coarse thinkers who co-categorize buying U.S. and French wine and there is a monopolist wine producer selling two homogeneous wines in the United States. If c < c¯, product differentiation through uninformative advertising emerges: in any equilibrium, the producer replaces x = Table with m = Burgundy for exactly one of its wines and chargesu ¯ + EC [q | m = Burgundy, s = US] > u¯ for that wine while chargingu ¯ for the other. The higher-priced wine is sold to the coarse thinkers. If c > c¯, then in any equilibrium the producer labels each wine m = x = Table and chargesu ¯ for each wine. In any such equilibrium, coarse thinkers do not buy either wine. Proof. In Appendix I. Proposition 6 shows how product branding emerges in equi- librium for certain parameter values. In such an equilibrium, the branded good is sold for a higher price than the generic good. The reason that this is possible is that the coarse-thinking audi- ence (incorrectly) believes the branded good is superior because it is associated with an attribute that contains decision-relevant information in a co-categorized situation. This rendition of brand- ing seems broadly consistent with the standard discussions in the marketing literature (Sutherland and Sylvester 2000; Peter and Olson 2005).

V. M UTUAL FUNDS The mutual fund industry presents a major challenge to fi- nancial economics. It is enormous, supervising around $7 trillion of investor assets. It includes thousands of competitors, who nonetheless charge high fees and remain highly profitable. Per- haps most strikingly, it appears to provide no economic value to investors, with virtually all mutual funds underperforming by a significant margin passive strategies offered by low-fee funds (Swensen 2005). How can an industry be so successful while adding so little value and charging so much? Perhaps part of the answer is successful persuasion. Below we present a simple model of advertising to coarse-thinking mu- tual fund investors and offer some evidence that it can help explain. COARSE THINKING AND PERSUASION 605

V.A. A Simple Model Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Suppose that coarse-thinking individuals do not sufficiently differentiate between choosing a mutual fund and selecting other professional advisors, such as doctors or lawyers, or grabbing great opportunities, such as buying stocks or finding jobs on tips. Indi- viduals face one of three similar situations: selecting a mutual fund (s = MF), selecting a professional service (s = PS), and grab- bing an opportunity (s = GO). The data an individual sees are as follows. First, he observes publicly available data about general past performance r ∈{u, d}, where r = u stands for good past performance and r = d stands for bad past performance. For a mutual fund, this could be the past market return. For another professional service, this could be the history of success of a particular surgery or type of lawsuit. For grabbing an opportunity, this could be his own or his friends’ experience with chasing tips. Second, he may receive a more specific hard message from a persuader about past performance mp ∈{ap, bp}, where mp = ap stands for good past performance and mp = bp stands for bad past performance. For a mutual fund, this could be some measure of its past relative or absolute return. For another professional service, this could be the history of success of a particular doctor or lawyer. For grabbing an opportunity, this could be a measure of past success, such as return, as well. We depart from the formal model presented earlier in one way. We assume that the mutual fund cannot fabricate hard in- formation about past performance. In other words, it cannot send mp = xp. On the other hand, the mutual fund can at zero cost choose not to report information about past returns and send mes- sage mp = Ø no matter the realization of xp. This departure does not introduce any new conceptual issues. To apply the analysis of previous sections we make five as- sumptions. First, we expand the set of possible private signals to include the empty message and assume that it is in general uninformative about quality:23

ASSUMPTION 1.

(28) E[q | r, xp = Ø, s] = E[q | r, s] for all r and s.

23. A realistic assumption is that mutual funds always have access to veri- fiable past performance data (so p(Ø | s = MF) = 0). However, we do not need to make any explicit assumptions on p(xp | s = MF) for the following analysis to hold. 606 QUARTERLY JOURNAL OF ECONOMICS

Second, we assume that the availability of past performance data is associated with grabbing opportunities. Specifically, we Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 assume that the joint distribution p is such that selecting a mutual fund is co-categorized with grabbing an opportunity if and only if the fund reports a non-null message about its past performance.24 Formally, ASSUMPTION 2. CGrabOpp ≡{MF, GO} if mp = Ø (29) C(r, mp) = CProServ ≡{MF, PS} if mp = Ø for all r. Third, we assume that all data on past performance are un- informative in evaluating the quality of a mutual fund (Carhart 1997; Chevalier and Ellison 1997; Sirri and Tufano 1998; Swensen 2005)25: ASSUMPTION 3.

(30) E[q | r, xp, s = MF] = 0 for all (r, xp). Assumption3impliesthatitisalsothecasethat

(31) E[q | r, mp, s = MF] = 0 for all (r, mp). Fourth, we assume that in evaluating other professional ser- vices or opportunities, good past performance data constitute good news about quality: ASSUMPTION 4.   E[q | u, xp, s ] ≥ E[q | d, xp, s ] for all xp (32)   E[q | r, ap, s ] ≥ E[q | r, bp, s ] for all r for each s ∈ {PS, GO}. Finally, we assume that past aggregate performance data are “sufficiently” more informative relative to individual performance data in evaluating the quality of opportunities than in evaluating the quality of other professional services:26

24. An implicit assumption here is that, unlike mutual funds, other profes- sional services and opportunities sometimes do not have access to past perfor- mance data (or such data are unverifiable) and access is uncorrelated with quality. 25. We are carrying over assumption (19) from the earlier sections that, for all s, E[q | s] = 0. 26. The conditions of Assumption 5 guarantee, respectively, that the mutual fund always reports performance when aggregate returns have been high and that COARSE THINKING AND PERSUASION 607

ASSUMPTION 5. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021

E[q | u, bp, s = GO]p(s = GO | CGrabOpp) > E[q | u, s = PS]p(s = PS | CProServ) (33) E[q | d, ap, s = GO]p(s = GO | CGrabOpp) < E[q | d, s = PS]p(s = PS | CProServ).

The above assumptions yield the following proposition:

PROPOSITION 7. Suppose individuals are coarse thinkers and As- sumptions 1–5 hold. When aggregate returns have been high (r = u), the optimal strategy of the mutual fund dictates al- ways reporting message mp = xp about past performance; that is, always reporting hard information about past returns. When aggregate returns have been low (r = d), the optimal strategy of the mutual fund dictates always reporting mes- sage mp = Ø about past performance; that is, never reporting hard information about past returns.

Proof. Fixing r and xp, the persuader receives payoff E[q | r, xp, s = GO]p(s = GO | CGrabOpp) if he reports mp = xp (by As- sumptions 2 and 3) and receives payoff E[q | r, s = PS]p(s = PS | CProServ) if he reports mp = Ø (by Assumptions 1, 2, and 3). By Assumptions 4 and 5, the former payoff is strictly higher than the latter whenever r = u, and strictly lower whenever r = d.Conse- quently, any optimal strategy of the persuader dictates always reporting mp = xp when r = u and always reporting mp = Øwhen r = d. Uniqueness is immediate.  Proposition 7 yields two implications. First, because past performance data are informative for evaluating the quality of opportunities, a mutual fund will advertise with data about its past performance when aggregate returns have been high to co- categorize selecting the fund with grabbing such opportunities and thereby to maximize the reaction to the good aggregate re- turns. Second, to minimize the reaction to bad aggregate returns, a mutual fund will not report even universally favorable informa- tion about past returns when aggregate returns have been low,

it never reports it when aggregate returns have been low. An example might clarify the link between the intuition and the mathematical conditions of Assumption 5. The mathematical conditions would be met, for instance, if past aggregate performance data were uninformative in selecting another professional service but were “more informative” than even individual performance data in grabbing an opportunity in the sense that E[q | u, bp, s = GO] > 0andE[q | d, ap, s = GO] < 0. 608 QUARTERLY JOURNAL OF ECONOMICS to avoid co-categorizing selecting the fund with grabbing such op- portunities. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Before turning to the evidence, three comments about these implications are in order. First, the prediction that mutual funds advertise their own performance in rising but not in falling mar- kets is difficult to reconcile with any plausible theory of rational persuasion. One could imagine a rational signal-extraction theory in which own past performance was informative about managerial ability in rising, but not in falling, markets. However, this does not strike us as plausible (in fact, the reverse seems more plausi- ble). Moreover, this rational signal-extraction theory is difficult to reconcile with other facts about mutual fund advertising, such as omission of data about fees. Second, the model does not deal with one additional important fact about mutual fund advertising, namely that it rarely includes data on management fees (Cronqvist 2005), data that are arguably crucial to assessing future returns. An extension of our model, which takes advantage of the fact that consumers often do not know fees for other professional services, such as doctors, might account for this finding. Such an extension was explored in the previous draft of this paper. Third, another extension of the model might deal with the closely related question of which products to advertise. Specifi- cally, firms should advertise products that co-categorize selecting a mutual fund with grabbing opportunities after returns have been high, and products that co-categorize selecting the fund with choosing other professional services after returns have been low. Mullainathan and Shleifer (2006) present evidence that the ad- vertising of growth funds is highly procyclical. Below, however, we focus on the predictions about the inclusion of past returns in mutual fund advertisements.

V.B. Evidence We put together a data set of all financial advertisements from two magazines: Business Week and . Business Week is a weekly business newsmagazine. We examine all issues from January 1, 1994, to December 31, 2003. Money is monthly and more specifically directed at individual investors. We examine all the issues from January 1, 1995, to December 31, 2003. (The one- year difference in coverage is due to hard copy availability in Harvard libraries.) We copy and date all financial advertisements COARSE THINKING AND PERSUASION 609 Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021

FIGURE II Stock Mutual Fund Ads Returns / Total Stock Mutual Funds Ads in every issue, both to count them and to examine their content. We aggregate the information on both the number and the con- tent of ads into quarterly series. In particular, we keep track of who the advertiser is, whether it is a mutual fund and, if so, of what kind, and whether the ads include information on the fund’s own past returns. Because we are interested in the persuasion of investors, we eliminate from the database business-to-business ads (principally investment banking ads, or other ads explicitly directed at companies). Our total sample includes 1,469 ads from Business Week and 4,971 ads from Money. Figure II shows, for Business Week and Money separately, the share of stock mutual fund ads containing information on own past returns (absolute or relative) in all stock mutual fund ads over the sample period. It also shows, for the same period, the rolling one-quarter-lagged return on the S&P 500 index, the most common indicator of broad market returns. As Figure II shows, on average only about 60% of the stock mutual fund ads present any data on own past returns, and the correlation be- tween the share of ads including these returns and the past market return is over .7 for both Money and Business Week.27 Indeed, Figure II makes it clear that, after the market crash, past returns data disappear from advertisements. This finding

27. These correlations fall by about .2 but remain highly statistically signifi- cant if we detrend all series using linear time trends. 610 QUARTERLY JOURNAL OF ECONOMICS Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021

FIGURE III Stock Outperforming S&P 500 vs. Number of Ads (T Rowe Price Year-on-Year) is broadly consistent with the predictions of our model, namely that the inclusion of past returns data is used to frame mutual fund investing as grabbing an opportunity rather than as hiring advice. One might object that this evidence is best explained by a sim- pler theory that funds only like to report good news, and the news is bad in down markets. If this were correct, then stock funds with good relative return news to report should report it in down mar- kets as well because they can always do so. Corollary 3 predicts, in contrast, that even good news about relative returns should not be reported to avoid the grabbing opportunities categorization. Is this prediction borne out by the data? Do companies really avoid advertising good relative returns in down markets, as our model predicts? Figure III addresses this hypothesis. It shows the relevant data for T. Rowe Price, a mutual fund complex that is the most frequent advertiser in our sample. We supplement our advertis- ing data with a sample of T. Rowe Price stock mutual funds with assets over $300 million at the beginning of the sample period, so we can compute the number of large T. Rowe Price stock mu- tual funds that outperform the market. Figure III shows that T. Rowe Price places a lot of ads during this period and that it has many funds outperforming the S&P 500 after 1999. If any- thing, the number of stock funds with good relative performance rises sharply during 2001–2002. Nonetheless, both the number COARSE THINKING AND PERSUASION 611 of stock mutual fund ads and the number of such ads reporting returns fall to near zero after the market declines. Even though T. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Rowe Price has many funds with positive relative performance, it chooses not to advertise them. This finding is, again, broadly con- sistent with our model, in which advertising returns prime the “opportunities” frame, which is unattractive to investors in down markets.

VI. CONCLUSIONS This paper has supplied a formal model of associative think- ing reflecting ideas about inference and persuasion from such diverse fields as linguistics, psychology, politics, marketing, and advertising. The main idea is that individuals “think coarsely”: they group situations into categories and apply the same model of inference to all situations within a category. Coarse think- ing exhibits two features that persuaders take advantage of: (i) transference, whereby individuals transfer the informational con- tent of a given message from situations in a category where it is useful to those where it is not, and (ii) framing, whereby ob- jectively useless information influences individuals’ choice of cat- egory. The model includes full Bayesian rationality as a limit- ing case, in which each situation is evaluated as if in its own category. The model sheds light on several phenomena. It explains how “soft” messages with little informational content can be persua- sive, especially in low-involvement situations such as choosing inexpensive or evaluating political candidates. It helps dis- sect the content of successful advertisements. It illuminates prod- uct branding. And it helps account for some features of mutual fund advertising, such as the procyclical inclusion of returns, that are hard to rationalize in any conventional model of informative communication. Our paper is just a first step in the analysis of uninforma- tive persuasion. Although we have allowed for category choice, we have not allowed for fluid categories, which can accommodate much more creativity on the part of the persuader (Lakoff 1987). We have also focused on associative thinking rather than on as- sociative feeling; there are no automatic quick judgments in our model. In persuasion, such feelings are likely to play an important role as well. 612 QUARTERLY JOURNAL OF ECONOMICS

APPENDIX I: PROOFS Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 A. Proof of Lemma First we will show that part (i) holds so long as 1ifm = x (34) σˆ (m | r, x, s) = 0ifm = x and 1ifx = m (35) pˆ(x | r, m, s) = 0ifx = m for s ∈{1, 2}. Note that conditions (34) and (35) hold whether the audience is sophisticated or takes messages at face value. To establish part (i), we need to show that

(36) C(r, m) = arg max pˆ(s ∈ C | r, m) C∈{C1,C2} does not depend onσ ˆ (m | r, x, s = 0) andp ˆ(x | r, m, s = 0). To this end,

C1 = arg maxpˆ(s ∈ C | r, m) C∈{C1,C2} ⇔ pˆ(s ∈ C1 | r, m) ≥ pˆ(s ∈ C2 | r, m) ⇔ pˆ(s = 1 | r, m) ≥ pˆ(s = 2 | r, m) (37) , | = = , | = = ⇔ pˆ(rm s 1)p(s 1) ≥ pˆ(rm s 2)p(s 2) pˆ(r, m | s)p(s) pˆ(r, m | s)p(s) s s ⇔ pˆ(r, m | s = 1)p(s = 1) ≥ pˆ(r, m | s = 2)p(s = 2) ⇔ p(r, m | s = 1)p(s = 1) ≥ p(r, m | s = 2)p(s = 2).

Note that this last condition is independent ofσ ˆ (m | r, x, s = 0) andp ˆ(x | r, m, s = 0). Now, to prove parts (ii) and (iii) of the lemma it is sufficient to show that E[q | r, m, s = 0] and EC(r,m)[q | r, m, s = 0] do not depend onσ ˆ (m | r, x, s = 0) andp ˆ(x | r, m, s = 0). First consider E[q | r, m, s = 0]: E[q | r, m, s = 0] = E[q | r, x, s = 0]p ˆ(x | r, m, s = 0) x∈{a,b} (38) = E[q | r, s = 0]p ˆ(x | r, m, s = 0) x∈{a,b} = E[q | r, s = 0] COARSE THINKING AND PERSUASION 613 for anyσ ˆ (m | r, x, s = 0) andp ˆ(x | r, m, s = 0), where the second equality follows from the uninformative persuasion assumption. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Equation (38) establishes the lemma for the case where the audi- ence consists of Bayesians. C(r,m) Now consider E [q | r, m, s = 0]. Fixing C(r, m) = Ci:

ECi [q | r, m, s = 0]

= E[q | r, m, s = 0]p(s = 0 | Ci) + E[q | r, m, s = i]p(s = i | Ci) = | , = = | E[q r s 0]p(s 0 Ci)      + E[q | r, x , s = i]ˆp(x | r, m, s = i) p(s = i | Ci) x∈{a,b}

= E[q | r, s = 0]p(s = 0 | Ci) + E[q | r, x = m, s = i]p(s = i | Ci) (39) for anyσ ˆ (m | r, x, s = 0) andp ˆ(x | r, m, s = 0).

B. Proof of Propositions 5 and 6 First, note that a necessary and sufficient condition for any given strategy of a consumer (Bayesian or coarse) to be a best response to the producer’s strategy is that it specifies buying wine w = i = j with positive probability for prices (p1, p2) and labels (m1, m2)onlyif e e u¯ + q − pi ≥ u¯ + q − pj IC (40) i j + e − ≥ u¯ qi pi 0IR and places positive probability on not buying either wine only if ≥ + e − (41) 0 max u¯ qk pk k∈{1,2} e = | , e = where qi E[q mi US] if the consumer is a Bayesian and qi C | , E [q mi US] if the consumer is a coarse thinker. The monopolist maximizes its profit, = p1 D1(p1, p2,δ) + p2 D2(p1, p2,δ) − δ1c − δ2c, given consumers’ strategies, where Di ∈ (0, 1) denotes the total demand for wine w = i given con- sumers’ strategies and δi is an indicator variable taking on the value of 1 if and only if the monopolist labels wine w = i “Burgundy.” Consider a possible equilibrium where both wines are labeled “Table.” In any such equilibrium, it must be the case that p1 = p2 ≡ p, because we confine attention to equilibria in which both 614 QUARTERLY JOURNAL OF ECONOMICS wines are sold with positive probability on the equilibrium path and wine w = i would face zero demand due to consumers’ in- Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 centive compatibility (IC) constraints if pi > pj . Also, in any such equilibrium, p = u¯. To see this, suppose first that p > u¯. Then both wines would face zero demand by consumers’ IR constraints and the producer would want to deviate and charge p = u¯, thereby selling to the Bayesians and making a positive profit. The pro- ducer would also not want to setu ¯ + EC [q | Table, US] < p < u¯ or p < u¯ + EC [q | Table, US] because it would wish to deviate by setting p = u¯ and p = u¯ + EC [q | Table, US], respectively. It is thus left to check that the producer also would not wish to set p = u¯ + EC [q | Table, US] and sell to both Bayesians and coarse thinkers. The producer’s payoff isu ¯ + EC [q | Table, US] if it sets p = u¯ + EC [q | Table, US], whereas it is (1 − β)¯u if it sets p = u¯. The second payoff is greater than the first because βu¯ < −EC [q | Table, US] by assumption. We have established that in any equilibrium where both wines are labeled “Table” the producer charges p =u¯ for each wine (selling only to Bayesians) and earns (1 − β)¯u ≡ (NoDiff). (∗) Now consider a possible equilibrium where the pro- ducer labels only one wine (say w = 2) “Burgundy.” In any such equilibrium, the producer clearly charges p1 = u¯ and C p2 = u¯ + E [q | Burgundy, US] because these are the highest prices it can charge while still satisfying consumers’ IR con- straints. In such an equilibrium, Bayesians buy w = 1 since C p1 = u¯ < p2 = u¯ + E [q | Burgundy, US] and coarse thinkers buy C C w = 2 becauseu ¯ + E [q | Burgundy, US] − p2 = 0 > u¯ + E [q | C Table, US] − p1 = E [q | Table, US]. The payoff to the producer in such an equilibrium is

(1 − β)¯u + β(¯u + EC [q | Burgundy, US]) − c (42) = β EC [q | Burgundy, US] + u¯ − c ≡ (Diff).

Thus, we have established that in any equilibrium where wine w = 2 is labeled “Burgundy” the producer charges p1 = u¯ C for wine w = 1 (selling only to Bayesians), charges p2 = u¯ + E [q | Burgundy,US] for wine w = 2 (selling only to coarse thinkers), and earns (Diff). (∗∗) By similar logic, the producer earns at most max{u¯,β(¯u + EC [q | Burgundy, US])}−c < (Diff) if it labels both wines “Burgundy,” so it will not do so in equilibrium. COARSE THINKING AND PERSUASION 615 Finally, comparing (NoDiff) with (Diff), we see that (Diff) > (NoDiff) iff β EC [q | Burgundy, US] + u¯ − c > (1 − Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 β)¯u ⇔ c <β(¯u + EC [q | Burgundy, US]) = c¯. It follows from this that (i) the producer labels exactly one wine “Burgundy” when c < c¯, (ii) the producer labels both wines “Table” when c > c¯,and (iii) there exists an equilibrium such that the producer labels both wines “Table” when c = c¯.(∗∗∗) The statements of Propositions 5 and 6 follow from (∗), (∗∗), and (∗∗∗).

APPENDIX II: “MORE BAYESIAN”UPDATING RULE In this Appendix, we explore the alternative, “more Bayesian,” updating rule for coarse thinkers presented in equations (4)–(5). Recall that this update rule is different in one key way from that in our primary model: the coarse thinker con- ditions on the information received in weighing the expectation of quality in a situation. Under this update rule, conditional on choosing a category (which still depends on the data received), the coarse thinker can simply be thought of as a Bayesian whose in- formation set contains only the knowledge of the category but not that of the situation. We state and prove natural analogs to Propo- sitions 2 and 3 under these alternative assumptions and demon- strate that the results presented in Section II are largely robust. Equilibrium is defined as in the main text for the case of a coarse, sophisticated audience except that the persuader is now assumed to maximize E[q | r, m, C(r, m)] if m = x (43) E[q | r, m, C(r, m)] − c if m = x

(rather than (12)), taking the audience’s beliefsp ˆ as given. The persuader is allowed to use mixed strategies. We now characterize some optimal (equilibrium) strategies of the persuader when the audience consists of sophisticated coarse thinkers under this alternative updating rule. Fix public informa- tion r and, without loss of generality, assume that C(r) = C1. First, consider the case where messages are not pivotal given r.Also, without loss of generality, suppose that private signal x = a is more favorable than private signal x = b in s = 1 in the following sense:

(44) E[q | r, a, s = 1]p(s = 1 | r, a, C1)

≥ E[q | r, b, s = 1]p(s = 1 | r, b, C1). 616 QUARTERLY JOURNAL OF ECONOMICS

We have the following proposition: Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021

PROPOSITION A.1 (Transference) . Suppose individuals are sophis- ticated coarse thinkers under the modified updating rule, that messages are not pivotal given public information r,andthat condition (44) holds. Then an optimal strategy of the per- suader in situation s = 0 may involve the creation of a mes- sage. Specifically, so long as

(45) c < E[q | r, a, s = 1]p(s = 1 | r, a, C1)

− E[q | r, b, s = 1]p(s = 1 | r, b, C1),

it cannot be optimal for the persuader in situation s = 0 to always report the private signal. Further, if the inequality in (45) holds, there always exists an optimal strategy such that the persuader replaces x = b with m = a with positive probability given r.

Proof. See online Appendix.  We now consider the case where categorization depends on the exact message the persuader sends; that is, one message is pivotal. To limit the number of cases, suppose that message m = a is pivotal and that the persuader weakly “prefers” private signal x = a over private signal x = b:

E[q | r, a, s = 2]p(s = 2 | r, a, C2) (46) ≥ E[q | r, b, s = 1]p(s = 1 | r, b, C1).

Because message m = a is pivotal, it does not matter whether or not (44) still holds. We then have

PROPOSITION A.2 (Framing) . Suppose individuals are sophisti- cated coarse thinkers under the modified updating rule, that message m = a is pivotal given public information r,andthat condition (46) holds. Then an optimal strategy of the per- suader in situation s = 0 may involve the creation of a mes- sage. Specifically, so long as

(47) c < E[q | r, a, s = 1]p(s = 2 | r, a, C2)

− E[q | r, b, s = 1]p(s = 1 | r, b, C1), COARSE THINKING AND PERSUASION 617

it cannot be optimal for the persuader in situation s = 0to always report the private signal. Further, if the inequality Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 in (47) holds, there always exists an optimal strategy such that the persuader replaces x = bwithm= a with positive probability given r.

Proof. See online Appendix. 

HARVARD UNIVERSITY HARVARD UNIVERSITY

REFERENCES Barberis, Nicholas, and Andrei Shleifer, “Style Investing,” Journal of , 68 (2003), 161–199. Becker, Gary, “Rational Indoctrination and Persuasion,” University of Mimeo, 2001. Becker, Gary, and Kevin Murphy, “A Simple Theory of Advertising as a Good or Bad,” Quarterly Journal of Economics, 108 (1993), 941–964. Bertrand, Marianne, Dean Karlin, Sendhil Mullainathan, Eldar Shafir, and Jonathan Zinman, “What’s Psychology Worth? A Field Experiment in Con- sumer Credit Markets,” NBER Working Paper No. 11892, 2006. Carhart, Mark, “On Persistence in Mutual Fund Performance,” Journal of Finance, 52 (1997), 57–82. Carpenter, Gregory, Rashi Glazer, and Kent Nakamoto, “Meaningful Brands from Meaningless Differentiation: The Dependence on Irrelevant Attributes,” Jour- nal of Marketing Research, 31 (1994), 339–350. Chevalier, Judith, and Glenn Ellison, “Risk Taking by Mutual Funds as a Response to Incentives,” Journal of , 105 (1997), 1167–1200. Crawford, Vincent, “Lying for Strategic Advantage,” , 93 (2003), 133–149. Crawford, Vincent, and Joel Sobel, “Strategic Information Transmission,” Econo- metrica, 50 (1982), 1431–1451. Cronqvist, Henrik, “Advertising and Portfolio Choice,” Mimeo, 2005. DellaVigna, Stefano, and Ethan Kaplan, “The Fox News Effect,” Quarterly Journal of Economics, 122 (2007), 1187–1234. Dewatripont, Mathias, and , “Advocates,” Journal of Political Economy, 107 (1999), 1–39. Downs, Anthony, An Economic Theory of Democracy (New York: Harper and Row, 1957). Edelman, Gerald, Bright Air, Brilliant Fire: On the Matter of the Mind (New York: Basic Books, 1992). Ettinger, David, and Philippe Jehiel, “Toward a Theory of Deception,” mimeo, University College London, UCL, 2007. Eyster, Erik, and Matthew Rabin, “Cursed Equilibrium,” Econometrica, 73 (2005), 1623–1672. Fryer, Roland, and Matthew Jackson, “A Categorical Model of Cognition and Biased Decision-Making,” The B.E. Journal of Theoretical Economics, 8 (2008), Article 6. Gabaix, Xavier, and David Laibson, “Shrouded Attributes, Consumer Myopia, and Information Suppression in Competitive Markets,” Quarterly Journal of Economics, 121 (2006), 505–540. Gentzkow, Matthew, and Jesse Shapiro, “Media Bias and Reputation,” Journal of Political Economy, 114 (2006), 280–316. 618 QUARTERLY JOURNAL OF ECONOMICS

Gilboa, Itzhak, and David Schmeidler “Case-Based ,” Quarterly Journal of Economics, 110 (1995), 605–639. Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Gilovich, Thomas, “Seeking the Past in the Future: The Effect of Associations to Familiar Events on Judgments and Decisions,” Journal of Personality and Social Psychology, 40 (1981), 797–808. Glaeser, Edward, “The Political Economy of Hatred,” Quarterly Journal of Economics, 120 (2005), 45–86. Glaeser, Edward, Giacomo Ponzetto, and Jesse Shapiro, “Strategic Extremism: Why Republicans and Democrats Divide on Religious Values,” Quarterly Jour- nalofEconomics, 120 (2005), 1283–1330. Glazer, Jacob, and Ariel Rubinstein, “Debates and Decisions: On a Rationale of Argument Rules,” Games and Economic Behavior, 36 (2001), 158–173. ——, “On Optimal Rules of Persuasion,” Econometrica, 72 (2004), 1715–1736. Goffman, Erving, Frame Analysis (Cambridge, MA: Harvard University Press, 1974). Grossman, Sanford, “The Informational Role of Warranties and Private Disclo- sure about Product Quality,” Journal of , 24 (1981), 461– 483. Grossman, Sanford, and , “Disclosure and Takeover Bids,” Jour- nalofFinance, 35 (1980), 323–334. Hong, Harrison, Jeremy Stein, and Jialin Yu, “Simple Forecasts and Paradigm Shifts,” Journal of Finance, 62 (2007), 1207–1242. Jehiel, Philippe, “Analogy-Based Expectation Equilibrium,” Journal of Economic Theory, 123 (2005), 81–104. Kahneman, Daniel, and Amos Tversky, “Prospect Theory: An Analysis of Decision under Risk,” Econometrica, 47 (1979), 263–292. ——, Judgment Under Uncertainty: Heuristics and Biases (New York: Cambridge University Press, 1982). Kartik, Navin, Marco Ottaviani, and Francesco Squintani, “Credulity, Lies, and Costly Talk,” Journal of Economic Theory, 134 (2007), 93–116. Kihlstrom, Richard, and Michael Riordan, “Advertising as a Signal,” Journal of Political Economy, 92 (1984), 427–450. Krueger, Joachim, and Russell Clement, “Memory-Based Judgments About Mul- tiple Categories,” Journal of Personality and Social Psychology, 67 (1994), 35–47. Lakoff, George, Women, Fire, and Dangerous Things (Chicago: The University of Chicago Press, 1987). Malt, Barbara, Brian Ross, and Gregory Murphy, “Predicting Features for Mem- bers of Natural Categories When Categorization Is Uncertain,” Journal of Experimental Psychology, 21 (1995), 646–661. McCloskey, Donald, and Arjo Klamer, “One Quarter of GDP Is Persuasion,” American Economic Review Papers and Proceedings, 85 (1995), 191–195. Milgrom, Paul, “Good News and Bad News: Representation Theorems and Appli- cations,” Bell Journal of Economics, 12 (1981), 380–391. Milgrom, Paul, and John Roberts, “Price and Advertising as Signals of Product Quality,” Journal of Political Economy, 94 (1986a), 796–821. ——, “Relying on the Information of Interested Parties,” Rand Journal of Eco- nomics, 17 (1986b), 18–32. Mullainathan, Sendhil, “Thinking through Categories,” mimeo, MIT, 2000. Mullainathan, Sendhil, and Andrei Shleifer, “Media Bias,” NBER Working Paper No. 9295, 2002. ——, “The Market for News,” American Economic Review, 95 (2005), 1031–1053. ——, “Persuasion in Finance,” NBER Working Paper No. 11838, 2006. Murphy, Gregory, and Brian Ross, “Predictions from Uncertain Categorizations,” Cognitive Psychology, 27 (1994), 148–193. Murphy, Kevin, and Andrei Shleifer, “Persuasion in Politics,” American Economic Review Papers and Proceedings, 94 (2004), 435–439. Nelson, Philip, “Advertising as Information,” Journal of Political Economy,82 (1974), 729–754. Okuno-Fujiwara, Masahiro, Andrew Postlewaite, and Kotaro Suzumura, “Strate- gic Information Revelation,” Review of Economic Studies, 57 (1990), 25–47. Peski, Marcin, “Categorization,” University of Chicago Working Paper, 2006. COARSE THINKING AND PERSUASION 619

Peter, Paul, and Jerry Olson, Consumer Behavior and Marketing Strategy (New York: McGraw–Hill, 2005). Downloaded from https://academic.oup.com/qje/article/123/2/577/1930852 by Harvard Law School Library user on 15 June 2021 Shapiro, Jesse, “A ‘Memory-Jamming’ Theory of Advertising,” mimeo, University of Chicago, 2006. Shin, Hyun Song, “News Management and the Value of Firms,” Rand Journal of Economics, 25 (1994), 58–71. Sirri, Erik, and Peter Tufano, “Costly Search and Mutual Fund Flows,” Journal of Finance, 53 (1998), 1589–1622. Smith, Eliot, “Mental Representations and Memory,” in Handbook of Social Psy- chology, fourth edition, Daniel Gilbert, Susan Fiske, and Gardner Lindzey, eds. (New York: McGraw–Hill, 1998). Stigler, George, “The Economics of Information,” Journal of Political Economy 69 (1961), 213–225. ——, The Theory of Price, fourth edition (New York: Macmillan Publishing, 1987). Stigler, George, and Gary Becker, “De Gustibus Non Est Disputandum,” American Economic Review, 67 (1977), 76–90. Sutherland, Max, and Alice Sylvester, Advertising and the Mind of the Consumer, second edition (St. Leonards, Australia: Allen & Unwin, 2000). Swensen, David, Unconventional Success: A Fundamental Approach to Personal Investment (New York: Free Press, 2005). Tirole, Jean, The Theory of Industrial Organization (Cambridge, MA: MIT Press, 1988). Zaltman, Gerald, “Rethinking Market Research: Putting People Back In,” Journal of Marketing Research, 34 (1997), 424–437.