DRINKING YOUR OWN KOOL-AID:

SELF-DECEPTION, DECEPTION CUES AND PERSUASION IN MEETINGS

JEREMIAH W. BENTLEY University of Massachusetts Amherst

ROBERT J. BLOOMFIELD† Cornell University

SHAI DAVIDAI* Princeton University

MELISSA J. FERGUSON* Cornell University

April 2016

We thank Abigail Allen, Mary Kate Dodgson, Michael Durney, Scott Emett, Steve Kachelmeier, Bob Libby, Eldar Maksymov, Patrick Martin (MAS Discussant), Mark Nelson, Ivo Tafkov (AAA Discussant), and workshop participants at the AAA Annual Meeting, the Brigham Young University Accounting Symposium, Cornell University, the LEMA group at Penn State, the Managerial Accounting Section Mid-Year Meeting, the University of Massachusetts Amherst, and the University of Michigan for their helpful comments and suggestions.

†Corresponding author: Johnson Graduate School of Management, Cornell University, Ithaca, NY 14850. Email: [email protected] *Joined the paper after the first two authors had written a manuscript based on experiment 1. All four authors contributed equally to the development of the paper from that point.

DRINKING YOUR OWN KOOL-AID: SELF-DECEPTION, DECEPTION CUES AND PERSUASION IN MEETINGS

Abstract: Two experiments show that face-to-face meetings help users discern reporters' true beliefs better than those who receive only a written report. Both experiments are based on a 'cheap talk' setting, modified to include two features common to accounting settings: reporters base reports on rich information, and (in a meeting condition) have rich channels of communication to users. Experiment 1 shows that meetings improve users' ability to discern the beliefs reporters held before they had an incentive to deceive the user. Once reporters learned of their incentive to deceive users, they revised their beliefs toward what they wanted users to believe (they self-deceived); those who revised more were more successful in their deception. Experiment 2 shows that users discerned reporters' beliefs through linguistic tone: reporters who believed their reports used more positive words. The results highlight the importance of face-to- face meetings and provide experimental support for Trivers' self-deception theory.

Key words: Deception, persuasion, linguistics, cheap-talk, reporting, motivated reasoning, self- deception

I. INTRODUCTION

“Jerry, just remember. It’s not a lie… if you believe it.”

George Costanza, , Episode 102 (“The Beard”) (1995).1

The face-to-face meeting is a pervasive institution for evaluating the veracity of performance reports. We report the results of a laboratory experiment showing that the rich communication environment of a face-to-face meeting causes reporters to betray their true beliefs to the users of their reports. Reporters limit detection by deceiving themselves into believing what they prefer the users to believe (they “drink their own Kool-Aid” 2). A second experiment shows that reporters betray their beliefs partly by using more positive linguistic tone when they are honest than when they are deceptive.

Both of our experiments examine ‘cheap talk’ settings in which a reporter communicates to a user without any auditing technologies or punishments for misreporting. Because the reporter and user have misaligned incentives, economic theory predicts that the reporter will lie and that the user will ignore the report (e.g. Crawford and Sobel 1982, Forsythe, Lundholm, and

Reitz 1999). To capture key features of deception in accounting settings, we modify the usual laboratory cheap talk setting in two ways. First, we require reporters to form judgments about the meaning of rich data, much as they would when reporting reserves, allowances, useful lives, impairments, or fair values. Second, we allow some reporter-user pairs to meet face-to-face to discuss the report, as they would through meetings with shareholders, conference calls, or

1 From the script provided at http://www.seinology.com/scripts/script-102.shtml, accessed March 2, 2016. 2 See, for example, http://www.geekwire.com/2011/drink-koolaid/. ~ 1 ~ narrative reports to investors (such as Management Discussion and Analysis) or constituents (as in the narratives recommended by GASB, 2008).

Our results are consistent with predictions drawn from Trivers’ self-deception theory

(Trivers 1976/2006, 1985). Trivers’ and coauthors (e.g. von Hippel and Trivers 2011) argue that people who deceive themselves are better able to deceive others. We define other-deception

(which we also refer to simply as “deception”) as someone expressing a judgment that is distorted in a directional fashion with the intent to persuade another party of something that the expressing party does not sincerely believe. Following Trivers (1976/2006), we define self- deception as distorting one’s sincere belief about a judgment (not merely the expression of the judgment) towards what one wishes someone else will believe. Self-deception theory argues that deception comes with psychological costs (e.g. cognitive dissonance, discomfort from violating other-regarding preferences) and cognitive costs (e.g. increased cognitive load from keeping a consistent false story). These costs result in subtle deception cues that may be detected by users, such as less use of positive words relative to negative words, increased pauses, and uncontrollable microexpressions (e.g. Vrij 2008; DePaulo et al. 2003). Reporters suppress these cues by deceiving themselves.

The traditional cheap talk paradigm restricts communication between reporters and users, which prevents the transmission and detection of deception cues. We predict that relaxing restrictions on communication by allowing reporters and users to meet will increase the transmission and detection of deception cues, which in turn will increase the degree to which users’ beliefs are associated with reporters’ initial beliefs and self-deception.

To test our predictions, we construct a two-player cheap-talk game that presents reporters with rich and subjective information. One participant of each pair is assigned to be the reporter

~ 2 ~ and enters the lab ostensibly to participate in an individual decision-making study. He sees two decks of cards.3 Each card shows the title and title screen of a YouTube video. The monetary value of the card is a function of the number of times the video has been viewed on YouTube.

The number of views is not shown on the card, forcing reporters to make a subjective judgment that requires rich communication and is susceptible to self-deception. After examining both decks of cards, the reporter decides how many of cards he would like to have drawn randomly from each deck, for a total of 10 cards, knowing he will be paid for the cash value of each card drawn. The user’s decision reflects the reporter’s ability for other-deception. The reporter also draws another ten cards from the two decks under the same incentives as before. The reporter’s decision reflects the reporter’s judgment after exposure to pressures toward self- and other- deception.

Next, the reporter is told that he will have the chance to interact with another participant

(the user), and be paid a bonus of $0.50 for each card that he can persuade the user to draw from one deck (the ‘commission’ deck). The commission deck is selected by coin flip, and thus may or may not be the deck he sincerely believes to be the better option. The user knows the basic nature of the card decks, but has seen only three cards from each deck. The user therefore has an incentive to learn the reporters’ sincere judgment, but also knows that the reporter may have an incentive to express that judgment deceptively. After the interaction, the user decides how many cards she would like to draw from each deck, for a total of 20 cards, and is paid the value of the cards drawn. The reporter also draws another ten cards from the two decks under the same

3 For fluency and clarity we refer to the reporter using masculine pronouns and the user using feminine pronouns throughout the paper. However, male and female participants are randomly assigned to conditions as either reporters or users. ~ 3 ~ incentives as before. This decision reflects the reporter’s judgment after exposure to forces for self- and other-deception.

Our key manipulation allows half of the reporter-user pairs to meet. In the no meeting condition, the reporter gives the user a handwritten recommendation simply stating the number of cards the user should choose from each deck (e.g., “15 from deck 1 and 5 from deck 2”), but there is no other interaction. In the meeting condition we relax the restrictions on communication: the handwritten recommendation is followed by an opportunity for the participants to meet and talk face-to-face. Such rich communication allows for a variety of deception cues which may be detected by users, and which may be suppressed by self-deception.

Consistent with our predictions, we find that users’ choices are associated with reporters’ initial judgments (as measured from reporters’ initial card choices) when the parties meet face- to-face, but not when they communicate only through the report. Reporters revise their judgments to be more consistent with what they prefer users to believe, whether or not they meet with the user, but these revisions are more strongly associated with users’ beliefs when they meet than when they don’t. Taken together, the results of our first experiment provide evidence that, as predicted by self-deception theory, meetings provide cues that allow users to determine reporters’ judgments, and that reporters can make their recommendations more convincing by revising their judgments toward what they wish the user to believe.

Our second experiment examines the mechanism by which self-deception aids in other- deception. To do so, we devote our entire sample to the meeting condition, record all conversations between reporters and users, and measure the extent to which linguistic deception cues mediate the relationship between reporter and user beliefs. Consistent with prior research on deception (e.g., Newman et al. 2003; Larcker and Zakolyukina 2012) we find that reporters use

~ 4 ~ less positive words and more negative words when they are attempting to deceive users without first deceiving themselves, and that such language mediates the relationship between reporter beliefs and user beliefs.

To the best of our knowledge, our results provide the first direct support of von Hippel and Trivers’ (2011) theory that people who ‘drink their own kool-aid’ are more persuasive. Our study also provides experimental support for the benefits of institutions that force people to explain and defend their reports. Narrative reporting allows users to distinguish between reporters’ beliefs and reporters’ cheap talk. Archival research (e.g. Larcker and Zakolyukina

2012; Hobson, Mayew, and Venkatachalam 2012) has shown that sophisticated technology and statistical analysis can allow researchers to detect deception in conference calls and similar settings. However, extensive research by Vrij and others (see Vrij 2008 for a summary) raises doubts about whether untrained people who lack such tools will be able to detect lies. Our results suggest that users can protect themselves from deception by meeting with reporters in ways that allow them to detect linguistic cues. However, this protection is partially offset by reporter self-deception.

Future research could extend our results in two directions. First, additional research could examine whether different types of post-report communication provide the same degree of belief-transmission. Some forms of communication (e.g. written statements) may give reporters more opportunity to censor their deception cues and to engage in self-deception. At the same time, some post-report communication (e.g. through text or audio) will give users less opportunity to detect these deception cues. Second, additional research could test the strongest claim of self-deception theory: that self-deception is caused by its benefits to other-deception.

Although our results are consistent with such a causal relationship, self-deception could be

~ 5 ~ caused by other forces (e.g., a prosocial desire to see the user earn money) yet still aid in other- deception as a side-effect.

The rest of the paper proceeds as follows. Section 2 discusses background literature and theory to develop our hypotheses. Section 3 outlines our experimental methods. Section 4 presents the results of our first experiment. Section 5 presents the changes for and results of our second experiment. Section 6 concludes.

II. BACKROUND, THEORY & PREDICTIONS

Detecting Deception in Cheap-Talk Games

In the cheap-talk paradigm typically used to study financial reporting issues, a reporter learns the true state of the world and then sends a simple report to a user. The incentives of the two parties are sufficiently at odds that the unique equilibrium outcome is for the reporter to lie, and the user to ignore the report. The cheap-talk paradigm is a powerful setting in which to examine deception and to test how accounting institutions affect reporters’ propensity to deceive.

Many studies have shown that reporters are more honest and users are more trusting than would be predicted by economic theory (see e.g. Forsythe et al. 1999, Baiman and Lewis 1989 and

Evans, Hannan, Krishnan, and Moser 2001). Research has also shown that reporters are more honest when honesty concerns are made more salient (Rankin, Schwartz, and Young 2008), when they are paid more fairly (Evans et al. 2001), or when they have reputation concerns

(Stevens 2002), and less honest when they have exerted more effort on the task (Haesebrouck

2015 or feel like they deserve more compensation (Brown, Chan, Choi, Evans, and Moser 2015).

The typical cheap-talk paradigm is less useful for examining users’ ability to detect deception, and in understanding how reporters might escape such detection through self- deception. The traditional paradigm prohibits all but the most simplistic communication and

~ 6 ~ interaction, forcing users to judge the truthfulness of a report by relying entirely on their understanding of the institutional setting and their prior beliefs about likely human behavior in the institutional setting presented to them, tempered only by beliefs about the likelihood of the reported state of nature. In contrast, most accounting settings allow for rich communication about the report, either through real-time interactions (like conference calls or face-to-face meetings) or prepared textual narratives (like Management Discussion and Analysis). This rich communication is a ripe setting for the physical and linguistic behaviors that can indicate deception because people experience psychological and cognitive stress when lying. These stresses result in systematic deception cues including facial microexpressions such as the ‘liar’s smirk’ (Zuckerman, DePaulo, and Rosenthal 1981; Ekman 2009), more negative words and fewer positive words, more pauses, less hand/finger movement and illustrators, a higher voice pitch, and more pupil dilation (see DePaulo et al. 2003 and Vrij 2008 for a thorough review of the literature). Individually, these deception cues generally have small effect sizes (DePaulo et al.

2003) that may be difficult to detect without large samples or specialized software (Vrij 2008).

However, users may still detect deception by looking at a large number of cues in aggregate.

The traditional cheap-talk paradigm also provides reporters with perfect objective information about the topic of their report. In contrast, most accounting settings require reporters to use rich information to form subjective judgments (like reserves, allowances, and fair value estimates). This subjectivity makes accounting settings ripe with opportunities for self-deception, because people typically hold distorted beliefs only when they can justify them as reasonable

(Kunda 1990; Ng and Shankar 2010).

To study the ability of users to detect (and reporters to manage) deception cues, we modify a traditional cheap-talk game to manipulate whether or not reporters meet face-to-face

~ 7 ~ and communicate freely with users after they issue their report. We predict that deceptive reporters will give off a wide variety of cues that users can detect, but only when they meet.

Thus, users will be better able to discern a reporter’s true (rather than stated) beliefs when the two parties meet than when they do not meet.

H1: Meetings increase the association between reporters’ and users’ beliefs.

Support for this hypothesis is hardly a foregone conclusion. Although large-sample analyses of video, voice, and transcripts can reveal deception (e.g. Li 2008; Hobsen, Mayew, and

Venkatachalam 2012; Larcker and Zakolyukina 2012), even professionals trained to detect deception struggle to do so consistently (Ekman and O’Sullivan 1991; Bond and DePaulo 2006).

For example, Bond and DePaulo (2006) find that people perform only slightly better than chance

(54 percent vs. chance level of 50 percent in their experiment) at detecting deception.

Furthermore, Vrij, Granhag, and Porter (2010) documents a large disconnect between what people think are cues to deception and what cues have real statistical validity for detecting deception. However, in most of the experiments and contexts analyzed by these prior researchers, reporters’ false statements were extremely straightforward (e.g. I did not take ; I did not hit someone), limiting the extent to which subtle cues of confidence, fluency, and sincerity could be detected. Furthermore, in many of the prior studies, researchers explicitly tell participants to lie, potentially allowing participants to feel morally justified in their deception, reducing the psychological discomfort that can cause deception cues. To increase the likelihood that users can detect deception cues, we require reporters to convey a recommendation based on a subjective evaluation of rich evidence about two investment opportunities. In addition to capturing a form of communication common in accounting settings, this subjectivity forces users to justify both true and false beliefs by talking about their reasoning process, which we expect to amplify the cues that indicate honesty or deception. ~ 8 ~

Self-Deception: ‘Drinking Your Own Kool-Aid’

If our first hypothesis is correct, then a reporter will be more persuasive (meaning he will be more effective at influencing a user’s actions in the intended direction) when the reporter sincerely believes his own report. This link between beliefs and persuasion effectiveness creates an incentive for reporters to self-deceive or “drink their own Kool-Aid”, as argued by Trivers and coauthors (Trivers 1976/2006; von Hippel and Trivers 2011). Reporters who successfully revise their own beliefs will no longer experience the psychological stress and cognitive effort that induces deception cues, because they will feel less guilt, less fear of being caught, and experience lower cognitive load in trying to support their claims. This reduction in stress will reduce their propensity to give off deception cues and make users more likely to believe them.

We therefore make two predictions: that reporters will revise their beliefs in the direction of their persuasion goal, and that meetings will increase the degree to which users’ beliefs are influenced by reporters’ belief-revision.

H2: Reporters revise their beliefs in the direction of their persuasion goal.

H3: Meetings increase the association between reporters’ belief-revision and users’ beliefs.

Note that H2 is consistent with, but distinct from, traditional motivated reasoning theory

(e.g. Kunda 1990). Motivated reasoning theory predicts that people believe what they personally want to be true. In the case of persuasion goals, reporters only want their persuasive story to be true if (1) they can be more persuasive if the story is true, (2) they have other-regarding preferences, or (3) they have a preference for being honest. Absent these conditions, reporters have no reason to care whether the story they present is true or false.4 Thus, H2 does not stem

4 Furthermore, reporters often have incentives to have accurate personal beliefs which would reduce the likelihood that they would bias their beliefs in the direction of their persuasion goal (see e.g. Prior, Sood, and Khanna 2015). In our setting, we provide an explicit incentive to have accurate beliefs such that there is an assumed cost of self- deception. ~ 9 ~ directly from motivated reasoning theory without the addition of one of the above conditions.

The Trivers’ self-deception theory argues that people are more persuasive when they believe their own argument is true, which would satisfy the first condition.

We are aware of no prior research testing H3, despite the fact that Trivers has promoted the role of self-deception in supporting other-deception for more than three decades. In an article fittingly titled “Get Thee to a Laboratory”, Dunning (2011) points out that these hypotheses have not been empirically tested. Dunning then issues a call for research, asking researchers to test (1) whether “people engage in self-deception more eagerly when they must persuade another person of some proposition” (i.e. our H2) and (2) whether “people are more persuasive to others to the extent they have persuaded themselves of some untruth first” (i.e. our H3) (p. 18).

III. EXPERIMENT ONE METHOD

Overview

Participants

We solicited subjects through a pool of students at a large US university who had indicated an interest in participating in research studies.5 We revealed little of substance in the solicitation, stating only that it was a “Study on Decision Making” in which they would “Earn between $5 and $25 for the 30 minute study”. The use of student subjects is appropriate because our theory applies to anyone involved in a persuasion task (Libby, Bloomfield, and Nelson

2002). Our only participation requirements were that subjects must be adults, and either enrolled in our school’s pre-existing subject pool, or a member of the school community and willing to become enrolled. A total of 135 subjects participated in the study. One participant was unable to finish the task because there was an odd number of participants and we needed participants to

5 Both experiments were approved by the IRB at the participating institution. ~ 10 ~ work in pairs. One pair of participants is excluded from analysis because the reporter misunderstood the instructions and gave a recommendation that did not add up to 20 cards. Of the remaining 132 participants, 43 (33 percent) were male and 89 (67 percent) were female.

Design

Our experiment is a modified cheap-talk game, and follows the timeline shown in figure

1. Subjects were paired into dyads, with one subject playing the role of reporter and the other of user. (Note that we use the terms ‘reporter’ and ‘user’ throughout this paper, but simply used the term ‘participant’ throughout the experiment itself.) We scheduled participants to arrive in a staggered fashion in order to provide a constant flow of participants. Dyads were established as participants arrived.6 The reporter examined two decks of cards to determine the values of their cards, and sent a report to the user recommending how many cards to draw (and be paid the value of) from each deck. The user’s choice represents our primary dependent variable.

We manipulated two factors between dyads, Commission and Meeting. We used the flip of a coin to determine whether the reporter would earn a commission for every card the user chose from one deck (if heads) or the other (if tails). The computer randomly determined whether the reporter and the user would meet to discuss the report. Thus, our experiment has a 2

(Commission) x 2 (Meeting) between-subjects design.

The design also includes two measured independent variables that reflect reporters’ beliefs about the values of the two decks. To gather data for these variables, we asked reporters how many cards they would like to draw (and be paid the value of) from each deck, once before the commission deck was determined and once after giving their report to the user (and meeting with the user if in the meeting condition). We interpret the first choice as the reporters’ judgment about the relative value of decks in the absence of any motivation for self-deception,

6 Thus, a participant who arrived early could be assigned as the user matched with a reporter who signed up for an earlier time or as the reporter for an on-time participant. Similarly, a participant who arrived late could be assigned as the user matched with a reporter who signed up for a later time or as the reporter for another participant who signed up for a later time. There is no systematic punctuality difference between reporters and users. ~ 11 ~ because reporters did not know which deck was the commission deck when they made that choice. We interpret the second choice as reflecting self-deception to the extent that the reporter increases the number of cards chosen from the commission deck, after it has been determined.

Task Paradigm

Like most cheap-talk games, we use an abstract task that allows us to test our underlying theories about beliefs, goals, and communication in reporting, without requiring subjects to have extensive training in accounting or financial statement analysis, which would not be germane to our theory. Our study differs from most cheap-talk experiments in that we ask reporters to make a subjective judgment based on rich information about the value of the two investment opportunities. This richness is intended to allow the parties to engage in a more substantive and natural discussion if they meet, and to allow reporters to justify any revision of their beliefs as reasonable.

To allow for subjectivity and rich information, we created ten decks of eighteen cards each. Each card shows the title and thumbnail image of a YouTube video. The value of the card is equal to the number of times the video has been viewed divided by 400,000, rounded to the nearest nickel. Participants are told that no card has more than 200,000 views.

Unknown to participants, the actual decks have identical dollar-value distributions. Each deck has two cards each worth $0.05, $0.10, … , $0.45 with the average card in both decks being worth $0.25.7 The identical distribution gives us the objective statement (unknown to

7The decks were pseudo-randomly created. The YouTube API does not allow for truly random sampling. To generate our sample we created a list of 700 random words. We then fed each word into a YouTube API search string that excluded all “Racy” and non-English videos as defined by YouTube. Each search returned at most 100 videos (a limitation imposed by YouTube). We combined the results of each search, and excluded videos with more than 189,999 views or less than 10,000 views. We then made 18 bins, each covering a range of 10,000 views. This process guaranteed us that any two videos from the same bin would round to the same dollar value (e.g 10,000 views and 19,000 views both round to be worth $0.05) and would have close to the same number of views. We selected ten cards from each bin at random and randomly placed one in each deck. Thus, each deck has exactly one card from each bin for a total value of $4.50. There is no statistically significant difference between the number of views in the decks. ~ 12 ~ participants) that there is no difference between the decks while allowing participants to subjectively interpret the evidence to support either deck as being superior.

The decks are designed to generate substantial variation in belief that would be largely random, which increases our ability to detect belief-driven effects without concern for correlations between beliefs and individual characteristics (like intelligence or familiarity with

YouTube videos). In pilot testing (N=24), we showed participants representative sets of two, four, and eight cards, and asked participants to identify the 50 percent of cards with the highest values. On average, participants performed no better than chance, correctly identifying 11 out of

21 possible high-value cards. However, participants believed they performed much better than chance (average response 4.6/7 with 1 being “no better than chance” and 7 being “I am confident in my choices”).

We also asked pilot participants to estimate the number of views for each of 10 cards. We told all participants that no card had more than 200,000 views. We compared participants’ estimates to two benchmarks: mid-point and random guesses. A strategy of guessing the scale mid-point for each card yielded an average error of 46,000 views. A strategy of making random guesses (Monte Carlo simulation with 2000 simulations) yielded an average error of 66,507 views. The average participant had an average error of 68,171 views (min=45,200, max=95,200 views), which is significantly worse than the midpoint strategy and no better than the random guessing strategy. We also asked participants to estimate 90 percent confidence intervals for the number of times each video had been viewed. The average participant was significantly overconfident, with only 3.2/10 confidence intervals being sufficiently wide. Overall, results of pilot testing confirm that variation in beliefs is both substantial and close to random.

Task Protocol

The reporter examines and compares two decks and chooses how many cards he would like to draw from each deck, drawing a total of 10 cards. We tell the reporter that at the end of the experiment he will be paid the value of the cards dealt to him (he never sees which cards are

~ 13 ~ dealt). After making a choice, the reporter is told that he will be paid $0.50 for each card that another participant (the user) picks from an indicated deck. We manipulate which deck the reporter is being paid from by flipping a coin in front of the participant, and do this after the reporter’s choice to guarantee that initial beliefs are independent of his goal. The reporter then indicates how many cards he recommends the user draws from each deck on the computer and on a slip of paper.

When the user enters the lab she sees an example YouTube card and three cards from each deck. She is told that she will be able to draw 20 cards from a combination of the two decks

(e.g. 14 cards from one deck and 6 cards from the other deck). She is then told that another participant (the reporter) has seen all of the cards in both decks and will provide her with a recommendation of how many cards she should take from each deck. She is also told that the other participant is being paid $0.50 for each card that she (the user) draws from one of the two decks. However, the user is not told from which deck the reporter is being paid.

In all treatments, the reporter then delivers his written recommendation in person. We manipulate the restrictions placed on subsequent communication between the reporter and the user. In the no meeting condition, the reporter is told “You will not be able to talk to with the other participant. If you talk to the other participant you will forfeit your entire pay for the experiment” (mirror instructions are given to the user). In the meeting condition, we relax these restrictions and tell users “You will be able to talk with the other participant as long as you wish and they (you) may ask you any questions they (you) wish” (with mirror instructions to the user).

The no meeting condition is similar to the traditional cheap talk setting and provides very little opportunity for a reporter to give off deception cues. No verbal communication prohibits language cues (e.g. positive/negative language, length of communication) and verbal cues (e.g. pauses, vocal pitch). Furthermore, the recommendation is delivered so quickly that there is very little time for physical cues (e.g. body movement, microexpressions). In contrast, the meeting condition provides ample opportunity for language, verbal, and physical deception cues. We

~ 14 ~ intentionally use a powerful manipulation that allows for multiple types of cues. Weaker manipulations could include allowing phone conversation (allows for language and verbal cues but precludes physical cues) or internet chat (allows language cues and pauses but precludes verbal and physical cues). Another setting would be to allow for prepared written remarks without Q&A (analogous to a press release or a written performance report). Written remarks would allow for language deception cues, albeit in a reduced form as participants may be better- able to censor their language cues in prepared remarks than in spontaneous communication (von

Hippel and Trivers 2011). We believe that any of these weaker manipulations would result in the same direction of effect, but with reduced power. We leave to future research to examine the nuances of these forms of communication.

After either interaction, participants return to their computers. The user chooses how many cards she would like from each deck (20 cards total), and the reporter chooses how many additional cards he would like from each deck (10 additional cards). The participants then answer debriefing questions and are compensated, thanked, and dismissed. The Appendix contains excerpts of the reporter’s instructions. The user’s instructions are mirrored and are available upon request.

Control Variables

Prior research in communication has documented a number of important variables relating to persuasion effectiveness. In debriefing questions we ask for participants’ gender, native language, country of origin, self-assessed YouTube familiarity, and degree to which they knew the other participant before coming to the lab.8 Finally, we give users a simplified version

8 None of our participants reported having come to the lab together or being “good friends.” One dyad reported having each other’s phone number. Additionally, one dyad reported being in a class together, one dyad reported knowing each other’s names before coming, and one user reported that she thought she recognized the reporter. Excluding the first dyad (the only one that suggests a substantive personal relationship) does not alter inference from any of our hypothesis tests. Excluding all four raises p-values for some meeting x consistency interactions to slightly above p < 0.05, but given that three of the four excluded observations are dropped from the No Meeting condition, we believe that the change reflects the decreased sample size rather than any substantive difference in results. ~ 15 ~ of the Mayer and Davis (1999) trust scale. None of our control variables, including the trust scale, are significantly associated with our manipulations, suggesting that our randomization was successful. Because none of these variables mediate or moderate the effects of interest to us, we do not discuss them further.

IV. RESULTS OF EXPERIMENT ONE

Paradigm Validation

Before testing our hypotheses, we verify that three aspects of our setting work as intended: (1) reporters hold reasonably strong beliefs that one deck is better than the other (so that deviating from those beliefs could trigger deception cues); (2) initial beliefs are not correlated with the persuasion goal (so that goals can be treated as randomly assigned); and (3) recommendations are correlated with the persuasion goal (so that recommendations are, on average, deceptive).

First, untabulated analyses show that fewer than 20 percent of reporters choose a 5-5 split and the average absolute deviance from a 5-5 split is 1.71 cards (i.e. almost a 7-3 or 3-7 split).

This establishes that most reporters initially believe that one deck is better than the other.9

Second, neither the mean nor the median number of cards reporters draw from the commission deck (Consistency) is significantly different from 5 (all p-values > 0.50) as shown in

Panel A of Table 1. Some participants draw significantly more than 5 cards from the commission deck, but an approximately equal number draw significantly less than 5 cards from the commission deck. This establishes that our random assignment of persuasion goal was successful.

Third, as shown in Panel B of Table 1, reporters recommend that users draw an average of 14.97 cards from their commission deck and only 5.03 cards from the non-commission deck;

9 Risk aversion and the diversification heuristic would lead unsure participants to select a 5-5 split. ~ 16 ~ this is significantly different from an even split (p<0.0001), and establishes that reporters are aware of the existence and sign of their goals, and that recommendations are deceptive on average.10

Reporters’ Beliefs and Persuasion Effectiveness

H1 hypothesizes that users’ beliefs are more strongly associated with reporters’ beliefs when the two parties meet than when they do not meet. To test this hypothesis, we focus on the number of cards that users draw from the commission deck (Effectiveness), which measures the effectiveness of reporters’ attempts to be persuasive. If H1 is true, then there should be a larger correlation between Consistency and Effectiveness when the two parties meet than when they do not. Table 2 reports the results of an ANCOVA in which Effectiveness is the dependent variable, and Meeting, Consistency and Meeting x Consistency are independent variables. Figure 2 presents scatter plots for the Meeting and No Meeting data, and best-fit regression lines for each condition. We find a significant association between the number of cards that reporters draw from the commission deck and the number of cards that users draw from the commission deck when the two parties meet (β=1.10, p<0.001, one-tailed), but an insignificant association when they don’t meet (β=0.19, p>0.40, two-tailed). The difference between the slopes is significant at the p<0.05 level (two-tailed), providing support for H1. We confirm these results with a non- parametric Mantel-Haenszel Chi-Square test that treats Consistency and Effectiveness as ordinal variables: there is a positive association between Consistency and Effectiveness when

2 participants meet (휒1 = 7.99, p<0.01, one-tailed equivalent) but not when they do not meet

2 (휒1 = 0.69, p>0.40, two-tailed).

10 None of our results are influenced by the identity of the decks examined (which color or number). This variable is excluded from all of our analyses. ~ 17 ~

Our theory predicts that Consistency and Effectiveness are positively associated because users can use deception cues to determine whether reporters believe their own recommendations.

However, the association could arise because the reporters’ written recommendations are influenced by reporters’ beliefs, and more so when the parties meet. To rule out this alternative explanation, we perform two tests. First, we estimate the regression:

Recommendation = 5.21 - 0.55Meeting + 0.24Consistency - 0.08Meeting*Consistency

The estimates provide no evidence for this alternative explanation: consistency has an insignificant effect on recommendation (β = 0.24, p > 0.30), and this association is not affected by Meeting (β = 0.08, p > 0.50).

Second, we expand the ANCOVA reported in Table 2 to include Recommendation and

Meeting*Recommendation. This larger model does not alter the significance of the

Meeting*Consistency effect (p < 0.01, untabulated). These results support H1 and suggest that intentional communication is not the mechanism underlying the increased belief transmission.

Drinking the Kool-Aid

Having established that reporters who believe their recommendations are more persuasive in meetings, we next test whether they alter their beliefs to be more consistent with their persuasion goal (H2), and whether meetings increase the degree to which this belief-revision affect users’ beliefs (H3). To test H2, we define Revision to be the change in the reporter’s choice of cards from the commission deck, signed as positive (and therefore indicating self- deception) if the reporter’s second choice contains more cards from the commission deck than does his first choice, and negative otherwise. Reporters who sincerely adopt their revised beliefs may not think they are deceiving themselves, but they are harming their own payoffs from the

~ 18 ~ perspective of their initial beliefs, which are unaffected by any desire to engage in other- deception.

Table 3 shows how many cards, on average, a reporter draws from his commission deck in his post-goal set of draws for each experimental condition. Consistent with H2, reporters draw an average of 6.12 cards from the commission deck in their second choice, compared to 5.14 cards in their first choice (as reported in panel A of table 1). Thus, the average revision is 0.98. A one-sample t-test confirms that this difference is significant (t=3.57, p<0.0001). The revision does not significantly differ between the Meeting and No-Meeting conditions (Mean

Meeting=1.12, Mean No-Meeting=0.85, t=0.49, p>0.50, two-tailed).11

Our self-deception results are inconsistent with reporters simply hedging their bets. If reporters were simply hedging their bets, we would expect them to become less extreme in their beliefs regardless of how those beliefs related to their persuasive goals – reporters who initially drew many cards from the commission deck would draw fewer, while reporters who initially drew few cards from the commission deck would draw more. Rather, we only see movement by those reporters who initially drew five or fewer cards from what would become their commission deck. These 38 reporters, on average, draw 1.82 more cards from the commission deck in their second draw than in their first draw (paired-t=5.32, p<0.0001, one-tailed), while the 28 reporters who initially drew six or more cards from what would become their commission deck do not significantly revise their draws (average change = 0.14 fewer cards, paired-t=-0.39, p>0.50).

11 As discussed earlier, we did not make a prediction about the effect of meeting on belief-revision. The lack of difference between the two conditions and the fact that reporters in the no-meeting condition draw more cards from the commission deck in their second draw (Mean=5.88) than in their first draw (Mean=5.03, Mean Difference=0.85, t=2.15, two-tailed p<0.05, untabulated) is consistent with Huang and Bargh’s (2011) argument that self-deception is an automatic process. Huang and Bargh (2011) argue that goals (such as a persuasion goal) create an autonomous process that operates independently and can thus deceive the central functions of the self. Thus, merely presenting participants with a persuasion goal may be sufficient to cause belief-revision due to autonomous goal-pursuit ~ 19 ~

H3 predicts that meetings will increase the degree to which belief-revision affect users’ beliefs. To test this hypothesis, we conduct an ANCOVA analysis with the dependent variable

Effectiveness, and independent variables Meeting, Consistency, Revision, Meeting x Consistency and Meeting x Revision. While there is no main effect of Revision, the Meeting x Revision interaction shows that the effect of Revision is more positive for those who meet than those who do not (p = 0.067, 1-tailed). Panel A of Table 4 reports the results of this regression.

One challenge in identifying revision effects is that reporters whose pre-goal beliefs were already consistent with their persuasion goal have limited ability to revise their beliefs to make them more consistent, and doing so would likely have limited effect. To get a better look at revision effects, we rerun our ANCOVA excluding observations where reporters chose more than five cards from their commission deck in their pre-recommendation draw. This allows us to examine the effect of Revision where variation in Revision should be greatest, because reporters with neutral and inconsistent beliefs can revise substantially toward the goal, but may differ in the extent to which they do revise their beliefs. As shown in Panel B of Table 4, this subsample shows a significant Meeting x Revision interaction, (p= 0.03, one-tailed), supporting our hypothesis that meetings increase the association between reporters’ belief revision and users’ beliefs.

V. EXPERIMENT TWO

Experiment one provided evidence that meetings increase the degree to which reporters’ initial beliefs and belief-revision are associated with users’ beliefs. Furthermore, these effects persisted when controlling for reporters’ explicit recommendation, which allowed us to conclude that the effect was occurring through some process other than explicit recommendation. We conduct experiment two with a twofold goal. First, we replicate the simple effects from the

~ 20 ~ meeting condition in experiment one: association between reporters’ initial beliefs and users’ beliefs, reporter belief-revision, and association between reporters’ belief-revision and users’ beliefs. Second, we test whether the relative usage of positive and negative words mediates the relationship between reporter beliefs and user beliefs. As discussed previously, researchers have identified a large number of predictable deception cues, but one of the most robust deception cues is the frequency of negative relative to positive emotion words (DePaulo et al. 2003, Vrij

2008, Newman et al. 2003, Larcker and Zakolyukina 2012).

Our procedure for experiment two is the same as the procedure for experiment one with the following exceptions. First, we use a new subject pool (psychology undergraduates). These participants earn extra credit in addition to their compensation. Second, we replicate only the meeting condition, but do so with a larger sample to provide sufficient statistical power to test net language positivity as a mediator. Third, we record and transcribe all interactions between the reporter and the user and analyze the reporters’ language using LIWC software, counting the frequency of positive versus negative emotion words.

One hundred twenty-one individuals participated in the experiment resulting in sixty observations. We excluded one reporter who didn’t write down a recommendation before meeting with the user. The video recording equipment failed for seven observations. We exclude seven additional observations due to strong personal ties between the reporter and the user (self- identified as coming to the lab together, self-identified as being good friends, or obviously knew each other as seen in the videos).12

12 We apply the same ex-ante exclusion criteria as we did in experiment one. The difference in percentage of exclusions likely relates to the difference in sample populations. The first experiment was conducted on a mix of students across campus who were recruited primarily from new student orientation events. The second experiment was a group of psychology students who needed extra credit and who were more likely to interact with each other on a regular basis. Inferences are unchanged (i.e. no p-values that are reported at p<0.05 rise to be above p<0.05) if we include all participants for whom data is available (i.e. N=60 for the main analysis, N=53 for the mediation analysis). ~ 21 ~

Table 5 presents abbreviated replication results. Consistent with our findings from experiment one, we find that both initial beliefs and belief-revision are associated with user beliefs (both ps<0.05). Untabulated results show that reporters revise their own beliefs by an average of 0.76 cards (t=2.40, one-tail p=0.01; signed-rank p<0.01). We next test the extent to which these results are mediated by language positivity, which we define as the percentage of positive emotion words minus the percentage of negative words used by the reporter. In the

English language, many phrases with identical meaning can be said in multiple ways using either positive or negative language. For example, the following actual reporter statements compared the decks using positive language: “I thought red was more interesting,” “I just have a good feeling about blue,” “I thought [it] would just get more views,” and “all the blue ones looked like they would be good” (emphasis added). In contrast, the following actual reporter statements compared the decks using negative language: “Less viewed cards,” “You would lower your chances,” “They looked less popular,” and “The other videos are all really bad” (emphasis added). Deception researchers have found that people use more negative words (including negations) and fewer positive words when lying (e.g. Larcker and Zakolyukina 2012), even if the same overall message is being communicated.

Figure 3 depicts the results of the mediation analysis. Baron and Kenny (1986) suggest four requirements for mediation. First, the independent variable must affect the mediator.

Second, the independent variable must affect the dependent variable when not controlling for the mediator. Third, the mediator must affect the dependent variable when controlling for the independent variable. Fourth, the effect of the independent variable on the dependent variable must be smaller when controlling for the mediating variable. All four of these conditions are met

~ 22 ~ in our setting: (1) both initial beliefs and belief-revision affect the language positivity (β=0.85, one-tailed p<0.01 and β=0.77, one-tailed p<0.01 respectively), (2) both initial beliefs and belief- revision affect persuasion effectiveness when we do not control for language positivity (β=0.65, one-tailed p<0.05, β=0.71, one-tailed p<0.05, respectively), (3) language positivity affects persuasion effectiveness (β=0.41, one-tailed p<0.01), and (4) the effects of initial beliefs and belief-revision on persuasion effectiveness go down when controlling for language positivity

(new β=0.30 two-tailed p>0.20 and β=0.39, two-tailed p>0.20, respectively).

We supplement our tests through bootstrapping analysis (see Hayes 2013 and Preacher and Hayes 2008) with 100,000 bootstrap samples. The indirect path of consistency on persuasion effectiveness through language positivity is significantly greater than zero in 99% of repetitions.

The indirect path of belief revision on persuasion effectiveness through language positivity is also significantly greater than zero in 99% of repetitions.13 We note that our analysis measures only one of many possible mediators. To the extent that other mediators exist and are correlated with language positivity, we may be overstating the magnitude of the indirect effect (Bullock,

Green, and Ha 2010). Future research can strengthen these inferences by experimentally manipulating one mediator at a time (holding other mediators constant) to provide a more accurate estimate of the effect (Bullock, et al 2010).

VI. CONCLUSION

We present the results of two experiments on self-deception and persuasion. Building on self-deception theory (von Hippel and Trivers 2011), experiment one finds that meetings cause reporters to betray their true beliefs to users. However, reporters revise their beliefs to conform to their reporting goals (i.e. they ‘drink their own Kool-Aid’) reducing the beneficial belief-

13 Inferences are unchanged if we decile-rank the language positivity variable. ~ 23 ~ transmission effect of meetings. Experiment two replicates these results and finds that language positivity mediates the relationship between initial beliefs and belief-revision (i.e. drinking the

Kool-Aid) on user beliefs.

Our research extends the non-cooperative cheap talk literature by creating a rich opportunity for reporters to exhibit detection cues, and for users to detect them. Prior research has used the non-cooperative cheap talk paradigm to test the effect of accounting institutions on reporters’ propensity to provide deceptive reports. However, the traditional cheap-talk paradigm has been unable to explore in great depth users’ ability to detect reporter deception. Our research modifies the cheap talk paradigm to allow rich communication similar to those that exist in the real world (e.g. conference calls, in-person performance evaluations, government testimonies).

We find that rich communication greatly increases the degree to which users’ beliefs are associated with reporters’ beliefs.

Our study leaves a number of issues for further research. First, the face-to-face meetings in our treatment condition allow for the exhibition and detection of many behavior cues, including visual, auditory and linguistic, and the reporters’ inability to prepare in advance likely made it especially hard for them to conceal those cues. Future research could examine behavior in settings in which some cues are unavailable (e.g., written narratives, reporter remarks without questions), or allow reporters to prepare in advance (see von Hippel and Trivers 2011 for a discussion of impromptu versus prepared communication). Future research could also explore the impact of the many possible mediating clues we do not examine, such as voice stress, microexpressions, pauses, concreteness, and pronoun use, and how the cues themselves are influenced by user behaviors (such as asking questions or responding skeptically).

~ 24 ~

Future research might also examine whether face-to-face meetings have different effects for different forms of deception. In the setting we examine, the reporter has an incentive to persuade users that their recommendation will result in a good outcome. However, many reporting settings (such as the budgeting process) encourage ‘sandbagging’, in which a reporter has an incentive to persuade users that circumstances are difficult and future performance will be poor (Jensen 2001). Given that most people are naturally optimistic, especially when they are rewarded for good performance, they might find it even more difficult to avoid betraying their true beliefs.

Finally, future research could examine exactly when self-deception occurs and what drives it. We measure self-deception by eliciting the reporter’s second judgment after the face- to-face meeting is complete. Revisions could therefore be driven by reporter’s knowledge of what they want the user to believe, by their writing of a recommendation, and/or by their face-to- face interaction with the user. Clarifying this timing could help uncover the causes of self- deception. Trivers (1976/2006) and von Hippel and Trivers 2011) argue that self-deception is caused by the need to deceive others, which suggests that it would arise as soon as reporters know what they want users to believe, quite possibly as an automatic process that engages regardless of whether the reporter expects to meet the user face-to-face. However, self-deception could also be caused or reinforced by the meeting itself, as the reporter becomes aware of the difficulty of suppressing deception cues.

~ 25 ~

APPENDIX

ABRIGED REPORTER INSTRUCTIONS14

Thank you for participating in this study. This study focuses on YouTube videos. From a random sample of YouTube videos, we have created several decks of playing cards. An example card is displayed below. The front of each card contains the title and thumbnail image of a YouTube video. The four letters in the bottom right corner are a lookup key so the experimenter can determine how much each card is worth. The letters are a random code to help us keep track of the cards.

Each card has a value equal to the number of times the video has been viewed on YouTube divided by 400,000, rounded to the nearest nickel.

For example, the card below represents a YouTube video that has 85,646 views. Thus, this card is worth $0.20.

No card has more than 200,000 views.

In front of you are two decks of cards.

Please take as much time as you like to look through the decks. Become familiar with both decks as the remainder of the study deals with these two decks of YouTube cards. You do not need to memorize the cards. We recommend about 4 minutes to look through the decks (2 minutes per deck).

14 For brevity we have left out the informed consent page and manipulation check questions. ~ 26 ~

You now have the opportunity to draw 10 cards from the two decks. You may draw as many cards as you wish from each deck as long as you draw a total of 10 cards (e.g. you may draw 7 cards from one deck and 3 from the other deck). The administrator will record your choice, randomly select that many cards from the deck, and pay you the value of the cards. Note, you get to choose the how many cards are drawn from each deck, but you do not get to choose the specific cards.

As a reminder, each card is worth its number of YouTube views divided by 400,000, so cards with more views are worth more.

How many cards would you like from each deck? You must draw a total of 10 cards (i.e. your two numbers must sum to 10). You may continue looking through the decks while you make your choice.

In a moment you will make a recommendation to another participant regarding how many cards he/she should choose from each deck.

The cards dealt to the other participant also add to your own pay. You will be paid $0.50 for every card dealt to the other participant from one of the two decks. To determine which deck you will be paid from, the administrator will flip a coin:  If the coin comes up heads, you will be paid $0.50 for every card the other participant takes from the Red Deck.  If the coin comes up tails, you will be paid $0.50 for every card the other participant takes from the Blue Deck.

Please raise your hand to bring over the administrator.

What was the result of the coin flip? (participant indicates coin flip result on computer screen while administrator watches)

We will not tell the other participant which deck you are being paid from.

Information About the Other Participant:

The other participant will draw a total of 20 cards from a combination of the two decks. For example, they might choose to have 14 cards dealt from one deck and 6 cards dealt from the other. They will be paid the value of the 20 cards dealt to them.

The other participant knows that no card has more than 200,000 views and that the value of each card is equal to its number of YouTube views divided by 400,000 rounded to the nearest nickel.

~ 27 ~

However, they have only seen a few of the 18 cards from each deck, so they will be relying on your recommendation to make their decision.

You will mark your recommendation on a slip of paper. You will not be able to talk with the other participant. If you talk to the other participant you will forfeit your entire pay for the experiment.

You will mark your recommendation on a slip of paper. You will be able to talk with the other participant as long as you wish and they may ask you any questions they wish.

The other participant is aware that you are being paid $0.50 per card drawn from one of the two decks, but does not know which deck.

After you meet with the other participant you will be able to choose another 10 cards from the decks (as before, choosing the deck the card will be drawn from but not choosing the specific card). The other participant will never find out how many cards you drew from each deck unless you tell them.

How many cards will you recommend the other participant take from each deck? As the other participant will choose a total of 20 cards, your two numbers must sum to 20.

You now have the opportunity to pick 10 more cards from the two decks.

How many cards would you like from each deck? You must draw a total of 10 cards (i.e. your two numbers must sum to 10).

After further thought and providing a recommendation to another participant, which deck do you believe is better? (101-point scale with endpoints labeled “The Blue Deck is Better” and “The Red Deck is Better” and a midpoint “The Decks are Equivalent”)

~ 28 ~

How comfortable was your interaction with the other participant (7-point Likert scale with endpoints “Very Uncomfortable” and “Very Comfortable” and a midpoint “Neither Comfortable nor Uncomfortable”)?

How persuasive do you think you were (7-point Likert scale with endpoints “Very Unpersuasive” and “Very Persuasive”)?

How well do you know the other participant? Please select all that apply - Never met them - Recognized them, but didn’t know their name - Currently in a class with them - Knew their name - Have their phone number - Very good friends - Came to the lab together

Please indicate your gender (Male/Female)

How old are you? (Options for 18-25 plus a text free response for other)

Where are you from? (Various standard options plus a text free response for other)

What is your native language? (Various standard options plus a text free response for other)

How regularly do you view YouTube videos (5-point scale labeled “Never, Rarely, A few times a month, A few times a week, and Daily”)?

Do you have any other comments or suggestions regarding the task?

~ 29 ~

REFERENCES

Baiman, S., and B. L. Lewis. 1989. An experiment testing the behavioral equivalence of strategically equivalent employment contracts. Journal of Accounting Research 27(1): 1-20.

Baron, R. M., and D. A. Kenny. 1986. The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology 51 (6): 1173-1182.

Bond, C. F. and B. M. DePaulo. 2006. Accuracy of deception judgments. Personality and Social Psychology Review 10 (3): 214-234.

Brown, J. E. Chan, J. Choi, H. Evans, and D. Moser. 2015. Working harder but lying more? How managers’ effort influences their reporting. Working Paper, Indiana University, The University of Texas at Austin, and University of Pittsburgh.

Bullock, J.G., D.P. Green, and S.E. Ha. 2010. Yes, but what’s the mechanism? (don’t expect an easy answer). Journal of personality and social psychology, 98(4): 550-558.

Crawford, V.P. and J. Sobel. 1982. Strategic information transmission. Econometrica: Journal of the Econometric Society, 50 (6): 1431-1451.

DePaulo, B. M., J. J. Lindsay, B. E. Malone, L. Muhlenbruck, K. Charlton, and H. Cooper. 2003. Cues to deception. Psychological Bulletin 129 (1): 74-118.

Dunning, D. 2011. Get thee to a laboratory. Behavioral and Brain Sciences 34: 18-19.

Ekman, P. 2009. Lie catching and microexpressions. The philosophy of deception. Ed. C.W. Martin. Oxford University Press.

Ekman, P., and M. O’Sullivan. 1991. Who can catch a liar? American Psychologist, 46 (9): 913- 920.

Elliott, W. B., K. M. Rennekamp, and B. J. White. 2012. Does Highlighting Concrete Language in Disclosures Mitigate Home Bias?. Working Paper, University of Illinois.

Evans III, J. H., R. L. Hannan, R. Krishnan, and D. V. Moser. 2001. Honesty in managerial reporting. The Accounting Review, 76(4): 537-559.

Festinger, L. 1957. A theory of cognitive dissonance. Vol. 2. Stanford university press.

Forsythe, R., R. Lundholm, and T. Rietz. 1999. Cheap talk, fraud, and adverse selection in financial markets: Some experimental evidence. Review of Financial Studies, 12(3): 481-518

Frankfurt, H. G. 2005. On bullshit. Princeton, NJ: Princeton University Press.

~ 30 ~

Governmental Accounting Standards Board (GASB). 2008. Service Efforts and Accomplishments Reporting (an Amendment of GASB Concepts Statement No. 2). Concepts Statement No. 5. Stamford, Ct: GASB.

Haesebrouck, K. 2015. The effect of knowledge acquisition on managerial reporting. Working Paper, KU Leuven.

Hayes, A. F. 2013. Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press.

Hobson, J. L., W. J. Mayew, and M. Venkatachalam. 2012. Analyzing speech to detect financial misreporting. Journal of Accounting Research 50 (2): 349-392.

Huang, J. Y., and J. A. Bargh. 2011. The selfish goal: Self-deception occurs naturally from autonomous goal operation. Behavioral and Brain Sciences 34: 27-28.

Jensen, M.C. 2001. Corporate budgeting is broken – Let’s fix it. Harvard Business Review, 79 (10): 94-101.

Kunda, Z. 1990. The case for motivated reasoning. Psychological bulletin 108 (3): 480-498.

Larcker, D. F., and A.A. Zakolyukina. 2012. Detecting deceptive discussions in conference calls. Journal of Accounting Research, 50(2): 495-540.

Li, F. 2008. Annual report readability, current earnings, and earnings persistence. Journal of Accounting and economics, 45(2): 221-247.

Libby, R., R. Bloomfield, and M. W. Nelson, 2002. Experimental research in financial accounting. Accounting Organizations and Society 27: 775-810

Mayer, R. C., and J. H. Davis. 1999. The effect of the performance appraisal system on trust for management: A field quasi-experiment. Journal of applied psychology, 84(1): 123.

Ng, T. B. P., and P. G. Shankar. 2010. Effects of technical department's advice, quality assessment standards, and client justifications on auditors' propensity to accept client- preferred accounting methods. The Accounting Review, 85 (5): 1743-1761.

Newman, M. L., J. W. Pennebaker, D. S. Berry, and J. M. Richards. Lying words: Predicting deception from linguistic styles. Personality and Social Psychology Bulletin, 29 (5): 665- 675.

Preacher, K. J., and A. F. Hayes, A. F. 2008. Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior research methods, 40(3): 879-891.

~ 31 ~

Prior, M., G. Sood, and K. Khanna. 2015. You Cannot be Serious: The Impact of Accuracy Incentives on Partisan Bias in Reports of Economic Perceptions. Quarterly Journal of Political Science, 10(4): 489-518.

Rankin, F. W., Schwartz, S. T., & Young, R. A. (2008). The effect of honesty and superior authority on budget proposals. The Accounting Review, 83(4): 1083-1099.

Trivers, R. 1976/2006. Foreword. In: The selfish gene, R. Dawkins, pp. 19–20. Oxford University Press. (Original work published in 1976.)

Trivers, R. 1985. Deceit and self-deception. In: Social evolution. 395-420.

Von Hippel, W. and R. Trivers. 2011. The evolution and psychology of self-deception. Behavioral and Brain Sciences 34: 1-16.

Vrij, A. 2008. Detecting lies and deceit: Pitfalls and opportunities. Wiley and Sons.

Vrij, A., P. A. Granhag, and S. Porter. 2010. Pitfalls and opportunities in nonverbal and verbal lie detection. Psychological Science,11 (3): 89-121.

Zuckerman, M., B. M. DePaulo, and R. Rosenthal. 1981. Verbal and nonverbal communication of deception. Advances in experimental social psychology 14(1), 59.

~ 32 ~

FIGURE 1 Timeline for Experiments One and Two

Examines 3 Debriefing Card cards from questions and Choice

User each deck compensation

Recommendation (Meeting)

Pre-goal Post-goal Debriefing Examines Goal Card Card questions and all cards Manipulation Choice Choice compensation

Reporter

FIG 1. –Experimental Timeline

~ 33 ~

FIGURE 2 Experiment One Results

Persuasion Effectiveness 25

20

15 No Meeting 10 Meeting 5 No Meeting Meeting 0 0 5 10

the reporter'sthe commissiondeck Number of cards the reporter

Numberofcards userthe draws from initially draws from the commission deck

FIG 2. –The above figure shows the relationship between how many cards a reporter initially draws from the commission deck (i.e. before he learns about the persuasion task) and how many cards the user draws from the commission deck. In the Meeting condition the reporter and user meet to discuss the reporter’s recommendation. In the No Meeting condition, they do not meet to discuss the recommendation. For ease of display, all points have a random perturbation on both the X and Y axis of between 0 and 0.2 units. All points also have 70 percent transparency. Thus, darker points represent more observations.

34

FIGURE 3 Experiment Two Results

* The top lines indicates the regression coefficients and significance levels when the mediator is not included. The bottom line indicates the regression coefficients and significance levels when the mediator is included in the model. p-values that are in boldface indicate one-tailed tests of directional predictions.

35

TABLE 1 Experiment One Results

Panel A: Number of cards the reporter initially drew from what will become his commission deck

Consistency

No Meeting Total Meeting Mean 5.03 5.24 5.14 [Std. Deviation] [2.51] [2.09] [2.29] 0.07 0.67 0.48 t-Value ( Mean > 5) ns ns ns Median 5.00 5.00 5.00 -1 2.5 1.5 Sign Test ns ns ns

Meeting does not significantly affect cards drawn (t=0.37, p>0.50; Mantel-Haenszel χ2=0.14, p>0.50).

Panel B: Number of cards that the reporter recommended the user take from the reporter’s commission deck

Recommendation

No Meeting Total Meeting Mean 15.21 14.73 14.97 [Std. Deviation] [4.06] [3.33] [3.69] 7.37 8.16 10.94 t-Value ( Mean > 10) p<0.0001 p<0.0001 p<0.0001 Median 15 14 15 13 14 27 Sign Test p<0.0001 p<0.0001 p<0.0001

Meeting does not significantly affect recommendation (t=0.53, p>0.50; Mantel-Haenszel χ2=0.28, p>0.50).

The reporter examined both decks of cards and chose how many cards he would like from each deck, for a total of 10 cards. We measured how many cards the reporter drew from what will become his commission deck (commission deck is manipulated). The reporter then gave a

36 recommendation to the user. . In the Meeting condition the reporter and user meet to discuss the reporter’s recommendation. In the No Meeting condition, they do not meet to discuss the recommendation. We measured how many cards the reporter recommended that the user draw from the commission deck. Finally, the user draws 20 cards. We measure how many cards the user and reporter draw from the commission deck. Neither the user nor the reporter see which specific cards are drawn. Rather, they find out the realization of the sum of their cards at the very end of the experiment.

All p-values comparing means to specific values are one-tailed.

37

TABLE 2 Experiment One Results The Association Between Beliefs and Persuasion Effectiveness

Persuasion Effectiveness

Source DF Regression Mean F Two- Parameter Square Statistic tailed p-value Intercept 1 1.78 Meeting 1 -0.35 0.989 0.15 0.70 Consistency 1 0.19 6.890 0.52 0.47 Meeting*Consistency 1 0.91 68.387 5.17 0.03 SS = Error 62 994.621

Hypothesis Test H2:Meeting*Consistency>0 β=0.91 t=2.27 p=0.013 (one-tailed)

38

TABLE 3 Experiment One Results Number of cards the reporter drew from his commission deck in his second time drawing cards No Meeting Total Meeting Mean 5.88 6.36 6.12 [Std. Deviation] [2.43] [1.98] [2.22] 2.07 3.95 4.11 t-Value ( Mean > 5) p<0.05 p<0.001 p<0.0001 Median 6 8 6 5.5 8 13.5 Sign Test p<0.05 p<0.0001 p<0.0001

Hypothesis Test Paired-t p<0.001 H3: Revision>0 β=0.98 =3.57 (one-tailed)

Meeting does not significantly affect subsequent draws (t=0.89, p>0.30; Mantel-Haenszel χ2=0.79, p>0.30).

39

TABLE 4 Experiment One Results Does Belief Revision Improve Persuasion Effectiveness? Panel A: Complete Sample Analysis of Covariance

Persuasion Effectiveness

Source Two- Regression Mean F DF tailed Parameter Square Statistic p-value Intercept 1 1.94 Meeting 1 -1.20 17.430 1.33 0.25 Consistency 1 0.11 1.73 0.13 0.72 Meeting*Consistency 1 1.33 101.935 7.8 0.01 Revision 1 -0.18 4.129 0.32 0.58 Meeting*Revision 1 0.72 30.112 2.30 0.13 SS = Error 60 784.207

Hypothesis Test H4:Meeting*Revision>0 β=0.72 t=1.52 p=0.067 (one-tailed)

Panel B: Analysis of Covariance in the Subsample Where Consistency ≤ 0 (i.e. beliefs were either inconsistent or neutral)

Persuasion Effectiveness

Source Two- Regression Mean F DF tailed Parameter Square Statistic p-value Intercept 1 1.42 Meeting 1 -0.75 2.363 0.19 0.67 Consistency 1 -0.27 3.958 0.31 0.58 Meeting*Consistency 1 2.26 78.933 6.21 0.02 Revision 1 -0.22 4.101 0.32 0.57 Meeting*Revision 1 1.28 49.820 3.92 0.06 SS = Error 32 406.905

Hypothesis Test H4:Meeting*Revision>0 β=1.28 t=1.98 p=0.028 (one-tailed)

40

TABLE 4, Continued

Before receiving a persuasion goal reporters look through both decks of cards and choose how many cards they personally want to draw from each deck, drawing a total of 10 cards. We then manipulate participants’ commission deck. Consistency measures the number of cards that the reporter originally chooses from his commission deck before learning what which deck is his commission deck. We subtract 5 from all values such that -5 represents “All cards drawn from the non-commission deck” and +5 represents “All cards drawn from the commission deck.” After the reporter delivers his recommendation to the user the reporter chooses another 10 cards. Revision is the number of cards the reporter chooses from the commission deck in his second choice minus the number of cards the reporter chose from the commission deck in his first choice. Persuasion Effectiveness is the number of cards that the user draws from the reporter’s deck minus 10 (as 10 is the naïve, 50-50 split).

41

TABLE 5 Experiment 2 Results Regression testing the relationship between initial beliefs and belief-revision on persuasion effectiveness.

Persuasion Effectiveness

Source Two- Regression Mean F DF tailed Parameter Square Statistic p-value Intercept 1 2.36 Consistency 1 0.65 73.44 4.79 0.03 Revision 1 0.71 73.71 4.80 0.03 SS = Error 42 644.59

Hypothesis Test Consistency>0 β=0.65 t=2.19 p=0.02 (one-tailed) Revision>0 β=0.71 t=2.19 p=0.02 (one-tailed)

42