<<

Opinion Dynamics with Confirmation

Armen E. Allahverdyan1) and Aram Galstyan2) 1)Yerevan Physics Institute, Alikhanian Brothers Street 2, Yerevan 375036, Armenia, 2) USC Institute, 4676 Admiralty Way, Marina del Rey, CA 90292, USA

Background. Confirmation bias is a tendency to acquire new information via confirming one’s preconceptions. It is omnipresent in , economics, scientific practices etc. The previous theoretical research of this phenomenon focused on its economic implications possibly missing its potential connections with broader notions of cognitive . /Principal Findings. We formulate a (non-Bayesian) rule for updating the subjective probabilistic opinion of an agent in the light of a persuasive opinion. The rule follows the of semantic information : the agent does not react to that is either far from his current opinion (confirmation bias to unexpected information) or coincide with it (no new information). The model accounts for the basic phenomenology of and the social judgment theory. It allows to describe the boomerang effect as an extreme form of the confirmation bias. The model also displays the order of presentation effect: when consecutively two contradicting opinions, the preference is given to the last opinion (recency) or the first opinion (primacy) depending on the degree of confirmation bias. We also uncover several features of repeated persuasion process. Conclusions. A single model accounts for a spectrum of effects and relates to each other the confirmation bias, primacy-recency phenomenon, boomerang effect and cognitive dissonance. We point out several limitations of the model that should motivate its future development.

I. INTRODUCTION ceived, the agent revises his opinion via the Bayes rule /∑ Pr(Ai) → Pr(Ai,E) Pr(Ak,E) . (1) Confirmation bias is a tendency to acquire new infor- k mation in a way that confirms one’s preconceptions and This is the normative theory of the rational behavior: 1 avoids information which contradicts prior opinions [29]. provided that Pr(Ak,E) are available, not behaving ac- Various manifestations of this bias have been reported in cording to (1) implies losses in certain types of econom- [2, 42], [11, 31], ical actions [7]. The Bayes rule is by definition free of politics [24] and media economics [16, 28, 32, 45] 2. Re- any confirmation bias. Hence economists studied the cent suggests that scientific practices too con- confirmation bias by reducing it to specific deviations tain a variety of confirmation [5, 21–23, 29], even from (1) [16, 28, 32, 45], e.g. when the joint probabilities though the imperative of avoiding precisely this bias is Pr(Ak,E) are available, but agents do not combine them frequently presented as a pillar of the scientific method. as in (1). Indeed, people rarely satisfy the Bayes rule, also because it is not always successful in the real world Here we are interested in the dynamics of opinion [17]. change in the presence of confirmation bias; see [6, 7, 29] However, we note that the confirmation bias is de- for reviews. veloped with respect to an essentially new information It is assumed that an agent’s uncertain opinion (de- that is going to change the existing opinions; otherwise, the new information is accepted without problems and is fined on an exhaustive event {Ak}) is quantified by even given some priority. Then we expect that the full probabilities Pr(Ak) that describe his degree of confi- dence on the of these events [7]. The second as- probability Pr(Ak,E) may not be available [12]. sumption is done within the Bayesian approach to opin- Hence we abandon the premise of the Bayesian ap- proach on the availability of joint probabilities Pr(A ,E). ion revision: the agent has joint probabilities Pr(Ak,E) k Instead, our model assumes 3 inputs: the (subjective) for Ak and some evidence E. Once this evidence is re- probabilistic opinion of the target agent P, the opinion of a persuading agent Q, and the degree of confirmation bias shown by P. We proceed within the opinion com- 1 We use opinion instead of , though these terms are virtually bination approach developed in statistics and applied in interchangeable. We prefer the first term, because belief implies a many fields; see [10, 15] for reviews. certain commitment (based on cultural or personal faith, moral- We propose a set of conditions that define cognitive ity, or values), whereas opinion is vaguer, and possibly more aspects of confirmation bias and formalize it within no- flexible and subject to change. 2 The bias has several different names that underline its various tions of semantic . The main message aspects: myside bias, affirmation bias, conservatism, disconfir- of these conditions is that P does not changes his opinion mation bias, overconfidence. if the opinion of Q is either very far or identical to his 2 opinion. Next we propose an opinion updating rule that method models the response of P to persuasion by Q. This rule describes several key effects of the social judg- dk = wP pk + wQqk, wP + wQ = 1, 1 ≤ k ≤ N, (4) ment theory that attempts to explain how people react to persuasion [6]. These effects include: separation of where wP and wQ are positive weights that quantify the opinion into different latitudes, the weighted average ap- importance of each agent for the decision maker, and the proach, change-discrepancy relations. logarithmic method The rule also produced new results: the recency ef- wP wQ fect is related to confirmation bias; repeated pk qk dk = ∑ , wP + wQ = 1, 1 ≤ k ≤ N. (5) N wP wQ are shown to hold certain monotonicity features, but do l=1 pl ql not hold the law of diminishing returns; the boomerang (backfire) phenomenon is related to confirmation bias These methods are different, they apply to different sit- and to the primacy effect. uations, because each one has its merits and drawbacks This paper is organized as follows. In Section II we dis- [10, 15] 3. As shown below, their specific combination is cuss the opinion representation via probabilities, define suitable for describing confirmation bias. our axioms and introduce the confirmation biased opin- ion combination rule. Section III relates our set-up to the social judgment theory. Next two sections show how B. Opinion combination rule our model accounts for two basic results of experimen- tal social psychology: opinion change versus discrepancy Let an agent P is persuaded by an agent Q, i.e. the and the order of presentation effect. Repeated persua- opinion p of P is going to change under influence of q sion is studied in section VII. Section VIII shows that 4; see (2). We propose the following conditions for the the boomerang effect—the agent changes his opinion not combination rule of p and q. towards the persuasion, but against it—can appear as 1. The final opinion pek of P reads a form of confirmation bias. Section IX shows how our / model formalizes some concepts of cognitive dissonance ∑N pek = F [pk, qk, ϵ] F [pl, ql, ϵ] , (6) and outlines new scenarios of its emergence. We summa- l=1 rize and conclude in the last section. where F [x, y, ϵ] is a smooth function of 3 variables that change between 0 and 1, and II. OPINION COMBINATION VIA 1 > ϵ > 0, (7)

A. Representation of opinions characterizes the degree of the confirmation bias of P, as well be seen below. For ϵ → 1 the opinion does not e ≤ ≤ Consider two agents P and Q, and assume they quan- change: pk = pk for 1 k N. Hence tify their opinion via probabilities F [x, y, 1] = x for 0 ≤ x, y ≤ 1. (8) ∑N ∑N { }N { }N Eq. (6) means that P first evaluates the (non- p = pk k=1 and q = qk k=1, pk = qk = 1, (2) k=1 k=1 normalized) weight for the event k solely on the base of pk and qk, and applies the overall normalization at the respectively, on the same set of events k = 1, ..., N, e.g. end. k = (rain, norain), if these opinions are on a weather forecast. Note that k = x can be a continuous variable, if (for example) the forecast concerns the chance of having rain 3 Note that the opinion combination problem does not admit a or the amount of rain. Then the respective probability straightforward Bayesian representation [7]. Nevertheless, the densities are: Bayesian approach can be generalized to this case, although it ∫ ∫ requires more assumptions than usually; see e.g. [26]. 4 We assume that the opinion of Q is communicated to P. One p(x) and q(x), dx p(x) = dx q(x) = 1. (3) option is that the full probability is communicated. Another option is that only some moments of the real probabilistic opinion of Q are communicated (e.g. its average and dispersion) together Both opinions are subjective, both are based on incom- with the domain of events, where those real probabilities of Q plete information and take into account different back- are strictly non-zero. Then P can approximately reconstruct grounds of P and Q. Let a decision maker wants to com- the opinion of Q via the maximum entropy method, and now the opinion q of Q should refer to that reconstructed opinion. bine them together, hoping to get a more reliable opinion. In particular, the Gaussian density (17) can be regarded as the There are two basic methods of combining p and q into maximum-entropy reconstruction of a probability density from the opinion d of the decision maker [10, 15]: the linear its first and second moments. 3

2. If pk = 0 for some k, then pek = 0: P never changes he does not change it in response to the opinion by Q. his opinion on something that he is absolutely confident Likewise, ϵ = 0 means that the highest credibility of Q. about. Thus Note that ϵ = 0 does not mean that P fully accepts the opinion of Q. He does not, i.e. pe ≠ q, since he can only F [0, y, ϵ] = 0 for 0 ≤ y ≤ 1. (9) accept opinions on the base of his previous knowledge. In (13), µ characterizes the confirmation bias during e P 3. If pkqk = 0 for all k, then pk = pk: cannot the projection sub-process and thus plays nearly the same Q be persuaded by , if they do not share any common role as ϵ. Practically, it suffices to have one such param- knowledge. eter, so from now on we put µ = 1/2. Our final rule for e 4. If pk = qk for all k, then pk = pk, i.e. no change is confirmation biased opinion combination is expected by combining two identical opinions. We write √ this feature as p [ϵp + (1 − ϵ)q ] pe = ∑ √k k k . (14) k N − F [x, x, ϵ] = x for 0 ≤ x ≤ 1. (10) l=1 pl[ϵpl + (1 ϵ)ql]

Note that 3 and 4 agree with the qualitative ideas of Eq. (14) directly translates to probability densities; see ≃ semantic information theory [8, 35]: information people (3). Approximating pk p(xk)dx we get from (14) obtain from a given message is conditioned on their on- √ p(x)[ ϵp(x) + (1 − ϵ)q(x)] tology, which is their previous knowledge on the subject pe(x) = ∫ √ . (15) of communication [41]. An agent with a limited ontology dy p(y)[ ϵp(y) + (1 − ϵ) q(y)] will likely not understand almost any message, while an agent with a very rich ontology will also not get much se- 7. Note that (8, 9, 10, 12) alone are compatible with mantic information, since he already knows a lot 5. Also, choices different from (13), e.g. 3 and 4 agree with experimental results in social psychol- µ 1−µ F [p , q , ϵ] = ϵp + (1 − ϵ)p q , 0 < µ < 1. (16) ogy that show that people are not persuaded by opinions k k k k k that are either very far, or very close to their initial opin- Here P first projects the opinion of Q on his opinion and ion [6, 9, 36]. then mixes it. 5. Whenever P and Q update their opinions We do not know how to motivate the choice of F uniquely. We however checked all our qualitative con- ∑γkpk ∑γkqk pk → , qk → , 1 ≤ k ≤ N, (11) clusions below do not depend on the choice between (13) γ p γ q l l l l l l and (16). Hence for the remainder of this paper we focus on (14). where γk = Pr(...|k) is a certain conditional probability, pe is updated via the same rule (11). Hence for 0 ≤ x, y ≤ 1:

F [γx, γy, ϵ] = γF [x, y, ϵ] for γ ≥ 0. (12) III. SOCIAL JUDGMENT THEORY AND GAUSSIAN OPINIONS This feature means that the sought opinion combination rule is consistent with probability theory. Our model can be put in the framework of the so- Note (12) leads to F [p, 0, ϵ] = pF [1, 0, ϵ] that together cial judgment theory [6, 34]. According to this theory, with (9) implies 3. when an agent is presented with a persuasive opinion, 6. Eqs. (8, 9, 10, 12) do not allow to specify F he switched on an automatic perceptual mechanism, and uniquely. We assume that F is a combination of (4) and evaluates the information comparing it with attitudes (5): first P “mixes” his opinion with that of Q accord- (opinions) he has already. An opinion contains mostly ing to (4) and then “projects” the mixture back to his 3 latitudes: acceptance, non-commitment and rejection previous opinion following (5): [6, 34]. The most acceptable opinion, or anchor, is lo- cated at the center of the latitude of acceptance. The µ − 1−µ F [pk, qk, ϵ] = pk [ϵpk + (1 ϵ)qk] , 0 < µ < 1. (13) theory states that persuasion is not efficient if it comes close to anchor or within the latitude of rejection [6, 34]. The meaning of ϵ is it quantifies the confirmation bias The social judgment theory is popular, but its quanti- of P displayed in his interaction with Q; ϵ also includes tative modeling has been scarce 6. In particular, no at- the credibility of Q as seen by P, as well as the tempt was made to relate its ideas to probabilities. devoted by P to Q. Hence ϵ → 1 means a strongly biased agent P (or a low-credible Q): though the opinion of P is uncertain,

6 The literature on the social judgment theory offers some formal mathematical expressions that could be fitted to experimental data [25]. There is also a more theoretical approach: the belief- 5 The same desiderata were proposed for the pragmatic informa- adjustment model assumes discrete-time (k = 1, 2, 3...) dynamics tion theory [8]. This is natural, since normally for the anchor point Sk as a function of the (new) evidence xk (semantics) should precede acting (pragmatics). received at step k [19]: Sk = Sk−1+w[s(xk)−R], where s(.) refers 4

2 To fill this gap, let us assume that k = x is a continuous and O[(mP − mQ) /vQ]. We get variable and that p(x) and q(x) are Gaussian with mean P Q 7 ωP mP + ωQmQ mλ and dispersion vλ (λ = , ) [26] : mP˜ = , (20) ωP + ωQ √ ′ 2 − 2 1 + ϵ ′ vQ − (x−mP ) − (x mQ) ≡ − 2v 2vQ ωP , ϵ = ϵ + 2ϵ [ 1 + ], (21) e P e vP vP p(x) = √ , q(x) = √ . (17) 2πvP 2πvQ 1 − ϵ ωQ ≡ , (22) vQ Now the anchor is the most probable opinion mλ. The Eq. (20) is nothing but the main postulate of the inverse dispersion 1/vλ relates to the opinion strength, as we confirmed below. The latitude of acceptance amounts weighted average approach; see [4, 14] for reviews. Here to opinions not far from the anchor, while the latitude of ωP and ωQ are the weights of (respectively) the initial P Q rejection contains close-to-zero probability events, since opinion of and of . We can re-write (20) as a linear P does not change his opinion on them; point 2 relation between the change (of the anchor) and discrep- P Q from the previous section. ancy (initial distance between opinions of and ): Employing the three sigma rule of statistics one can ωQ mP˜ − mP = (mQ − mP ). (23) define the latitudes of acceptance and rejection by, re- ωQ + ωP spectively, the following formulas Eq. (21) simplifies if vP /vQ is close to 1. Then in (21) √ √ one can put ϵ′ ≈ ϵ. The persuasion weight 1−ϵ increases vQ x ∈ [mP − 2 vP , m + 2 vP ], (18) √ λ √ with the source credibility 1 − ϵ and with the source self- x ∈ (−∞, mP − 3 vP ] ∪ [mP + 3 vP , ∞), (19) confidence 1 . Likewise, the initial weight 1+ϵ increases vQ vP with the confirmation bias ϵ of P and with the inverse range of his acceptance latitude ≃ 1 . Hence 1 and where the latitude of non-commitment contains what- vP vP ever is left out from (18, 19). Recall that the latitudes 1 can be regarded as opinion strengths of P and Q, vQ of acceptance, non-commitment and rejection carry (re- respectively. spectively) 95.4, 4.3 and 0.3 % of probability. Within this approach the weights ωP and ωQ are nor- Definitions (18, 19) are to some extent conventional. mally postulated to be independent from each other. But they work well with the rule (15), e.g. if the opinions Eqs. (20, 21, 22), which apply for vP /vQ not close to of P and Q overlap only within their rejection latitudes, 1, allow to see that both ωP and ωQ depend on ϵ. Fur- then neither of them can effectively change the opinion thermore, the weight ωP of P depends on the strength 8 of another. Also, P is persuaded most efficiently if his 1/vQ of Q: (21) shows that ωP decays with 1/vQ . In non-commitment latitude strongly overlaps with the ac- contrast, ωQ does not depend on 1/vP . ceptance latitude of Q. This is seen below when studying There is another effect related to the weighted average change-discrepancy relations. approach. Let mP = mQ, but vP ≠ vQ. Now pe(x), p(x) and q(x) have the same maximum mP . But the opinion of P gets stronger (resp. weaker) is interacting with a strong (resp. weak) opinion: IV. WEIGHTED AVERAGE APPROACH 1/vP > 1/vP if 1/vQ > 1/vP , (24) e 1/vP < 1/vP if 1/vQ < 1/vP , (25) Here we demonstrate that the main quantitative theory e of persuasion and opinion change—weighted average ap- where vP is the dispersion of (non-Gaussian) pe(x). proach [4, 14]—is a particular case of our model. We as- This effecte is seen on Fig. 1(b) and can also be a given sume that initial opinions are given by (17). If |mP −mQ| a simple analytical description provided that vP /vQ is is sufficiently small, pe(x) given by (17, 15) has a single close to 1, i.e. p(x) ≈ q(x). Expanding (15) over the first peak (anchor) which is shifted towards that of q(x); see order of this difference we get Fig. 1(a). 1 − ϵ 1 + ϵ We now look for the maximum mP of pe(x) by using vP = vQ + vP . (26) 2 e 2 2 (17) in (15) and neglecting factors O[(mP − mQ) /vP ] Thus the weighted average approach is a particular case of our model, where a confirmation biased agent P

to the evaluation of the evidence, w > 0 refers to the interaction strength and R is the threshold . The model lacks a natural representation of acceptance and rejection latitudes and their 8 Eq. (21) is specific to rule (15), e.g. if we apply rule (16) with µ = interaction with the anchor. vQ vQ 7 1/2, then q in (21) changes to [ ]1/4. But the qualitative If x is logarithm of probability, then the Gaussian assumption vP vP (17) can be deduced from the law of large numbers [26]. form of this dependence remains the same. 5 is persuaded by an opinion that is close to his initial the highly incredible (ϵ ∼ 1) persuasion has a larger mc opinion. Our model explains the structure of weights than the highly credible one; see Fig. 2(b). and relates them to the degree of confirmations bias and Let us note an example, where the change-discrepancy opinion strengths. curve is monotonic. It is realized for mP = mQ (coincid- ing anchors), where the distance (28) between p(x) and q(x) is controlled by vQ (for a fixed vP ). Now the change V. CHANGE-DISCREPANCY RELATION h[p, pe] is a monotonic function of discrepancy h[p, q]: a larger discrepancy produces larger change. For understanding the effectiveness of persuasion, one needs to study how the opinion change relates to the dis- crepancy, i.e. to the difference between the initial opinion VI. ORDER OF PRESENTATION of P and that of Q [6, 20, 25, 36, 37]. Initial studies saw a linear relationship between discrepancy and the opinion A. Definition change [20]. This is what comes out from the weighted average model; see (23). A point that makes the Bayesian theory inapplicable But further experiments clarified that the linear regime to opinion dynamics of real (non-normative) humans is is restricted to small discrepancies only and that the the order of presentation effect: it does matter in which actual behavior of the opinion change as a function of order two persuasive opinions act [6, 7, 12, 19, 20, 27, 29]. the discrepancy is non-monotonic: the opinion change h Sometimes the first opinion matters more (primacy), in reaches its maximal value at some discrepancy mc and other cases the last interaction is more important (re- then decays for m > mc [6, 25, 36, 37]. cency) 10. It is not completely clear which experimentally To study this problem in our model we need a - (un)controlled factors are responsible for primacy and re- able definition of distance h[p, q] between two probability cency, but there is a widespread tendency of relating the densities p(x) and q(x). We choose 9 primacy effect with confirmation bias [7, 29]. The rela- [ ∫ ] tion involves a qualitative that we scrutinize √ 1/2 below. h[p, q] ≡ 1 − dx p(x)q(x) , (27) We now define the order of presentation effect in our v situation. The agent P interacts first with Q (with prob- u [ ]1/2 u − 2 ′ t (vQvP )1/2 − (mQ mP ) ability density q(x)), then with Q with probability den- h[p, q] = 1 − e 4(vQ+vP ) . (28) ′ vQ+vP sity q (x). To ensure that we compare only the order 2 of Q and Q′ and not different magnitudes of influences coming from them, we take both interactions to have the Eq. (27) is the general definition of the Hellinger distance same parameter 0 < ϵ < 1. Moreover, we make Q and Q′ that applies to both continuous and discrete probabilities symmetric with respect to each other and with respect (changing the integral in (27) to sum). Eq. (28) is de- to P, e.g. if p(x), q(x) and q′(x) are given by (17) we duced from (17). assume As the measure of opinion change we take the Hellinger distance h[p, pe] between the initial and final opinion of P. vQ′ = vQ, mQ′ − mP = mP − mQ. (29) The discrepancy is quantified via the Hellinger distance P h[p, q] between the initial opinion of and the persuading Here is then the question: is the final opinion p(x|q, q′) opinion. For concreteness we assume that the opinion of P closer to q(x) (primacy) or to q′(x) (recency)? strengths 1/vP and 1/vQ are fixed. Then h[p, q] reduces to the distance between the anchors (peaks of p(x) and q(x)): m = |mP − mQ|. B. Theoretical prediction Fig. 2(a) shows that the change h[p, pe] is maximal at m = m ; it decreases for m > m , since the densities of c c The answer is that in the present model (and for P and Q loose overlap (common knowledge). ′ ′ 0 ≤ ϵ ≤ 1) p(x|q, q ) is closer to the last opinion q (x) The dependence of m on ϵ is also non-monotonic. c (recency) both in terms of the maximally probable value Fig. 2(b) shows that for a highly credible (ϵ ∼ 0) per- and in terms of distance. Hence in terms of the Hellinger suasion, m is larger than for a moderately credible c distance (27) we get one. Also, mc is located within the latitude of non- P ′ ′ ′ commitment of (except for ϵ close to 1); cf. (18, 19). h[p(x|q, q ), q ] < h[p(x|q, q ), q]. (30) These two points agree with experiments [6, 36]. Another aspect of mc(ϵ) curve is a new prediction of the model:

10 There is a viewpoint that both recency and primacy relate to (normative) irrationality; see e.g. [7]. But note that the infor- 9 The precise form of distance is not important for qualitative con- mation which came later is generally more relevant for predicting clusions below. future. Hence recency can be more rational than primacy. 6

See Fig. 3(a) for an example 11. recency effect 12. Note that this argument on recency To illustrate (30) analytically take directly extends to more general situations, where (for instance) 4 persuasions act as q q q′ q′, and this is com- p = (1/2, 1/2), q = (0, 1), q′ = (1, 0), (31) pared with the inverse order q′ q′ q q 13.

as the binary probabilistic opinion of P, Q and Q′, re- P spectively. is fully ignorant on a binary random vari- C. Relations with experimental studies able, while Q and Q′ are fully convinced and opposite to each other. If P interacts first with Q and then with Q′ 1 Now we compare the above finding with experimental (both interactions are given by (14) with ϵ = 2 ), the opinion of P becomes (0.52727, 0.47273). This is closer results on primacy and recency. They can be roughly to the last opinion (that of Q′). divided into several group: persuasion tasks [6, 27], sym- bol recalling [44] and impression formation [3, 4]. In all The recency effect in our model is counterintuitive, be- those situations one generally observes both primacy and cause one would expect that it naturally supports the recency, though in different proportions and under differ- primacy effect: the first interaction shifts the opinion of ent conditions [19]. P towards that of Q, and then the second interaction with Q′ has a smaller influences on the opinion of P due The situation is clearer for the first and second group. to a smaller overlap between opinions of Q′ and P. This Here the recency effect is observed whenever the retention is the standard argument that relates primacy with the time (between the last stimulus and the data taking) is confirmation bias [7, 29]: the first interaction shapes the short. If this time is sufficiently long, the recency effect opinion of P, and then P is confirmation biased with re- changes to primacy [6, 27, 44]. The general interpretation spect to the existing shape. The argument is wrong, at of these results is that two different processes are involved least for the present model, because it incorrectly implies which operate on separate time-scales. These processes that in the first step the opinion of P as whole moves to- can be conventionally related to short-time and long-time wards that of Q; see Fig. 1(a). [44], with the primacy effect related to the long- time memory. In our model the longer time process is To get a deeper understanding of the recency effect, absent. Hence it is natural that we see only the recency expand (14) for a small η ≡ 1 − ϵ: effect. 2 ∑ 2 At this point let us remind the importance of sym- η η (ql − pl) pek = pk + (qk − pk) + (pk − 1) metry conditions [such as (29)] for the genuine or- 2 8 pl l der of presentation effect. In contrast, several exper- +O[η3]. (32) imental studies—in particular those on impression for- mation—suggest that the order of presentation exists If now P interacts with an agent Q′ having opinion q′, due to different conditions in the first versus the sec- the resulting opinion p(q, q′) reads from (32): ond interaction [3, 6, 19, 43]. (In our context, this means different parameters ϵ and ϵ′ for each interac- ′ pk(q, q ) = pk tion.) Thus Refs. [3, 6] argue that the primacy effect 2 ∑ 2 is frequently caused by attention decrement (the first ac- η η (ql − pl) + (qk − pk) + (pk − 1) tion/interaction gets more attention); see also [43] in this 2 8 pl l context. (This effect is trivially described by our model, if ′ η η2 ∑ (q′ − p )2 we assume ϵ < ϵ .) For similar experiments it was shown ′ − − l l + (qk pk) + (pk 1) that if the attention devoted to two interactions is bal- 2 8 pl l anced, the recency effect results [18]. This is consistent η2 with the prediction by our model. Note that yet another + (p − q ) + O[η3]. (33) 4 k k aspect of the order of presentation effect is studied in Appendix A. ′ ′ Hence in this limit pk(q, q ) − pk(q , q) depends only on We close by mentioning the advantages and drawbacks ′ − qk qk (and not e.g. on ql≠ k): of the present model concerning the primacy-recency ef- fect: the main advantage is that it robustly demonstrates ′ − ′ 2 ′ − O 3 pk(q, q ) pk(q , q) = η [qk qk]/4 + [η ]. (34) the recency effect (see however Footnote 11) and shows

It is seen that the more probable persuasive opinion (e.g. Q′ ′ the opinion of if qk > qk) is more efficient in chang- ing the opinion of P if it comes later. This implies the 12 Indeed, due to symmetry conditions for checking the order of pre- sentation effect we can also look at h[p(q, q′), q] − h[p(q′, q), q]. η2 − Using (34) we get for this quantity: 16h[p(q′,q),q] Pk[qk ′ qk]pqk/pk > 0, again due to symmetry conditions. 11 13 ′ − ′ 2 ′ − In our model primacy effect exists in the boomerang regime ϵ > For this case we get instead of (34): pk(q, q ) pk(q , q) = η [qk 3 1; see below and Fig. 3(b). qk] + O[η ]. 7

[n+1] [n] that the well-known argument on relating confirmation Define hn = h[p , p ], as the Hellinger distance bias with primacy does not hold generally. The main (27) between two consecutive opinions of P evolving as drawback is that the model does not involve longer time- in (35). It is now possible that scale processes that are supposedly responsible for the interplay between recency and primacy. max1≤n<∞[hn] = hm ≠ h1, (39)

i.e. the largest change of the opinion of P comes not from VII. REPEATED PERSUASION the first, but from one of intermediate persuasions 15. We conclude that though repeated persuasion does its Repeating the same message several times is a known job (of driving the opinion) monotonously in the number way of reaching the persuasion goal. In which re- of repetitions (Le Chatelier principle), it is not generally peated persuasions can be more efficient than a single true that the first persuasion makes the largest change one? Expectedly, after repeating the same persuasion in the opinion. Put differently, the law of diminishing many times, the target opinion will converge to the per- returns does not hold for repeated persuasions. Thus to suasion goal. How this convergence takes place? These get the largest opinion change one should choose carefully are two main questions we answer in this section. the number of repetitions. This finding might explain Assume that P updates his opinion repeatedly with the popularity of repeating in adversarial forms of per- the same opinion of Q. Eq. (14) implies suasion. Note that the framework of (35) can be applied √ to studying mutual persuasion (consensus reaching); see [n+1] ∝ [n] [n] − Appendix C for details. pk pk [ϵpk + (1 ϵ)qk)], n = 1, 2, ..., (35) where 1 > ϵ > 0, n is the discrete time, and we omit the normalization. For simplicity we assume VIII. BOOMERANG (BACKFIRE) EFFECT [1] ≡ ≤ ≤ pk pk > 0, qk > 0 for 1 k N. (36) A. Confirmation bias and boomerang effect { }N Eq. (35) admits only one fixed point q = qk k=1. Ap- ′′ Sometimes persuasion (by agent Q) brings in the op- pendix B shows that for any convex [f (y) ≥ 0] function posite effect: the persuaded agent P moves his opinion f(y) one has away from that of Q, i.e. he enforces his old opinion Φ[p[n+1]; q] ≤ Φ[p[n]; q], (37) [30, 33, 36, 39]. Early literature on social psychology ∑N proposed that this boomerang effect may be due to opin- Φ[p; q] ≡ qkf(pk/qk). (38) ions placed in the latitude of rejection [36], but this was k=1 not confirmed experimentally [37]. Hence Φ[p; q] is a Lyapunov function of (35). Since Φ[p; q] The present model offers a possibility to look at the is a convex function of p, f(1) = Φ[q; q] is the unique boomerang effect as (just) an extreme form of confirma- global minimum of Φ[p; q]. Appendix B shows that the tion bias. Recall that after (13) we defined ϵ as the degree equality sign in (B13) holds ony for p[n+1] = p[n]. Thus of confirmation bias, so that ϵ = 1 means a special point, Φ[p[n]; q] monotonically decays to f(1) = Φ[q; q] showing where no change (of opinion of P) is possible whatsoever. 14 that the fixed point q√is globally stable . For illustrating Now we propose that the boomerang effect relates to (B13) take f(y) = − y. Then (B13) amount to decaying Hellinger distance (27). Many other reasonable measures ϵ > 1. (40) of distance are obtained under various choices of f. As expected, 0 < ϵ < 1 influences the convergence time. We This proposal is consistent with experiments [39], where checked that this time is an increasing function of ϵ: more the subjects showing the boomerang effect had special confirmation biased agent converges slower. to insist on their previous opinion. For example, In Appendix B we also show that the convergence to the boomerang effect resulted from the fact that subjects the fixed point respects the Le Chatelier principle known already had announced their opinion publicly, and were in thermodynamics: probabilities of overestimated events not only reluctant to change it (as for the usual confir- [1] mation bias), but even enforced it on the light of the (i.e. pk > qk) tend to decay in the discrete time (see Ap- pendix B). Likewise, probabilities of the underestimated [1] events (i.e. pk < qk) increase in time. 15 A simple example of this situation is realized for p = (0.98, 0.01, 0.01) and q = (0.01, 0.01, 0.98) in (36). We then apply (35) under ϵ = 0.5. The consecutive distances read 14 These conclusions holds by continuity if pk > 0 for all k, but h1 = 0.1456 < h2 = 0.1567 > h3 = 0.1295 > h4.... Here the some qk’s are zero. Likewise, if some of pk’s nullify (but qk > 0), second advise is the most relevant one. For this to hold the they stay zero always, while the initially non-zero pk’s converge initial opinion of P have to sufficiently far from that of Q. Oth- to renormalized values of qk, e.g. if only p1 = 0, while pk > 0 erwise we get a more expected behavior h1 > h2 > h3 > h4... → N for k > 1, we get: pk qk/ Pk=2 qk. meaning that the first persuasion is always the most relevant one. 8 contrary evidence [39] (in these experiments the subjects IX. COGNITIVE DISSONANCE who did not make their opinion public behaved without the boomerang effect). A similar situation is realized for A. Moderate confirmation bias voters who decided to support (and implicitly to defend) a certain candidate. They sometimes increase their sup- Consider an agent whose opinion probability density port after hearing that the candidate is criticized [30, 33]. has two peaks on widely separated events. Such a den- The opinion combination rule reads after analytical sity—with the most probable opinion being different from continuation (40) in (14): the average—includes examples of cognitive dissonance, √ p |ϵp + (1 − ϵ)q | where the agent in mutually conflicting things pe = ∑ √k k k , (41) [13]. For instance, a smoker opines that the probability k N | − | l=1 pl ϵpl + (1 ϵ)ql density p(x) of the chance x of dying out of smoking is ∼ with obvious generalization to probability densities. The peaked at a large x (x 1), because he trusts scien- absolute values in (41) are necessary to conserve the pos- tific evidence against smoking. But his subjective p(x) itivity of probabilities. is also peaked at a small value of x: since he does not quit smoking, he believes that the chance of surviving is Let us return to (20) that shows how the anchor (max- 16 imally probable opinion) changes under influence of a sizable after all [6, 13] . As suggested in Ref. [13], cognitive dissonance emerges close persuasion (mP ∼ mQ). For ϵ > 1, the anchor when an agent (who initially holds a unique maximally of P drifts away from that of Q due to ωQ < 0; (20, 21). Likewise, (26) shows that whenever the two anchors possible opinion that coincides with the average) is sub- ject to a conflicting information that he is willing to ac- are equal, mP = mQ, a weaker persuasion makes the persuaded opinion stronger, and vice versa. Hence the cept. Our model describes this scenario quantitatively. | − | first inequality in (24) [and in (25)] is inverted. In (15, 17) we assume that mP mQ is neither very Fig. 4 illustrates the shape of pe(x) (the final opinion large nor very small, vQ/vP < 1 (strong persuasion) and of P) for Gaussian opinions. It is seen that the anchor 0 < ϵ < 1. Now we get 2 peaks (anchors) for the final e of P moves away from that of Q, while the peak of pe(x) density p(x). The first peak is very close to that of p(x), around the anchor is more narrow than that of p(x); see while the second one is close to the peak of q(x); see also Fig. 3(b) in this context. Fig. 6(a). Thus persuasion from a strong (narrow peak of q(x)), sufficiently unexpected (but not completely for- eign) source Q leads to cognitive dissonance: P holds si- B. The order of presentation effect multaneously two different anchors, the old one and the one induced by Q. We saw that for 0 < ϵ < 1 the model predicts recency There are two ways for reducing the cognitive disso- Q effect. For 1 . ϵ we expect the recency effect is still ef- nance: to increase ϵ making it closer to 1 (i.e. to make P fective as implied by the argument (34). However, the less credible) or to make the initial opinion of stronger. situation changes drastically for ϵ sufficiently larger than The avoidance of cognitive dissonance is one of func- 1, as indicated in Fig. 3(b). Now the primacy effect dom- tions of human (sometimes called Freud- inates, i.e. instead of (30) we get the opposite inequality. Festinger’s law) [6, 13]. Implications of this law for for- Fig. 3(b) also shows that interaction with two contra- mal of decision making are studied in [45]. dicting opinions (in the boomerang regime) enforces the initial anchor of P. To understand the primacy-recency effect analytically, B. Boomerang regime consider example (31), and recall that P interacts first with Q and then with Q′ with the same parameter ϵ. Within the boomerang regime ϵ > 1 the agent is more The resulting opinion p(q, q′) of P reads: prone to cognitive dissonance; cf. Fig. 4 with Figs. 1. ( ) The mechanism of this proneness is explained in Fig. 4: ′ g(ϵ) 1 P p(q, q ) = , , (42) the opinion of is easier separated into pieces, since the g(ϵ) + 1 g(ϵ) + 1 v probability moves away (in different directions) from the u √ Q u √ anchor of . t ϵ + (1 − ϵ) |2 − ϵ| g(ϵ) = √ . (43) Moreover, a stronger form of cognitive dissonance is ϵ(2 − ϵ) possible: the anchor moves in one direction (away from Q), while the probability density moves towards Q in Fig. 5 shows how p (q, q′) = g(ϵ) behaves as a function terms of distance. This is illustrated by Fig. 7, where for 1 g(ϵ)+1 √ of ϵ.√ The recency effect holds for ϵ < 2 + 2; for ϵ > 2 + 2 we get primacy. Similar results are obtained for initially Gaussian opinions. 16 This example on smoking and cognitive dissonance is mentioned Thus in the present model the primacy effect (relevance in many books and papers. Still we repeat it here, because we of the first opinion) is related to the boomerang effect. are both smokers. 9

ϵ > 6, pe(x) (opinion of P after one interaction) moves monotonic and non-monotonic change-discrepancy rela- closer to q(x) (as measured by the Hellinger distance), tions. while the anchor of P moves away from that of Q. New effects predicted by the model are summarized as Let P is repeatedly persuaded with the same opinion follows. of Q [cf. (35)]: (i) The model display the recency effect in the order √ of presentation set-up. We argued that the standard ar- p[n+1](x) ∝ p[n](x) |ϵp[n](x) + (1 − ϵ)q(x)|, (44) gument on relating confirmation bias with the primacy effect—which was intuitively supposed to work also for where n = 1, 2, ... is the discrete time. Now the ini- this model—does not go through. We noted that our tially Gaussian opinion of P develops in time two well- model lacks “long-term memory” processes which could separated peaks (another scenario of confirmation bias): be responsible for the experimental fact of changing re- the smaller one moves towards the anchor of Q and fi- cency to primacy (upon increasing the retention time). nally locates itself within the acceptance latitude of Q. Introducing such processes would be one way of improv- The larger peak (together with the anchor of P) becomes ing this model in future. more narrow and moves away from q(x); see Fig. 6(b). (ii) For repeated persuasions we saw that there is a After many iterations (≃ 103 for parameters of Fig. 6(b)) wide class of Lyapunov functions that describe the mono- the larger peak locates itself within the rejection latitude tonic convergence of the target opinion to the persuasion of Q. Then p[n](x) stops changing (stationary opinion). goal. However, repeated persuasions do not hold the law of diminishing returns: the subsequent persuasion can be more efficient in opinion changing than the previous one. X. SUMMARY AND DISCUSSION We opined that these findings may contribute to under- standing the widespread usage of repeated persuasions. We presented a new model for opinion change in (iii) We proposed that the boomerang (back-reaction) the presence of confirmation bias. Our approach effect is naturally related to extreme forms of confirma- employs subjective-probabilistic interpretation of agent tion bias, and can be described by the same model as the opinions/beliefs, but it is non-Bayesian. Instead it is proper confirmation bias. In the boomerang regime we based on ideas of semantic information theory and opin- get that the order of presentation can display primacy ion combination research developed in statistics. The instead of recency. model has 3 inputs: the probabilistic opinions of the tar- (iv) Finally, we show that the model formalizes the lore get agent P and a persuading agent Q, and the degree of of cognitive dissonance. It reproduces several standard confirmation bias displayed by P. features of this phenomenon and proposes new scenarios The model accounts for several key empirical obser- for its emergence. vations that have been reported in social psychology and quantitatively interpreted within the social judgment theory. In particular, the model allows to formalize the concept of opinion latitudes, explains the structure of the Acknowledgements weighted average approach to opinion formation, and re- lates the initial discrepancy (between the opinions of P We thank Seth Frey for useful remarks and sugges- and Q) to the magnitude of the opinion change (shown tions. This research was supported by DARPA grant by P). In all these cases our model extends and clarifies No. W911NF–12–1–0034 and AFOSR MURI grant No. previous results, e.g. it elucidates the difference between FA9550-10-1-0569.

[1] Allahverdyan AE, Galstyan A (2011). Le Chatelier prin- ence 35: 499-526 ciple in replicator dynamics. Physical Review E 84: [6] Aronson E (2007). The Social Animal. Palgrave Macmil- 041117 lan, 10th revised edition. [2] Allakhverdov VM, Gershkovich VA (2010). Does con- [7] Baron J (2008). Thinking and deciding. Cambridge Uni- sciousness exist? in what sense? Integrative Psychologi- versity Press, Cambridge. cal and Behavioral Science 44: 340-347. [8] beim Graben P (2006). Pragmatic information in dy- [3] Anderson NN (1965). Primacy effect in personality namic semantics. Mind and Matter, 4: 169-193. impression formation using generalized order effect [9] Bochner S, Insko CA (1966). Communicator discrepancy, . Journal of Personality and Social Psychology source credibility, and opinion change. Journal of Person- 2: 1-9. ality and Social Psychology, 4: 133-140. [4] Anderson NN (1981). Foundations of information inte- [10] Clemen RT, Winkler RL (1999). Combining probability grafion theory. Academic Press, New York. distributions from experts in risk analysis. Risk Analysis [5] Austerweil JL, Griffiths TL (2011). Seeking confirmation 19: 187-203. is rational for deterministic hypotheses. Cognitive Sci- [11] Darley JM, Gross PH (1983). A hypothesis-confirming 10

bias in labeling effects. Journal of Personality and Social Ever “Get It”. 31: 563-593. Psychology 44: 20-33. [34] Social Psychology: Handbook of Basic Principles (2007). [12] Diaconis P, Zabell SL (1982). Updating subjective prob- Edited by Kruglanski AW, Higgins EW. The Guilford ability. Journal of the American Statistical Association, Press, New York. 77: 822-830. [35] Schreider YA (1970). On the semantic characteristics of [13] Festinger L (1957). A Theory of Cognitive Dissonance. information. edited by Saracevic T. Introduction to In- Press, Stanford, CA. formation Science. Bowker, New York: 24-32. [14] Fink EL, Kaplowitz SA, Bauer CL (1983). Positional dis- [36] Whittaker JO (1963). Opinion change as a function of crepancy, psychological discrepancy, and change: communication-attitude discrepancy. Psychological Re- Experimental tests of some mathematical models. Com- ports 13: 763-772. munication Monographs 50: 413-430. [37] Kaplowitz SA, Fink EL (1997). Message discrepancy and [15] Genest C, Zidek JV (1986). Combining probability distri- persuasion. in Communication Sciences XIII. butions: A critique and an annotated bibliography. Sta- Edited by Barnett GA, Foster FJ. Ablex Publishing Cor- tistical Science 1: 114–135. poration, Greenwhich, Connecticut. [16] Gentzkow M, Shapiro JM (2006). and repu- [38] Curtis JP, Smith FT (2008). Mathematical Models of tation. Journal of Political Economy 114: 280-316. Persuasion. American Conference on Applied Mathemat- [17] Gigerenzer G, Goldstein DG (1996). Reasoning the Fast ics (MATH ’08), Harvard, Massachusetts: 60-65. and Frugal Way: Models of Bounded Rationality. Psy- [39] Sutherland S (1992). Irrationality: The Enemy Within. chological Review 103: 650-669. London, Constable. [18] Hendrick C, Costantini AF (1970). Effects of varying trait [40] Marshall AW, Olkin I (1979). Inequalities: Theory of inconsistency and response requirements on the primacy Majorization and its Applications. Academic Press, New effect in impression formation. Journal of Personality and York. Social Psychology 15: 158-164. [41] Huhns MN, Singh MP (1997). Ontologies for agents. [19] Hogarth RM, Einhorn HJ (1992). Order effects in belief IEEE Internet Computing. 1: 81-83. updating: The belief-adjustment model. Cognitive Psy- [42] Wason PC (1960). On the failure to eliminate hypotheses chology, 24: 1-55. in a conceptual task. Quarterly Journal of Experimental [20] Hovland CI (editor) (1957). The order of presentation in Psychology, 12: 129-140. persuasion. Yale University Press, New Haven. [43] Webster, D. M., Richter, L., Kruglanski, A. W. (1996). [21] Jeng M (2005). A selected history of expectation bias in On leaping to conclusions when feeling tired: Mental fa- physics. American Journal of Physic, 74: 578-582. tigue effects on impressional primacy. Journal of Experi- [22] Klayman J, Ha YW (1987). Confirmation, disconfirma- mental Social Psychology, 32: 181-195. tion, and information. Psychological Review 94: 211-228. [44] Wright AA, Santiago HC, Sands SF, Kendrick DF, Cook [23] Koehler JJ (1993). The influence of prior beliefs on sci- RG (1985). Memory processing of serial lists by pigeons, entific judgments of evidence quality. Organizational Be- monkeys, and people. Science 229: 287-289. havior and Human Decision Processes 56: 28-55. [45] Yariv L (2002). I’ll see it when I believe it - A Simple [24] Lazarsfeld PF, Berelson B, Gaudet H (1944). The Peo- Model of Cognitive Consistency. Cowles Foundation Dis- ple’s Choice. How the Voter Makes up his Mind in Presi- cussion Paper # 1352. dential Campaign. Columbia University Press, New York. [25] Laroche M (1977). A model of in groups following a persuasive communication: An attempt at formalizing research findings. Behavioral Science 22: 246- 257. [26] Lindley DV, Tversky A, Brown RV (1979). On the rec- onciliation of probability assessments. J. R. Statist. Soc. A 142: 146-156. [27] Miller N, Campbell DT (1959). Recency and primacy in persuasion as a function of the timing of speeches and measurements. The Journal of Abnormal and Social Psy- chology 59: 1. [28] Mullainathan S, Shleifer A (2005). The Market of News. The American Economic Review 95: 1031-1053. [29] Nickerson RS (1998). Confirmation bias: a ubiquitous phenomenon in many guises. Review of General Psychol- ogy, 2: 175-220. [30] Nyhan B, Reifler J (2010). When corrections fail: The persistence of political misperceptions. Political Behavior 32: 303-330. [31] Oskamp (1965). Overconfidence in case-study judgments. Journal of Consulting Psychology, 29: 261-265. [32] Rabin M, Schrag JL (1999). First Impressions Matter: A Model of Confirmatory Bias. The Quarterly Journal of Economics 114: 37. [33] Redlawsk DP, Civettini AJW, Emmerson KM (2010). The Affective Tipping Point: Do Motivated Reasoners 11

Figures

0.6 0.8

0.5 0.6 0.4

0.3 0.4

0.2 0.2 0.1

0.0 0.0 -4 -2 0 2 4 -4 -2 0 2 4 (a) (b)

FIG. 1: Opinion change after one interaction. The initial opinion of P is described by Gaussian probability density p(x) (blue curve) centered at zero. The opinion of Q amounts to Gaussian probability density q(x) (purple curve) centered at a positive value; see (17). The resulting opinion p(x) of P is given by (15) with ϵ = 0.5 (olive curve). (a) The opinion of P moves towards thate of Q; mP = 0, σP = 1, mQ = 1, σQ = 0.5. (b) The maximally probable opinion of P is reinforced; mP = 0, σP = 1, mQ = 0, σQ = 0.25.

0.25 3.2 0.20

3.0 0.15 h mc 0.10 2.8

0.05 2.6

0.00 0 1 2 3 4 5 6 0.2 0.4 0.6 0.8 m Ε (a) (b)

FIG. 2: Opinion change vesrus discrepancy. (a) The opinion change h = h[p, p] versus discrepancy m = |mP − mQ|. The initial opinion of the agent P is Gaussian with mP = 0 and vP = 1; see (17).e The opinion of Q is Gaussian with mQ = m and vQ = 1. Thus m quantifies the initial distance between the opinions of P and Q. The opinion change of P is quantified via the Hellinger distance (27) between the old and new opinion: h = h[p, p], where p(x) is given by (14). Different curves correspond to different ϵ; from top to bottom: ϵ = 0.1, 0.5, 0.9, 0.98. The maximume of he(m) is reached at mc. (b) mc versus ϵ for same parameters as for (a). mc(ϵ) weakly grows both for ϵ → 1 and ϵ → 0, e.g. mc(0.01) = 3.29972, mc(0.0001) = 4.53052, mc(0.9) = 2.94933, mc(0.999) = 4.12861. 12

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0.0 0.0 -4 -2 0 2 4 -4 -2 0 2 4 (a) (b)

FIG. 3: Order of presentation effect. (a): Blue curve: The initial opinion of P is described by Gaussian probability density p(x) with mP = 0 and vP = 1; see ′ (17). Purple (resp. olive) curve: the initial opinion of Q (resp. Q ) are given by (17) with mQ = 1.5 (resp. mQ′ = −1.5) and ′ vQ = 0.5 (resp. vQ′ = 0.5). Green curve: the resulting opinion of P after interacting first with Q and then with Q . Both interactions use ϵ = 0.5. The final opinion of P is inclined to the most recent opinion (that of Q′) both with respect to its maximally probable value and distance. The final opinion of P has a larger width than the initial one. (b) The same as in (a) but for ϵ = 1.5 (boomerang regime, see section VIII). Now the final opinion of P is inclined to the first opinion (that of Q) with respect to the distance. The initial maximally probable opinion of P is still maximally probable. Moreover, its probability has increased and the width around it has decreased. The final opinion has 3 peaks.

0.4

0.3

0.2

0.1

0.0 -4 -2 0 2 4

FIG. 4: Opinion change in the boomerang regime. Blue (resp. purple) curve: the initial opinion of agent P (resp. Q) described by probability density p(x) (resp. q(x)). Olive curve: the final opinion p(x) of P given by (15) with ϵ = 2. Here p(x) and q(x) are given by (17) with mP = 0 and vP = mQ = vQ = 1. The anchore (maximally probable opinion) of P not only moves away from the anchor of Q; but it is also enhanced: the (biggest) peak of p(x) is larger than that of p(x). The second (smaller) peak of p(x) arises because the initial probability of P located to the righte from the anchor mQ of Q, moves away from mQ; p(x) gets ae local minimum close to mQ. e

1

0.75 L q,q'

H 0.5 1 p

0.25

0. 0 1 2 3 4 5 Ε

′ g(ϵ) FIG. 5: p1(q, q ) = g(ϵ)+1 given by (42, 43) versus ϵ. 13

1.2 0.6 1.0 0.5

0.8 0.4

0.6 0.3

0.4 0.2

0.2 0.1

0.0 0.0 -4 -2 0 2 4 -4 -2 0 2 4 (a) (b)

FIG. 6: Cognitive dissonance. (a) Blue (resp. purple) curve: the initial opinion of agent P (resp. Q) described by probability density p(x) (resp. q(x)). Olive curve: the final opinion p(x) of P given by (15) with ϵ = 0.5. Here p(x) and q(x) are defined by (17) with mP = 0, vP = 1, mQ = 2, vQ = 0.1. It ise seen that the maximally probable opinion of P does not change, but the final opinion develops two peaks of comparable height (cognitive dissonance). (b) Blue and purple curves are defined as for (a), but with parameters: mP = 0, vP = vQ = 1, mQ = 1. Olive curve: the opinion of P after 50 iterations (44) with ϵ = 2 (boomerang regime).

0.10

0.05

0.00

-0.05

0 2 4 6 8 Ε

FIG. 7: Opinion change after one interaction (including the boomerang regime); see (14, 15, 41). The Hellinger distance difference h[p, q] − h[p, q] versus ϵ. The initial opinions p(x) and q(x) are given by (17) with mP = 0, vP = mQ = vQ = 1 (lower curve) and mP = 0, emQ = 1.5, vP = vQ = 1 (upper curve). h[p, q] − h[p, q] > 0 means that the final opinion of P is closer to that of Q in terms of the Hellinger distance. e 14

Appendix A: Order of effectiveness For z > 0 and 0 < ϵ < 1 we note the following features of ψ[z]: 17 The order of presentation issue has another aspect: dψ[z] d assume that Q and Q′ persuade P in the same direc- > 0, (ψ[z]/z) < 0. (B6) dz dz tion—i.e. mQ > mP and mQ′ > mP —but their dis- tances from the anchor of P are different: mQ′ > mQ > These relations imply from (B3, 17): Q Q′ mP . In which order should and act to bring in ze ≥ ... ≥ ze , (B7) the maximal change in the opinion of P? (It is assumed, 1 N ze ze as above, that both interactions have the same ϵ and 1 ≤ ... ≤ N . (B8) that vQ′ = vQ to make the comparison unambiguous). z1 zN The answer is again unique (but this time also intuitive) Due to within the present model: the maximal change—as mea- ∑N ∑N sured e.g. by the Hellinger distance—is achieved when qkzek = qkzk = 1, (B9) k=1 k=1 the closer opinion acts first: we have from (B7): | ′ | ′ h[p(x q, q ), p] > h[p(x q , q), p]. (A1) ze ze 1 ≤ 1, N ≥ 1. (B10) z z The same conclusion holds for vQ′ < vQ and mQ′ = 1 N ′ mQ, where opinion of Q is more distant from the initial Hence there exist such a θ (1 ≤ θ < N) that opinion of P. ze ze ze ze The message of (A1) is intuitive, since the interaction 1 ≤ 1, ...., θ ≤ 1, θ+1 ≥ 1, ... N ≥ 1. (B11) ′ of P with Q is weaker: the interaction with Q prepares z1 zθ zθ+1 zN ′ the ground for the subsequent action of Q . But there Eqs. (B11, B9) lead to are experimental results that seemingly contradict this ∑m ∑m result [14]. They show that when the most distant mes- pek ≤ pk, m = 1, ..., N − 1. (B12) sage acts before the less distant one, the opinion changes k=1 k=1 more than for the reverse order. We believe that in those Eqs. (B3, B7, B9, B12) imply that for any convex experiments the above condition on the same value of ϵ [f ′′(y) ≥ 0] function f(y) one gets (B13) [1, 40]: did not hold. This agrees with the viewpoint expressed ∑N ∑N ′ by the authors of [14]. If ϵ and ϵ are different, (A1) does qkf(p/qe k) ≤ qkf(pk/qk). (B13) k=1 k=1 not hold anymore, and our model can account for the main result of [14]. Let us demonstrate the implication explicitly, since it is a useful exercise on features of convex functions. We define Appendix B: Lyapunov functions for repeated − e ≡ f(zk) f(zk) persuasions Qk , (B14) zk − zek ∑k ∑k αk ≡ pl, αek ≡ pel, (B15) 1. Derivation l=1 l=1 α0 ≡ αe0 ≡ 0. (B16) Here we show that in (14) the revised opinion peis closer Note that whenever zk = zek (for a certain k), we define to q. Let us for simplicity assume that ′ Qk = f (zk) instead of (B14). We deduce from (B3, B7) and from convexity of f(y): pk > 0, qk > 0 for 1 ≤ k ≤ N, (B1) Q ≥ Q ≥ ... ≥ Q . (B17) and define 1 2 N The sought implication amounts to summation by parts: ≡ e ≡ e zk pk/qk, zk pk/qk. (B2) ∑N qk[f(zk) − f(zek)] k=1 We choose the indices k such that the following ordering ∑ N f(zk) − f(zek) relations hold = [pk − pek] k=1 zk − zek ∑N z1 ≥ ... ≥ zN . (B3) = Qk[αk − αk−1 − (αek − αek−1)] ∑k=1 Eq. (35) implies N = (Qk − Qk+1)(αk − αek) ≥ 0. (B18) k=1 ψ[zk] zek = ∑ , k = 1, ..., N, (B4) N q ψ[z ] √l=1 l l ψ[z] ≡ z2ϵ + z(1 − ϵ). (B5) 17 All results of this section hold for rules (13, 16). 15

The last expression is non-negative due to (B14) and bias ϵQ). The same procedure is repeated in the second (B12). The boundary terms in the summation by parts step and so on. Instead of (35) we get [normalization disappear due to αN = αeN = 1 and to (B16). factors are omitted] Now recall that inequalities in (B6) are strict. Hence √ if the initial conditions are chosen such that all inequali- [n+1] p (x) ∝ p[n](x)[ϵP p[n](x) + (1 − ϵP )q[n](x)], ties in (B3) are strict, and also if f(y) is strictly convex, ′′ (C1) f (y) > 0, all the inequalities leading to (B18) can be √ made strict in the sense that whenever (B18) nullifies, [n+1] q (x) ∝ q[n](x)[ϵQq[n](x) + (1 − ϵQ)p[n](x)]. we conclude that p = pe. (C2)

The linear—in the sense of (20)—version of this model 2. Interpretations was recently studied in [38] with the same aim of model- ing consensus reaching. Eq. (B7) means that if the ordering (B3) stays intact. For n → ∞ recursions (C1, C2) converge to a station- ̸ ∞ ∞ It can be given following meaning: pk = qk means a ary density p[ ](x) = q[ ](x) ≡ r(x), which depends on P Q disagreement between the opinions of and on the the initial states p[1](x) = p(x), q[1](x) = q(x), and the probability of the event k. There can be two types of confirmation biases ϵP and ϵQ. Let us discuss the main disagreement: overestimation (pk > qk) and underesti- scenarios for the behavior of r(x) assuming the Gaus- mation (p < q ). k k sian situation (17) for initial densities. Recall that 1/vλ Now (B3) implies that there exist some ζ, 1 ≤ ζ < N, in (17) may be related to the amount of strength (self- such that confidence) present in the opinion. 1. This is a general feature of r(x) (for the sake of z ≥ 1, ..., z ≥ 1, z ≤ 1, ..., z ≤ 1. (B19) 1 ζ ζ+1 N concreteness we take mP > mQ): r(x) is spread over the interval All the events 1, ..., ζ (ζ +1, ..., N) are overestimated (un- derestimated) from the viewpoint of Q. √ √ x ∈ [mQ − 2 vQ, mP + 2 vP ], (C3) According to (B19) the first event was overestimated. Its probability p1 decays, as (B11) shows. Likewise, the which includes the acceptance latitudes of p(x) and q(x); last event was underestimated and its probability pN in- see (18). creases; see (B11). Since generally ζ ≠ θ, the correlation 2. Equally biased, equally self-confident agents. In this between decay and overestimation (resp. increase and case ϵP = ϵQ and vP = vQ. If |mP − mQ| is not large underestimation) need hold for all other events (i.e. for (initial opinions are not far from each other), r(x) is cen- 1 < k < N), but still this correlation holds in a more lim- tered at (mP +mQ)/2, i.e. in between of two opinions. If ited sense. Eq. (B12) means that the sum of probabilities the initial opinions are sufficiently far from each other, P of the most overestimated event p1 and its neighbours and Q do develop double-peak structure (cognitive dis- (p2, p3...) decays in time, although (say) p2 may still in- sonance) in their consensus opinion r(x). Two peaks of dicate on overestimated event, but increase in time for r(x) are located very close to x = mP and x = mQ, re- some finite number of time-steps. spectively, meaning that each agent has now two equal Following the classification of stability notions pro- maximally probable opinion (anchors): his initial opinion posed in [1] for probability dynamics, (B11) can be called and the initial of the other agent. the strong Le Chatelier principle. The general heuristics 3. Non-equally biased, equally self-confident agents: of this principle in thermodynamics is that [1]: An exter- ϵP > ϵQ (for concreteness), but still vP = vQ. In the nal influence disturbing an equilibrium state of a system previous situation of equally biased agents, the peak of induces processes tending to diminish the results of the r(x) (if it was unique) was located at the average opinion disturbance. For the present opinion dynamics, the equi- (mP + mQ)/2. Now the peak of r(x) is shifted towards librium state refers to q, while the perturbation over it more confirmationally biased agent P. can be taken to be p. If the disagreement is taken to be The convergence of p[n](x) and q[n](x) towards r(x) the cause of this perturbation, then the decay of the prob- takes place in two steps: first p[n](x) quickly spreads over ability of the overestimated event (resp. increase for the the interval (C3) without changing much its maximally underestimated event) makes sense from the viewpoint probable value. After that p[n](x) ≈ r(x) does not change of the principle. anymore, but q[n](x) is gradually (i.e over a longer time) forced to reach the same maximally probable value as p[n](x). Thus P first accepts to an extent the opinions of Appendix C: Consensus reaching. Q (as well as all intermediate opinions), but then gradu- ally forces Q towards accepting his maximally probable Assume the following dialogue scenario: P is per- opinion. suaded by Q (with the confirmation bias ϵP ) and simul- 4. Equally biased, non-equally self-confident agents: taneously Q is persuaded by P (with the confirmation ϵP = ϵQ, but vP < vQ. Now P is more self-confident, 16 i.e his initial opinion is stronger. Hence in the consensus to impose his maximally probable opinion, Q can have a reaching, P forces Q to accept his maximally probable larger confirmation bias. opinion (anchor). For not allowing more self-confident P