Downloaded by guest on September 27, 2021 n ihl Starnini Michele and otx,tedbt rudeh hmesi udmna to fundamental is this consump- chambers on social In echo influence on ’s (33). around social studies understanding consumption debate comparative the news issue of context, This concerning scarcity 32). the especially 27, existence by media, (2, very fueled questioned the also been and is have 23, chambers effects (13, echo the as of polarization echo recently, such group tendencies, out However, and human point 29–31). contagion, of studies effect exposure, (25), Some emerging selective forums (26–28). an (24), as sites more chambers as media in toward such exist social group media to and entire online shown of the been forms have move within various chambers result, opinion Echo a existing positions. an as extreme reinforce and, to group mechanism a a as act can 22). 21, 17, (1, of media emergence social the on explain chambers may echo opinions) information preexisting seek to to tendency sources adhering the (19) (i.e., or exposure (20) gets Selective bias peers confirmation topic attitudes. and with and a the tendencies interactions about similar repeated which users having to of in can due environments We or reinforced 14–18). as leaning, around (1, political chambers chambers formed opinion, echo echo groups define is, join that broadly and narrative, beliefs to shared - their exposed favor a to usually to tendency sug- adhering are a we by tion show 10), ones process users the (9, Furthermore, selection to (11–13). limited our similar limit remains contents might span gesting algorithms attention foster feed Our and may 8). instance, (1, spreading for platforms. media polarization, of social multitude Online on that a spreading However, claimed information news. affects real to and factors than limited faster consumed travels (7) news is study fake recent information A diffusion way on. disintermediated when reported the the especially information—alters by (6), of behavior cycle—dominated and their information news avoid affect the or decisions seek those people how how understand to need S spreading information . on on consumption segregation higher news We finding Twitter. comparing , and directly and Facebook by on the interactions conclude clus- homophilic online in dominate users of ters aggregation the peers. that like-minded networks show toward results diffusion interaction Our information the the in in bias 2) by and 1) media social ingredients: over main chambers two echo quantify We Twitter. Red- and Facebook, topics tag">Gab, dit, from controversial abortion) several vaccination, 100 control, concerning gun than (e.g., content more of of analysis pieces comparative million chambers’ a echo perform and We spreading they formation. information how influence and to platforms likely media are social main dif- the key the between explores ferences paper This platforms. media social However, across chambers. vary greatly algorithms echo feed and and users is, among framing that interaction the users narrative, like-minded shared and of a perspectives groups 2020) reinforcing diverse of 15, to November formation review exposure the for the (received favor limit 2021 14, may January media approved and Social Norway, Oslo, Oslo, of University Underdal, Arild by Edited Italy Rome, 00185 Rome, of University Sapienza Science, Italy; Torino, 10126 Foundation, (ISI) a media social on Cinelli Matteo effect chamber echo The PNAS eateto niomna cecs nomtc n ttsis aFsaiUieiyo eie 07 eie Italy; Venice, 30172 Venice, of Univerity Ca’Foscari Statistics, and Informatics Sciences, Environmental of Department codn ogopplrzto hoy(3,a cochamber echo an (23), theory polarization group to According eacs nomto n omoroiin 15.We (1–5). which opinions by our form mechanism and the information changed access radically we media ocial 01Vl 1 o e2023301118 9 No. 118 Vol. 2021 a inac eFacsiMorales Francisci De Gianmarco , | cochambers echo b c eateto nomto niern,Uiest fBeca 52 rsi,Iay and Italy; Brescia, 25123 Brescia, of University Engineering, Information of Department | oilmedia social | polarization b lsadoGaleazzi Alessandro , doi:10.1073/pnas.2023301118/-/DCSupplemental edt) edti n ftems iie estsworldwide visited most features the distinctive of sub- also one called ( design interest but is and of Reddit Twitter) communities upvotes) in ). to structured or is similar Reddit likes (e.g., is as allow Gab such all (e.g., they platforms actions (e.g., These feedback functionalities plat- Gab. social and from multiple features and diffusion on similar Reddit, information analysis present Twitter, in our Facebook, bias focus forms: 2) We concern- and sources. interactions topic like-minded in specific homophily a 1) ing aspects: looking we by different the presence Then, two chambers’ 2) platform. echo at and assess the to user’s vaccines) on elements interactions these politics, the use social (e.g., of their topic of inference specific structure and the a quantified for 1) be character- leaning can namely, that that measured, observables elements empirically into common chambers two echo explore ize the to particular, In operationalize ground formation. their we methodological influence cham- common platforms echo different a of how definition provide operational to an bers introduce dynam- We social (35). different ranging in very ics comments triggering plat- and users, thus likes Facebook, Different to to on Twitter groups on (34). paradigms mentions and problems interaction retweets from different long-debated to offer exposure offer on can selective chambers forms view platforms of echo multiple fresh case considering of the a studies in formation shown outlets, how the recently news and influence explore As platforms not. to we media or likely paper, social this are between In they differences formation. key opinion the public and tion ou ocletdsusoso ierneo ois from topics, platform of social a range be wide to claims a Gab on support. emotional discussions to collect politics to forum a ulse eray2,2021. 23, February Published at online information supporting contains article This 1 under distributed BY).y is (CC article access open This Submission.y Direct PNAS a is article This interest.y competing paper.y no the declare wrote authors and The data, analyzed tools, performed research, reagents/analytic designed new .S. contributed and research, W.Q., A.G., G..F.M., M.C., contributions: Author owo orsodnemyb drse.Eal walter.quattrociocchi@uniroma1..y : addressed. be may correspondence whom To https://www.alexa.com/siteinfo/reddit.com oprsno escnupino aeokadReddit and Facebook Facebook. on on segregation higher consumption shows news homophilic of direct in a However, aggregation comparison dynamics. online the dominates like-minded users that toward of clusters show diffusion results information networks Our the interaction peers. in the two bias in on 2) homophily and focuses 1) analysis dimensions: The Twitter. main Gab, from and abortion) Reddit, vaccination, Facebook, control, on gun controver- (e.g., analysis concerning topics content comparative sial of pieces a million 100 perform than the more assess information we To chambers. influence dynamics, echo to different of likely formation the are and they spreading how media and social main platforms the between differences key the explore We Significance c atrQuattrociocchi Walter , https://doi.org/10.1073/pnas.2023301118 b raieCmosAtiuinLcne4.0 License Attribution Commons Creative nttt o cetfi Interchange Scientific for Institute . y https://www.pnas.org/lookup/suppl/ d eateto Computer of Department d,1 n sognzdas organized is and ) , | f8 of 1

PSYCHOLOGICAL AND COMPUTER SCIENCES COGNITIVE SCIENCES aimed at protecting freedom of speech. However, low modera- polarization and homophilic interactions, should be quantified tion and regulation on content has resulted in widespread hate independently. speech. For these reasons, it has been repeatedly suspended by its service provider, and its has been banned from Implementation on . This section explains how we both App and Play stores (36). Overall, we account for the inter- implement the operational definitions defined above on differ- actions of more than 1 million on the four platforms, ent social media. For each , we detail 1) how we quantify for a total of more than 100 million unique pieces of content, users’ leaning, and 2) how we reconstruct how the information including posts and social interactions. Our analysis shows that spread. platforms organized around social networks and algo- Twitter. We consider the of tweets posted by user i that rithms, such as Facebook and Twitter, favor the emergence of contain links to news outlets of known political leaning. Each echo chambers. news outlet is associated with a political leaning score rang- We conclude the paper by directly comparing news consump- ing from extreme left to extreme right following the Materials tion on Facebook and Reddit, finding higher segregation on and Methods classification. We infer the individual leaning of Facebook than on Reddit. a user, i, xi ∈ [−1, +1], by averaging the news organizations’ scores linked by user i according to Eq. 1. We analyze three Characterizing Echo Chambers in Social Media different datasets collected on Twitter related to controversial Operational Definitions. To explore the key differences between topics: gun control, Obamacare, and abortion. For each dataset, social media platforms and how they influence echo chambers’ the social interaction network is reconstructed using the fol- formation, we need to operationalize a definition for them. First, lowing relation so that there is a direct link from node i to we need to identify the attitude of users at a microlevel. On node j if user i follows user j (i.e., the source). Henceforth, we online social media, the individual leaning of a user i toward a focus on the dataset about abortion, and others are shown in SI specific topic, xi , can be inferred in different ways, via the con- Appendix. produced or the endorsement network among users (37). Facebook. We quantify the individual leaning of users consid- Concerning content, we can define the leaning as the attitude ering endorsements in the form of likes to posts. Posts are expressed by a piece of content toward a specific topic. This produced by pages that are labeled in a certain number of cat- leaning can be explicit (e.g., arguments supporting a narrative) egories, and, to each category, we assign a numerical value (e.g., or implicit (e.g., framing and agenda setting). Let us consider a Anti-Vax [+1] or Pro-Vax [–1]). Each like to a post (only one

user i producing a number ai of contents, Ci = {c1, c2, ... , cai }, like per post is allowed) represents an endorsement for that con- where ai is the activity of user i, and each content leaning is tent, which is assumed to be aligned with the leaning associated assigned a numeric value. Then the individual leaning of user with the page. Thus, the user’s leaning is defined as the average i can be defined as the average of the leanings of produced of the content leanings of the posts liked by the user, according contents, to Eq. 1. We analyze three different datasets collected on Facebook Pai j =1 cj regarding a specific topic of discussion: vaccines, science versus xi ≡ . [1] ai conspiracy, and news. The interaction network is defined by con- sidering comments. In such an interaction network, two users are Once individual leanings have been inferred, polarization can connected if they cocommented on at least one post. Henceforth, be defined as a state of the system such that the distribution of we focus on the dataset about vaccines and news, and others are leanings, P(x), is concentrated in one or more clusters. A pos- shown in SI Appendix. sible is the case of a single cluster, distinguishable by Reddit. The individual leaning of users is quantified similarly to a single, extreme peak in P(x). Another example is the typical Twitter by considering the links to news organizations in the case of topics characterized by positive versus negative stances, content produced by the users, submissions, and comments. We in which a bimodal distribution can describe polarization. For build the interaction network considering comments and submis- instance, if opinions are assumed to be embedded in a one- sions. There exists a direct link from node i to node j if user i dimensional space (38), x ∈ [−1, +1] without loss of generality, comments on a submission or comment by user j (we assume as usual for controversial topics, then polarization is charac- that i reads the comment they are replying to, which is written terized by two well-separated peaks in P(x), for positive and by j ). negative opinions. In contrast, neutral ones are absent or under- We analyze three datasets collected on different subreddits: represented in the population. Note that polarization can happen the donald, Politics, and News. In the following, we focus on the independently from the structure or the very presence of social dataset collected on the Politics and the News subreddits, and interactions. Homophily in social interactions can be quantified others are shown in SI Appendix. by representing interactions as a and then ana- Gab. The political leaning xi of user i is computed by consider- lyzing its structure concerning the opinions of the users (18, 39, ing the set of contents posted by user i containing a link to news 40). Social networks can be reconstructed in different ways from outlets of a known political leaning, similarly to Twitter and Red- online social media, where links represent social relationships dit. To obtain the leaning xi of user i, we averaged the scores of or interactions. Since we are interested in capturing the possi- each link posted by user i according to Eq. 1. The interaction ble exchange of opinions between users, we assume links as the network is reconstructed by exploiting the cocommenting rela- substrate over which information may flow. For instance, if user tionships under posts in the same way as for Facebook. Given i follows user j on Twitter, user i can see tweets produced by two users i and j , an undirected edge between i and j exists if user j , and there is a flow of information from node j to node i and only if they comment under the same post. in the network. When the reconstructed network is directed, we assume the link direction points to potential influencers (oppo- Comparative Analysis site of information flow). Actions such as mentions or retweets In the following, we perform a comparative analysis of four dif- may convey similar flows. In some cases, direct relations between ferent social media. We select one dataset for each social media: users are not available in the data, so one needs to assume some Abortion (Twitter), Vaccines (Facebook), Politics (Reddit), and proxy for social connections, for example, a link between two Gab as a whole. Results for other datasets for the same medium users if they comment on the same post on Facebook. Crucially, are qualitatively similar, as shown in SI Appendix. We first char- the two elements characterizing the presence of echo chambers, acterize echo chambers in the networks’ topology, and then look

2 of 8 | PNAS Cinelli et al. https://doi.org/10.1073/pnas.2023301118 The effect on social media Downloaded by guest on September 27, 2021 Downloaded by guest on September 27, 2021 htuesd o pi nogop ihopst enn u form but leaning opposite indicating with area, groups into bright split single cor- not a do The display users picture. that 1 different . in a plots show responding Gab see differ- and platform; for Reddit media found social Conversely, same is the and behavior from topics Similar vaccines ent neighbors. leaning of average nearest the strong topics their and user a of a The show of leaning area. respectively, the between Twitter, that correlation and in Facebook on users the abortion, of in users density of the number the space phase representing maps, contour coded the on plotted and distributions leaning) probability vidual The consideration. neighbors, under their of leaning the and i A network, tion h cocabrefc nsca media social on effect chamber echo The distribution Marginal al. users. et of Cinelli number the larger the color, the lighter The users: of density the the on represent plotted Colors Gab. ( D) and Facebook, 1. Fig. user node net- to a In get into contents. they translates similar x thus are this to and probability, terms, users leanings, higher work where a similar chambers, with with exposed, echo peers reveal by surrounded can Networks. Interaction topology the work’s in Homophily directly and we Polarization Finally, Reddit. diffusion. and Facebook information on consumption on news compare effects their at k i.1sostecreainbtentelaigo user a of leaning the between correlation the shows 1 Fig. . i i 1 → ij oelkl ob once ihndswt enn close leaning a with nodes with connected be to likely more x 0 = i P i 1) hscnetcnb unie ydfiig o each for defining, by quantified be can concept This (18). h vrg enn fternihoho,as neighborhood, their of leaning average the , j on itiuino h enn fusers of leaning the of distribution Joint A tews,and otherwise, ij x j (x x where , and A , x x ij N y and 1 = h rgtrteae ntepa,telarger the plan, the in area the brighter The ): xs epciey aeokadTitrpeetb oohlcclustering. homophilic by present Twitter and Facebook respectively. axes, P A fteei ikfo node from link a is there if N y ij k C AB (x i → xs epciey l lt r color- are plots All respectively. axes, steajcnymti fteinterac- the of matrix adjacency the is ) = aeaelaigo egbr)are neighbors) of leaning (average P j A x ij i N steotdge fnode of out-degree the is o h orsca media social four the for , Fcbo Gab Facebook Twitter Reddit Twitter i x ihagvnleaning given a with n h vrg enn fterneighborhood their of leaning average the and IAppendix. SI i P onode to (x h net- The ) (indi- x i N ≡ j i , uoso es(54) nteSRmdl ahaetcan agent each model, SIR as such the information, In of (45–47). diffusion news the study or to - rumors used epidemic (44) been model Classical (SIR) 43). have susceptible–infected–recovered peers the 42, with as (18, such information leaning els exchange similar are to Users a likely chambers: sharing more echo be of to presence the expected the of gauge can state spreading Spreading. Information polarized on the Effects confirms 0 to lean- average with close systems. communities similar of very show absence total leaning all not almost and the do homophilic Furthermore, spectrum, Gab ing. of and whole observation Reddit the on the cover communities with Instead, accordance interactions. in These Facebook. are of case leanings results the in similar a especially by leaning, with whole characterized average are robust users the communities Some but span community. each leanings, communities form possible Twitter, of and spectrum Facebook the On shows while nity. arranged 2 red), the medium, to on Fig. social leaning blue members. each average its increasing for by of emerging communities leanings the individual we the of as Then, determined user. average leaning, average one We community’s only (41), each networks. with computed algorithm communities Louvain interaction singleton the the removing applying of by communities structure detected community the see by Reddit; on datasets different for Appendix. found SI (Gab). right are the or results (Reddit) left Similar the to biased community, single a D h rsneo oohlcitrcin a econfirmed be can interactions homophilic of presence The x NN edt (C Reddit, (B) Twitter, (A) datasets. different for y xsrprstesz ftecommu- the of size the reports axis https://doi.org/10.1073/pnas.2023301118 ipemdl finformation of models Simple x xs(oo-oe from (color-coded axis P (x and ) PNAS P | N (x f8 of 3 are ) )

PSYCHOLOGICAL AND COMPUTER SCIENCES COGNITIVE SCIENCES ABPro 10000 Extreme Abortion Right 1000 Against Extreme Abortion 1000 Left

100 100 Community Size Community 10 Size Community 10

1 1 12345 0 5 10 15 20 Community ID Community ID Twitter Reddit

C105 D Anti Vaccines 10000 Extreme Right

Pro Vaccines Extreme 1000 Left

103

100 Community Size Community Community Size Community 10 101

1 0204060 0 5 10 15 20 Community ID Community ID Facebook Gab

Fig. 2. Size and average leaning of communities detected in different datasets. A and C show the full spectrum of leanings related to the topics of abortions and vaccines with regard to communities in B and D, where the political leaning is less sparse.

be in any of three states: susceptible (unaware of the circu- β and average degree hki depends on the specific dataset and is lating information), infectious (aware and willing to spread it reported in the caption of each figure. further), or recovered (knowledgeable but not ready to transmit Again, one can observe a clear distinction between Facebook it anymore). Susceptible (unaware) users may become infectious and Twitter, on one side, and Reddit and Gab on the other side. (aware) upon contact with infected neighbors, with a specific For the topics of vaccines and abortion, on Facebook and Twit- transmission probability β. Infectious users can spontaneously ter, respectively, users with a given leaning are much more likely become recovered with probability ν. To measure the effects of to be reached by information propagated by users with similar the leaning of users on the diffusion of information, we run the leaning, that is, hµ(x)i ≈ x. Similar behavior is found for differ- SIR dynamics on the interaction networks, by starting the epi- ent topics from the same social media platform; see SI Appendix. demic process with only one node i infected, and stopping it Conversely, Reddit and Gab show a different behavior: The aver- when no more infectious nodes are left. age leaning of the set of influence, hµ(x)i, does not depend on The set of nodes in a recovered state at the end of the dynam- the leaning x. As expected, the average leaning in these media ics started with user i as a seed of infection, that is, those that is not zero. Still, it assumes negative (positive) values in Reddit become aware of the information initially propagated by user i, (Gab), indicating that the users of this platform are more likely form the set of influence of user i, Ii (48). Thus, the set of influ- to receive left (right)-leaning content. ence of a user represents those individuals that can be reached These results indicate that information diffusion is biased by a piece of content sent by him/her, depending on the effective toward individuals who share a similar leaning in some social infection ratio β/ν. One can compute the average leaning of the media, namely Twitter and Facebook. In contrast, in others— set of influence of user i, µi , as Reddit and Gab in our analysis—this effect is absent. Such a latter configuration may depend upon two factors: 1) Gab and −1 X Reddit are not bursting the echo chamber effects, or 2) we are µi ≡ |Ii | xj . [2] observing the dynamic inside a single echo chamber. j ∈Ii Our results are robust for different values of the effective infection ratio β/ν; see SI Appendix. Furthermore, Fig. 3 shows The quantity µi indicates how polarized the users are that can be that the spreading capacity, represented by the average size of reached by a message initially propagated by user i (18). the influence sets (color-coded in Fig. 3), depends on the lean- Fig. 3 shows the average leaning hµ(x)i of the influence ing of the users. On Twitter, proabortion users are more likely sets reached by users with leaning x, for the different datasets to reach larger audiences. The same is true for antivax users on under consideration. The recovery rate ν is fixed at 0.2 for Facebook, left-leaning users on Reddit, and right-leaning users every dataset. In contrast, the ratio between the infection rate on Gab (in this dataset, left-leaning users are almost absent).

4 of 8 | PNAS Cinelli et al. https://doi.org/10.1073/pnas.2023301118 The echo chamber effect on social media Downloaded by guest on September 27, 2021 Downloaded by guest on September 27, 2021 h cocabrefc nsca media social on effect chamber echo The al. et Cinelli effect this Reddit, On chambers. indi- therefore echo are, absent. of is information presence the the of cating recipients affects final leaning the user’s seed who the Facebook, on 4 the Moreover, in majority. Fig. (noticeable of leaning histogram extreme more marginal media, more a social are displaying latter the users leanings In even users’ peak. one Reddit, only depend- on show and users while, homogeneous On among leaning, their datasets: separation on clear other ing a for observe obtained we picture Facebook, leaning the with confirm 4, users measures (Fig. by dynamics reached SIR sets influence the of 4, (Fig. networks the the between quantify 4, user correlation (Fig. to a the sections 1) of previous chambers: a leaning the echo shows in of 4 used presence Red- Fig. metrics the and the network. Facebook on along interaction consumption among dit an news connections of on creating comparison based direct in is rationale that the users provided and classification details) the ther see using mediabiasfactcheck.org; indi- by of by definition (computed the leaning share they common vidual since particularly a cross-comparison are this a on Reddit to For and Reddit apt Facebook account. and consumption. Facebook into news be compare topic, taken we could topics in here diffusion, different homophily reason, the information of to and terms attributed in networks media, interaction social the Reddit. across and observed Facebook ences on Consumption News (A) to set are dynamics SIR the of parameters The sets. influence the (D) of size average the represent i.3. Fig. β = 0.05hk vrg leaning Average ,2 h vrg enn fcmuiisdtce in detected communities of leaning average the 2) Top), i −1 while , x n h vrg enn fneighbors of leaning average the and hµ(x ,ad3 h vrg leaning average the 3) and Middle), ν sfie t02fralsimulations. all for 0.2 at fixed is C AB )i fteiflec esrahdb sr ihleaning with users by reached sets influence the of .Oecnseta l three all that see can One Bottom). B, aeil n Methods and Materials edt neatwt the with interact to tend Top) h tiigdiffer- striking The x yrun- by , o fur- for hµ(x x )i N ini oohlccutr fuesdmntsoln dynamics. aggrega- online dominates the users that of show clusters homophilic results in Our diffusion tion in information peers. the homophily like-minded in 1) bias toward 2) dimensions: and main Twitter. of networks two interaction and pieces the on Reddit, focuses million Facebook, vac- analysis Gab, control, 100 The from gun than abortion) (e.g., topics more cination, controversial on concerning infor- perform analysis content and we chambers dynamics, comparative different are differences echo the a they of key assess how To formation the spreading. and the mation explore platforms influence media to we differ- social likely paper, very leading trigger this the on may between In consumption platforms dynamics. different content ent and dominates media, exposure social How- patterns. selective spreading information ever, for determinant a as used narratives. quickly shared misinformation high, around proliferates. infor- is groups dissenting polarization when polarized ignore Furthermore, infor- form , prefer and to their mation, tend to online especially adhering users debate, mation Indeed, public the political topics. of making, polarizing and evolution policy on the influence and may social it , of narratives; construction of framing the affected a Such content attitudes. shift the and preferences influence users’ for for and designed mediate unprece- algorithms Indeed, originally an spread. feed information Platforms way to the changed content. access entertainment user of direct amount provide dented platforms media Social Conclusions D x oeage httevrct fteifrainmgtbe might information the of veracity the that argued Some o ifrn aaesudrcnieain ieadclro ahpoint each of color and Size consideration. under datasets different for , β = 0.10hk i −1 (B) , β https://doi.org/10.1073/pnas.2023301118 = 0.01hk i −1 (C , ) β = 0.05hk PNAS i | −1 f8 of 5 and ,

PSYCHOLOGICAL AND COMPUTER SCIENCES COGNITIVE SCIENCES AB

Fig. 4. Direct comparison of news consumption on (A) Facebook and (B) Reddit. Joint distribution of the leaning of users x and the average leaning of their nearest neighbor xN (Top), size and average leaning of communities detected in the interaction networks (Middle), and average leaning hµ(x)i of the influence sets reached by users with leaning x, by running SIR dynamics (Bottom) with parameters β = 0.05hki for A, β = 0.006hki for B, and ν = 0.2 for both. Facebook presents a highly segregated structure with regard to Reddit.

However, a direct comparison of news consumption on Face- in the information diffusion toward like-minded users. A clear- book and Reddit shows higher segregation on Facebook. Fur- cut distinction emerges between social media having a feed thermore, we find significant differences across platforms in algorithm tweakable by the users (e.g., Reddit) and social media terms of homophilic patterns in the network structure and biases that don’t provide such an option (e.g., Facebook and Twitter).

6 of 8 | PNAS Cinelli et al. https://doi.org/10.1073/pnas.2023301118 The echo chamber effect on social media Downloaded by guest on September 27, 2021 Downloaded by guest on September 27, 2021 1 .L Schmidt L. A. 11. Cinelli M. 10. 3 .Bkh,S esn,L .Aai,Epsr oielgclydvrenw and news diverse ideologically to Exposure Adamic, A. L. Messing, S. Bakshy, E. 13. Cinelli M. 12. fSrie ntecrepnigato ntttoa aea hslink: this at page institutional author Terms corresponding their to the according on data Services provide we of Twitter, and link: Facebook this concerns at Pushshift repository the on public available are data Reddit this https://files.pushshift.io/gab/. at link: (https://pushshift.io/what-is-pushshift-io/) repository public and Pushshift process Availability. labeling Data source political the a in of have found be description we can which detailed distribution for bias A political outlets 2,190. media MBFC, of polit- is bias for number by label Right political total provided Extreme to the The labeling Left bias. of Extreme The ical from and share. ranges (https:// 2019, reliability and June the produce in (MBFC) retrieved of they contents basis Check the the of Bias/Fact on outlets that Media news organization rates fact-checking by independent an reported mediabiasfactcheck.com), information Sources. the Media of Labeling datasets the and outlets news of labeling considered. the about details provide we Here Methods and Materials their impact feed- can platforms, social formation. distinct different to of how specific dimension mechanisms, better back temporal understand the to chambers, addresses media. echo step social envisioned on next consumption of The information understanding and the dynamics into insights social important provides work Our h cocabrefc nsca media social on effect chamber echo The al. et Cinelli .A aocel,Teeegneo osnu:Aprimer. A news: consensus: fake of emergence and The Polarization Baronchelli, A. Zollo, F. 9. Scala, A. online. Quattrociocchi, news W. false and Vicario, true D. of M. spread know. The 8. Aral, to news S. want Roy, online they D. and what Vosoughi, decide S. chambers, people 7. echo How bubbles, Sunstein, R. Filter C. Rao, Sharot, T. M. J. Digital 6. Goel, Institute “ S. Nielsen, Flaxman, R. S. Kalogeropoulos, 5. media. A. social Fletcher, from of R. politics effect Newman, Learning feed: N. moderating news The 4. the overstated: in news is Political chamber Bode, L. echo The 3. Blank, G. Dubois, E. 2. Vicario Del M. 1. c.U.S.A. Sci. (2020). e0229129 (2018). targets. misinformation potential (2019). of warning Early (2018). 1146–1151 Behav. 2019). , of consumption. Study the for Institute Reuters 2019, (Rep. 2019” Report News Soc. Commun. media. diverse and interest political U.S.A. pno nFacebook. on opinion 5–5 (2016). 554–559 113, 41 (2020). 14–19 4, h OI-9sca ei infodemic. media social COVID-19 The al., et 0533 (2017). 3035–3039 114, eetv xouesae h aeoknw diet. news Facebook the shapes exposure Selective al., et ntm fnw osmto nFacebook. on consumption news of Anatomy al., et ul pn Q. Opin. Publ. 44 (2016). 24–48 19, h pedn fmsnomto online. misinformation of spreading The al., et o htcnen a,aldt r vial nthe on available are data all Gab, concerns what For Science aeokSiCn aur 005y7,7 8,7 1.00 183,378 75,172 content. Conspiracy and y 5 contents unique of number Politics 2010 January Reddit 2016 June Sci/Cons control Facebook Gun Dataset Twitter Media details Dataset 1. Table a a oebr21 3mlin15120.13 165,162 million 13 y 1 2017 November in information (more (API) Interface Programming Application the via topic Gab Twitter, Gab o ahdtst erpr h trigdt fcollection of date starting the report we dataset, each For 9–2 (2016). 298–320 80, 1013 (2015). 1130–1132 348, h aeigo esotesi ae on based is outlets news of labeling The T o what For https://search.pushshift.io/reddit/. ersnstewno osml cieues fwihw ereealo h wesrltdt the to related tweets the of all retrieve we which of users, active sample to window the represents n.Cmu.Soc. Commun. Inf. acnsJnay21 47621781.00 221,758 94,776 y 7 2010 January News Vaccines esJnay21 2,3 7,4 0.20 179,549 723,235 y 1 2017 January News the 2016 June 2016 June Abortion Obamacare IAppendix SI oadJnay21 .3 ilo 3,1 0.16 138,617 million 1.234 y 1 2017 January donald c.Rep. Sci. C 2–4 (2018). 729–745 21, C rn.Web Trans. ACM ubro users of number , .Sc pnSci. Open Soc. R. . 69 (2020). 16598 10, rc al cd Sci. Acad. Natl. Proc. aur 006y1,4 8631.00 38,663 15,540 y 6 2010 January aur 071y338420450.15 240,455 353,864 y 1 2017 January rc al Acad. Natl. Proc. lSOne PloS Science T a.Hum. Nat. 0 172189 5, 1–22 13, N n coverage and , Mass 359, 15, 3 .R usen h a fgoppolarization. group of law The Sunstein, R. C. 23. Bessi A. 22. guises. many Vicario in Del M. phenomenon ubiquitous 21. A bias: Confirmation Nickerson, S. R. chamber 20. echo Klapper, T. Quantifying J. Starnini, 19. M. Pastor-Satorras, R. Ferreira, C. S. Cota, of W. effect 18. “The Mathioudakis, M. Gionis, A. Morales, Francisci De G. Garimella, “Politi- K. 17. Mathioudakis, M. Gionis, A. expo- Morales, Francisci De selective G. motivated Garimella, Politically online?: K. 16. chambers Echo Garrett, K. R. 15. Cappella, N. J. Jamieson, H. K. 14. iu nihsfrtedvlpeto hsppr eaegaeu to grateful and are analysis We data paper. the inspiring this for of Hypnotoad interpretation. the development result and the Stilton for Geronimo insights cious ACKNOWLEDGMENTS. MBFC to in according found labeled be can were details Posts Further quotations. classification. and replies, The from MBFC. posts, collected from tains the been obtained Politics, has classification dataset the subreddit to Gab the according on labeled in com- based and downloading posted classified News by submissions instead obtained were and been provac- have News ments respectively, datasets dataset Reddit way, the labeling. binary in MBFC Posts a they where posted. page in were the on labeled based proscience/conspiracy, were Conspiracy and and cines/antivaccines data Science datasets Vaccines, two Conspiracy), the and and For (Science Facebook (News). 50 (11) using ref. clas- and in by (Vaccines) are explored (51) created bias previously were were known and datasets a API Facebook Obamacare, with Graph MBFC. source control, on news Gun based a namely to sified (16), linking al. Tweets each abortion. three et regarding and Garimella platforms, tweets by used among collected we Twitter, differences topics For features. structural different the has dataset to Due consideration. Datasets. Empirical section. following the to refer details data, further about For (49). (10.17605/OSF.IO/X92BR ) Framework Science deposited Open been in have inde- data an Anonymized (https://mediabiasfactcheck.com), organization. fact-checking MBFC pendent from data used we cation, classifi- outlet news For https://walterquattrociocchi.site.uniroma1.it/ricerca. 4d1 ilo ,6 0.93 3,963 million 19 d 14 4mlin7410.95 0.90 7,401 8,703 million 34 million 39 d 7 d 7 tion. Facebook. Psychol. Gen. networks. communication political over Sci. spreading information in effects 43–52. in pp. 2017), media” NY, York, social Conference New Science on Web debates ACM controversial International on attention 2018), price Switzerland, the Geneva, Committee, and Steering gatekeepers, Conferences 913–922. pp. Web chambers, Wide ’18 Echo World WWW media: in ence, social bipartisanship” on of discourse cal users. news (2009). among sure Establishment Media n N C T T 0 n 5(2019). 35 8, iespan time , c lSOne PloS fato fueswt lsie enn) For leaning). classified with users of (fraction cec scnprc:Cletv artvsi h g fmisinforma- of age the in narratives Collective conspiracy: vs Science al., et c.Rep. Sci. h fet fMs Communication Mass of Effects The 7–2 (1998). 175–220 2, cocabr:Eoinlcnainadgopplrzto on polarization group and contagion Emotional chambers: Echo al., et 0103(2015). e0118093 10, T .A hmi,F adn .M L. Gandon, F. Champin, A. P. , al eot umr ttsiso h aaesunder datasets the of statistics summary reports 1 Table 72 (2016). 37825 6, xrse ndy d ryas(y), or (d) days in expressed IAppendix SI Ofr nvriyPes 2008). Press, University (Oxford etakFbaaZloadAtnoSaafrpre- for Scala Antonio and Zollo Fabiana thank We rceig fte21 ol ieWbConfer- Web Wide World 2018 the of Proceedings coCabr uhLmag n h Conservative the and Limbaugh Rush Chamber: Echo .SiCn,Scientific Sci/Cons, ). .Cmu.Mdae Commun. Mediated Comput. J. https://doi.org/10.1073/pnas.2023301118 https://files.pushshift.io/gab AscainfrCmuigMachinery, Computing for (Association .Plt Philos. Polit. J. IAppendix SI Fe rs,1960). Press, (Free c dn,Es (International Eds. edini, ´ 7–9 (2002). 175–195 10, . PNAS eSi’7 9th ’17: WebSci oad and donald, 265–285 14, n con- and | P Data EPJ f8 of 7 Rev.

PSYCHOLOGICAL AND COMPUTER SCIENCES COGNITIVE SCIENCES 24. E. Gilbert, T. Bergstrom, K. Karahalios, “Blogs are echo chambers: Blogs are echo 37. K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, Quantifying chambers” in 42nd Hawaii International Conference on System Sciences, (IEEE controversy on social media. TSC ACM Transactions Soc. Comput. 1, 3 (2018). Computer Society, 2009), pp. 1–10. 38. M. H. DeGroot, Reaching a consensus. J. Am. Stat. Assoc. 69, 118–121 (1974). 25. A. Edwards, (How) do participants in online discussion forums create ‘echo cham- 39. G. Kossinets, D. J. Watts, Origins of homophily in an evolving social network. Am. J. bers’?: The inclusion and exclusion of dissenting voices in an online forum about Sociol. 115, 405–450 (2009). . J. Argumentation Context 2, 127–150 (2013). 40. A. Bessi et al., Homophily and polarization in the age of misinformation. Eur. Phys. J. 26. M. Gromping,¨ ‘Echo chambers’: Partisan Facebook groups during the 2014 Thai Spec. Top. 225, 2047–2059 (2016). election. Asia Pac. Media Educat. 24, 39–59 (2014). 41. V. D. Blondel, J. L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfolding of communi- 27. P. Barbera,´ J. T. Jost, J. Nagler, J. A. Tucker, R. Bonneau, Tweeting from left to right: Is ties in large networks. J. Stat. Mech. Theor. Exp. 2008, P10008 (2008). online political communication more than an echo chamber? Psychol. Sci. 26, 1531– 42. K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, “Reducing con- 1542 (2015). troversy by connecting opposing views” in WSDM ’17: 10th ACM International 28. W. Quattrociocchi, A. Scala, C. R. Sunstein, Echo chambers on Facebook. SSRN Conference on Web Search and Data Mining (Association for Computing Machinery, [Preprint] (2016). http://dx.doi.org/10.2139/ssrn.2795110 (Accessed 16 February 2021). New York, NY, 2017), pp. 81–90. 29. I. Himelboim, S. McCreery, M. Smith, Birds of a feather tweet together: Integrat- 43. K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, “Quantifying con- ing network and content analyses to examine cross- exposure on Twitter. troversy in social media” in WSDM ’16: 9th ACM International Conference on Web J. Computer-Mediated Commun. 18, 154–174 (2013). Search and Data Mining (Association for Computing Machinery, New York, NY, 2016), 30. D. Nikolov, D. F. Oliveira, A. Flammini, F. Menczer, Measuring online social bubbles. pp. 33–42. PeerJ Comput. Sci. 1, e38 (2015). 44. R. M. Anderson, R. M. May, Infectious Diseases in Humans (, 31. F. Baumann, P. Lorenz-Spreen, I. M. Sokolov, M. Starnini, Modeling echo chambers Oxford, , 1992). and polarization dynamics in social networks. Phys. Rev. Lett. 124, 048301 (2020). 45. M. Jalili, M. Perc, Information cascades in complex networks. J. Complex. Netw. 5, 32. A. Bruns, Echo chamber? What echo chamber? Reviewing the evidence. QUT ePrints 665–693 (2017). [Preprint] (2017). https://eprints.qut.edu.au/113937/ (Accessed 16 February 2021). 46. L. Zhao, H. Cui, X. Qiu, X. Wang, J. Wang, Sir rumor spreading model in the new 33. P. Barbera,´ “Social media, echo chambers, and ” in Social Media media age. Phys. Stat. Mech. Appl. 392, 995–1003 (2013). and Democracy: The State of the Field, Prospects for Reform, N. Persily, J. Tucker, Eds. 47. C. Granell, S. Gomez,´ A. Arenas, Dynamical interplay between awareness (SSRC Anxieties of Democracy, Cambridge University Press, Cambridge, UK, 2020), pp. and epidemic spreading in multiplex networks. Phys. Rev. Lett. 111, 128701 34–55. (2013). 34. A. Gollwitzer et al., Partisan differences in physical distancing are linked to health 48. P. Holme, Network reachability of real-world contact sequences. Phys. Rev. E 71, outcomes during the COVID-19 pandemic. Nat. Hum. Behav. 4, 1186–1197 (2020). 046119 (2005). 35. Y. Golovchenko, C. Buntain, G. Eady, M. A. Brown, J. A. Tucker, Cross-platform state 49. M. Cinelli, G. De Francisci Morales, A. Galeazzi, W. Quattrociocchi, M. Starnini, propaganda: Russian trolls on Twitter and YouTube during the 2016 US presidential The echo chamber effect on social media. Open Science Framework. https://osf.io/ election. Int. J. Press/Politics 25, 357–389 (2020). x92br/?view only=cdcc87d30e404e01b6ae316db15e3375. Deposited 22 January 2021. 36. S. Zannettou et al., “What is gab: A bastion of speech or an alt-right echo 50. A. Bessi et al., Users polarization on Facebook and YouTube. PloS One 11, e0159641 chamber” in Companion Proceedings of the the Web Conference 2018 (International (2016). Conferences Steering Committee, Geneva, Switzerland, 2018), pp. 51. A. L. Schmidt, F. Zollo, A. Scala, C. Betsch, W. Quattrociocchi, Polarization of the 1007–1014. vaccination debate on Facebook. Vaccine 36, 3606–3612 (2018).

8 of 8 | PNAS Cinelli et al. https://doi.org/10.1073/pnas.2023301118 The echo chamber effect on social media Downloaded by guest on September 27, 2021