obcm oesmlrt hi oilrfrns(1 2 5 26). 25, www.pnas.org/cgi/doi/10.1073/pnas.1615978114 12, (11, referents social beliefs their own to their similar revise more they others, become of to beliefs the about learn ple (20–24). improve to judgment group’s lead network, can a learning interaction social conditions, the right the of under structure that, predicting the on collective based on vary influence social decisions of effects (19– the learning that suggested social have of 21) models theoretical however, with contrast results, direct the these In 11–18). compromise (9, to judgment group argued the of been reliability has which reduced, When are and (11–13). diversity independence both another errors, one correlated to with leads influence align social and to influence, estimates judgments social their their about of revise processes exchange by people which undermined in be may crowds of accu- 10). an 9, leaving (2, out, judgment cancel the group errors below rate these far aggregate and although in above Thus, value, far (9). true both population estimates a have in may opinions individuals of diversity errors preserv- the thereby individuals’ ing accuracy correlated, negatively where or group groups uncorrelated either from that are Statistical taken argue 6–8). estimates 4, on phenomenon relies the (2, this experts that for individual accurate shown explanations of more have judgments be 5) can the individuals than (4, many forecasting of judgment financial aggregated and (3), kets S crowds of wisdom networks social error. estimates in increase group to process, likely more individu- estimation become central collective of the structure. influence dominates the network als where with networks, change centralized accuracy In group of that dynamics show further We the exchange. information of more result a reliably as become accurate decentralized indi- estimates in pre- group that, as networks, theoretical showing communication present even results We experimental estimates, similar. and group more dictions influ- become of social beliefs accuracy which D vidual the under Helbing conditions improves F, network ence Schweitzer general show H, we Rauhut in J, increase (2011) [Lorenz corresponding accuracy a without group diversity beliefs, other’s reducing each wis- estimates observed thereby the subjects individuals’ when that undermines similar showed more influence results became These social crowds. Schools, that of arguing Firms, dom by Groups, view Better this Creates Societies Diversity and of with Power (2008) diverse, S the or [Page be beliefs, beliefs to correlated uncorrelated individuals negatively requires with suggested judgments independent, (1907) group have either of F intelligence accuracy [Galton the collective crowds” that of of theories “wisdom indi- Galton’s 75:450–451], Since distributed the judgments. of of collective groups discovery intelligent how computa- form determine and can biological, to 2016) viduals 8, social, is October review the sciences for (received in tional 2017 14, problem April longstanding approved and A CA, Stanford, University, Stanford Jackson, O. Matthew by Edited 19104 PA Philadelphia, a wisdom the crowds in of influence social of dynamics Network ohaBecker Joshua nebr colfrCmuiain nvriyo enyvna hldlha A114 and 19104; PA Philadelphia, Pennsylvania, of University Communication, for School Annenberg hspeito eie rmteasmto ht hnpeo- when that, assumption the from derives prediction This wisdom the that suggested has evidence experimental Recent er g 1,rslso rwsucn 1 ) rdcinmar- 100 prediction 2), over (1, crowds” on of results “wisdom (1), ago the years of discovery Galton’s ince rcNt cdSiUSA Sci Acad Natl Proc | .Peiu xeietlsuishv supported have studies experimental Previous ]. olcieintelligence collective | a eo Brackbill Devon , xeietlsca science social experimental | 0:0092] ycontrast, By 108:9020–9025]. oillearning social a n ao Centola Damon and , h ifrne How Difference: The | Nature a,b,1 cuayo h ru’ ein vna h ru enremains mean group the as ( even fixed median, the group’s improving the accurate, of more lead become accuracy will to influence group social the to in exposure individuals accurate, is of ( mean mean group beliefs statistical tial the independent on initial, converge predicts the will model distributions this (i.e., connected), belief network equally that is social everyone decentralized which the in a one and in population, embedded a is to population throughout relative (i.i.d.) belief and own distributed independently their is identically “self-weight” on this by place When part information. they refer- in social social weight determined their of is weighted of amount revision beliefs individual’s the a an the on Thus, and (19). based belief ents are own their revisions of individual’s average sug- each theory this that learning, social gests of model DeGroot the Following 1073/pnas.1615978114/-/DCSupplemental at online information supporting contains article This Submission. 1 Direct PNAS a is article paper. This the interest. wrote of D.C. conflict and no D.B., declare J.B., authors and The data; analyzed D.C. and D.B., J.B., research; others, than prominent more are population a in individuals mean. group the of accuracy in the resulting in estimate, improvement group direct the a in influential more indi- be accurate correlation will more viduals that a means weight, self-weight and influence accuracy Because between social belief. to collective that the contributes influence on has self-weight of network a amount in the individual social each identifies of notion which result (20) al. weight, also et influence DeMarzo may accurate. the more median on but becoming builds estimate the prediction mean, population This improve the the of toward only mean the it not in bringing may by influence judgment social accu- then individual with correlated i.i.d. racy, instead not are are but self-weights population, the individuals’ group in if of accuracy that the show affect We can judgments. influence social how for dictions uhrcnrbtos .. .. n ..dsge eerh ..adDB performed D.B. and J.B. research; designed D.C. and D.B., J.B., contributions: Author owo orsodnesol eadesd mi:[email protected]. Email: addressed. be should correspondence whom To edne rdcstems cuaegopjudgments. inde- group accurate not most influence, the produces which pendence, under We crowds. conditions of general wisdom learn- the identify improve generates reliably influence that dynamics social ing theoreti- networks, present expected decentralized We that, is in demonstrating crowds. findings influence experimental of and social predictions wisdom cal result, the a undermine As to accuracy. group increase corresponding in a reducing without estimates, diversity individual and of independence others similarity of increased beliefs to the leads observe evidence to Empirical people allowing beliefs. that suggests individual informa- group among or that diversity independence held tional statistical years have 100 either intelligence requires over collective crowds accuracy of of wisdom theories the ago of discovery the Since Significance ebido h ero oe ognrt hoeia pre- theoretical generate to model DeGroot the on build We u rdcin loso htti rcs a oar fsome if awry go can process this that show also predictions Our IAppendix SI b colo niern,Uiest fPennsylvania, of University Engineering, of School ). . IAppendix SI www.pnas.org/lookup/suppl/doi:10. NSEryEdition Early PNAS .Tu,i h ini- the if Thus, ). | f7 of 1

SOCIAL SCIENCES PNAS PLUS giving them disproportionate levels of social influence in the self-weight, each subject’s influence in the collective estima- population. Theoretical results suggest that when networks are tion process is determined in part by how heavily they weight highly “centralized” in this way, instead of efficiently aggregat- their own opinion compared with the social information they ing all available information, populations are biased toward the receive. beliefs of the central individuals (20), which can significantly This concept of social influence weight comes from the prop- influence the accuracy of the collective judgment (SI Appendix). erties of the DeGroot model, in which members of a popula- This effect of centralization on group estimates has been pre- tion revise their estimates indefinitely according to the process dicted by a variety of social learning models, including both above. Through this revision process, the DeGroot model pre- fixed (20, 23) and growing (21, 24) networks, as well as mod- dicts that, in a wide range of network structures, all members of els of both discrete choice (23, 24) and continuous estimation the population will asymptotically converge on a single shared (20, 21). estimate (19). The collective estimate after social influence is a We test these theoretical predictions using a web-based exper- weighted mean of the initial independent estimates (20). Each imental design (27, 28). We recruited subjects to participate individual’s social influence weight is defined by the size of the in a series of large-group estimation tasks and compared the contribution that their initial (independent) estimate makes to effects of social influence in both centralized and decentralized the final collective estimate (20). The relationship between self- networks to a control condition in which there was no social weight and social influence weight reflects the fact that when a influence. Consistent with previous work, our theoretical results subject places more weight on their own individual belief, they predict that centralized networks will exhibit a bias toward the adjust their belief less in response to others, and thereby con- beliefs of central individuals. However, in contrast to prior work tribute more weight to the group estimate (20). showing that social influence undermines group accuracy (11– In decentralized networks—defined as networks where every- 18), we predict that social influence in decentralized networks one has the same number of ties (29) —the properties of the will improve the accuracy of the group median (SI Appendix). model described above indicate that the arithmetic mean of a Moreover, we also predict that social influence can produce sys- group’s estimate distribution will remain unchanged, even as tematic improvements in the accuracy of the group mean if the social influence leads individuals’ estimates to become more sim- individual revision process is not i.i.d., but is correlated with indi- ilar. This prediction (convergence toward the mean) holds under vidual accuracy. As described below, our experimental design the assumption that self-weight is i.i.d. throughout a popula- permits a direct test of these theoretical predictions based on our tion (SI Appendix). If this process accurately characterizes the extensions of the DeGroot model. effects of influence on the wisdom of crowds, and the initial group mean is accurate, then social influence in decentralized Theoretical Model. We build on DeGroot’s (19) formalization of networks allows individuals to increase the accuracy of their esti- local information aggregation, in which subject i updates their mates without any deleterious effects on group-level accuracy. response estimate, Rt,i , after being exposed to the estimates of One consequence of this process is that the median of the group ¯ their network neighbors, Rt,j ∈Ni . We define a subject’s revision estimate can improve, while the mean stays fixed (SI Appendix). process with three components: their own estimate; the estimates We extend these predictions by analyzing what happens when of network neighbors; and self-weight or the amount of weight this i.i.d. assumption is violated—i.e., there is non-i.i.d. hetero- they place on their own estimate relative to the estimates of their geneity in the degree to which individuals revise their estimates network neighbors. In this model, each subject responds to social based on the estimates of others. Our results predict that if an information by adopting a weighted mean of their own estimate individual’s self-weight is correlated with their accuracy, social and the estimates of their neighbors, according to the rule: influence dynamics may not only be able to improve the median judgment by bringing it toward the mean, but may also result in ¯ Rt+1,i = αi × Rt,i + (1 − αi ) × Rt,j ∈Ni , the mean of the population estimate becoming more accurate as a function of social influence (SI Appendix). where the value Rt,i indicates the response of subject i at time t; αi indicates the self-weight a subject places on their own ini- Experimental Design. We recruited 1,360 participants from the tial estimate; (1 − αi ) indicates the weight they place on the ¯ World Wide Web to take part in a series of estimation chal- average estimate of their network neighbors; and Rt,j ∈Ni indi- cates the average estimate of subject i’s network neighbors at lenges. Subjects were randomized either to one of two experi- time t. Outcomes are therefore determined by three parameters: mental social network conditions or to a control condition. In all the communication network (i.e., who can observe whom); the conditions, participants were prompted to complete estimation tasks and were awarded a monetary prize based on the accuracy distribution of initial estimates, R1; and the distribution of self- of their final estimate. In the network conditions, participants weights, αi . At the population level, this model states that the group were placed into either a decentralized network, in which every- estimate after t revisions can be calculated as a function of a one had equal connectivity, or a centralized network, in which a weighted, directed network of social influence (19). In this social highly connected central member had a disproportionate num- influence network, a tie exists from node i to node j if i can ber of connections (Materials and Methods and SI Appendix, Figs. observe j in the communication network. The tie has a numeric S1 and S7). value that indicates the weight that node i places on the esti- Each social network contained 40 subjects. Within each net- mate of node j, which is determined by αi . For any given node work, all subjects were simultaneously shown the same image i, the sum of the outgoing tie weights equals (1 − αi ). Con- prompt (e.g., a plate of food) and asked to estimate a numeri- sistent with previous implementations of this model (19–21), cal quantity (e.g., the caloric content) (SI Appendix). There were we represent the self-weight that node i places on its own esti- three rounds for each estimation task. In round 1, participants mate (αi ) as a “self-tie” from i to i. The set of each node’s provided an independent estimate based on the prompt. In both outgoing tie weights (including their self-tie) sums to 1. The network conditions, participants were then shown the average sum of each node’s incoming ties (including their self-tie) is estimate of the peers directly connected to them in their social proportional to their overall influence in the network during network and prompted to submit their answers again in round each round of revisions—i.e., their “social influence weight,” 2. Subjects were then shown the average of their peers’ revised which is defined as each subject’s influence in the collective esti- estimates and prompted to submit a final estimate in round 3. mation process (20). Because this sum includes the subject’s Thus, for each question, participants provided one independent

2 of 7 | www.pnas.org/cgi/doi/10.1073/pnas.1615978114 Becker et al. ekre al. a et Becker to (comparable answer true the from the as away represented z-score). truth, SDs of the of terms from in estimate number each measured of therefore ( distance were the question error the in of each changes (SD) reported for deviation standard responses the independent by answers them true dividing estimates, have answers true questions have different some questions of some tasks divided (i.e., estimation estimate scales different the ( across estimate in initial comparisons change the itate mea- of the was magnitude of change the magnitude percent by that the indepen- so as 3). comparisons and 1), sured 3) round all round round (i.e., in change, (i.e., estimates estimates (i.e., estimates percent dent final final revision report between the of made we with were rounds where study), two results our after For of group 1 each each round of of estimates in initial (i.e., the group comparing by and judgments 40 collective of ( groups level group in still the subjects at we analyses with our conditions, trials conducted control experimental our the conducted each with comparison from proper analy- independent control the were ( for required ses groups were 40 subjects control overall size fewer other, the of in each ( total) groups, subjects in control subjects control eight (320 producing set, question unique eight produced process sets. this question questions total, unique In subjects two trials. trials, using repeated trial, seven over each remaining in the questions five In each answered unique. where trial, was each set experimental in question the questions of four answered six subjects In trials, total). in condition subjects network experimental each in (1,040 subjects 520 comprising trials, indepen- imental single, a provided ( network observation each dent at that analysis such main level our trial cluster the we trial, experimental an estimation within multiple tasks completed network each ( level because net- group Moreover, the the at in conducted estimates were collective conditions work of analyses all independent, cally individuals. comprised 40 group of control set and unique condition a network every group— our control that in one in such once or only condition network participated one subject in study—either Each for questions. we group” of “control trials, set a the that experimental formed in collectively the individuals who independent condition, in 40 control asked from responses was collected each that also For questions individuals). of (40 indi- network set (40 centralized network one decentralized and one viduals) to provided questions of set par- information. that social except was any conditions, given condition other not in control were subjects ticipants the of in that to experience identical as subject order More same the networks. the experimental generally, in the questions within of embedded partic- total sets participants control same a All the providing question. observed times, per ipants two estimates answer independent the three initial of oppor- their given the revise given instead still to were tunity participants were These exposed influence. being but without social questions to networks, same the social answer to opportunity into placed not two the which across identical networks, was conditions. social experience network their subject the about that pro- information ensured not were any Subjects question. with per vided estimates information, three social of to total exposure a for after estimates two and estimate emaue h uuaieefc fsca nuneon influence social of effect cumulative the measured We unique each to corresponding conducted were groups Control statisti- not were conditions network the in subjects Because identical an of consisted study the of trial experimental Each were condition control the to randomized were who Subjects aeil n Methods and Materials IAppendix SI and .I oa,w odce 3exper- 13 conducted we total, In ). aeil n Methods and Materials IAppendix SI < 0) enraie all normalized we 100), IAppendix SI IAppendix SI .Nvrhls,for Nevertheless, ). IAppendix SI > ,0,whereas 1,000, IAppendix SI .T facil- To ). .Because ). ). .All ). ). 1,w on htsca nunei eetaie networks decentralized in influence social that found (30, we research previous 31), with consistent Rather, crowds. of wisdom of wisdom the pre- siz- in (13). from influence This crowds social finding 3. on main research and the experimental 1 replicates vious rounds diversity in between reduction SD able average ( the responses in of reduction SD the narrowed trials, 1 significantly Fig. in revision shown As of estimates. group of diversity the Networks. Decentralized in conditions. estimates network group the of of of trajectory each of the majority affected median the influence the social than how and accurate mean ( more members the was 13), its both 5–8, estimate average, (1, group’s the on studies each exhibited that, earlier groups found with trials), Consistent we all crowds. of of 1 inde- wisdom round the (i.e., in central that round of confirming by pendent beliefs analysis the our toward begin We bias individuals. predicted the net- exhibited centralized of that the works and showed accuracy wisdom networks collective in decentralized the increase that predicted affected both found significantly We crowds. structure network Social Results itiuino siae ie,dvriyo pnos erae significantly ( decreased revision of opinions) round of each diversity after (i.e., estimates ( of ( median). truth distribution and of the mean direction when the the accurate in for more estimate became an median provided node and central mean ( the truth Both of an median). provided direction and node opposite central the the in when accurate estimate less became median and ( mean accurate more became networks 13, decentralized in estimate. estimates group of median the in to shown relative changes that truth indicating of the direction whether the on ( in ini- conditioned were the was Results by node node. determined central central was social the median of of group estimate rounds and tial mean two group over the of accurate accuracy more became ( median influence. the for and trials ( mean experimental shown. 13 are in condition network SD each and error Average networks. decentralized 1. Fig. CD C Change in Normalized Error Normalized Error of Mean A oa hnefo on orud3wt ottapd9%errbars, error 95% bootstrapped with 3 round to 1 round from change Total ) oee,ti euto ndvriyddntudriethe undermine not did diversity in reduction this However, P −0.4 −0.3 −0.2 −0.1 0.6 0.7 0.8 0.0 0.1 0.2 0.3 < .1frmean, for 0.01 P eetaie Centralized Decentralized Round One Cumulative ChangeInError feto oiliflec ngopacrc ncnrlzdand centralized in accuracy group on influence social of Effect < Median Mean Median Mean Decentralized Networks B .0,Wloo indrn et,pouiga43% a producing test), rank signed Wilcoxon 0.001, ncnrlzdntok,teefc fsca nuneo the on influence social of effect the networks, centralized In ) Error ByRound IAppendix SI

Away From Round Two ruth Tr P <

oadT uth Tr Toward Round Three .0 o ein.Frcnrlzdntok,the networks, centralized For median). for 0.001 Centralized oiliflec rmtclyreduced dramatically influence Social n D .I h eut htflo,w analyze we follow, that results the In ). nbt ewr odtos h Di the in SD the conditions, network both In ) 13, = A and P A < ndcnrlzdntok,bt the both networks, decentralized In ) B B

.0 o ohconditions). both for 0.001 Standard Deviation Normalized Error r infiat ohtema and mean the Both significant. are of Normalized Estimates 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.3 0.5 0.7 0.9 1.1 n 13, = Standard Deviation by Round Round One Round One NSEryEdition Early PNAS Decentralized Centralized Median Mean Centralized Networks P < Error ByRound Both Networks .1frbt mean both for 0.01

Round Round Two Center Away FromTruth D Center Toward Truth n w rounds two , T 12, = wo

Round Three Round Three P | n < 13 = f7 of 3 0.01 n =

SOCIAL SCIENCES PNAS PLUS produced significant improvements in individual accuracy. Across all 13 trials with decentralized networks, average individ- Partial Correlation ual error was significantly lower in round 3 than it was in round 1, 0.2 < decreasing by 23% on average (n = 13 trials, P 0.001, Wilcoxon 1.00 0.1 signed rank test). In addition to these individual-level improve- ments, we also found that the average error of each group’s 0.0 median estimate was significantly lower in round 3 (0.67 SD) −0.1 than in round 1 (0.76 SD) (n = 13 trials, P < 0.001, Wilcoxon −0.2 Magnitude of Revision Magnitude signed rank test), resulting in a 12% decrease in average error, 0.75 as shown in Fig. 1 A and C. In our analysis of how social influence produced these group- 0.1 − 0.3 0.3 − 0.6 0.6 − 0.9 0.9 − 5.1 −0.1 − 0.1 level improvements in the median, our initial expectation was that −1.9 − −0.8 −0.8 − −0.6 −0.6 − −0.5 −0.5 − −0.3 −0.3 − −0.1 Individual Error self-weights were i.i.d. within each network. On this assumption, (Deviation from Conditional Average) the DeGroot (19) model predicts that social influence in decen- 0.50 tralized networks can improve the group median by pushing it Magnitude of Revision toward the mean of the group’s independent estimate, which is not expected to change (SI Appendix). Remarkably, however, we found that, on average, each group’s mean estimate also became more accurate. After two rounds of exposure to social influ- 0.25 ence, the average error of the group mean at round 3 (0.62 SD) was significantly lower than at round 1 (0.69 SD) (n = 13 trials, P < 0.01, Wilcoxon signed rank test), resulting in a 10% reduc- 0 − 0.1 0.8 − 1 1 − 1.2 1.6 − 2 2 − 10.6 0.1 − 0.3 0.4 − 0.6 1.2 − 1.6 0.3 − 0.4 0.6 − 0.8 tion in the average error of the group mean. These findings can Individual Error be explained with the DeGroot model by observing that individu- (Distance from Truth, Grouped by Decile) als’ self-weights were not identically distributed in the population. Fig. 2 shows that across all network conditions, the magni- Fig. 2. Correlation between revision magnitude and individual error. Each tude of an individual’s revisions from round 1 to 3 was sig- point in the main graph shows the average size of individuals’ revisions nificantly correlated with the magnitude of their initial error from round 1 to 3 for individuals located in each decile of the distribu- (n = 4,340 estimates by 1,040 subjects, ρ = 0.41, 95% CI [0.39, tion of individual error (i.e., average distance from zero error), measured 0.43], P < 0.001, analysis of covariance). Because each individ- for n = 4,340 estimates provided by 1,040 individuals assigned to one of 13 ual completed multiple estimation tasks, we measure this rela- decentralized networks or 13 centralized networks. The graph shows a pos- itive “revision coefficient,” such that individuals with greater error in their tionship between individual accuracy and revision magnitude initial estimates made significantly larger revisions. Controlling for corre- after controlling for correlation between estimates by the same lation between estimates by the same individual (SI Appendix), we find a individual (SI Appendix). The results (Fig. 2) show that initially positive correlation between individual error and individual revision magni- accurate individuals made smaller revisions to their estimates, tude (n = 4,340, ρ = 0.41, 95% CI [0.39, 0.43], P < 0.001). (Inset) On the y axis, whereas initially inaccurate individuals made larger revisions. positive values indicate larger revisions than would be expected based on Consistent with the DeGroot model, one explanation for this the distance between an individual’s estimate and their neighborhood esti- revision pattern is that individuals who were more accurate had mate. On the x axis, positive values indicate greater initial error than would greater self-weight in their revisions than individuals who were be expected given the distance between an individual’s estimate and their less accurate. This explanation is consistent with the observed neighborhood estimate. After controlling for the distance between each individual’s initial estimate and the average estimate of their neighborhood, behavior; however, our analysis also needs to account for the there is still a significant correlation between individual error and individual observation that individuals who were more accurate also had revision magnitude (n = 4,340, ρ = 0.25, 95% CI [0.22, 0.28], P < 0.001). estimates that were closer to their observed neighborhood aver- age. Consequently, the positive correlation between error and revision magnitude may be due to the fact that subjects whose social influence weight in the network, which can pull the group initial estimates were farther from their neighborhood average estimate toward a more accurate mean (SI Appendix). These were inclined to make larger revisions, rather than to the fact analyses suggest a direct positive relationship between the aver- that more accurate individuals had a stronger self-weighting. age revision coefficient among the members of a group and the To control for this potentially confounding effect, we mea- expected improvement in the accuracy of the group mean. Fig. sured the partial correlation between error and revision magni- 3A shows, for decentralized networks, the correlation between tude, while holding constant the distance between the subject’s the improvement in the group mean for each question and the initial estimate and the initial neighborhood estimate. Inset in group’s revision coefficient for that question for each of the Fig. 2 shows that, even with this statistical control, more accurate 59 group estimation tasks completed in decentralized networks. individuals still made smaller revisions to their estimates than Because each group completed multiple estimation tasks, these less accurate individuals (n = 4,340 estimates by 1,040 subjects, analyses control for correlations across multiple estimates made ρ = 0.25, 95% CI [0.22, 0.28], P < 0.001, analysis of covariance). by the same group (SI Appendix). This result suggests that accurate individuals placed more weight Consistent with our theoretical expectations, the correla- on their own estimates and less weight on social information (SI tion shown in Fig. 3A indicates that, in decentralized net- Appendix). By contrast, less inaccurate individuals had a lower works, groups with higher revision coefficients also exhibited self-weight and were more influenced by social information. For larger improvements in group accuracy (n = 59 estimation tasks, clarity, we refer to this partial correlation between accuracy and ρ = −0.71, 95% CI [−0.82, −0.56]. By contrast, Fig. 3B shows self-weight as the revision coefficient. that centralized networks (as discussed below) exhibited no sig- As discussed above, each individual’s social influence weight nificant correlation between a group’s average revision coeffi- in the network is determined in part by their self-weight, so that cient and a change in group accuracy (n = 57 estimation tasks, individuals who place more weight on their own estimate are ρ = −0.16, 95% CI [−0.33, 0.10]). also more influential in the collective estimate. When considered Fig. 3A indicates that, in decentralized networks, the greater in the context of our theoretical model, the correlation shown the correlation between individual accuracy and self-weight, the in Fig. 2 indicates that more accurate individuals had a larger more likely it is that the group mean will improve. Additional

4 of 7 | www.pnas.org/cgi/doi/10.1073/pnas.1615978114 Becker et al. ekre al. et Becker based social categories, two into networks of centralized in estimates dynamics group network Networks. Centralized the learn- to independent due by are explained decen- but influence. not in effects, observed are ing judgment networks collective tralized in mean improvements group the the in either provided of analyses accuracy the no ( in produced changes improvements individual significant these condition, control individuals trials, was by ( trol improvement improvement networks 23% decentralized this the in However, than test). smaller significantly rank in signed even ( Wilcoxon error information individual social average of in absence the decrease (3%) pro- revision small for a opportunity duced The test). sum ( rank trials, Wilcoxon (42%) isons, control networks 8 and centralized experimental and decentral- 13 in (43%) diversity a networks in ( only reduction ized SD showed the condition than average smaller control in significantly the decrease in (3%) groups Between small 3, participants. any and other provided 1 of not rounds estimates were the but times, about able several information with- were answers (i.e., condition their control condition revise the to control in all Subjects the influence). estimates, from social individuals’ out results of the accuracy the with as contrast well as median, the Condition. Control mean. group the of accuracy the con- in generate improvements to sistent likely posi- is influence was social subjects networks, decentralized all for coefficient ( tive revision experimental average all the across Notably, consis- trials, findings. accuracy, coeffi- empirical group our in revision with increases tent positive produce to a sufficient networks, is cient decentralized in that in group’s provided show the are in which change analyses, the with simulation correlated highly decreased. typically is mean node group central the the mean, group of the estimate below initial was the node central and the estimate group initial ( the estimate between and difference accuracy The individuals’ ( node. between mean group network the a of error ( of the members in ( change all mean. the for with group correlation correlated highly ( partial tasks. task—is the estimation estimation 59 given − estimate—i.e., all a show group on networks each magnitudes Decentralized revision response. for their any coefficient provide not revision did the node central networks, the where omitted, are tasks estimation 3. Fig. n P ABC .1) netmto ak nwihgop xiie agrrvso ofcet,te hwdsgicnl rae mrvmnsi h cuayo the of accuracy the in improvements greater significantly showed they coefficients, revision larger exhibited groups which in tasks estimation On 0.51]). 57, = Change in Error of Mean > −0.5 0.0 0.5 .4 rtegopmda ( median group the or 0.94) IAppendix SI ρ orltoswt hne ngopma.Sonaeal5 siaintsscmltdoe 3eprmna ras ncnrlzdntok,two networks, centralized In trials. experimental 13 over completed tasks estimation 59 all are Shown mean. group in changes with Correlations = n − 57, = P 10−. . . 1.0 0.5 0.0 −0.5 −1.0 .6 5 I[ CI 95% 0.16, B < ycnrs,i etaie ewrs hr a osgicn orlto ewe h eiincefiin n h hnei ru mean group in change the and coefficient revision the between correlation significant no was there networks, centralized in contrast, By ) ρ .0,Wloo aksmts) oevr nthe in Moreover, test). sum rank Wilcoxon 0.001, Decentralized .2 5 I[.8 .5) hntecnrlnd a netmt agrta ru en h ru entpclyicesd when increased; typically mean group the mean, group than larger estimate an had node central the When 0.95]). [0.88, CI 95% 0.92, = Revision Coefficient ,sgetn ht nvr ag ouain with populations large very in that, suggesting ), hs mrvmnsi ohtema and mean the both in improvements These IAppendix SI oaayeteeefcs edvddthe divided we effects, these analyze To − .3 .0) ( 0.10]). 0.33, n 1 3eprmna n con- 8 and experimental 13 21; = .Teerslsidct that indicate results These ). n C oto trials, control 8 = P ncnrlzdntok,tecag ntegopma ssrnl orltdwt h eairo h central the of behavior the with correlated strongly is mean group the in change the networks, centralized In ) P IAppendix SI < > .0 o ohcompar- both for 0.001 .4 (complementary 0.64) IAppendix SI Change in Error of Mean −0.5 0.0 0.5 ,wihwas which ), P 10−. . . 1.0 0.5 0.0 −0.5 −1.0 i.S9 Fig. , < n 0.001, 21; = Revision Coefficient Centralized , ieetmt o aho h 7etmto ak nwihthe which in tasks estimation 57 the of each for estimate tive ( estimates of median the and of mean error trials, the the both 1 and of 19% (Fig. accuracy by 32% mean by influence group median social the the truth, of from error away the was increased that estimate ( individual an central SD) the provided (0.36 when Similarly, 3 test). round rank from signed to estimations Wilcoxon SD) group (0.70 these also 1 median in round the 48% of by error the significantly that decreased showed median the for analysis SD), ( (0.56 accuracy aver- influence group social the P in before increase than significant mean lower a group 43% producing the was of mean SD) error group the (0.32 age of influence error social average the after truth, toward 1 was Fig. estimate in shown As analy- use tively. our category Accordingly, each truth. for toward ses was the where individual to responses central produced relative the trials truth 12 from only whereas away estimate, was group individual central the which re- as median, group the and mean below. group ported the social both of instead effects on the is the identify influence to node from used central was away strategy the analytical mean of This 70. group estimate the the if pulled instance, node truth—for esti- central the the which the in of of trials pull mate includes estimate truth”) will from an accurate) away (“center ( with (less egory truth node 120 the central or toward a accurate) group 100 90, (more is is value the 105 true of mean either the side group if opposite instance, the the For cen- and on mean. the group was which the but in from accurate, truth estimates less also was and node mean, tral accu- is group more the was cate- node node than This category central mean. central rate the one which group the in the In estimates of includes of accuracy gory nodes. the influence central increase the to the expected truth”), of toward estimate (“center initial the on < i.3 Fig. in question one least at to responses produced trials 13 All .1 icxnsge akts) orsodnl,tesame the Correspondingly, test). rank signed Wilcoxon 0.01, P < C .1frbt oprsn,Wloo indrn test). rank signed Wilcoxon comparisons, both for 0.01 hw h fet ftecnrlnd ntecollec- the on node central the of effects the shows

Change in Mean −0.5 IAppendix SI B n 0.0 0.5 3til and trials 13 = and B and andCentral NodeatRoundOne Difference Between GroupMean C 2− 2 1 0 −1 −2 hntecnrlindividual’s central the when , C ,sgicnl euigthe reducing significantly ), n i.S8 Fig. , 59, = Centralized NSEryEdition Early PNAS n ρ n 2trials, 12 = = 2til,respec- trials, 12 = − .Tescn cat- second The ). .1 5 I[ CI 95% 0.71, A ndecentralized In ) n 2trials, 12 = P < | n − 0.01, 13 = f7 of 5 0.84,

SOCIAL SCIENCES PNAS PLUS central node offered a response. As described above, because Materials and Methods each group completed multiple estimation tasks, these analy- All subjects who participated in this study provided informed consent during ses control for correlations between multiple estimations made the registration process, and all procedures in this study were approved by the by the same group (SI Appendix). The positive slope in Fig. 3C Institutional Review Board of the University of Pennsylvania. Upon entering (n = 57 estimation tasks, ρ = 0.92, 95% CI [0.88, 0.95]) indicates the experimental platform, participants were randomly assigned to one of that the group estimates in centralized networks moved toward three conditions—a decentralized random network, a centralized network, the initial belief of the central individual—i.e., higher estimates or a control condition (SI Appendix). Once placed into a condition, players interacted in real time for a period of approximately 15 min. For each ques- by the central node made the group mean increase, whereas tion, participants first provided an independent estimate without any social lower estimates made the group mean decrease. information. In the network conditions, participants observed the average response of the peers immediately connected to them in a social network and Robustness. To conclude our analyses, we examined the robust- were prompted to submit their answers again. Subjects were exposed to two ness of our theoretical and experimental findings under varia- rounds of social influence before they submitted their final answer, providing tions in the network parameters, such as average degree, graph a total of three responses to each question. In the control condition, partici- density, and population size. Graph density and average degree pants were given three opportunities to respond, but were not provided any had no effect on the results (SI Appendix, Figs. S13 and S14). social information. Monetary rewards were based on the accuracy of subjects’ However, we found that the effects of social influence on the final response to each question. To ensure that our findings are robust to variations in the distribution wisdom of crowds are significantly strengthened with larger pop- of estimates, we conducted two sets of experimental trials, using questions ulation sizes (SI Appendix). Our analyses indicate that recent that generate distributions with different shapes. In the first set of trials, small-group studies arguing that social influence undermines the subjects were given count-based questions (e.g., “how many candies are wisdom of crowds (even in a decentralized network) (13) in this jar?”). Because these are zero-bounded on the left and unbounded were insufficiently statistically powered to identify the improve- on the right, count-based questions generate highly skewed distributions ments in collective accuracy that we found (SI Appendix, Fig. (1, 13), in which the median is able to improve even if the mean remains S15). Additional simulation analyses, as well as supplementary unchanged (SI Appendix). In the second set of trials, we asked participants analyses of the publicly available data from these studies (SI to provide responses to percentage-based question (e.g., “what percentage of people in this photograph are wearing hats?”). These responses are con- Appendix), show that these effects of population size can both strained to fall between 0 and 100, and did not produce any systematic skew explain the negative findings from previous experiments using in the distribution of estimates (SI Appendix). small groups and demonstrate the generalization of our positive A single experimental trial consisted of 40 individuals placed into a decen- results to larger population sizes. tralized network and 40 individuals placed into a centralized network, all of whom were given the same question set. A control group consisted of 40 Discussion independent individuals who were all given the same question set as the 80 Our study differs in several respects from previous work on the subjects in the corresponding experimental trial. Because the subjects in a network dynamics of . Unlike research on control trial were independent from one another, only one control trial was conducted for each question set. social coordination (28, 32, 33) and group problem-solving (34– In trials where we provided count-based estimation tasks, each group 36), our study does not consider situations where social inter- completed four tasks. We conducted six independent experimental trials of action is necessary for groups to achieve a collective outcome. this kind of task, with four questions each, producing a total of 24 count- Instead, we identify how the network dynamics of social influence based estimations by decentralized networks and 24 count-based estima- can affect collective estimation tasks in situations where social tions by centralized networks. We used a unique question set for each trial, influence has been predicted to have a negative effect on the yielding six unique question sets. To create independent control groups for quality of group judgments (2, 11–18). Our finding that groups each question set, we ran 6 independent control groups, each with 40 indi- have the ability to generate accurate estimates, even in the pres- viduals, producing 24 control group estimations. In trials where we used percentage-based estimation tasks, each group ence of social influence, has useful implications for the design completed five estimation tasks. We conducted 7 independent experimen- of several kinds of collective decision processes. As described in tal trials of this kind of task, with 5 questions each, producing a total previous studies (13), if social influence did indeed undermine of 35 percentage-based estimations by decentralized networks and 35 the wisdom of crowds, then democratic institutions and organi- percentage-based estimations by centralized networks. We used two unique zational decision procedures could be improved by preventing question sets, which were randomly assigned across trials. One set was used people from communicating during a voting process (13). Based in three of the trials; the other was used in four of the trials. To create on these ideas, commercial and nonprofit organizations have independent control groups for each question set, we ran two independent implemented automated aggregation tools to collect individu- control groups, each with 40 individuals, producing 10 control group esti- mations. Because control groups are composed of statistically independent als’ independent beliefs in ways that minimize the information individuals, we only required a single control group for each question set to exchanged between them (37). Our findings argue against this compare with the experimentally replicated trials. In total, we observed 59 approach to aggregation. In contrast, we have shown how social estimations by decentralized networks, 59 estimations by centralized net- learning in networks can amplify the influence of accurate indi- works, and 34 estimations by control groups. viduals, leading to both individual and collective judgments that ACKNOWLEDGMENTS. We thank D. Helbing, F. Schweitzer, A. van de Rijt, are more accurate than those that could typically be obtained J. Kleinberg, S. Levin, I. Couzin, A. Kao, P. Starr, J. Jemmott, M. Delli Carpini, by independent aggregation alone. We therefore anticipate that J. Cappella, and S. Wood for helpful suggestions; A. Wagner and R. Overby process interventions within political discussion settings (38) and for development assistance; participants in the Warren Center for Data and organizational decision contexts (2, 39) may benefit more from Network Sciences Faculty Symposium and the Princeton University Depart- ment of Sociology Graduate Workshop for useful comments; and two approaches that manage communication networks, rather than anonymous reviewers for valuable suggestions that improved this article. approaches that attempt to increase independence in the aggre- This work was supported in part by a Robert Wood Johnson Foundation gation process. Pioneer Grant 73593.

1. Galton F (1907) Vox populi (The wisdom of crowds). Nature 75:450–51. 5. Nofer M, Hinz O (2014) Are crowds on the internet wiser than experts? The case of a 2. Sunstein CR (2006) Infotopia: How Many Minds Produce Knowledge (Oxford Univ stock prediction community. J Bus Econ 84(3):303–338. Press, Oxford). 6. Sjoberg¨ L (2009) Are all crowds equally wise? A comparison of political election fore- 3. Wolfers J, Zitzewitz E (2004) Prediction markets. J Econ Perspect 18:107–126. casts by experts and the public. J Forecast 28(1):1–18. 4. Kelley EK, Tetlock PC (2013) How wise are crowds? Insights from retail orders and 7. Herzog SM, Hertwig R (2011) The wisdom of ignorant crowds: Predicting sport out- stock returns. J Finance 68(3):1229–1265. comes by mere recognition. Judgm Decis Mak 6(1):58–72.

6 of 7 | www.pnas.org/cgi/doi/10.1073/pnas.1615978114 Becker et al. 24. 23. 22. 21. 20. 19. 18. 17. 16. 15. 14. 13. 12. 11. 10. ekre al. et Becker 9. 8. Aeol ,Dhe A oe ,Odga 21)Bysa erigi oilnet- social in learning Bayesian net- (2011) social A of Ozdaglar I, topology Lobel the MA, and Dahleh learning D, Acemoglu Strategic (2015) O Tamuz of A, wisdom Sly neighbours. the E, from and Mossel Learning (1998) networks S social Goyal V, in Bala learning Naive (2010) MO uni- Jackson and B, influence, social Golub bias, Persuasion (2003) J Zwiebel D, Vayanos PM, consensus. DeMarzo a Reaching (1974) MH DeGroot Moussa Socio- decision-making: economic and influence social Herding, (2010) polarization. M group of Baddeley law The (2002) severity CR The Sunstein dollars: about Deliberating (2000) D Kahneman CR, Sunstein D, Schkade under- can (1982) influence social IL How Janis (2011) D Helbing F, Schweitzer H, Rauhut J, discussion. Lorenz group of in attitudes matter dominant a of Enhancement regarding (1971) opinion GD Bishop changing DG, in Myers discussion of role The (1932) A wisdom. Jenness collective of microfoundations Some (2008) SE Page L, Hong Pg E(2008) SE forecasting geopolitical Page a winning for strategies Psychological (2014) al. et B, Mellers works. works. crowds. opinions. dimensional formation. opinion of dynamics analyses. 365(1538):281–290. neuroscientific and psychological shift. 349. p Boston), Mifflin, (Houghton effect. crowd of wisdom the mine Psychol Soc Pers J fact. dom Societies and Schools, Firms, tournament. d admr ,Ese CmrdeUi rs,Cmrde K,p 56–71. pp UK), Cambridge, Press, Univ (Cambridge J Elster H, Landemore eds , bomScPsychol Soc Abnorm J oubaLwRev Law Columbia e cnStud Econ Rev Econometrica dM K M, ¨ ıd mEo Microecon J Econ Am sco Sci Psychol me E nltsP,Nt 21)Sca nuneadtecollective the and influence Social (2013) H Neth PP, Analytis JE, ammer ¨ rutik scooia tde fPlc eiin n Fiascoes and Decisions Policy of Studies Psychological : h ifrne o h oe fDvriyCetsBte Groups, Better Creates Diversity of Power the How Difference: The 20(3):386–391. 83(5):1755–1794. 78(4):1201–1236. Econ J Q 25(5):1106–1115. 100(4):1139–1175. 27(3):279–296. PictnUi rs,Princeton). Press, Univ (Princeton 2(1):112–149. 118(3):909–968. LSOne PLoS rcNt cdSiUSA Sci Acad Natl Proc 8(11):e78433. mSa Assoc Stat Am J hlsTasRScLn ilSci Biol B Lond Soc R Trans Philos e cnStud Econ Rev oi Philos Polit J 108(22):9020–9025. 69(345):118–121. 10(2):175–195. 65(3):595–621. olcieWis- Collective 39. 38. 37. 36. 35. 34. 33. 32. 31. 30. 29. 28. 27. 26. 25. GenK rsrn ,Gaf 20)Mtost lctfrcssfo rus Delphi groups: from Deliberative forecasts elicit ideal: to Methods democratic (2007) A Graefe a J, Armstrong with K, Green Experimenting (2005) RC Luskin intelligence. JS, collective of Fishkin power The 2.0: Decisions (2009) E Bonabeau investigation experimental An figuring: and Facts (2015) exploitation. D Lazer and E, exploration Bernstein J, of Shore structure network The (2007) A Friedman D, Lazer groups. task-oriented in patterns Communication (1950) lan- of A dynamics Bavelas Nonequilibrium (2006) V net- Loreto in A, Barrat influence A, and Baronchelli dynamics L, Dall’Asta Behavioral (2010) Y Vorobeychik M, Kearns S, Judd crowd. the in G individuals of wisdom the benefits influence Social (2011) S Farrell clarification. conceptual networks social An in Centrality (1978) conventions: LC Freeman of emergence spontaneous experiment. The network (2015) social A online Baronchelli an D, in Centola behavior of spread online in The influences (2010) informational D and Centola Normative (2006) influ- JN social Cappella informational L, and Nir normative V, of Price study A (1955) HB Gerard M, Deutsch n rdcinmrescompared. markets prediction and opinion. public and polling Rev spaces. solution and information in 26(5):1432–1446. performance and structure network of Q Sci Adm 22(6):725–730. networks. complex on games guage consensus. and coloring worked accuracy. USA Sci Acad Natl 1(3):215–239. evolution. cultural 1994. of study experimental Science discussions. political judgement. individual upon ences r¸yB elr A ao 21)Tepwro oiliflec nestimation on influence social of power The (2015) J Baron BA, Mellers B, urc¸ay ¨ 50(2):45–52. 329:1194–1197. ea ei Mak Decis Behav J 52(4):667–694. 108(36):E625–E625. omnTheory Commun caPolitica Acta 28(3):250–261. rcNt cdSiUSA Sci Acad Natl Proc bomPsychol Abnorm J Foresight hsRvE Rev Phys 16(1):47–74. 40(3):284–298. (8):17–20. rcNt cdSiUSA Sci Acad Natl Proc 74(3):036105. 51(3):629–636. NSEryEdition Early PNAS 107(34):14978–14982. cutScAm Soc Acoust J o Networks Soc 112(7):1989– la Mage Sloan ra Sci Organ | f7 of 7 Proc

SOCIAL SCIENCES PNAS PLUS