<<

The Polymath Project: Lessons from a Successful Online Collaboration in Mathematics

Justin Cranshaw Aniket Kittur School of Computer Science School of Computer Science Carnegie Mellon University Carnegie Mellon University Pittsburgh, PA, USA Pittsburgh, PA, USA [email protected] [email protected]

ABSTRACT and prior knowledge, and they have high coordination de- Although science is becoming increasingly collaborative, mands. Furthermore, professional advancement in science there are remarkably few success stories of online collab- is typically subject to a highly developed and closed-loop orations between professional scientists that actually result reward structure (e.g., publications, grants, tenure). These in real discoveries. A notable exception is the Polymath factors do not align well with the open, asynchronous, in- Project, a group of who collaborate online to dependent, peer production systems that have typified social solve open mathematics problems. We provide an in-depth collaboration online. However, one notable exception is the descriptive history of Polymath, using data analysis and vi- Polymath Project, a collection of mathematicians ranging in sualization to elucidate the principles that led to its success, expertise from winners to high school math- and the difficulties that must be addressed before the project ematics teachers who collaborate online to prove unsolved can be scaled up. We find that although a small percentage mathematical conjectures. of users created most of the content, almost all users never- theless contributed some content that was highly influential There are many reasons to believe why mathematics might to the task at hand. We also find that leadership played an be well suited for large-scale online collaboration. Mathe- important role in the success of the project. Based on our maticians, unlike most empirical scientists, require little ex- analysis, we present a set of design suggestions for how fu- perimental equipment to do their work, making the asyn- ture collaborative mathematics sites can encourage and fos- chronous nature of online collaborations less of a deterrent. ter newcomer participation. In addition, mathematical ideas transfer between mathemati- cians, especially between those who specialize in the same Author Keywords sub-field, with remarkable speed and efficiency, which might large-scale collaboration, online collaborative mathematics, help minimize coordination costs [30]. online collaborative science, online communities In many ways, the deductive nature of proofs and theorems is inherently collaborative. Although proofs themselves are ACM Classification Keywords typically produced by a small number of primary contribu- H.5.2 Information Interfaces and Presentation: Miscellaneous tors, they often rely on a rich history of prior contributions to the problem made by past mathematicians. Gian-Carlo General Terms Rotta, in reference to Wiles’s proof of Fermat’s last theo- Human Factors rem describes this notion of proof as “a triumph of collab- oration across frontiers and centuries [28].” This pattern is INTRODUCTION a common one in mathematics. As Luca Trevisan explains, The Internet has made creative collaboration possible on an works of “genius” in mathematics typically occur when “a unprecedented scale: tens of thousands of people work to- large collection of ‘incremental’ results and ‘observations’ gether to consolidate knowledge (e.g. Wikipedia), or to write has been made, and all the pieces are there, ready to be put software (e.g. Apache and Linux). However, there have been together [31].” These series of incremental results are often few examples of scientific knowledge creation being done the work of a great many mathematicians sometimes span- by large online groups. Tasks such as scientific discovery ning decades or even centuries. The mathematical “genius” are difficult to carry out on a large scale for a number of is then often one who excels at putting these pieces together. reasons: they are complex, they require significant expertise According to Trevisan, massively-collaborative mathematics could re-define this idea of genius. Instead of the few who are able to “put together the zeitgeist,” genius could rather “be in the union of many minds, each doing nothing more Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are than saying what is obvious to them [31].” not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or In this paper we combine social science theory, with an in- republish, to post on servers or to redistribute to lists, requires prior specific depth descriptive analysis of data gathered from the Poly- permission and/or a fee. CHI 2011, May 7–12, 2011, Vancouver, BC, Canada. math Project, in order to explicate the factors that contributed Copyright 2011 ACM 978-1-4503-0267-8/11/05...$10.00. Thursday, January13,2011 rmepr elgss[ indistinguishable geologists virtually expert on work from craters in classifying resulting volunteers surface, Mars’ of had program thousands research ClickWorkers of scientific NASAs hundreds the example, of For parts process. crowdsource using to of stories Internet success notable the of number a been have There norms, [ and structure reward culture in institutional differences in and and also distance but geographical in differences only time not rooted [ are a success costs of have These likelihood can the on that effect costs negative significant coordination high impose [ laborations boundaries ge- institutional across expertise and and ographic resources share to agencies funding in- multiple spanning [ and stitutions continents across spread labs with knowledge of [ production the with dominating collaborative, increasingly more teams progressively becoming [ centuries is for science together worked have scientists Although WORK RELATED contributors sustained of project. number the the to increase bur- this help lessen and might that den suggestions design and of acclimated, set getting a in provide have newcomers look that we burden Finally, the at spurring comments. at non-leader effect than greater discussion a new find have leaders and quantitatively, from comments and leadership that qualitatively of both role the Polymath, examine in we on- Additionally, the task. to relevant going comments, content partici- few quality a most high contribute produce individuals, only nevertheless who of only those number even small pants, by content Polymath produced the we of First, is percentage large threefold. a are although work that find this of contributions main The future. scale the large in promote collaborations tech- may scientific that for structures implications social design and discuss tools nical also We success. its to comments. numbered 1228 of total this a During left contributors activity. 39 of of days total Terence indicate a and timeline time, Gowers the Timothy on authors, 2009, bars two Vertical of by blog Tao. May posts by and blog post February 14 blog Between were each there project. for Polymath1 comments active the of in span author time The 1. Figure 33 Post number

Post10 11 number12 13 14 10 11 12 13 14 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 .Teetasaeas eoigicesnl distributed, increasingly becoming also are teams These ]. Feb Feb − − 09 09 13 commenters 162 comments 7 commenters 66 comments 13 commenters 162 comments 13 7 commenters 66 comments 9 commenters 105 comments Mar .Hwvr ept h uhb eerhr and researchers by push the despite However, ]. 9 commenters 105 comments 7 commenters 95 comments 7 commenters 34 comments Mar − 09 9 commenters 96 comments 14 commenters 97 comments 7 commenters 95 comments 7 commenters 34 comments − 09 9 commenters 96 comments 10 commenters 96 comments 9 commenters 83 comments 14 commenters 97 comments 8 commenters 84 comments Apr − 10 commenters 96 comments 11 commenters 110 comments 9 commenters 83 comments 09 8 commenters 84 comments 18 Apr Friday, January14,2011 14 commenters 82 comments .Ohreape fctznsci- citizen of examples Other ]. − May 11 commenters 110 comments 09 − 09 14 commenters 82 comments Number of comments 10 20 30 40 0 7 commenters 42 comments May Feb Jun 8 11 commenters 76 comments − − ]. − 09 09 09 10 7 commenters 42 comments ,dsrbtdcol- distributed ], Jul Jun 11 commenters 76 comments − 09 − 09 Post author Number ofcommentsperday onTao's blog Mar − Tao Gowers 09 Jul 7 11 , − 8 09 ]. ], Apr o eryoemlinglxe [ galaxies million one network nearly observation for bird volunteer a [ eBirds, include ence their by of separated contributors the blog, and each leaders, on two day the blogs. per of comments each from of comments Number 2. Figure sntanwpeoeo omteais udeso math- of Hundreds mathematics. to scale phenomenon large new a a on not is collaboration out, pointed “ Gowers as possible? title course, provocative mathematics the collaborative with post massively blog Timothy a published when Gowers began Project Polymath The PROJECT POLYMATH THE OF BACKGROUND project. the investigating few to approach a quantitative been date have to there Project an [ Polymath While is the describing generally qualitatively research. more studies of applied how area and and improved important success be its has might underlying it they principles that in the Un- remarkable scientists. derstanding between truly collaboration scale large is at Polymath succeeded context this In researchers. between collaborations scale large rather supporting researchers than individual for resources standardizing for and vocabulary controlled [ dynamic a biology provide to aims which the [ including spurred papers has scientific exceptions, which multiple Parts, notable Biological Standard suc- some of scientific Registry are MIT collaborative There scale large to cesses. led has this How- ever, sup- globe. wikis the around and communication bound- blogs, asynchronous time email, porting as and such geographical technologies span with to Inter- aries, easier The it made rare. collabora- has been online net has scale scientists established large between successes, tion these to contrast [ In date to solved been internal have problems for world 400 intensive over labor the labs; or R&D around difficult a too scientists problems into 120,000 solve tap to than who more Innocentive, of as network such companies by sourced pro- [ predict to folding scientists tein help game online an play volunteers 29 2 − 09

, Number of comments Number of comments 10 20 30 40 10 20 30 40 0 0 ;GlxZo hc a rvddvsa classifications visual provided has which GalaxyZoo, ]; 15 Feb Feb ) oorkoldeor stefis td otk a take to study first the is ours knowledge our to ]), − − 09 09 1 .Teeecpin,hwvr ou nsharing on focus however, exceptions, These ]. Number ofcommentsperday onGowers's blog Number ofcommentsperday onTao's blog 5 May .Ee cec-ae & sbigcrowd- being is R&D science-based Even ]. Mar Mar − 09 − − 09 09 24 ;adteGn nooyproject, Ontology Gene the and ]; Apr Apr Jun − − 09 09 − 09 22 ;adFlI,i which in FoldIt, and ]; Comments by contributors Comments by Tao Comments by Gowers May May − − 09 09 ”[ 14 .Of ]. Jun Jun 21 − − 09 09 Is ]. ematicians produced over 500 publications in the second half 400 of the 20th century in a successful collaborative attempt to 250 the classify all the finite simple groups in abstract algebra. 200 300 Although this was indeed a massive collaboration, the nature 150 200 of the collaboration was quite different from what Gowers 100 100 was proposing, since in the case of finite simple group clas- 50 sification, the larger task being solved had hundreds if not 0 0 Number of comments Number thousands of natural and predefined subtasks whose solu- Participants of publications Number Identifiable parcipants tions were largely independent of the solutions to the others subtasks. Gowers was instead proposing to collaboratively Figure 3. Right: Number of Google Scholar publications of each of 29 solve a single problem that “does not naturally split up into the users who tied their blog identity to their real names, sorted from a vast number of subtasks.” lowest to highest by number of publications. Left: Count of numbered- comments left by the 39 users, sorted from lowest to highest by com- Gowers outlines three potential advantages for such large- ment count. scale collaborations in mathematics. First there is the role of chance in problem solving. Having more people work The Polymath Wiki on a problem increases the odds that one person will get A wiki for the project was created and hosted by physicist lucky and discover a great insight. Second, different mathe- Michael Nielsen1. The wiki primarily serves as a clearing- maticians have different areas of expertise, so having a large house for background material, summaries, and preliminary number of contributors work on a problem creates a collec- write-ups relevant to work being done on the blogs. Each tive expertise that cannot be achieved by small groups of problem the community decides to work on has its own de- contributors. Finally, Gowers argues that different people voted page on the Polymath Wiki page where material re- have different characteristics, and will thus assume different lated to that problem is posted. roles in the problem solving process. Some contributors will be proficient at coming up with new ideas or approaches, others will perform complex calculations, still others will The Polymath Blog “fact check” the work of their piers. In short, having a large A dedicated blog was created primarily for administrative discussions, and for high level summaries of the progress pool of people working on the problem allows for a greater 2 opportunities for people to specialize in the problem solv- being made on the other blogs . Additionally, some of the ing process. These advantages are succinctly summarized discussions about what problems the community wishes to by Gowers: work on are carried out on the Polymath Blog. Friday, January 14, 2011 If a large group of mathematicians could connect their The Ground Rules for Participation brains efficiently, they could perhaps solve problems In many ways Gowers’s proposal was an experiment. For the very efficiently as well. [14] experiment to be a success, it would not be enough to simply prove a theorem via blog comments. It is instead, the way Of course there are risks to this approach too, most notably the problem is ultimately solved that is essential to the ex- the possibility that an individual might use material being periment’s success. For instance, if one contributor decided developed on the blog to individually publish a great break- to work independently from the group continuously for days through. and solved the problem on his own, this would clearly not be an example of massively collaborative mathematics. To min- imize the chances for this type of failure, and to generally set Organization of the Polymath Project the tone of the project, Gowers proposed a set of 12 ground Following Gowers’s initial problem proposal, the commu- rules. Many were intended to guide the nature of the com- nity rapidly organized itself in order to begin work on solv- ments that participants produce to maximize the chance that ing the problem. Contributors made use of numerous and the experiment succeeds. Several rules explicitly discourage distinct web-tools available to them to fulfill their needs. The collaborators from working extensively on their own on the resulting suite of tools that they assembled was highly de- problem without discussing their progress on the blog. Ad- centralized and spread across at least 5 Internet domains. We ditionally, there are rules that are designed to make sure the describe these components of the Polymath Project below. discourse remains polite and focused, while also not discour- aging comments that may not be fully thought out. Finally Blogs of Noted Mathematicians the last ground rule specifies that any publications that re- The primary venue for participants to work on the Polymath sult from work done on the blog will be published under a problems is within the posts and comments of the personal pseudonym, and will contain links to the full discourse for blogs of noted mathematicians. Initial work began on Gow- readers to examine. ers’s blog just days after his initial post. On February 1st, mathematician wrote a general post discussing Gower’s initiative, and shortly after that Tao started an offi- cial post working on Polymath from his blog as well. Since then other mathematicians have also on occasion hosted Poly- 1http://michaelnielsen.org/polymath1 math work on their blogs. 2http://polymathprojects.org/ The Polymath Projects Previous day: bβp-value Currently there are five official Polymath Projects (denoted Comments by contributors 0.24 0.24 0.044 by Polymath1 through Polymath5). Each of these projects Comments by leaders 0.35 0.41 < 0.001 focuses on a significant unsolved problem in mathematics. These five projects make up the core of the serious work Table 1. The extent to which comments on the previous day spur con- being done by the group. In addition, there are also two tributor comments on the current day. Raw b-coefficient, standard- ized β-coefficient estimates and p-values are shown from a multiple Mini-Polymath Projects, which are intended mainly as puz- linear regression. The two independent variables are the number of zle exercises and are not serious open problems, and a wiki comments by the two leaders and the number of comments by the con- page dedicated to the discussion of the purported proof by tributors on the previous day. The dependent variable is the number of Deolalikar that P = NP [9]. To date, on the 5 Polymath comments by the contributors on the current day. Projects and 2 Mini-Polymath￿ Projects, there have been over 275 unique commenters, who have left over 5000 comments containing well over 20000 equations. attempt, whereas comments that did not receive a number were typically administrative in nature, commenting on the process, rather directly contributing to the proof. We call POLYMATH1 these two types of comments numbered comments and meta- In this work, our analysis will focus almost entirely on the comments respectively. Polymath1 project, which was the first, and to date most suc- cessful of the Polymath Projects. At the time of this writing, Since we wish to study the act of proving a theorem collab- Polymath1 was the only project that was complete, making oratively, we will primarily focus on numbered comments it a particularly appealing artifact to study in its entirety as and the participants who write these numbered comments. an example of successful collaboration. This is not to say the meta-comments are without value. On the contrary, some meta-comments played a critical role in The target problem for Polymath1 was to find a combina- the organization of the discourse. Rather, we will primar- torial proof of the density version of the Hales-Jewett the- ily treat numbered-comments as quantitative data, and meta- orem, a theorem in . Although proofs exist, comments as qualitative data. they have relied on techniques from ergodic theory [12] and are non-combinatorial. There is often great value in math- In total 39 distinct users contributed at least one numbered ematics to finding multiple proofs for one theorem. In par- comment to Polymath1. Although, this number seems small ticular, elementary proofs (ones that rely only on first prin- compared to the thousands of users who actively edit popular cipals) such as what Gowers was proposing, are often most Wikipedia articles, it is large in comparison to the number valued for their instructiveness. As G. H. Hardy said, “the of authors who traditionally collaborate on the proof of a history of mathematics shows conclusively that mathemati- theorem. The number of comments per user is shown in cians do not evacuate permanently ground which they have Figure 3 (left). Similar to other online content producing conquered once [17].” communities, a majority of the content is produced by a very small number of users. Indeed the top three commenters Success from a mathematical point of view in Gowers’s words produced 55% of the comments, and the top ten commenters would be “anything that could count as genuine progress to- produced nearly 90% of the comments. wards an understanding of the problem.” In this strict sense, Polymath1 was successful well beyond Gowers’s expecta- Of these 39 users, 29 linked their online profiles to their true tions. To date there are two journal articles that have been identity (the remaining 10 were either anonymous, or we submitted [26, 25]. Additionally, at least two new proofs were unable to map their account to a true identity). Their of the theorem, plus new bounds on Hales-Jewett numbers range of backgrounds varies widely, from Fields Medalists were discovered. In this work we will examine whether and winners to high school mathematics teachers, however the to what extent Polymath1 was successful in the larger sense, distribution of their backgrounds is relatively homogeneous; that is as a collaborative framework for doing higher mathe- almost all of the participants are either university professors matics on a large scale. or graduate students in mathematics or computer science and almost all of them are male (28 male, 1 female). Participation and demographics Between February 1 and May 23, 2009, there were 1555 What does vary fairly drastically among these 29 is their comments written on the 14 blog posts of Polymath1. These level of seniority in the higher-mathematics community as 14 blog posts were distributed across two mathematics blogs; measured by number of academic publications. To measure 8 of which were written by Timothy Gowers3 and 6 were this, we use the Google Scholar author search restricted to written Terence Tao4. Of the comments, 1228 were assigned mathematics, computer science, engineering, and physics ar- a sequential and unique “comment number” by the partic- ticles. Figure 3 (right) shows the number of Google Scholar ipants so that they could be easily and succinctly referred publications for each of the 29 identifiable participants. The to by subsequent comments. The comments that received median number of publications is 35, and the mean is 65.2. a number typically were direct contributions to the proof It is also notable that there are 4 participants with 0 publica- tions, 6 with 1 publication or less, and 8 with 5 publications 3http://gowers.wordpress.com or less, showing that there is a mix of well established math- 4http://terrytao.wordpress.com ematicians, and those who are early in their career. THE ROLE OF LEADERSHIP very important in guiding and driving the work of the con- Research in online content producing communities has shown tributors. To address this hypothesis, we examined whether that leadership can be essential to the production of high or not the leader comments on any given day actually spurred quality content [27, 23]. We find that this is also true of increased contributor activity on the next day. To control for the Polymath1 project. the fact that comments in general on any given day (both from leaders and contributors) could spur increased contrib- From its inception, leadership was essential to the Polymath utor activity on the next day, we compare the effects of leader Project. The idea was initially conceived and implemented comments to contributor comments. Table 1 shows the re- by Timothy Gowers, a well know mathematician and Fields sults of a multiple regression, where the two independent Medal winner. It is natural then that Gowers would take on a variables are the number of comments by the leaders and number of leadership duties in the Polymath1 project. In ad- the number of comments by the contributors on the previ- dition to setting the guidelines for participation, and select- ous day. The dependent variable is the number of com- ing the problem for Polymath1, it was Gowers’s intention to ments by the contributors on the current day. This shows that host the work in the comments section of his mathematics contributor and leader comments are both significant factors blog. This made him best positioned to be the person that in spurring activity on the next day. However leader com- guides the progress on the problem, if for any other reason ments have a much greater effect on spurring new comments than he has administrative privileges on the blog and others (β =0.41 for leader comments, compared to β =0.24 for do not. It was Gower’s role to summarize past progress and contributors). guide future progress with each new blog post.

Very soon after Gower’s initial post, fellow Fields medalist COORDINATION AND THREADING Terance Tao wrote a blog post discussing Gowers’s project Collaboration among many people on a task with a poten- that quickly would become a second thread to Polymath1, tially large set of interdependencies can require a great deal running in parallel to Gowers’s first post (see Figure 1). Like of coordination. Here we look for evidence of coordination Gowers, this naturally positioned Tao to take various leader- and interdependencies by examining the attempts of users to ship roles, both on his blog, and on Polymath1 in general. parallelize their efforts.

Though there were certainly other participants that at vari- Threading ous times showed leadership, Gowers and Tao were leaders The decision to have a fixed number of comments (roughly in Polymath1 throughout. For this reason, we will refer to 100) on any single blog post was consequential in opening Gowers and Tao as the leaders of Polymath1, and we refer the possibility of parallel blog threads (i.e. two or more blog to the collective readership of their blogs as the contributors. posts with active comments). Setting a fixed number of com- ments per post allowed the the of comment numbers Blog posts summarizing progress for any post to be allocated in advance, thus maintaining a In a meta-comment near comment number 350, Tao pro- sequential numbering system on parallel posts. poses that every post should be limited to 100 comments. Although this proposal was out of necessity, as it allowed At the same time, very early in the comments of the first post, threads to run in parallel while still being numbered sequen- parallel discussions began to develop on different possible tially (since the numbers for a thread could be allocated in proof approaches. Participants attempted to use the limited advance), there were important unforeseen consequences. tools made available to them by the blog site to organize Ending a thread after 100 comments allowed for the leaders these discussions. They annotated the subjects of comments to make fairly frequent summary posts, highlighting what with either a thread name, or a list of relevant comment num- they thought was relevant in the previous thread and guiding bers. It was clear very early that there was certainly potential the contributors on the next thread. This proved to be one of for some parallelism in discussing distinct approaches to the the essential roles of the leaders throughout the project. problem.

Leader comments versus contributor comments Dual threads: Gowers and Tao Aside from this very important role of summarizing the This potential for parallel discussions was partially realized progress, the leaders also made direct contributions to the when Tao and the readers of his blog began working on Poly- proofs by posting comments. On their own Gowers and math1 in parallel to Gowers. In Figure 1, note the high de- Tao were the top two commenters on Polymath1. Gowers gree of temporal overlap between work being done on Gow- contributed 285 numbered comments, and Tao contributed ers’s weblog and that being done on Tao’s weblog. 232 numbered comments, thus in total there were then 517 leader comments, which was 42% of all numbered com- Despite this overlap, it is difficult to claim that this is evi- mented (readers contributed a total of 711 numbered com- dence of coordination and organization, since the work being ments). done by Tao and his readers was related but distinctly differ- ent from the work being done on Gower’s blog. Tao and Judging by volume alone, it is clear that Gowers and Tao his readers were trying to compute exact bounds on Hales made significant contributions to the work of the proof. How- - Jewett numbers for small dimensions, while Gowers and ever we hypothesize that their comments were themselves his readers were attempting to prove the general theorem. Cross refrencing comments on other blog posts

14

13

12

11

10 In general there was little evidence of any coordination con- Cross refrencing comments on other blog posts 9 straints between work being done across the two blogs, which 14 allowed them to operate almost indendependently.8 "To" post author 13 7 Gowers 12 Tao However, aside from the parallelism6 of the dual Gowers /

Tao posts, there is very little evidence post number "To" of any successful par- 11 5 allel threads despite explicit attempts to parallelize both blogs. 10 Parallelism was attempted at least4 once by Gowers. He in- 9 tended that post number 4 proceed3 in parallel to post number 8 3 on a separate but related topic.2 However, note in Figure 1 that work on post 3 ends soon after the creation of post 4, and 7 1 well before the allocated 100 comments for that post. In a 6

meta-comment on the success of threading, Gowers remarks post number "To" 1 2 3 4 5 6 7 8 9 10 11 12 135 14 "From" post number Im not completely convinced that the multiple-threads 4 idea has worked. Or rather I think it has partially worked. 3

It seems that the upper-and-lower bounds thread was 2 sufficiently separate that it has thrived on its own. But the experience of the other two threads was that once 1 the 400 thread was opened the 300 thread died pretty 1 2 3 4 5 6 7 8 9 10 11 12 13 14 rapidly. "From" post number Tao had some success in parallelizing his blog, as post 6 was active in parallel to each of post 2 and post 7. However, Figure 4. A visualization of the structure of the comment reference graph with respect to inter- and intra-post comment references. Along the nature of post 6 was quite different than the others, as it the diagonal shows the number of intra-post comment references. Off was designed to be an interactive “reading group” to discuss of the diagonal shows inter-post references. The size of the bars is pro- background material. Aside from this one instance, all other portional to the number of such comments. posts comprised one linear thread.

Although parallel threading was not as successfulFriday, as January the par- 14, 2011 to the originating post author. Figure 4 illustrates that there ticipants had hoped, there were unintended benefits to this was a fair amount of comment referencing, highlighting the approach. Gowers, best summarizes this: “An advantage of importance and usefulness of the sequential comment num- the multiple threads was that we had multiple summaries: I bering scheme. Although inter-post comment references did think it would be quite good to be forced to summarize more exist, the majority of comment references were intra-post, often.” In this sense having multiple threads was a success, suggesting that the discussion was largely localized and fo- though not in the way originally envisioned. cused. This is further quantitative evidence of the linear nature of the work, perhaps suggesting that comment ref- erences might have been constrained by the linearity of the The comment reference graph blog. To examine the nature of dependencies in the comments, we constructed a directed graph based on comment references. Nodes of the graph are numbered comments, and directed THE BURDEN TO NEWCOMERS edges are placed from node b to node a if comment b explic- It has been shown that socializing newcomers is an essen- itly mentions the comment number for a. There were 655 tial component to sustaining a healthy and productive on- edges, 702 non-isolated vertices, and 135 weakly connected line community [4, 3]. The level of technical sophistication components of the comment reference graph. The mean of the Polymath content and the pace at which progress on component size was 5.2, the median was 2, and the standard problems moves means that it is necessary to have measures deviation was 25.5. The largest component contained 299 in place to allow newcomers to quickly and efficiently get up comments. The second largest had 15 comments. This large to speed with the material. gap in size suggests the largest component might have com- prised the “main discussion” and the remaining components Figure 5 shows the cumulative number of contributors to were side discussions. Polymath1 over time, split by the number of comments the users would ultimately contribute to the project. This figure Figure 4 shows a visualization of the structure of the com- shows that the core set of contributors started contributing ment references with respect to the 14 blog posts. The columns to the project very early. By the end of the first week of indicate the post number of the originating comment, and the project, over 2/3 of the contributors who would go on rows indicate the post number of the referenced comment. to produce at least 10 comments had already joined. This Thus the diagonal shows the number of intra-post comment could suggest that once the project started, the technical na- references, and off the diagonal shows inter-post comment ture of the generated content made it difficult for newcomers references. The size of each bar is proportional to the num- to join. This hypothesis is supported by the meta-comments. ber of such references, and the color of the bar is according Gowers, at one point remarks 5 40 Polymath1 timeline , which was created on the wiki to high-

30 light the comments that were important milestones to the proof. We will use the timeline to to better understand which 20 Users with at least 1 comment Users with at least 5 comments comments were important as judged by the participants them- 10 Users with at least 10 comments selves. In total there were 215 numbered comments listed in 0

Commenting Users Commenting the timeline, from 22 distinct users (18% of all numbered

Cumulative Number of Number Cumulative Feb-09 Mar-09 Apr-09 May-09 comments, 56% of all users). Out of the users who only commented 5 times or less, there were 6 comments listed Figure 5. Cumulative number of users who join the project grouped by in the timeline from 5 distinct users (17% of the 36 com- total number of comments. ments from these users, 23% of these users). Thus it was not uncommon for users who commented very infrequently to nevertheless have a large impact on the proof. A number of people have commented that it is difficult to catch up with the discussion if you come to it late. We should try to start each thread in a way that makes Comment graph centrality it as self-contained as possible. Of course a comment does not need to be a milestone in the proof for it to be a meaningful contribution. For instance, There is additional evidence that the work done on Gower’s imagine if someone leaves a comment that by itself is not blog required a significant amount of background knowledge a milestone, but it links together several other contributions in combinatorics in order to participate, in contrast to the in an important way. Or perhaps, the user leaves a com- problem on Tao’s blog, which was more elementary. On this ment that directly contributes to the the ideas of a subsequent issue, one participant remarked: milestone comment. To look for evidence of such contribu- tions, we can examine the structure of the comment refer- Regarding the amount of background knowledge re- ence graph. quired, while on the thread here it did stray through in- timidating territory, [...] on Terry’s blog the arguments We first compute a series of graph centrality measures on [...] were relatively elementary. each node of the comment reference graph. The centrality measures we use are: eigenvector centrality, betweenness This burden may have discouraged a number of qualified in- centrality, authority score, Bonpow centrality, closeness cen- dividuals who were interested in the project from actually trality, alpha centrality, page rank, in degree, and total de- participating. This also may explain why there were more gree. We next compute the ranking on the vertices that is individuals contributing to Tao’s blog. Over the course of the induced by each centrality measure. Finally, we aggregate project 32 people contributed to Tao’s blog, 20 contributed the rankings by averaging them to produce a single ranking to Gowers’s blog, with 14 contributing to both. of the 1228 comments.

In addition to the burden of keeping up with the fast pace, Figure 6 is a visualization of three variables that are essential and the burden of mastering the required background knowl- to understanding the nature of participation and contribution edge, an additional barrier to participation emerged from the in Polymath1. Each of the identifiable users are plotted ac- meta-comments. One participant was discouraged by the cording to the number of comments they contributed to to large volume of comments that were not fully “thought out” the project, and the maximum ranking of all of their com- that nevertheless required a significant amount of time to val- ments. Additionally, the size of each point is proportional to idate. Despite making some important contributions early the number of publications of that user according to Google on, he concluded his time was “better spent on problems that Scholar. do not get as much attention.” There are a number of observations that one can immedi- All of this evidence suggests that the project could have ben- ately make from this Figure. First, the leadership of Gowers efited from targeted mechanisms to aid and encourage new- and Tao can be seen instantly. Not only did Gowers and comers to join and continue to participate. Tao have the highest, and second highest number of com- ments respectively, they produced the second highest and third highest ranking comments respectively under our rank- DOES EVERYONE REALLY CONTRIBUTE? ing method. Additionally, it can be seen that there is a third As we noted, there were many users who only contributed a potential leader, that had the highest ranked comment, and few comments to the project. Indeed, there were 13 partici- the third highest number of comments overall. Like Gowers pants who only contributed 1 comment, and 21 participants and Tao, in addition to being a prolific contributor, this in- contributed 5 comments or less. This begs the questions, dividual took on various administrative roles in the project, did they all really contribute to solving the problem? Can including being in charge of one of the final write-ups. one or two comments from a user who never comments to the project again really have any impact on the final solution In addition to observing leadership in the upper right quad- when there are over 1200 comments in total? Perhaps all but rant, we can see that there are a number of contributors in the top few contributors were irrelevant? the bottom right quadrant. These are individuals who were To investigate these questions, we first turn to the official 5http://michaelnielsen.org/polymath1/index.php?title=Timeline Gowers !

250 took a significant number of administrative duties in order Tao Gowers ! Comments by: to guide the project to its mathematical success. Addition- ! Gowers 200 250 ally, it is evident from the number of popular press articles ! Tao Tao written about Polymath, that their collective celebrity had ! Contributors 200 a large role in both the publicity the project received and ! 150 Publication count: in the recruitment of contributors. Their involvement was ! 0 ! 150 ! 100 so significant all throughout, that it is questionable whether ! 200 ! the Polymath project would have succeeded without their ef- 100

Number of comments 300 100 ! ! forts. Whether or not large-scale collaborative mathematics

! ! ! Number of comments 400 can succeed without such high-profile leaders is of course an ! ! ! 50 open question. ! 50 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! Incentives for participation !! ! ! ! !! !! ! ! ! ! ! In order to maintain a diverse community, incentives for par- 400 600 800 1000 400 600 800 1000 Ranking of highest ranked comment Ranking of highest ranked comment ticipation should be such that all levels of mathematicians have incentive to participate. This may be difficult to main- Figure 6. Here we show the ranking of the highest ranked comment for tain in practice unless the individuals are able to get proper each user together with the total number of comments from that user. academic attributions for the work they do in Polymath. There Points are sized proportional to the number of scholarly publications is a potentially large misalignment in incentives between the of that user (as given by Google Scholar). demands of a professional academic career in mathematics, and long-term involvement in the Polymath Project. This not prolific contributors, but who nevertheless made signif- misalignment is best expressed by an anecdote told by a icant, highly ranking comments. Finally, using the number Polymath contributor during meta-discussion. While having of Google Scholar publication as a proxy for seniority, one a conversation about the Polymath with an offline colleague, can see that the level of seniority of the participants varies the colleague asked “how will participants put their grant ac- greatly. Furthermore, if you discount Tao and Gowers, it knowledgment lines” in the final document? does not seem to correlate to volume or importance of con- tribution. This begs several other questions regarding the sustainabil- Thursday, January 13, 2011 ity of long-term involvement. How do the universities that To put this finding in more precise terms, the 662 comments employ the participants receive attribution? Could someone that received highest ranking (those above the median) were justify to their university that they deserve tenure if all of produced by 28 users. Conversely, there were only 11 users their work is published collectively? Unlike work in open whose comments all ranked less than the median ranking. source software, where there are significant economic incen- We feel this result is quite surprising given the volume of tives at play at the heart of these communities, it’s unclear comments written by the top contributors (recall 90% of the that the economics of truly large scale collaborative mathe- comments were written by the top contributors). matics are scalable absent the right incentive structure.

DISCUSSION Design recommendations for encouraging newcomers Was Polymath1 a success? From a mathematical point of We have discussed several barriers to entry that might dis- view, the answer is unequivocally, yes. In just over 6 weeks courage newcomers from participating. Here we focus on a collective group of distributed individuals accomplished three challenges: identifying the important content to read, some highly non-trivial mathematical feats. Whether or not identifying tasks to work on, and learning the required back- the project was a success as an experiment in “massive-scale” ground material. collaboration, requires a more nuanced discussion. Identifying important content Certainly, the scale of the project could hardly be considered We have shown that one potential problem discouraging new- “massive,” falling short of the vision originally put forth by comers is the large volume of content that is generated by ex- Gowers [15]. However, it is clear that the scale was large isting users, together with the lack of any mechanisms that compared to most collaborations in higher math. It is rel- sort out important content from unimportant content. Al- atively uncommon for a mathematics publication to have though it is inevitable that newcomers must read some exist- more than 3 authors, let alone 40. Additionally, although the ing content in order to become acclimated, when so much of core set of contributors by volume was small, we have shown the work done in mathematics is validating ideas that may or that several users were still able to make important contribu- may not work, participants may be discouraged by reading tions to the work, despite producing a small percentage of content that has already been invalidated by others. New- the comments. comers may feel that they are simply replicating work that has already been done by the existing users in order to get The ambiguity regarding the success of Polymath1 is ampli- acclimated to the project. Thus in addition to the cost associ- fied by the roles played by the two leaders. Not only were ated to with reading a potentially large volume of irrelevant Gowers and Tao by far the top contributors to the project content, there is the added cost associated with the redun- in terms of number of comments, but they also each under- dancy of this task. We have two proposals to address this challenge. First, there are both temporary solutions. Ideally there would be some should be better explicit linkage among the comments uti- structural mechanism in the design of the collaborative web- lizing the explicit comment referencing done by the partic- site that deals with this issue. One approach would be to ipants. For any comment, there should be automated tools have a forum section of the website which is designed to that generate and display all the comments that refer to it. discuss and ask questions related to background material. This might be accompanied through a visualization of the comment reference graph. This mechanism will allow new- This would ideally be integrated so that primary content could comers to see where each comment fits into the larger body link and refer back to the forum in a structured way. This of work. Second, there should be mechanisms for users to would help with the problem of identifying what background rate the importance of each comment. Ideally this would in- material should be studied to understand a given piece of clude both a numeric rating of importance, such as a “star” primary content. For the most part, the wiki was useful as mechanism, as well as a categorical importance rating where a clearinghouse for background material, however the lack old-timers could, for example designate a given comment as of integration between the wiki and the primary content be- “Important for getting up to speed” or “A core contribution.” ing produced on the blogs was likely prohibitive. Like com- One might also pair socially generated ratings, with algorith- ments, different pieces of background material have differ- mically generated ratings, such as the graph-based comment ing and evolving levels of importance. Tools that connect ranking that we employed in this work. the primary content back to the secondary content and visa versa, could significantly reduce the burden of newcomers Identifying tasks to work on by helping them focus on what background material they Similar structural problems make it difficult for newcom- need to understand most to contribute to any given area. ers to identify tasks to work on. Research on task identi- fication in Wikipedia suggests that automated mechanisms CONCLUSION can significantly improve productivity [6]. The manifesta- In this work, we examine the Polymath Project, an experi- tion of this challenge in Polymath was outlined in Gowers & ment in large scale collaborative mathematics, to better un- Nielsen derstand its successes and shortcomings as a online commu- nity and tool for open collaboration. We find that a large A significant barrier to entry was the linear narrative percentage of the content is produced by small number of style of the blog. This made it difficult for late entrants individuals. However, many individuals that produce a low to identify problems to which their talents could be ap- volume of content are still producing high quality, influen- plied. There was also a natural fear that they might have tial content. We also examined the role of leadership, and missed an earlier discussion and that any contribution the burden to newcomers. Based on our analysis, we gave a they made would be redundant. [15] set of design recommendations for encouraging newcomers.

They suggest using a task management system similar to REFERENCES those in use by open source software communities. To some 1. M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, extent the participants attempted to use the wiki for these H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. purposes. However, without the explicit assignments of tasks Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, to people, this likely had little affect. We agree that a task L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, management system would be of use here, however we J. E. Richardson, M. Ringwald, G. M. Rubin, and strongly suggest that such a system be designed to have ex- G. Sherlock. Gene ontology: tool for the unification of plicit tasks for acclimating newcomers. These tasks need biology. the gene ontology consortium. Nature not be content producing tasks, but could be administra- genetics, 25(1):25–29, May 2000. tive tasks, or even tasks designed to get the newcomer up 2. M. J. Barany. ’[b]ut this is blog maths and we’re free to to speed. We suspect that task assignment, or automated make up conventions as we go along’: Polymath1 and task suggestion might not be useful for experienced Poly- the modalities of ’massively collaborative math contributors, and could even discourage them, since mathematics’. In WikiSym ’10: Proceedings of the 6th they are more familiar with the content and their own abili- International Symposium on Wikis and Open ties, and thus may not work well with the rigidity of a task Collaboration, pages 1–9, New York, NY, USA, 2010. assignment system. ACM. 3. M. Burke, R. Kraut, and E. Joyce. Membership Claims Dealing with background material and Requests: Conversation-Level Newcomer Finally, it is natural that many Polymath projects will require Socialization Strategies in Online Groups. Small Group a significant amount of background material to participate, Research, 41(1):4–40, 2010. as is often the case in communities of practice. However, this should not be so prohibitive that it precludes someone with 4. M. Burke, C. Marlow, and T. Lento. Feed me: sufficient technical abilities from participating. The commu- motivating newcomer contribution in social network nity has attempted to address this issue by suggesting longer sites. In Proceedings of the 27th international “lead times” for the start of new projects, for potential par- conference on Human factors in computing systems, ticipants to research the background material. Additionally, CHI ’09, pages 945–954, New York, NY, USA, 2009. they experimented with “reading group” blog threads. These ACM. 5. S. Cooper, F. Khatib, A. Treuille, J. Barbero, J. Lee, 20. A. Kittur and R. E. Kraut. Beyond wikipedia: M. Beenen, A. Leaver-Fay, D. Baker, Z. Popovic, and coordination and conflict in online production groups. F. Players. Predicting protein structures with a In Proceedings of the 2010 ACM conference on multiplayer online game. Nature, 466(7307):756–760, Computer supported cooperative work, CSCW ’10, August 2010. pages 215–224, New York, NY, USA, 2010. ACM. 6. D. Cosley, D. Frankowski, L. Terveen, and J. Riedl. 21. K. R. Lakhani and J. A. Panetta. The principles of Suggestbot: using intelligent task routing to help distributed innovation. Innovations: Technology, people find work in wikipedia. In Proceedings of the Governance, Globalization, 2(3):97–112, July 2007. 12th international conference on Intelligent user 22. C. Lintott, K. Schawinski, S. Bamford, A. Slosar, interfaces, IUI ’07, pages 32–41, New York, NY, USA, K. Land, D. Thomas, E. Edmondson, K. Masters, 2007. ACM. R. Nichol, J. Raddick, A. Szalay, D. Andreescu, 7. J. N. Cummings and S. Kiesler. Collaborative research P. Murray, and J. Vandenberg. Galaxy zoo 1 : Data across disciplinary and organizational boundaries. release of morphological classifications for nearly Social Studies of Science, (35), 2005. 900,000 galaxies. Jul 2010. 8. J. N. Cummings and S. Kiesler. Coordination costs and 23. K. Luther and A. Bruckman. Leadership in online project outcomes in multi-university collaborations. creative collaboration. In Proceedings of the 2008 ACM Research Policy, 36(10):1620–1634, 2007. conference on Computer supported cooperative work, CSCW ’08, pages 343–352, New York, NY, USA, 9. V. Deolalikar. P neq np, unverified claim, 2010. 2008. ACM. 10. T. A. Finholt. Collaboratories. Annual Review of 24. J. Peccoud, M. F. Blauvelt, Y. Cai, K. L. Cooper, Information Science and Technology (ARIST), O. Crasta, E. C. DeLalla, C. Evans, O. Folkerts, B. M. 36:73–107, 00 2002. Lyons, S. P. Mane, R. Shelton, M. A. Sweede, and S. A. Waldon. Targeted development of registries of 11. T. A. Finholt and G. M. Olson. From laboratories to biological parts. PLoS ONE, 3(7):e2671+, July 2008. collaboratories: a new organizational form for scientific collaboration. Psychological Science, (8):28–35, 1997. 25. D. H. J. Polymath. Density hales-jewett and moser numbers, preprint, available at .org/abs/1002.0374, 12. H. Furstenberg and Y. Kaznelson. A density version of 2009. the hales-jewett theorem. J. Anal. Math, (57), 1991. 26. D. H. J. Polymath. A new proof of the density 13. O. Gassmann and M. von Zedtwitz. New concepts and hales-jewett theorem, preprint, available at trends in international randd organization. Research arxiv.org/abs/0910.3926, 2009. Policy, 28(2-3):231 – 250, 1999. 27. J. M. Reagle, Jr. Do as i do:: authorial leadership in 14. T. Gowers. Is massively collaborative mathematics wikipedia. In WikiSym ’07: Proceedings of the 2007 possible? http://gowers.wordpress.com/2009/01/27/is- international symposium on Wikis, pages 143–156, massively-collaborative-mathematics-possible/. New York, NY, USA, 2007. ACM. 15. T. Gowers and M. Nielsen. Massively collaborative 28. G. C. Rota. The phenomenology of mathematical mathematics. Nature, 461(7266):879–881, 10 2009. proof. Synthese, 111:183–196, 1997. 10.1023/A:1004974521326. 16. C. Gutwin, R. Penner, and K. Schneider. Group awareness in distributed software development. In 29. B. L. Sullivan, C. L. Wood, M. J. Iliff, R. E. Bonney, Proceedings of the 2004 ACM conference on Computer D. Fink, and S. Kelling. ebird: A citizen-based bird supported cooperative work, CSCW ’04, pages 72–81, observation network in the biological sciences. New York, NY, USA, 2004. ACM. Biological Conservation, 142(10):2282 – 2292, 2009. 30. W. Thurston. On proof and progress in mathematics. In 17. G. H. Hardy. Mathematical proof. Mind, 38(149):1–25, R. Hersh, editor, 18 Unconventional Essays on the 1929. Nature of Mathematics, pages 37–55. Springer New 18. B. Kanefsky, N. G. Barlow, and V. C. Gulick. Can York, 2006. 10.1007/0-387-29831-2 3. Distributed Volunteers Accomplish Massive Data 31. L. Trevisan. A people’s history of mathematics. Analysis Tasks? In Lunar and Planetary Institute http://lucatrevisan.wordpress.com/2009/02/01/a- Science Conference Abstracts, volume 32, page 1272, peoples-history-of-mathematics/. Mar. 2001. 32. H. Wickham. ggplot2: elegant graphics for data 19. A. Kittur and R. E. Kraut. Harnessing the wisdom of analysis. Springer New York, 2009. crowds in wikipedia: quality through coordination. In Proceedings of the 2008 ACM conference on Computer 33. S. Wuchty, B. F. Jones, and B. Uzzi. The Increasing supported cooperative work, CSCW ’08, pages 37–46, Dominance of Teams in Production of Knowledge. New York, NY, USA, 2008. ACM. Science, 316(5827):1036–1039, 2007.