The Polymath Project: Lessons from a Successful Online Collaboration in Mathematics

The Polymath Project: Lessons from a Successful Online Collaboration in Mathematics Justin Cranshaw Aniket Kittur School of Computer Science School of Computer Science Carnegie Mellon University Carnegie Mellon University Pittsburgh, PA, USA Pittsburgh, PA, USA [email protected] [email protected] ABSTRACT and prior knowledge, and they have high coordination de- Although science is becoming increasingly collaborative, mands. Furthermore, professional advancement in science there are remarkably few success stories of online collab- is typically subject to a highly developed and closed-loop orations between professional scientists that actually result reward structure (e.g., publications, grants, tenure). These in real discoveries. A notable exception is the Polymath factors do not align well with the open, asynchronous, in- Project, a group of mathematicians who collaborate online to dependent, peer production systems that have typified social solve open mathematics problems. We provide an in-depth collaboration online. However, one notable exception is the descriptive history of Polymath, using data analysis and vi- Polymath Project, a collection of mathematicians ranging in sualization to elucidate the principles that led to its success, expertise from Fields Medal winners to high school math- and the difficulties that must be addressed before the project ematics teachers who collaborate online to prove unsolved can be scaled up. We find that although a small percentage mathematical conjectures. of users created most of the content, almost all users nevertheless contributed some content that was highly influential There are many reasons to believe why mathematics might to the task at hand. We also find that leadership played an be well suited for large-scale online collaboration. Mathe- important role in the success of the project. Based on our maticians, unlike most empirical scientists, require little ex- analysis, we present a set of design suggestions for how fu- perimental equipment to do their work, making the asyn- ture collaborative mathematics sites can encourage and fos- chronous nature of online collaborations less of a deterrent. ter newcomer participation. In addition, mathematical ideas transfer between mathematicians, especially between those who specialize in the same Author Keywords sub-field, with remarkable speed and efficiency, which might large-scale collaboration, online collaborative mathematics, help minimize coordination costs [30]. online collaborative science, online communities In many ways, the deductive nature of proofs and theorems is inherently collaborative. Although proofs themselves are ACM Classification Keywords typically produced by a small number of primary contribu- H.5.2 Information Interfaces and Presentation: Miscellaneous tors, they often rely on a rich history of prior contributions to the problem made by past mathematicians. Gian-Carlo General Terms Rotta, in reference to Wiles’s proof of Fermat’s last theo- Human Factors rem describes this notion of proof as “a triumph of collaboration across frontiers and centuries [28].” This pattern is INTRODUCTION a common one in mathematics. As Luca Trevisan explains, The Internet has made creative collaboration possible on an works of “genius” in mathematics typically occur when “a unprecedented scale: tens of thousands of people work to- large collection of ‘incremental’ results and ‘observations’ gether to consolidate knowledge (e.g. Wikipedia), or to write has been made, and all the pieces are there, ready to be put software (e.g. Apache and Linux). However, there have been together [31].” These series of incremental results are often few examples of scientific knowledge creation being done the work of a great many mathematicians sometimes span- by large online groups. Tasks such as scientific discovery ning decades or even centuries. The mathematical “genius” are difficult to carry out on a large scale for a number of is then often one who excels at putting these pieces together. reasons: they are complex, they require significant expertise According to Trevisan, massively-collaborative mathematics could re-define this idea of genius. Instead of the few who are able to “put together the zeitgeist,” genius could rather “be in the union of many minds, each doing nothing more Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are than saying what is obvious to them [31].” not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or In this paper we combine social science theory, with an in- republish, to post on servers or to redistribute to lists, requires prior specific depth descriptive analysis of data gathered from the Poly- permission and/or a fee. CHI 2011, May 7–12, 2011, Vancouver, BC, Canada. math Project, in order to explicate the factors that contributed Copyright 2011 ACM 978-1-4503-0267-8/11/05...$10.00. Number of comments per day on Tao's blog 40 Number of comments per day on Gowers's blog 14 76 comments 30 11 commenters 110 comments 13 11 commenters 40 Comments by Gowers 42 comments 12 20 7 commenters 30 Comments by Tao 83 comments 11 9 commenters Comments by contributors 96 comments 10 20 10 10 commenters 84 comments 9 8 commenters Number of comments 10 0 76 comments 14 96 comments 11 commenters 8 9 commenters110 comments 13 11 commenters Feb−09 Mar−09 Apr−090 May−09 Jun−09 97 comments 42 comments Number of comments 127 14 commenters 7 commenters Feb−09 Mar−09 Apr−09 May−09 Jun−09 34 comments83 comments 116 7 commenters9 commenters 96 comments 10 95 comments10 commenters Post number Post 5 7 commenters Number of comments per day on Tao's blog 84 comments 94 105 comments8 commenters 9 commenters96 comments Post author 8 66 comments9 commenters 40 3 7 commenters97 comments 7 14 commenters Gowers 82 comments 2 34 comments 30 6 7 commenters 14 commenters Tao 162 comments95 comments 51 13 commenters7 commenters 20 Post number Post 105 comments 4 9 commenters 66 comments 10 3 Feb−097 commentersMar−09 Apr−09 May−09 Jun−09 Jul−09 82 comments 2 14 commenters 0 1 162 comments Number of comments 13 commenters Feb−09 Mar−09 Apr−09 May−09 Jun−09 Figure 1. The time span of active comments for each blog post by blog authorFeb in−09 theMar Polymath1−09 Apr project.−09 May Between−09 Jun February−09 Jul and−09 May of 2009, there were 14 blog posts by two authors, Timothy Gowers and Terence Figure 2. Number of comments per day on each blog, separated by Tao. Vertical bars on the timeline indicate days of activity. During this comments from each of the two leaders, and the contributors of their time, a total of 39 contributors left a total of 1228 numbered comments. blogs. to its success. We also discuss design implications for tech- ence include eBirds, a volunteer bird observation network nical tools and social structures that may promote large scale [29]; GalaxyZoo, which has provided visual classifications scientific collaborations in the future. for nearly one million galaxies [22]; and FoldIt, in which volunteers play an online game help scientists to predict pro- The main contributions of this work are threefold. First, we tein folding [5]. Even science-based R&D is being crowd- find that although a large percentage of the Polymath content sourced by companies such as Innocentive, who tap into a is produced by small number of individuals, most partici- network of more than 120,000 scientists around the world pants, even those only who only contribute a few comments, to solve problems too difficult or labor intensive for internal nevertheless produce highFriday, quality January content 14, 2011 relevant to the on- R&D labs; over 400 problems have been solved to date [21]. going task. Additionally, we examine the role of leadership in Polymath, both qualitatively and quantitatively, and find In contrast to these successes, large scale online collabora- Thursday, January 13, 2011that comments from leaders have a greater effect at spurring tion between established scientists has been rare. The Inter- new discussion than non-leader comments. Finally, we look net has made it easier to span geographical and time bound- at the burden that newcomers have in getting acclimated, and aries, with technologies such as email, blogs, and wikis sup- provide a set of design suggestions that might lessen this bur- porting asynchronous communication around the globe. How- den and help increase the number of sustained contributors ever, this has led to large scale collaborative scientific suc- to the project. cesses. There are some notable exceptions, including the MIT Registry of Standard Biological Parts, which has spurred multiple scientific papers [24]; and the Gene Ontology project, RELATED WORK which aims to provide a dynamic controlled vocabulary for Although scientists have worked together for centuries [11], biology [1]. These exceptions, however, focus on sharing science is becoming progressively more collaborative, with and standardizing resources for individual researchers rather teams increasingly dominating the production of knowledge than supporting large scale collaborations between researchers. [33]. These teams are also becoming increasingly distributed, with labs spread across continents and spanning multiple in- In this context Polymath is truly remarkable in that it has stitutions [13]. However, despite the push by researchers and succeeded at large scale collaboration

The Polymath Project: Lessons from a Successful Online Collaboration in Mathematics

2006 Annual Report

A Mathematical Tonic

Sir Andrew J. Wiles

The Top Mathematics Award

Ten Mathematical Landmarks, 1967–2017

On the Depth of Szemerédi's Theorem

Sir Timothy Gowers Interview

Should Mathematicians Care About Communicating to Broad Audiences? Theory and Practice

Reviewed by John P. Burgess Why Is There Philosophy of Mathematics At

The Early Career of G.H. Hardy

De Morgan Association NEWSLETTER

Fields Medal