
The Lives and Deaths of Open Source Code Forges Megan Squire Elon University Elon, North Carolina, USA [email protected] ABSTRACT file downloads, version control systems, mailing list Code forges are third party software repositories that also software, wikis, bug tracking software, and so on. In the provide various tools and facilities for distributed software early years of the FLOSS phenomenon, these code forges development teams to use, including source code control served an important role for developers by providing a low systems, mailing lists and communication forums, bug barrier to entry to coordinate team work, and they served tracking systems, web hosting space, and so on. The main end-users by providing a centralized place to find and contributions of this paper are to present some new data sets communicate about a variety of different FLOSS projects. relating to the technology adoption lifecycles of a group of six free, libre, and open source software (FLOSS) code By the mid-2000s, larger software companies began to forges, and to compare the lifecycles of the forges to each create their own software forges, such as Google Code and other and to the model presented by classical Diffusion of Microsoft CodePlex. Non-commercial special-purpose Innovation (DoI) theory. We find that the observed forges were also created during this time frame, for adoption patterns of code forges rarely follow the DoI example RubyForge was designed for projects written in a model, especially as larger code forges are beset by spam particular programming language (Ruby) and the and abuse. The only forge exhibiting a DoI-like lifecycle ObjectWeb forge was designed for FLOSS middleware was a smaller, community-managed, special-purpose forge projects. Github was launched in 2008 to offer version whose demise was planned in advance. The results of this control and some basic features such as wikis and file study will be useful both to practitioners building downloads, and is now by far the largest centralized collaborative FLOSS ecosystems, such as code forges, and software forge with over 21 million user accounts and 57 to researchers who study the evolution and adoption of million repositories as of this writing. [1] socio-technical systems. Though their intended audiences may differ, and the Author Keywords services provided by each code forge may be slightly Open source; free software; FLOSS; code forge; diffusion different, the purpose of all FLOSS code forges is to host of innovations; technology adoption; RubyForge; Google projects. Each time a project owner "chooses" to host their Code; SourceForge; ObjectWeb; CodePlex; Github; particular project on a code forge, this action is an software evolution indication that the code forge is still relevant in some way. Some of the oldest forges are still accepting new projects, ACM Classification Keywords while others have closed, merged, or otherwise transformed D.2.9. SOFTWARE ENGINEERING: Management; H.3.5 themselves as the FLOSS phenomenon has changed and INFORMATION STORAGE AND RETRIEVAL: Online matured. What do the project hosting rates look like in the Information Services; Data sharing. years between a code forge's birth and its death? Do the INTRODUCTION code forges follow the same adoption or "diffusion" Because teams of developers of free, libre, and open source patterns found in other technologies, as project owners software (FLOSS) projects are often geographically choose to adopt the technology or move to something else? distributed around the world, many teams choose to For this paper, we compare longitudinal data from six code structure their work in an asynchronous, location-neutral forges to a "typical" technology adoption curve as presented way. Early web-based FLOSS hosting services, such as in classical Diffusion of Innovation (DoI) theory. Basic SourceForge and GNU Savannah, offered features such as models for technology diffusion were first described by Paste the appropriate copyright/license statement here. ACM now supports Everett Rogers [2], who proposed a life cycle consisting of three different publication options: early adoption, adoption by the majority, then late adoption • ACM copyright: ACM holds the copyright on the work. This is the ("laggards"), and ultimately either discontinuance of the historical approach. product or a saturated market. He applied this model to a • License: The author(s) retain copyright, but ACM receives an exclusive publication license. variety of technologies in a variety of industries, refining • Open Access: The author(s) wish to pay for the work to be open the model to show that diffusion could be affected by many access. The additional fee must be paid to ACM. factors including social interactions and organizational This text field is large enough to hold the appropriate release statement dynamics. With time plotted on the x-axis and adoption of assuming it is single-spaced in Times New Roman 8-point font. Please do not change or modify the size of this text box. an innovation shown on the y-axis, Rogers proposed that Each submission will be assigned a DOI string to be included here. the typical diffusion of an innovation over time will likely 1 resemble a normal distribution (or an S-curve if plotted repository, at http://flossdata.syr.edu/data/forgeStudies/ with a cumulative x-axis). Figure 1 shows the two "typical" 2017deathOfForges. DoI technology adoption curves. RubyForge RubyForge was launched on July 16, 2003 as a Ruby language-specific hosting site for FLOSS projects. It included collaboration tools such as file downloads, source code control software, bug tracking, and mailing lists. Project-level metadata collected from the 10 years of RubyForge's existence was gathered and described in our prior work [6]. Examples of project-level metadata we collected for RubyForge includes: project name, project owners/developers, project description, project license, Figure 1. "Typical" Diffusion of Innovations (DoI) curves with periods of maximum growth shown (after Rogers, 1964). project registration date, and so on. In assessing the adoption or lifecycle of a technology, the Figure 2 shows a visualization of the RubyForge monthly steepness or shallowness of the S-curve is interesting, as is new project registrations found in the [6] data set, the point at which the innovation declines following a beginning with its launch in 2003 through its shutdown in period of maximum adoption. By studying the adoption 2014. The dates of two important events in RubyForge's curves, we may find that some code forges may reach their history are overlaid on the graph: the launch of Github's peak (maximum adoption) earlier or later than expected. gem builder in 2008, and the 2009 launch of Gemcutter Some forges may be kept alive well past their expected (eventually renamed RubyGems). Github was a significant lifespan. For code forges that are not dead yet, a partial competitor to RubyForge, and Gemcutter/RubyGems was adoption curve may exhibit clues for what is to come. specifically designed by the RubyForge team to be a replacement for RubyForge. As the graph shows, the years Thus, this paper begins a data-driven, historical analysis of of most intense growth at RubyForge were between 2006- the diffusion/adoption curves of code forges. We study six 2009. in detail: RubyForge, Google Code, SourceForge, CodePlex, ObjectWeb, and Github. Our questions are: • RQ1: What do the adoption curves for each code forge look like? • RQ2: What factors, if any, alter the curves or affect the adoption patterns between the forges? • RQ3: Do all code forges exhibit the same patterns of birth, growth, and death as would be expected from traditional DoI theory? Figure 2. Monthly new project registrations on RubyForge, To answer RQ1, for each of six code forges, we gather 2003-2013, with key dates shown metrics to describe its adoption rate and we plot the adoption rate graphically. For RQ2, we outline the various details (e.g. spam) that may explain the shapes of the curves. To answer RQ3, we compare the shape of these curves to what DoI theory would predict and discuss the possible reasons for any differences. Finally, we explore the limitations of this work and present ideas for how to advance this work in the future. TECHNOLOGY ADOPTION DATA FLOSSmole [3] data is used in this paper to describe the adoption rates - via new project registrations - of RubyForge, Google Code, CodePlex, and ObjectWeb. SourceForge Research Data Archive (SRDA) [4] data is Figure 3. Cumulative monthly new project registrations on used to describe the lifecycle of SourceForge. GHTorrent RubyForge, 2003-2013, with month of maximum growth data [5] is used to describe the lifecycle of Github. For each shown (January 2008) forge, the data, queries, and calculations used in this paper Figure 3 shows the same data, but with monthly are available for download in the FLOSSmole data contributions comprising a cumulative total. This graph shows that RubyForge began its decline at 52% saturation 2 of the "market". Following the launch of a competitive site announced [9] that the code forge would shut down (Github) and its own planned replacement completely. He explained, (Gemcutter/RubyGems), RubyForge began a slow decline. "As developers migrated away from Google Code [to RubyForge was eventually closed to new projects at the end Github], a growing share of the remaining projects were of 2013, and shuttered for good in May of 2014, having [sic] spam or abuse. Lately, the administrative load has hosted a total of 9,898 projects over its lifetime. consisted almost exclusively of abuse management." Google Code Google Code (http://code.google.com) was launched in DiBona directly blames the demise of Google Code on both 2006 as a free-of-charge hosting site for any software the spam/abuse problem and the existence of competitive project using an OSI-approved FLOSS license.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages8 Page
-
File Size-