The Lives and Deaths of Open Source Code Forges Megan Squire Elon University Elon, North Carolina, USA [email protected]

Total Page:16

File Type:pdf, Size:1020Kb

The Lives and Deaths of Open Source Code Forges Megan Squire Elon University Elon, North Carolina, USA Msquire@Elon.Edu The Lives and Deaths of Open Source Code Forges Megan Squire Elon University Elon, North Carolina, USA [email protected] ABSTRACT file downloads, version control systems, mailing list Code forges are third party software repositories that also software, wikis, bug tracking software, and so on. In the provide various tools and facilities for distributed software early years of the FLOSS phenomenon, these code forges development teams to use, including source code control served an important role for developers by providing a low systems, mailing lists and communication forums, bug barrier to entry to coordinate team work, and they served tracking systems, web hosting space, and so on. The main end-users by providing a centralized place to find and contributions of this paper are to present some new data sets communicate about a variety of different FLOSS projects. relating to the technology adoption lifecycles of a group of six free, libre, and open source software (FLOSS) code By the mid-2000s, larger software companies began to forges, and to compare the lifecycles of the forges to each create their own software forges, such as Google Code and other and to the model presented by classical Diffusion of Microsoft CodePlex. Non-commercial special-purpose Innovation (DoI) theory. We find that the observed forges were also created during this time frame, for adoption patterns of code forges rarely follow the DoI example RubyForge was designed for projects written in a model, especially as larger code forges are beset by spam particular programming language (Ruby) and the and abuse. The only forge exhibiting a DoI-like lifecycle ObjectWeb forge was designed for FLOSS middleware was a smaller, community-managed, special-purpose forge projects. GitHub was launched in 2008 to offer version whose demise was planned in advance. The results of this control and some basic features such as wikis and file study will be useful in explaining adoption trajectories, both downloads, and is now by far the largest centralized to practitioners building collaborative FLOSS ecosystems software forge with over 21 million user accounts and 57 and to researchers who study the evolution and adoption of million repositories as of this writing. [1] socio-technical systems. Though their intended audiences may differ, and the Author Keywords services provided by each code forge may be slightly Open source; free software; FLOSS; code forge; diffusion different, the purpose of all FLOSS code forges is to host of innovations; technology adoption; RubyForge; Google projects. Each time a project owner "chooses" to host their Code; SourceForge; ObjectWeb; CodePlex; GitHub; particular project on a code forge, this action is an software evolution indication that the code forge is still relevant in some way. Some of the oldest forges are still accepting new projects, ACM Classification Keywords while others have closed, merged, or otherwise transformed D.2.9. SOFTWARE ENGINEERING: Management; H.3.5 themselves as the FLOSS phenomenon has changed and INFORMATION STORAGE AND RETRIEVAL: Online matured. What do the project hosting rates look like in the Information Services; Data sharing. years between a code forge's birth and its death? Do the INTRODUCTION code forges follow the same adoption or "diffusion" Because teams of developers of free, libre, and open source patterns found in other technologies, as project owners software (FLOSS) projects are often geographically choose to adopt the technology or move to something else? distributed around the world, many teams choose to For this paper, we compare longitudinal data from six code structure their work in an asynchronous, location-neutral forges to a "typical" technology adoption curve as presented way. Early web-based FLOSS hosting services, such as in classical Diffusion of Innovation (DoI) theory. Basic SourceForge and GNU Savannah, offered features such as models for technology diffusion were first described by Permission to make digital or hard copies of all or part of this work for Everett Rogers [2], who proposed a life cycle consisting of personal or classroom use is granted without fee provided that copies are early adoption, adoption by the majority, then late adoption not made or distributed for profit or commercial advantage and that copies ("laggards"), and ultimately either discontinuance of the bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be product or a saturated market. He applied this model to a honored. Abstracting with credit is permitted. To copy otherwise, or variety of technologies in a variety of industries, refining republish, to post on servers or to redistribute to lists, requires prior the model to show that diffusion could be affected by many specific permission and/or a fee. Request permissions from Permissions factors including social interactions and organizational @acm.org. OpenSym '17, August 23–25, 2017, Galway, Ireland dynamics. With time plotted on the x-axis and adoption of © 2017 Copyright is held by the owner/author(s). Publication rights an innovation shown on the y-axis, Rogers proposed that licensed to ACM. the typical diffusion of an innovation over time will likely ACM ISBN 978-1-4503-5187-4/17/08…$15.00 https://doi.org/10.1145/3125433.3125468 resemble a normal distribution (or an S-curve if plotted GHTorrent data [5] is used to describe the lifecycle of with a cumulative x-axis). Figure 1 shows the two "typical" GitHub. GHTorrent was created in 2013 as a source of data DoI technology adoption curves. about projects hosted on Github. For each of the five forges in this study, our data, queries, and calculations are available for download in the FLOSSmole data repository, at http://flossdata.syr.edu/data/forgeStudies/2017deathOf Forges. RubyForge RubyForge was launched on July 16, 2003 as a Ruby language-specific hosting site for FLOSS projects. It included collaboration tools such as file downloads, source code control software, bug tracking, and mailing lists. Figure 1. "Typical" Diffusion of Innovations (DoI) curves with periods of maximum growth shown (after Rogers, 1964). Project-level metadata collected from the 10 years of RubyForge's existence was gathered and described in our In assessing the adoption or lifecycle of a technology, the prior work [6]. Examples of project-level metadata we steepness or shallowness of the S-curve is interesting, as is collected for RubyForge includes: project name, project the point at which the innovation declines following a owners/developers, project description, project license, period of maximum adoption. By studying the adoption project registration date, and so on. curves, we may find that some code forges may reach their peak (maximum adoption) earlier or later than expected. Figure 2 shows a visualization of the RubyForge monthly Some forges may be kept alive well past their expected new project registrations found in the [6] data set, lifespan. For code forges that are not dead yet, a partial beginning with its launch in 2003 through its shutdown in adoption curve may exhibit clues for what is to come. 2014. The dates of two important events in RubyForge's history are overlaid on the graph: the launch of GitHub's Thus, this paper begins a data-driven, historical analysis of gem builder in 2008, and the 2009 launch of Gemcutter the diffusion/adoption curves of code forges. We study six (eventually renamed RubyGems). GitHub was a significant in detail: RubyForge, Google Code, SourceForge, competitor to RubyForge, and Gemcutter/RubyGems was CodePlex, ObjectWeb, and GitHub. Our questions are: specifically designed by the RubyForge team to be a replacement for RubyForge. As the graph shows, the most • RQ1: What do the adoption curves for each code intense growth at RubyForge occurred between 2006-2009. forge look like? • RQ2: What factors, if any, alter the curves or affect the adoption patterns between the forges? • RQ3: Do all code forges exhibit the same patterns of birth, growth, and death as would be expected from traditional DoI theory? To answer RQ1, for each of six code forges, we gather metrics to describe its adoption rate and we plot the adoption rate graphically. For RQ2, we outline the various details (e.g. abuse of the system) that may explain the Figure 2. Monthly new project registrations on RubyForge, shapes of the curves. To answer RQ3, we compare the 2003-2013, with key dates shown shape of these curves to what DoI theory would predict and discuss the possible reasons for any differences. Finally, we explore the limitations of this work and present ideas for how to advance this work in the future. TECHNOLOGY ADOPTION DATA FLOSSmole [3] data is used in this paper to describe the adoption rates - via new project registrations - of RubyForge, Google Code, CodePlex, and ObjectWeb. FLOSSmole was created in 2004 as a source for research- quality data about how free, libre, and open source software is constructed. SourceForge Research Data Archive (SRDA) [4] data is used to describe the lifecycle of SourceForge. SRDA was created in 2003 as a repository of Figure 3. Cumulative monthly new project registrations on data about projects hosted on the SourceForge system. RubyForge, 2003-2013, with month of maximum growth shown (January 2008) Figure 3 shows the same data, but with monthly but new project creations never again reached their pre- contributions comprising a cumulative total. This graph deluge levels. Nonetheless, on March 12, 2015 Chris shows that RubyForge began its decline at 52% saturation DiBona announced [9] that the code forge would shut down of the "market". Following the launch of a competitive site completely. He explained, (GitHub) and its own planned replacement "As developers migrated away from Google Code [to (Gemcutter/RubyGems), RubyForge began a slow decline. GitHub], a growing share of the remaining projects were RubyForge was eventually closed to new projects at the end [sic] spam or abuse.
Recommended publications
  • Practice Tips for Open Source Licensing Adam Kubelka
    Santa Clara High Technology Law Journal Volume 22 | Issue 4 Article 4 2006 No Free Beer - Practice Tips for Open Source Licensing Adam Kubelka Matthew aF wcett Follow this and additional works at: http://digitalcommons.law.scu.edu/chtlj Part of the Law Commons Recommended Citation Adam Kubelka and Matthew Fawcett, No Free Beer - Practice Tips for Open Source Licensing, 22 Santa Clara High Tech. L.J. 797 (2005). Available at: http://digitalcommons.law.scu.edu/chtlj/vol22/iss4/4 This Article is brought to you for free and open access by the Journals at Santa Clara Law Digital Commons. It has been accepted for inclusion in Santa Clara High Technology Law Journal by an authorized administrator of Santa Clara Law Digital Commons. For more information, please contact [email protected]. ARTICLE NO FREE BEER - PRACTICE TIPS FOR OPEN SOURCE LICENSING Adam Kubelkat Matthew Fawcetttt I. INTRODUCTION Open source software is big business. According to research conducted by Optaros, Inc., and InformationWeek magazine, 87 percent of the 512 companies surveyed use open source software, with companies earning over $1 billion in annual revenue saving an average of $3.3 million by using open source software in 2004.1 Open source is not just staying in computer rooms either-it is increasingly grabbing intellectual property headlines and entering mainstream news on issues like the following: i. A $5 billion dollar legal dispute between SCO Group Inc. (SCO) and International Business Machines Corp. t Adam Kubelka is Corporate Counsel at JDS Uniphase Corporation, where he advises the company on matters related to the commercialization of its products.
    [Show full text]
  • Proceedings of the 11Th International Symposium on Open Collaboration
    Proceedings of the 11th International Symposium on Open Collaboration August 19-21, 2015 San Francisco, California, U.S.A. General chair: Dirk Riehle, Friedrich-Alexander University Erlangen-Nürnberg Research track chairs: Kevin Crowston, Syracuse University Carlos Jensen, Oregon State University Carl Lagoze, University of Michigan Ann Majchrzak, University of Southern California Arvind Malhotra, University of North Carolina at Chapel Hill Claudia Müller-Birn, Freie Universität Berlin Gregorio Robles, Universidad Rey Juan Carlos Aaron Shaw, Northwestern University Sponsors: Wikimedia Foundation Google Inc. University of California Berkeley In-cooperation: ACM SIGSOFT ACM SIGWEB Fiscal sponsor: The John Ernest Foundation The Association for Computing Machinery 2 Penn Plaza, Suite 701 New York New York 10121-0701 ACM COPYRIGHT NOTICE. Copyright © 2014 by the Association for Computing Machin- ery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be hon- ored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481, or [email protected]. For other copying of articles that carry a code at the bottom of the first or last page, copying is permitted provided that the per-copy fee indicated in the code is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, +1-978-750-8400, +1-978-750- 4470 (fax).
    [Show full text]
  • Microsoft and Cray to Unveil $25,000 Windows-Based Supercomputer
    AAll About Microsoft: l lCodeTracker A monthly look at Microsoft’s codenames and what they Areveal about the direction of the company. b o u t M i c r o s o f t : All About Microsoft CodeTracker Keeping track of Microsoft's myriad codenames is an (almost) full-time occupation. I know, as I spend a lot of my work hours tracking down the latest names in the hopes of being able to better keep tabs on what's coming next from the Redmondians. Each month, I'll be releasing an updated, downloadable version of the CodeTracker. I'll add new codenames -- arranged in alphabetical order by codename -- of forthcoming Microsoft products and technologies. I also will note timing changes (date slips, the release of a new test build, the disappearance of a planned deliverable) for entries that are already part of the Tracker. Once Microsoft releases the final version of a product or technology I've been tracking, I will remove it from the Tracker. In that way, the CodeTracker will remain focused on futures. (An aside about the Tracker: A question mark in place of an entry means I have insufficient information to hazard even an educated guess about a particular category.) If you have suggested new entries or corrections to existing ones, please drop me an e-mail at mjf at microsofttracker dot com. Thanks! Mary Jo Foley, Editor, ZDNet's "All About Microsoft" blog This Month's Theme: Big iron needs love, too If you went by nothing but blog and publication headlines, you might think mobile phones and slates are where all the innovation is these days.
    [Show full text]
  • AAS Worldwide Telescope: Seamless, Cross-Platform Data Visualization Engine for Astronomy Research, Education, and Democratizing Data
    AAS WorldWide Telescope: Seamless, Cross-Platform Data Visualization Engine for Astronomy Research, Education, and Democratizing Data The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Rosenfield, Philip, Jonathan Fay, Ronald K Gilchrist, Chenzhou Cui, A. David Weigel, Thomas Robitaille, Oderah Justin Otor, and Alyssa Goodman. 2018. AAS WorldWide Telescope: Seamless, Cross-Platform Data Visualization Engine for Astronomy Research, Education, and Democratizing Data. The Astrophysical Journal: Supplement Series 236, no. 1. Published Version https://iopscience-iop-org.ezp-prod1.hul.harvard.edu/ article/10.3847/1538-4365/aab776 Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:41504669 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#OAP Draft version January 30, 2018 Typeset using LATEX twocolumn style in AASTeX62 AAS WorldWide Telescope: Seamless, Cross-Platform Data Visualization Engine for Astronomy Research, Education, and Democratizing Data Philip Rosenfield,1 Jonathan Fay,1 Ronald K Gilchrist,1 Chenzhou Cui,2 A. David Weigel,3 Thomas Robitaille,4 Oderah Justin Otor,1 and Alyssa Goodman5 1American Astronomical Society 1667 K St NW Suite 800 Washington, DC 20006, USA 2National Astronomical Observatories, Chinese Academy of Sciences 20A Datun Road, Chaoyang District Beijing, 100012, China 3Christenberry Planetarium, Samford University 800 Lakeshore Drive Birmingham, AL 35229, USA 4Aperio Software Ltd. Headingley Enterprise and Arts Centre, Bennett Road Leeds, LS6 3HN, United Kingdom 5Harvard Smithsonian Center for Astrophysics 60 Garden St.
    [Show full text]
  • XAMARIN.FORMS for BEGINNERS ABOUT ME Tom Soderling Sr
    XAMARIN.FORMS FOR BEGINNERS ABOUT ME Tom Soderling Sr. Mobile Apps Developer @ Polaris Industries; Ride Command Xamarin.Forms enthusiast DevOps hobbyist & machine learning beginner 4 year XCMD Blog: https://tomsoderling.github.io GitHub: https://github.com/TomSoderling Twitter: @tomsoderling How Deep Pickster Spaniel Is It? THE PLAN • Introduction: Why, What, and When • Overview of Xamarin.Forms Building Blocks • Building a Xamarin.Forms UI in XAML • Data Binding • View Customization • Next Steps & Resources • Please ask any questions that come up! THE PLAN • Introduction: Why, What, and When • Overview of Xamarin.Forms Building Blocks • Building a Xamarin.Forms UI in XAML • Data Binding • View Customization • Next Steps & Resources INTRODUCTION : WHY • WET: the soggy state of mobile app development • Write Everything Twice INTRODUCTION : WHY • WET: the soggy state of mobile app development • Write Everything Twice INTRODUCTION : WHAT • What is Xamarin.Forms? • Cross-platform UI framework • Platforms: • Mobile: iOS 8 and up, Android 4.0.3 (API 15) • Desktop: Windows 10 UWP, MacOS, WFP • Samsung Smart Devices: Tizen INTRODUCTION : WHAT • Brief History: • May 2011, Xamarin founded • MonoTouch and Mono for Android using MonoDevelop IDE • February 2013, release of Xamarin 2.0 • Xamarin Studio IDE & integration with Visual Studio • Renamed to Xamarin.Android and Xamarin.iOS • May 2014, Xamarin.Forms released as part of Xamarin 3 • February 24 2016, Xamarin acquired by Microsoft • Owned, actively developed on, and supported by Microsoft • Free
    [Show full text]
  • Discovery Attender User Guide
    Welcome Sherpa Software's Discovery Attender is a software tool designed to automate the search and collection of electronically stored information across a variety of platforms. This dynamic solution sits behind the firewall and enables in-house talent to identify and gather responsive information in a timely, defensible manner. The Discovery Attender interface is designed to be intuitive and user friendly. The Welcome Screen (above) serves as the gateway to the product. From here you can directly open the PreSearch Tool or create a project. A project serves as a repository for a collection - related searches and associated result sets. Once a new pro- ject is created, the MAIN CONSOLE acts as the central hub to allow you to control the creation, processing, and organization of searches. From here, you can access a Search Wizard which guides you through the step-by- step process of creating a new search. The MAIN CONSOLE also provides access to the Result Management features to view, organize, export and report on the messages, attachments and files that are found during your custom searches. Where to Begin To effectively filter and produce data with Discovery Attender, you must first create a project . Once a project is created, follow the steps below to begin the process: Step Action Description 1 Create and Run A Search A setup wizard leads you through the process of selecting locations and criteria to include in your search. Once started, the chosen data stores are scanned to find items (called results) that match the selected criteria. Information and metadata from these results are stored in the search data- bases.
    [Show full text]
  • ACS – the Archival Cytometry Standard
    http://flowcyt.sf.net/acs/latest.pdf ACS – the Archival Cytometry Standard Archival Cytometry Standard ACS International Society for Advancement of Cytometry Candidate Recommendation DRAFT Document Status The Archival Cytometry Standard (ACS) has undergone several revisions since its initial development in June 2007. The current proposal is an ISAC Candidate Recommendation Draft. It is assumed, however not guaranteed, that significant features and design aspects will remain unchanged for the final version of the Recommendation. This specification has been formally tested to comply with the W3C XML schema version 1.0 specification but no position is taken with respect to whether a particular software implementing this specification performs according to medical or other valid regulations. The work may be used under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported license. You are free to share (copy, distribute and transmit), and adapt the work under the conditions specified at http://creativecommons.org/licenses/by-sa/3.0/legalcode. Disclaimer of Liability The International Society for Advancement of Cytometry (ISAC) disclaims liability for any injury, harm, or other damage of any nature whatsoever, to persons or property, whether direct, indirect, consequential or compensatory, directly or indirectly resulting from publication, use of, or reliance on this Specification, and users of this Specification, as a condition of use, forever release ISAC from such liability and waive all claims against ISAC that may in any manner arise out of such liability. ISAC further disclaims all warranties, whether express, implied or statutory, and makes no assurances as to the accuracy or completeness of any information published in the Specification.
    [Show full text]
  • Understanding the Challenges of Collaborative Evidence-Based
    Do you have a source for that? Understanding the Challenges of Collaborative Evidence-based Journalism Sheila O’Riordan Gaye Kiely Bill Emerson Joseph Feller Business Information Business Information Business Information Business Information Systems, University Systems, University Systems, University Systems, University College Cork, Ireland College Cork, Ireland College Cork, Ireland College Cork, Ireland [email protected] [email protected] [email protected] [email protected] ABSTRACT consumption has evolved [25]. There has been a move WikiTribune is a pilot news service, where evidence-based towards digital news with increasing user involvement as articles are co-created by professional journalists and a well as the use of social media platforms for accessing and community of volunteers using an open and collaborative discussing current affairs [14,39,43]. As such, the boundaries digital platform. The WikiTribune project is set within an are shifting between professional and amateur contributions. evolving and dynamic media landscape, operating under Traditional news organizations are adding interactive principles of openness and transparency. It combines a features as participatory journalism practices rise (see [8,42]) commercial for-profit business model with an open and the technologies that allow citizens to interact en masse collaborative mode of production with contributions from provide new avenues for engaging in democratic both paid professionals and unpaid volunteers. This deliberation [19]; the “process of reaching reasoned descriptive case study captures the first 12-months of agreement among free and equal citizens” [6:322]. WikiTribune’s operations to understand the challenges and opportunities within this hybrid model of production. We use With these changes, a number of challenges have arisen.
    [Show full text]
  • Licensing Information User Manual Release 9.1 F13415-01
    Oracle® Hospitality Cruise Fleet Management Licensing Information User Manual Release 9.1 F13415-01 August 2019 LICENSING INFORMATION USER MANUAL Oracle® Hospitality Fleet Management Licensing Information User Manual Version 9.1 Copyright © 2004, 2019, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error- free. If you find any errors, please report them to us in writing. If this software or related documentation is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.
    [Show full text]
  • Software License Agreement (EULA)
    Third-party Computer Software AutoVu™ ALPR cameras • angular-animate (https://docs.angularjs.org/api/ngAnimate) licensed under the terms of the MIT License (https://github.com/angular/angular.js/blob/master/LICENSE). © 2010-2016 Google, Inc. http://angularjs.org • angular-base64 (https://github.com/ninjatronic/angular-base64) licensed under the terms of the MIT License (https://github.com/ninjatronic/angular-base64/blob/master/LICENSE). © 2010 Nick Galbreath © 2013 Pete Martin • angular-translate (https://github.com/angular-translate/angular-translate) licensed under the terms of the MIT License (https://github.com/angular-translate/angular-translate/blob/master/LICENSE). © 2014 [email protected] • angular-translate-handler-log (https://github.com/angular-translate/bower-angular-translate-handler-log) licensed under the terms of the MIT License (https://github.com/angular-translate/angular-translate/blob/master/LICENSE). © 2014 [email protected] • angular-translate-loader-static-files (https://github.com/angular-translate/bower-angular-translate-loader-static-files) licensed under the terms of the MIT License (https://github.com/angular-translate/angular-translate/blob/master/LICENSE). © 2014 [email protected] • Angular Google Maps (http://angular-ui.github.io/angular-google-maps/#!/) licensed under the terms of the MIT License (https://opensource.org/licenses/MIT). © 2013-2016 angular-google-maps • AngularJS (http://angularjs.org/) licensed under the terms of the MIT License (https://github.com/angular/angular.js/blob/master/LICENSE). © 2010-2016 Google, Inc. http://angularjs.org • AngularUI Bootstrap (http://angular-ui.github.io/bootstrap/) licensed under the terms of the MIT License (https://github.com/angular- ui/bootstrap/blob/master/LICENSE).
    [Show full text]
  • Using Shape Expressions (Shex) to Share RDF Data Models and to Guide Curation with Rigorous Validation B Katherine Thornton1( ), Harold Solbrig2, Gregory S
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Repositorio Institucional de la Universidad de Oviedo Using Shape Expressions (ShEx) to Share RDF Data Models and to Guide Curation with Rigorous Validation B Katherine Thornton1( ), Harold Solbrig2, Gregory S. Stupp3, Jose Emilio Labra Gayo4, Daniel Mietchen5, Eric Prud’hommeaux6, and Andra Waagmeester7 1 Yale University, New Haven, CT, USA [email protected] 2 Johns Hopkins University, Baltimore, MD, USA [email protected] 3 The Scripps Research Institute, San Diego, CA, USA [email protected] 4 University of Oviedo, Oviedo, Spain [email protected] 5 Data Science Institute, University of Virginia, Charlottesville, VA, USA [email protected] 6 World Wide Web Consortium (W3C), MIT, Cambridge, MA, USA [email protected] 7 Micelio, Antwerpen, Belgium [email protected] Abstract. We discuss Shape Expressions (ShEx), a concise, formal, modeling and validation language for RDF structures. For instance, a Shape Expression could prescribe that subjects in a given RDF graph that fall into the shape “Paper” are expected to have a section called “Abstract”, and any ShEx implementation can confirm whether that is indeed the case for all such subjects within a given graph or subgraph. There are currently five actively maintained ShEx implementations. We discuss how we use the JavaScript, Scala and Python implementa- tions in RDF data validation workflows in distinct, applied contexts. We present examples of how ShEx can be used to model and validate data from two different sources, the domain-specific Fast Healthcare Interop- erability Resources (FHIR) and the domain-generic Wikidata knowledge base, which is the linked database built and maintained by the Wikimedia Foundation as a sister project to Wikipedia.
    [Show full text]
  • Gforge-Lists-Mailman Apache2 0. Mailman 0. Gforge-Db-Postgresql
    llvm-2.8-dev mongodb-clients libfpdi-php lazarus-ide-gtk2 libecpg6 libtulip-ogl-dev gnome-mime-data 0. 0. 0. libparrot3.3.0 libwnck-3-0 0. 0. libjaxp1.3-java-gcj 0. libusbmuxd1 erlang-public-key 0. 1.93236714976 0. llvm-2.8 mongodb-dev libfpdf-tpl-php 0. libecpg-compat3 0. lazarus-ide libgemplugin-ruby1.8 0. 0. 2.24719101124 0. libtulip-dev 3.33333333333 0. 0. 0. 0. 0. 0. 0. 0. libgnomevfs2-0 gnome-desktop3-data 0. 0. 0. 1.68269230769 libopenr2-3 gir1.2-wnck-3.0 0. libxalan2-java-gcj 2.22222222222 libimobiledevice1 3.22580645161 erlang-inets 0. libgpg-error0 libblkid1 0.24154589372 libparrot-dev parrot llvm-2.8-runtime mongodb-server pnp4nagios-web libpgtypes3 libtulip-qt4-dev libnet-daemon-perl lazarus-src 0.0. 0. 0. 0. libgemplugin-ruby kdebase-workspace-data libecore-input1 0. libbft-dev libgtk2.0-common 0. 0. 0. 0.70949940875 0.749506903353 libeet-dev python-lazr.uri 0. 4.54674623473 0. cl-alexandria libxau6 0. pia freespacenotifier libksieve4 libtasn1-3 libgnutls26 0.45045045045 libwnck-3-common usbmuxd erlang-ssl libxerces2-java-gcj libgnomevfs2-common 0.0426257459506 0. gir1.2-gnomedesktop-3.0 fxload 0. dahdi-linux 0. 0. libcgns-dev 0. 0. 0. 0. mount initscripts sysvinit-utils 0. 0. reportbug libdbi-perl geant321-data 0.0394477317554 0.355029585799 0. 0. 4.07786302927 llvm-2.9 libecore-evas1 libecore-fb1 0. 0. 0.1574803149611.06709047565 libatk1.0-0 0. sugar-presence-service-0.90 0. 0. 0. 0. parrot-minimal mongrel 1.41899881751.45956607495 libgcrypt11 0. ksysguard kdebase-workspace-bin 0.
    [Show full text]