Summer of Code & Hackathons for Community Building in Scientific

Total Page:16

File Type:pdf, Size:1020Kb

Summer of Code & Hackathons for Community Building in Scientific Community Code Engagements: Summer of Code & Hackathons for Community Building in Scientific Software Erik H. Trainer, Chalalai Chaihirunkarn, Arun Kalyanasundaram, James D. Herbsleb Institute for Software Research Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213 {etrainer, cchaihir, arunkaly, jdh}@cs.cmu.edu ABSTRACT the software’s future may not be addressed, e.g.: in 4 years’ time, Community code engagements — short-term, intensive software will the software still be available? Will it work? Will there be development events — are used by some scientific communities pool of participants with the right set of technical skills who can to create new software features and promote community building. respond to bug reports and feature requests? Successful But there is as yet little empirical support for their effectiveness. communities find ways to get code contributions from their This paper presents a qualitative study of two types of community members and to incorporate successive generations of newcomers code engagements: Google Summer of Code (GSoC) and after the original developers leave [30]. hackathons. We investigated the range of outcomes these There is an additional twist for software that scientists write. engagements produce and the underlying practices that lead to Although scientists are directly funded to produce new these outcomes. In GSoC, the vision and experience of core knowledge, they spend significant time searching for, using, and members of the community influence project selection, and the developing software that enables those results. The sustainability intensive mentoring process facilitates creation of strong ties. of scientific software – the ability to maintain the software in a Most GSoC projects result in stable features. The agenda setting state where scientists can understand, replicate, and extend prior phase of hackathons reveals high priority issues perceived by the reported results that depend on that software – has sometimes community. Social events among the relatively large numbers of been an afterthought because scientists are rewarded for the participants over brief engagements tend to create weak ties. Most publications they write, not the software they create and support hackathons result in prototypes rather than finished tools. We [25, 26]. This software, however, is a critical link in the chain of discuss themes and tradeoffs that suggest directions for future evidence establishing new scientific knowledge, and thus other empirical work around designing community code engagements. scientists need to be able to run this software in order to understand and replicate this new knowledge, and apply it to new Categories and Subject Descriptors problems. H.5.3 [Information Interfaces and Presentation (e.g., HCI)]: Group and Organization Interfaces – computer supported A few scientific communities in the life sciences have begun cooperative work, organizational design. experimenting with short-term focused community engagements, such as Google Summer of Code (GSoC) and hackathons. Keywords Although there is reason to believe from previous research on Community code engagements, Google Summer of Code (GSoC), online communities (e.g., [30]) that these engagements may hackathons, scientific software. enhance the sustainability of scientific software, there is as yet little empirical support. Moreover, evidence about when various 1. INTRODUCTION types of engagements are likely to succeed is scant. How do you go from a small number of people with a common interest to a full-fledged community? The active body of research In this paper, we aim to understand the range of outcomes these on this problem (e.g., [17, 30, 31]) is a testament to the role of engagements produce and the underlying practices that lead to community building in collaborative work practices. these outcomes. We hope to highlight concrete engagement design issues that community leaders and funding agencies might Active communities are essential to the sustainability of software. consider in order to optimize the outcomes they desire. Without a community around the code distribution, key issues of 2. BACKGROUND This is a preprint version of a paper to be published in the 2014 ACM 2.1 Sustainability of Scientific Software Conference on Supporting Group Work. Software is of vital importance to science. The role of software in data analysis, simulation, and visualization is widely Please do not quote or distribute. acknowledged [9, 27, 38]. A 2005 NSF Workshop Report [5] clarified the importance of software in cyberinfrastructure, which is the “infrastructure based upon distributed computer, information, and communication technology”[2]. Much scientific software, however, is not infrastructural. For instance, there are many “workbench” applications for end user scientists (e.g., Dan Gezelter’s directory [34] lists almost 500 programs). This list does not even include the myriad scripts and data conversion utilities scientists write to translate data into intermediate forms required by tools in the later stages of workflows [26]. Although NSF common interest, purpose, or goal. Practitioners who learn from sponsored workshops have repeatedly called attention to each other to develop themselves personally and professionally cyberinfrastructure maintenance [5], there is less clarity into how (e.g., a less experienced scientist developer works with a more scientific software more generally can be sustained over time, experienced developer to develop features of increasing difficulty) even though it is a crucial part of scientific research, development, constitute a community of practice [33], whereas people who and delivery [44]. share information with others but who are not necessarily Informal evidence suggests that scientific software is increasingly practitioners themselves (e.g., an end user scientist answers a a key problem of interest to individual researchers and research question about the tool’s installation procedure on the mailing list) institutions. For example, the First Workshop on Sustainable constitute a community of interest [24]. In a community of Software for Science [51] was recently held collocated with an practice, learning is always situated in practice. Prior studies of annual conference on High Performance Computing. A simple open-source software communities have used situated learning to head count revealed that one third of the workshop participants describe the socialization and sustained participation of only attended the conference for the workshop. As further newcomers [17]. This suggests that community of practice is the evidence, the Water Science Software Institute has developed a type of community important for the sustainability of scientific model for software development specifically aimed to support the software. maintenance of scientific software [6]. The literature on communities suggests that in order to be Scientific software exists in a variety of states from “as persistent successful, communities must address two primary challenges: as the next grant supporting maintenance” to “supported by a very receiving contributions and attracting newcomers. small community of volunteers” [44]. But if that software has 2.3.1 Contributions gained widespread use outside of the lab and served a valuable Communities need contributions from participants ([30], p. 2- role in assisting other scientists in making new discoveries, it 5). In software development, contributions might comprise a should be able to be refined and extended for use by other working base of source code that serves people significantly better scientists who can use it to produce new knowledge. than the competition. For example, BLAST, the widely used 2.2 The Promise of Open-Source open-source tool for comparing biological sequence information [1] achieved such early success because it was so much faster than In discussions of software sustainability, the open-source software previous algorithms available at the time. Assuming a newcomer model is invariably held up as a promising approach. For instance, is at least interested in modifying or contributing to the software, position papers from the 2009 NSF funded workshop on the barrier to entry must be low to modify and extend the existing “Cyberinfrastructure Software Sustainability and Reusability” contributions to meet their own needs. Because most contributors [44] led to the report’s recommendation that cyberinfrastructure to open-source software projects leave after their personal needs software should be released under an open-source license. are met [41], it is important to attract enough newcomers in order Directly applying the open-source model to scientific software to find those who will continue to contribute and act as stewards development, however, neglects crucial differences between open- of the code base. source scientific software and open-source software in general. The primary difference is in the incentive structure for 2.3.2 Attracting Newcomers contribution [25, 26]. For open-source developers, reputation in To survive over the long-term, communities must find ways to the open-source community is a primary motivation, where the attract new generations of members to replace the ones who number of “followers” a developer has is a symbol of social status leave ([30], p. 179). For example, in recent years leaders of the [13]. Scientists who write software, in contrast, operate in a online encyclopedia Wikipedia have expressed discontent over
Recommended publications
  • ROADS and BRIDGES: the UNSEEN LABOR BEHIND OUR DIGITAL INFRASTRUCTURE Preface
    Roads and Bridges:The Unseen Labor Behind Our Digital Infrastructure WRITTEN BY Nadia Eghbal 2 Open up your phone. Your social media, your news, your medical records, your bank: they are all using free and public code. Contents 3 Table of Contents 4 Preface 58 Challenges Facing Digital Infrastructure 5 Foreword 59 Open source’s complicated relationship with money 8 Executive Summary 66 Why digital infrastructure support 11 Introduction problems are accelerating 77 The hidden costs of ignoring infrastructure 18 History and Background of Digital Infrastructure 89 Sustaining Digital Infrastructure 19 How software gets built 90 Business models for digital infrastructure 23 How not charging for software transformed society 97 Finding a sponsor or donor for an infrastructure project 29 A brief history of free and public software and the people who made it 106 Why is it so hard to fund these projects? 109 Institutional efforts to support digital infrastructure 37 How The Current System Works 38 What is digital infrastructure, and how 124 Opportunities Ahead does it get built? 125 Developing effective support strategies 46 How are digital infrastructure projects managed and supported? 127 Priming the landscape 136 The crossroads we face 53 Why do people keep contributing to these projects, when they’re not getting paid for it? 139 Appendix 140 Glossary 142 Acknowledgements ROADS AND BRIDGES: THE UNSEEN LABOR BEHIND OUR DIGITAL INFRASTRUCTURE Preface Our modern society—everything from hospitals to stock markets to newspapers to social media—runs on software. But take a closer look, and you’ll find that the tools we use to build software are buckling under demand.
    [Show full text]
  • Google Summer of Code 2019
    Google Summer of Code 2019 Contributing for: The Terasology Foundation Biome-centric Gameplay Template / Enhancements for Terasology! 1 ABOUT ME Name Hassaan Ali (TheHxn) Email [email protected] Discord @TheHxn (#3124) GitHub - https://github.com/TheHxn Profiles Forum - https://forum.terasology.org/members/thehxn.3148/ 2 BIOME-CENTRIC GAMEPLAY ENHANCEMENTS 2.1 OVERVIEW This Idea has been chosen from Terasology’s GSoC Ready Ideas board from Trello [1]. Currently biomes are used in a few game settings, but not with a huge impact to gameplay. This idea aims to support greater variety, meaning to biomes and to help make worlds more "alive" as said by Brylie on the forum. 2.2 INTEREST My interest in this project comes from the fact that not many GSoC students are interested in it, so it definitely needs work as it is a very good idea for Terasology giving the game engine a unique feel to it. Also because I have worked very much with terrains, used World Machine, L3DT, Terresculptor terrain generators to generate climate based terrains. I am very interested as to how the world and life biomes could be improved in Terasology. 2.3 PROJECT FUNCTIONS 1. Inspection tool: When a player encounters a plant or animal, they might use an 'inspection' tool. It can show the details of the entity, we can use WordlyToolTip module to give such information. These details could include health, hunger, biome preferences, and genomic information for the inspected entity. 2. Transplant/Transport: Plants and animals can be transplanted between biomes. Animals could be transplanted using the GooKeeper module as a catch-and-release tool.
    [Show full text]
  • Ultimate++ Forum - Mentoring How to Ing-Howto/Index.Html
    Subject: Google Summer of Code Posted by koldo on Mon, 08 Mar 2010 11:08:17 GMT View Forum Message <> Reply to Message Hello all Google Summer of Code is a program that awards with money students that work in approved Open Source projects. To participate in it first the open source project has to apply to it as a "mentor organization". The deadline for this is this Friday 12. Main things to do are: - Open a "ideas" page in web - Fill the mentor organization questionnaire There is few time and few opportunities to be approved but some of us think that we would have to try it. If you can help please answer to this post ASAP. We have only 4 days, so we have to be very constructive talking ONLY about "Applying to GSoC as a Mentoring Organization". Please put other discussions in other posts. If you cannot participate this week but you have an idea for a project please post it, including: - Project description - Experience required to do it Do not forget that there is few time to do the project ("summer of code") so please be specific including only projects to be finished in short time. Some links: - Google Summer of Code 2010 FAQ http://socghop.appspot.com/document/show/gsoc_program/google /gsoc2010 - "ideas" page examples: -- https://svn.boost.org/trac/boost/wiki/soc2009 -- http://wiki.winehq.org/SummerOfCode -- http://wiki.wxwidgets.org/Development:_Student_Projects - Selection criteria http://socghop.appspot.com/document/show/program/google/gsoc 2009/orgcriteria - Advices for mentor organization http://code.google.com/p/google-summer-of-code/wiki/Advicefo
    [Show full text]
  • Phpmyadmin Documentation Release 5.1.2-Dev
    phpMyAdmin Documentation Release 5.1.2-dev The phpMyAdmin devel team Sep 29, 2021 Contents 1 Introduction 3 1.1 Supported features............................................3 1.2 Shortcut keys...............................................4 1.3 A word about users............................................4 2 Requirements 5 2.1 Web server................................................5 2.2 PHP....................................................5 2.3 Database.................................................6 2.4 Web browser...............................................6 3 Installation 7 3.1 Linux distributions............................................7 3.2 Installing on Windows..........................................8 3.3 Installing from Git............................................8 3.4 Installing using Composer........................................9 3.5 Installing using Docker..........................................9 3.6 IBM Cloud................................................ 14 3.7 Quick Install............................................... 14 3.8 Verifying phpMyAdmin releases..................................... 16 3.9 phpMyAdmin configuration storage................................... 17 3.10 Upgrading from an older version..................................... 19 3.11 Using authentication modes....................................... 19 3.12 Securing your phpMyAdmin installation................................ 26 3.13 Using SSL for connection to database server.............................. 27 3.14 Known issues..............................................
    [Show full text]
  • MEDES and Google Summer of Code
    MEDES and Google Summer Of Code All students and developers are welcome to participate in the Google Summer of Code program, with MEDES. Google Summer of Code is a program that offers student developers stipends to write code for various open source projects. For its activities, MEDES has developed an innovative data collection tool based on Open Source technologies: the Imogene solution. This tool has enabled the deployment of information systems in various contexts that are now operational. The platform allows to rapidly design a data collection system and, based on MDA technologies, it allows to generate a set of applications fulfilling the needs specified by the model. The applications generated include: * a Web application, * an Android application, * a Desktop application. Both Android and Desktop applications can work offline and have bi-directionnal synchronization processes. They integrate remote update mechanisms. Read more about Imogene on our website or at code.google.com/p/imogene All our applications are developed using the Java programing language. You will have to work with Eclipse as Imogene is an Eclipse Plugin itself. A good knowledge of Java is required to apply to one of these projects. The ideas below were contributed by our team. If you wish to submit a proposal based on these ideas, you can contact us and find out more about the particular suggestion you're looking at. Project: Unit test project for the web generated application using Selenium Brief explanation: Each time a web application is generated using Imogene, the application needs to be tested. By generating a unit test project, this would automate the unit tests for a generated application allowing the users to validate the application functionalities.
    [Show full text]
  • Facebook's Libra
    JULY 2019 Facebook’s Libra AND THE FUTURE OF DIGITAL IDENTITIES Page 6 (Feature Story) Apple launches its own digital ID program 9Page 9 (News and Trends) The challenges of digital IDs in the mobile space 13Page 13 (Deep Dive) © 2019 PYMNTS.com All Rights Reserved WHAT'S INSIDE Digital ID developers race to provide better, 03 more secure solutions FEATURE STORY Wayne Vaughan, co-founder of the 06 Decentralized Identity Foundation, on Facebook’s Libra cryptocurrency and how it will impact the digital identity industry NEWS AND TRENDS METHODOLOGY The latest headlines from around the digital Who’s on top and how they got there, including 09 identity space, including Apple's new digital ID 15 three sets of top provider rankings program, 3D finger vein scanners at hospitals and more DEEP DIVE SCORECARD An in-depth look at mobile digital IDs and the The results are in. See the highest-ranked 13 issues world governments have faced during 16 companies in a provider directory featuring implementation more than 200 major digital identity players. ABOUT 89 Information about PYMNTS.com and Jumio TABLE OF CONTENTS ACKNOWLEDGMENT The Digital Identity Tracker is done in collaboration with Jumio, and PYMNTS is grateful for the company’s support and insight. PYMNTS.com retains full editorial control over the presented findings, methodology and data analysis. WHAT’S INSIDE The digital identity market is expected to reach $15 billion paralysis of choice with so many options available. One by 2024, and giants such as Google and Apple are rac- potential solution is a decentralized, self-sovereign stan- ing to improve identity verification experiences.
    [Show full text]
  • Harvesting the Twittersphere: Qualitative Research Methods Using Twitter
    Fall 08 Spring 13 Harvesting the Twittersphere: Qualitative Research Methods Using Twitter By: Anthony La Rosa [email protected] May 2013 Marketing Advisor: Deborah Fain Marketing-Lubin School of Business Pace University i Abstract: Harvesting the Twittersphere explores the current state of research, by comparing quantitative to qualitative and analyzing the current market. In a consumer driven market, it seems that most businesses are neglecting performing qualitative research. It could be because of the falling cost and increasing convenience of quantitative research methods provided by cloud systems such as: Salesforce.com, IBM, SAP, and McKinsey. Another reason may be that there is no efficient or inexpensive way to conduct qualitative research on a digital platform. While certain companies try to conduct qualitative studies online by using chat rooms, discussion boards or Facebook prompts, there is no method that is as widespread or respective as the traditional methods. While focus groups and participant observation offer unique insights into consumers, both methods can be costly and difficult to set up. Harvesting the Twittersphere proposes a new methodology of using Twitter to conduct a qualitative study. By search a specific term, a researcher can search through the constantly generated tweets to see what people are saying about the term. The tweets should be captured, sorted and analyzed in order to provide a unique insight from the consumer. By nature, Twitter offers feelings of users since what they tweet is usually their individual perspective on a subject. This is the perfect field in order to conduct a qualitative study since it is about sharing emotions, sentiments and feelings, rather than numbers, facts or statistics.
    [Show full text]
  • Zvonimir Rakamaric June, 2021
    Zvonimir Rakamari´c June, 2021 CONTACT Address: School of Computing, 50 South Central Campus Drive, Rm 3424 INFORMATION University of Utah, Salt Lake City, UT 84112-9205, USA Phone: +1 (801) 581-6139 E-mail: [email protected] WWW: www.zvonimir.info, www.soarlab.org PROFESSIONAL Amazon Web Services (AWS), Seattle, WA, USA APPOINTMENTS Principal Applied Scientist May 2021 – present School of Computing, University of Utah, Salt Lake City, UT, USA Associate Professor (leave of absence) Jul 2018 – present School of Computing, University of Utah, Salt Lake City, UT, USA Assistant Professor Jun 2012 – Jun 2018 Carnegie Mellon University, Silicon Valley Campus, NASA Ames Research Park, CA, USA Postdoctoral Fellow Mar 2011 – Mar 2012 Dept. of Computer Science, University of British Columbia, Vancouver, BC, Canada Research Assistant Sep 2006 – Mar 2011 Software Reliability Research Group, Microsoft Research, Redmond, WA, USA Research Intern Jul 2006 – Oct 2006, Oct 2008 – Jan 2009, Nov 2009 – Feb 2010 Dept. of Computer Science, University of British Columbia, Vancouver, BC, Canada Research Assistant May 2005 – Aug 2006 TIS.kis, Zagreb, Croatia Software Engineer Mar 2003 – Aug 2004 EDUCATION University of British Columbia, Vancouver, BC, Canada Ph.D. in Computer Science, Mar 2011 • Thesis: Modular Verification of Shared-Memory Concurrent System Software • Supervisor: Alan J. Hu M.Sc. in Computer Science, Aug 2006 • Thesis: A Logic and Decision Procedure for Verification of Heap-Manipulating Programs • Supervisor: Alan J. Hu Faculty of Electrical
    [Show full text]
  • Open Source and Sustainability
    OPEN SOURCE AND SUSTAINABILITY: THE ROLE OF UNIVERSITY Dr. Giorgio F. SIGNORINI, PhD Dipartimento di Chimica, Università di Firenze via della Lastruccia, 3 I-50019 Sesto F. (Firenze), Italy giorgio.signorini@unifi.it Abstract One important goal in sustainability is making technologies available to the maximum possi- ble number of individuals, and especially to those living in less developed areas (Goal 9 of SDG). However, the diffusion of technical knowledge is hindered by a number of factors, among which the Intellectual Property Rights (IPR) system plays a primary role. While opinions about the real effect of IPRs in stimulating and disseminating innovation differ, there is a growing number of authors arguing that a different approach may be more effective in promoting global development. The success of the Open Source (OS) model in the field of software has led analysts to speculate whether this paradigm can be extended to other fields. Key to this model are both free access to knowledge and the right to use other people’s results. After reviewing the main features of the OS model, we explore different areas where it can be profitably applied, such as hardware design and production; we finally discuss how academical in- stitutions can (and should) help diffusing the OS philosophy and practice. Widespread use of OS software, fostering of research projects aimed to use and develop OS software and hardware, the use of open education tools, and a strong commitment to open access publishing are some of the discussed examples. Keywords Open Source, Sustainable Development, University, Open Education, Open Access arXiv:1910.06073v1 [cs.CY] 8 Oct 2019 1 Introduction What is sustainability about? According to the widely accepted definition of the Brundtland Report (Brundtland 1987), human development is sustainable when it can satisfy the needs of the current gener- ation without compromising the ability of future generations to do the same.
    [Show full text]
  • A Microkernel Written in Rust
    Objectives Introduction Rust Redox A microkernel written in Rust Porting the UNIX-like Redox OS to Arm v8.0 Robin Randhawa Arm February 2019 Robin Randhawa (arm) FOSDEM 2019 A microkernel written in Rust Objectives Introduction Rust Redox I want to talk about on Robin Randhawa (arm) FOSDEM 2019 A microkernel written in Rust Objectives Introduction Rust Redox Redox is written in Rust - a fairly new programming language Robin Randhawa (arm) FOSDEM 2019 A microkernel written in Rust Objectives Introduction Rust Redox So it is important to discuss Rust too Robin Randhawa (arm) FOSDEM 2019 A microkernel written in Rust Objectives Introduction Rust Redox My goals with this presentation are ● Lightweight intro to the Rust language ● Unique features that make it shine To primarily ● Explain why Rust is interesting for arm talk about ● Rust’s support for arm designs these ● Introduce Redox’s history, design, community ● Status, plans … and some relevant anecdotes from the industry Robin Randhawa (arm) FOSDEM 2019 A microkernel written in Rust Objectives Introduction Rust Redox Open Source Software Division System Software Architecture Team Safety Track Track Charter Firmware Kernel Middleware Platform “Promote the uptake of Arm IP in safety critical domains using open source software as a medium” Robin Randhawa (arm) FOSDEM 2019 A microkernel written in Rust Objectives Introduction Rust Redox My areas of Interest Systems programming languages Operating Arm system architecture design extensions Arm based Open source system communities design Software
    [Show full text]
  • Free Applications
    1 Free Applications Hundreds of Apps with Potential to Enhance Professional Development, Technical Assistance, and Dissemination Activities and Results By Larry Edelman [email protected] Note: This document (v. 9 – 2/28/11) is updated frequently. Updates and related discussions are posted at: http://exploringtech.wordpress.com/ Why should we use technology for PD, TA, and Dissemination? • In particular, technology can help us to efficiently, effectively, and creatively: 1. Create content; 2. Share knowledge; and 3. Build and support relationships. We should consider ALL applications that are available to us. This includes applications that are expensive, moderately priced, inexpensive (shareware), and free (freeware). This includes software that we download to our computers and software that runs on the web. Some expensive applications enable us to us to communicate in very unique and effective ways, while the functions of other expensive applications can be easily replicated, or even improved upon, by the use of freeware. Likewise, some free applications are all we need to get the job done, while other free applications have significant limitations or involve the use of paid advertisements that detract from our purposes. I use some relatively expensive applications for tasks such as video editing and authoring online learning modules. But I also use many free applications for things such as document sharing, web conferencing, video conferencing, media transcoding, audio editing, screen capturing, media posting, and building and hosting wikis, blogs, and web sites. Why, in particular, should we explore free applications? • Sometimes, free applications are all that we need. Why purchase or license expensive software when there are free solutions? • The budgets in PD and TA programs are being dramatically reduced.
    [Show full text]
  • Interns Lightning Talks
    Interns Lightning Talks Proxy editing PiTiVi Proxy editing Anton Belka, [email protected] GUADEC 2013 The TM Conference GUADEC Proxy editing What is proxy editing? Proxy editing is the ability to swap video clips by a "proxy" version that is more suited for editing, and then using the original, full-quality clips to do the render. The TM Conference GUADEC Proxy editing Implementation GStreamer Editing Services (GES) Design and implement proxy editing API in GES Write tests for proxy editing API Fixing possible issues PiTiVi Integrate changes in GES with PiTiVi Fixing possible issues The TM Conference GUADEC Proxy editing Summary We must have manual/semi-automated and fully-automated modes We must be able choose what video clips must use proxy editing mode No negative impacts on perfomance when generating the clips "proxies" The TM Conference GUADEC Appendix ResourcesI PiTiVi http://pitivi.org GStreamer http://gstreamer.freedesktop.org Proxy editing requirements http://wiki.pitivi.org/wiki/Proxy_editing_ requirements My blog http://antonbelka.com The TM Conference GUADEC Bookshelf View Tiled Rendering Things I learnt Bookshelf View & Tiling for Evince Aakash Goenka, [email protected] Google Summer of Code 2013 GUADEC 2013 The TM Conference GUADEC Bookshelf View Tiled Rendering Things I learnt Why bookshelf? For easy access to recently opened documents Looks way cooler than a blank window Display more items than `Recent Files' menu The TM Conference GUADEC Bookshelf View Tiled Rendering Things I learnt Bookshelf Screenshot Figure
    [Show full text]