The Secret Life of Software Communities: What We Know and What We Don’T Know
Total Page:16
File Type:pdf, Size:1020Kb
The Secret Life of Software Communities: What we know and What we Don’t know Gemma Catolino1, Fabio Palomba2, Damian A. Tamburri3 Delft University of Technology1, University of Salerno2, Eindhoven University of Technology & Jheronimus Academy of Data Science (JADS)3 [email protected], [email protected], [email protected] Abstract—Communities of software practice are increasingly II. PRACTICAL RECOMMENDATION playing a central role in the development, operation, mainte- nance, and evolution of good-quality software, as well as DevOps A. FACT 1. Software Communities Have Types pipelines, lean Organizations and Global Software Development. Several community types have been studied in organiza- However, it is still unknown the structures and characteristics behind such communities. For this reason, in this paper, we tried tional structures and social networks research - Figure 1 offers to explore the organizational secret of communities, trying to an overview of these types extracted from a systematic studies offer a few practical extracts of (1) what we know and is known, on the matter [11], [12]. In Layman’s terms, a community (2) what we know to be unknown, and (3) what we know to type is a series of characteristics, a pattern, which remain be tentatively discoverable in the near future from an empirical constantly true across the social network underlying that research point of view. Moreover, the paper provides a number of recommendations for practitioners to help and be helped in community. What is still understood fairly little in the state their community endeavors. of the art of software engineering research is that, on the one Index Terms—Community Types; DevOps; Road Map. hand, software communities not surprisingly exhibit the same types already known in literature, but, on the other hand, the I. INTRODUCTION role of community types and characteristics for the benefit Communities are the foundation and backbone of organized (or fallacy) of software code qualities as well as software societies and just as much as our software-driven civil societies processes remains mostly unknown. are dense with communities, the software makers are also What we don’t know: (a) the influence of community types organized into communities of practice (e.g., in open-source), over software qualities as well as the qualities of software of intent (e.g., in standardization groups), of purpose (e.g., processes; (b) algorithms and measurements to precisely pin- as part of agile movements). Recent studies showed how the point type shifts across the community’s life-cycle; (c) design community’s health can reflect the quality of the software patterns for community structures which are contiguous with produced [5], [6], for this reason research community tried to design patterns in underlying software architectures. deeper analyze and characterize the structure of software com- What should practitioners do: (1) follow community munities (i.e., defined as social units of size, dense strongly- tracking and measurement initiatives such as OPENHUB 1 typed and diverse interactions across diverse roles and fluid or BITERGIA 2 to gather insights over their own community characteristics) [12]. and act upon it - this ensures that progress in community In this paper, we discuss about a few basic facts and management, measurement, and steering practices are har- observations about the secret life of software communities nessed; (2) engage in community quality and health initiatives which we were able to empirically establish over previous such as CHAOSS 3 - this ensures that initiatives struggling research studies as well as our own practice as part of software to crack the code of sustainable software communities are communities themselves. well fed with engaged practitioners; (3) manifest their own We outline the roundup to encourage further research into organizational and socio-technical issues as much as the code- understanding and harnessing the (still) very much secret life quality issues - this ensures that social software engineering of open-source communities, as well as their structural, socio- [13] researchers have both quantities and qualities of the right technical, and health characteristics. Furthermore, we aim to data and evidence to study. call for help from the practitioners themselves in supporting their own software community activity and therefore, we B. FACT 2. Types Influence Software Qualities provide practitioners with practical recommendations or calls We conducted several experiments to exploratively figure for help defined with the intent of engaging their interest out the boundaries around the influence that software com- around the matter and/or allowing further understanding their munity types may be playing over the qualities for software needs from a research perspective, such that we as a research production and its maintenance. Figure 2, for example, shows community can aid them in a better fashion. Moreover, we the influence of several types over the rate of re-opened provide a brief road map for further research along this intriguing emerging topic. 1 http://openhub.net/ 2 https://bitergia.com/ 3 http://chaoss.community/ Fig. 1. An Overview of Community Types from the State of the Arts [12] issues investigated over 25 open-source communities sam- What we don’t know: (a) a general quality model of what pled according to community participants age, size, gender software qualities are influenced by which community struc- diversity, community programming language, type of product, ture qualities; (b) which community structure characteristics geographical dispersion, and activity, the last one measured as play a role for software communities; (c) how to quantify the mean activity stemming from BITERGIA and OPENHUB. what is the additional cost or socio-technical debt connected The Box-plot in Figure 2 was generated by operationalizing to sub-optimal characteristics; (d) how to measure a type and use those measurements as devices for improved governance or organizational structure agility. What should practitioners do: (1) open issues on issue- trackers not only for software code but also for perceived community structure issues - they are as much impactful as they are dangerous and can even lead to breaking the internet (see the NPM incident4); (2) report software community accidents on your version control system as much as you do on STACKOVERFLOW - researchers need to study both to come up with proper empirically established ground truths as well as practical outputs to support your work. C. FACT 3. Types narrow with Practitioners’ Experience Similarly to results in Figure 2, we report that the number of types intermixed in the same community types, that is, the number of attributes diversity across a community reduces as the practitioners’ age (or their experience, skills, community Fig. 2. The influence of community types over re-opened software issues participation). In the same study that led us to prepare the plot in Figure 2, we reported an inverse correlation of ∼ −0:40 a series of metrics to measure essential characteristics from (p-value << 0:05) between the mean developers experience Figure 1 and match them with corresponding types as well as with the community (i.e., the total time they spent working in their mean proneness to re-opening issues, that is, the mean the community) across our projects sample and the amount of ratio of re-opened issues over 6 months worth of releases community types and characteristics that manifest explicitly in the projects lifetime. The plot shows that something is across the community. in fact going on - in this case, the data shows that Formal- What we don’t know: (a) how to quantify mentorship and Networks (second bar from the left) may be exhibiting an experience as assets across software communities and how to organizational behavior which tends to re-open issues more reward both in a proper fashion; (b) how to infer community often than other types. Conversely, the opposite seems true for Informal-Communities (third bar from the left). 4 https://tinyurl.com/yaferj3b evangelists and use them as thought leaders for their software to think that this endeavor is pointless or less important communities; (c) what skills and characteristics matter more than, say, software testing or maintenance, please read your than others in software communities and which skills grow literature, e.g., a couple of hints are in Smite et al. [10] or with age and which others do not, as well as how to foster the [8]. growth and nourishment of skills that cannot grow with time What we don’t know: (a) a systematically-generated, gen- (e.g., knowledge communication) eral quality model for software communities with supporting What should practitioners do: (1) prepare codes of con- evaluation and analytic tools and automatons; (b) a practi- ducts [1] which reflect also the mentorship, governance struc- cal implementation of sustainability in software communities ture, experience and reward management as well as the code which are currently managed in a rather trial-and-error fashion, contribution policies or good-behavior across the community; with due exceptions (e.g., the Apache Software Foundation); (2) make the software code reflect the code of conduct, namely, (c) the dimensions, qualities, and metrics of software commu- use comments in software code to reprimand or exercise social nity sustainability, to be used jointly with a quality model for sanctioning [9], [11] wherefore codes of conducts are violated. performing, long-lived, high-quality software communities. D. FACT 4. Communities with more formal characteristics What should practitioners do: we don’t know, yet. may result more often into abandon-ware III. CONCLUSIONS We observed that 82% of the abandonware communities in As it turns out, there is still information that need to be our sample exhibited a formal type in the last 6 months of their further investigate . As a matter of fact, our non-exhaustive life. This is in line with previous research and observations list cannot convey enough that there is much more we don’t from Crowston et al.