The Secret Life of Communities: What we know and What we Don’t know

Gemma Catolino1, Fabio Palomba2, Damian A. Tamburri3 Delft University of Technology1, University of Salerno2, Eindhoven University of & Jheronimus Academy of Data Science (JADS)3 [email protected], [email protected], [email protected]

Abstract—Communities of software practice are increasingly II.PRACTICAL RECOMMENDATION playing a central role in the development, operation, mainte- nance, and evolution of good-quality software, as well as DevOps A. FACT 1. Software Communities Have Types pipelines, lean Organizations and Global Software Development. Several community types have been studied in organiza- However, it is still unknown the structures and characteristics behind such communities. For this reason, in this paper, we tried tional structures and social networks research - Figure 1 offers to explore the organizational secret of communities, trying to an overview of these types extracted from a systematic studies offer a few practical extracts of (1) what we know and is known, on the matter [11], [12]. In Layman’s terms, a community (2) what we know to be unknown, and (3) what we know to type is a series of characteristics, a pattern, which remain be tentatively discoverable in the near future from an empirical constantly true across the social network underlying that research point of view. Moreover, the paper provides a number of recommendations for practitioners to help and be helped in community. What is still understood fairly little in the state their community endeavors. of the art of research is that, on the one Index Terms—Community Types; DevOps; Road Map. hand, software communities not surprisingly exhibit the same types already known in literature, but, on the other hand, the I.INTRODUCTION role of community types and characteristics for the benefit Communities are the foundation and backbone of organized (or fallacy) of software code qualities as well as software societies and just as much as our software-driven civil societies processes remains mostly unknown. are dense with communities, the software makers are also What we don’t know: (a) the influence of community types organized into communities of practice (e.g., in open-source), over software qualities as well as the qualities of software of intent (e.g., in standardization groups), of purpose (e.g., processes; (b) algorithms and measurements to precisely pin- as part of agile movements). Recent studies showed how the point type shifts across the community’s life-cycle; (c) design community’s health can reflect the quality of the software patterns for community structures which are contiguous with produced [5], [6], for this reason research community tried to design patterns in underlying software architectures. deeper analyze and characterize the structure of software com- What should practitioners do: (1) follow community munities (i.e., defined as social units of size, dense strongly- tracking and measurement initiatives such as OPENHUB 1 typed and diverse interactions across diverse roles and fluid or BITERGIA 2 to gather insights over their own community characteristics) [12]. and act upon it - this ensures that progress in community In this paper, we discuss about a few basic facts and management, measurement, and steering practices are har- observations about the secret life of software communities nessed; (2) engage in community quality and health initiatives which we were able to empirically establish over previous such as CHAOSS 3 - this ensures that initiatives struggling research studies as well as our own practice as part of software to crack the code of sustainable software communities are communities themselves. well fed with engaged practitioners; (3) manifest their own We outline the roundup to encourage further research into organizational and socio-technical issues as much as the code- understanding and harnessing the (still) very much secret life quality issues - this ensures that social software engineering of open-source communities, as well as their structural, socio- [13] researchers have both quantities and qualities of the right technical, and health characteristics. Furthermore, we aim to data and evidence to study. call for help from the practitioners themselves in supporting their own software community activity and therefore, we B. FACT 2. Types Influence Software Qualities provide practitioners with practical recommendations or calls We conducted several experiments to exploratively figure for help defined with the intent of engaging their interest out the boundaries around the influence that software com- around the matter and/or allowing further understanding their munity types may be playing over the qualities for software needs from a research perspective, such that we as a research production and its maintenance. Figure 2, for example, shows community can aid them in a better fashion. Moreover, we the influence of several types over the rate of re-opened provide a brief road map for further research along this intriguing emerging topic. 1 http://openhub.net/ 2 https://bitergia.com/ 3 http://chaoss.community/ Fig. 1. An Overview of Community Types from the State of the Arts [12] issues investigated over 25 open-source communities sam- What we don’t know: (a) a general quality model of what pled according to community participants age, size, gender software qualities are influenced by which community struc- diversity, community , type of product, ture qualities; (b) which community structure characteristics geographical dispersion, and activity, the last one measured as play a role for software communities; (c) how to quantify the mean activity stemming from BITERGIA and OPENHUB. what is the additional cost or socio-technical debt connected The Box-plot in Figure 2 was generated by operationalizing to sub-optimal characteristics; (d) how to measure a type and use those measurements as devices for improved governance or organizational structure agility. What should practitioners do: (1) open issues on issue- trackers not only for software code but also for perceived community structure issues - they are as much impactful as they are dangerous and can even lead to breaking the (see the NPM incident4); (2) report software community accidents on your version control system as much as you do on STACKOVERFLOW - researchers need to study both to come up with proper empirically established ground truths as well as practical outputs to support your work. C. FACT 3. Types narrow with Practitioners’ Experience Similarly to results in Figure 2, we report that the number of types intermixed in the same community types, that is, the number of attributes diversity across a community reduces as the practitioners’ age (or their experience, skills, community Fig. 2. The influence of community types over re-opened software issues participation). In the same study that led us to prepare the plot in Figure 2, we reported an inverse correlation of ∼ −0.40 a series of metrics to measure essential characteristics from (p-value << 0.05) between the mean developers experience Figure 1 and match them with corresponding types as well as with the community (i.e., the total time they spent working in their mean proneness to re-opening issues, that is, the mean the community) across our projects sample and the amount of ratio of re-opened issues over 6 months worth of releases community types and characteristics that manifest explicitly in the projects lifetime. The plot shows that something is across the community. in fact going on - in this case, the data shows that Formal- What we don’t know: (a) how to quantify mentorship and Networks (second bar from the left) may be exhibiting an experience as assets across software communities and how to organizational behavior which tends to re-open issues more reward both in a proper fashion; (b) how to infer community often than other types. Conversely, the opposite seems true for Informal-Communities (third bar from the left). 4 https://tinyurl.com/yaferj3b evangelists and use them as thought leaders for their software to think that this endeavor is pointless or less important communities; (c) what skills and characteristics matter more than, say, software testing or maintenance, please read your than others in software communities and which skills grow literature, e.g., a couple of hints are in Smite et al. [10] or with age and which others do not, as well as how to foster the [8]. growth and nourishment of skills that cannot grow with time What we don’t know: (a) a systematically-generated, gen- (e.g., knowledge communication) eral quality model for software communities with supporting What should practitioners do: (1) prepare codes of con- evaluation and analytic and automatons; (b) a practi- ducts [1] which reflect also the mentorship, governance struc- cal implementation of sustainability in software communities ture, experience and reward management as well as the code which are currently managed in a rather trial-and-error fashion, contribution policies or good-behavior across the community; with due exceptions (e.g., the Apache Software Foundation); (2) make the software code reflect the code of conduct, namely, (c) the dimensions, qualities, and metrics of software commu- use comments in software code to reprimand or exercise social nity sustainability, to be used jointly with a quality model for sanctioning [9], [11] wherefore codes of conducts are violated. performing, long-lived, high-quality software communities. D. FACT 4. Communities with more formal characteristics What should practitioners do: we don’t know, yet. may result more often into abandon-ware III.CONCLUSIONS We observed that 82% of the communities in As it turns out, there is still information that need to be our sample exhibited a formal type in the last 6 months of their further investigate . As a matter of fact, our non-exhaustive life. This is in line with previous research and observations list cannot convey enough that there is much more we don’t from Crowston et al. [14] where informality is signaled as an know with respect to how much we do know; most especially, essential software community health parameter. we don’t know a lot about what should practitioners do What we don’t know: (1) whether it is in fact true if to cater for their communities, striving for healthy types formal characteristics and types lead to abandon ware — matching their organizational requirements over time and in empirical software engineering needs to be instrumented to a sustainable fashion. Our conclusion is not only that more actually assess and establish this link; (2) what ‘formal’ means, work is needed to explore the aspects we non-exhaustively in the software engineering world and in terms of socio- pointed at — rather, the most dire observation is that, in order technical and organizational relations — typically in software to better support the (secret) life of software practitioners engineering a formal formulation involves proofs, foundational in their communities, practitioners themselves may be well theories, but from the realm of social-networks analysis and encouraged to treat the community as a software-influencing organizations research, the word and concept of “formality” artifact itself, thus, for example, opening issues if a community assumes a sensibly different meaning; (3) the measured role issue or smell does in fact manifest. An increased practitioner of “informal” as opposed to formal — how can one foster community consciousness will increase awareness over the informal? Is Informal beneficial as formal is not? These and problem, making it more explicit, measurable, and hence similar research questions are still out there for the taking; What should practitioners do: we don’t know, yet. improvable by practitioners as well as for researchers. In the future it is our ultimate intention to further pursue the road map E. FACT 5. Communities can be healthy and sustainable that emerges from the previous identified shortcomings, and, at Many studies in the literature identify healthy values for the same time, work to improve software forges, collaborative several community characteristics, e.g., socio-technical con- coding environments, IDEs or other software equipment that gruence [2] or community structure verification [3]. These practitioners may use to build, maintain, or work upon their indicators, stemming from top studies, lead to argue that code as part of a lively and healthy community and with more software communities, like other thriving communities in our appropriate software engineering ‘ergonomics’, intended as the societal structures, can be turned healthy and made sustainable disciplines of software engineering which engage into design- with appropriate and dedicated efforts — sites such as BITER- ing or arranging software commons, communities, workplaces, GIA,OPENHUB, as well as initiatives and research projects work-products, and working systems so that they fit the people such as OSSMETER 5 are fundamental assets to drive the and goals around them [14]. societal challenge of making software communities aware and sensible to their health, as sustainable communities should be. REFERENCES In this respect, the state of the art in organizations research, [1] C. M. ACM. Acm code of ethics and professional conduct. Code of social networks analysis as well as emerging disciplines such Ethics, 1992. [2] M. Cataldo, J. D. Herbsleb, and K. M. Carley. Socio-technical con- as sustainable community development [4], [14] can aid but gruence: a framework for assessing the impact of technical and work even typical, structured approaches used in software engi- dependencies on software development productivity. In Proceedings of neering such as formalization [7], e.g., to better understand the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 2–11. ACM, 2008. and measure the socio-organizational dynamics playing a role [3] M. Joblin, W. Mauerer, S. Apel, J. Siegmund, and D. Riehle. From across software communities. And for those skeptical enough developer networks to verified communities: a fine-grained approach. In Proceedings of the 37th International Conference on Software 5 http://ossmeter.org/ Engineering-Volume 1, pages 563–573. IEEE Press, 2015. [4] J. Lucena, J. Schneider, and J. A. Leydens. Engineering and sustainable action. Dynamic Games and Applications, 1(1):149–171, 2011. community development. Synthesis Lectures on Engineers, Technology, [10]D. Smiteˇ and Z. Galvin¸a. Socio-technical congruence sabotaged by and Society, 5(1):1–230, 2010. a hidden onshore outsourcing relationship: lessons learned from an [5] F. Palomba, D. A. Tamburri, A. Serebrenik, A. Zaidman, F. A. Fontana, empirical study. In International Conference on Product Focused and R. Oliveto. How do community smells influence code smells? In Software Process Improvement, pages 190–202. Springer, 2012. ICSE (Companion Volume), pages 240–241. ACM, 2018. [11] D. A. Tamburri, P. Lago, and H. Van Vliet. Uncovering latent social [6] F. Palomba, D. A. A. Tamburri, F. A. Fontana, R. Oliveto, A. Zaidman, communities in software development. IEEE software, 30(1):29–36, and A. Serebrenik. Beyond technical aspects: How do community smells 2012. influence the intensity of code smells? IEEE transactions on software [12] D. A. Tamburri, P. Lago, and H. v. Vliet. Organizational social structures engineering, 2018. for software engineering. ACM Computing Surveys (CSUR), 46(1):3, [7] D. Rombach and F. Seelisch. Formalisms in software engineering: Myths 2013. versus empirical facts. In IFIP Central and East European Conference [13] W. Tracz. Lord of the files: essays on the social aspects of software on Software Engineering Techniques, pages 13–25. Springer, 2007. engineering by russel ovans. ACM SIGSOFT Software Engineering [8] J. Rost and R. L. Glass. The dark side of software engineering: evil on Notes, 36(6):31–31, 2011. computing projects. John Wiley & Sons, 2011. [14] R. T. Watson, M.-C. Boudreau, and A. J. Chen. Information systems [9] K. Sigmund, C. Hauert, A. Traulsen, and H. De Silva. Social control and and environmentally sustainable development: Energy informatics and the social contract: the emergence of sanctioning systems for collective new directions for the is community. MIS quarterly, 34(1), 2010.