A Descriptive Process Model for Open-Source Software Development
Total Page:16
File Type:pdf, Size:1020Kb
University of Calgary PRISM: University of Calgary's Digital Repository Graduate Studies Legacy Theses 2001 A descriptive process model for open-source software development Johnson, Kim Johnson, K. (2001). A descriptive process model for open-source software development (Unpublished master's thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/22282 http://hdl.handle.net/1880/41007 master thesis University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. Downloaded from PRISM: https://prism.ucalgary.ca The author of this thesis has granted the University of Calgary a non-exclusive license to reproduce and distribute copies of this thesis to users of the University of Calgary Archives. Copyright remains with the author. Theses and dissertations available in the University of Calgary Institutional Repository are solely for the purpose of private study and research. They may not be copied or reproduced, except as permitted by copyright laws, without written authority of the copyright owner. Any commercial use or publication is strictly prohibited. The original Partial Copyright License attesting to these terms and signed by the author of this thesis may be found in the original print version of the thesis, held by the University of Calgary Archives. The thesis approval page signed by the examining committee may also be found in the original print version of the thesis held in the University of Calgary Archives. Please contact the University of Calgary Archives for further information, E-mail: [email protected] Telephone: (403) 220-7271 Website: http://www.ucalgary.ca/archives/ THE UNIVERSITY OF CALGARY A Descriptive Process Model for Open-Source Software Development by Kim Johnson A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF COMPUTER SCIENCE CALGARY, ALBERTA JUNE, 2001 ©Kim Johnson 2001 Abstract Open Source is a term used to describe a tradition of open standards, shared source code, and collaborative software development. However the methodology itself has yet to be captured definitively in writing. The single best description is Eric Raymond's (1998a) The Cathedral and the Bazaar, and while excellent, it is not an academic work but more a pseudo-evangelical report from the field. Consequently, the current perception of what constitutes open-source software development remains somewhat subjective. This thesis attempts to describe an introductory process model for open-source software development. Common characteristics are identified and discussed with specific examples from various open-source projects. The results lend support to suggestions that open-source software development follows an adaptive lifecycle, with a flexible management model emphasizing leadership, collaboration, and accountability. Moreover, open source would seem to represent an alternative approach to distributed software development, able to offer useful information about common problems as well as possible solutions. in Acknowledgements This work would not have been possible without the guidance and support of many people. I would first like to thank my supervisor, Dr. Rob Kremer, for giving me an opportunity to research a somewhat unconventional subject. Thanks especially for a flexible yet supportive advisory style. My appreciation and respect to those who have pioneered open-source software development. It is a truly unique approach and a fascinating area for research. In particular, thanks to the following people for taking time to review an early draft of this work: Alan Cox, Brian Behlendorf, Roy Fielding, Michael Johnson, David Lawrence, Jason Robbins, Guido van Rossum, Erik Troan, and Paul Vixie. I would also like to thank Mildred Shaw, Alfred Hussein, and the other early adopters at SERN for an excellent introduction to the complex subject of software engineering. It has provided me with a solid foundation for continued learning, and I hope it has made me a better practitioner. And last but certainly not least, most heartfelt thanks to Tera and Kylan for their motivation and continued tolerance of long hours at the keyboard. IV ... when men were men and wrote their own device driver ... Linus Torvalds v Table of Contents Abstract iii Acknowledgements iv Table of Contents vi List of Tables viii List of Figures ix List of Abbreviations and Nomenclature x Chapter 1 Introduction 1 1.1 Aim 1 1.2 Motivation 1 1.3 Open-Source Software 3 1.4 Software Process Models 4 1.5 Approach 6 1.6 Objectives ...8 1.7 Thesis Structure .9 1.8 Summary ..9 Chapter 2 Open-Source Software Development 10 2.1 History 10 2.2 Definition 13 2.3 The Cathedral and the Bazaar 19 2.4 Projects .23 2.5 Summary 29 Chapter 3 State View 30 3.1 Closed Prototyping ..31 3.2 Iterative and Incremental Enhancement 35 3.3 Concurrent Development ..41 3.4 Large-Scale Peer Review 45 3.5 U ser-Driven Requirements 5 0 3.6 Summary 54 Chapter 4 Organizational View 55 4.1 Decentralized Collaboration 56 4.2 Trusted Leadership 60 4.3 Internal Motivation 64 4.4 Asynchronous Communication 68 4.5 Summary -74 Chapter 5 Control View 76 5.1 Informal Planning 77 5.2 Tiered Participation 79 5.3 Modular Design ..86 vi 5.4 Ubiquitous Tool Support 91 5.5 Shared Information Space 96 5.6 Summary 99 Chapter 6 Evaluation 100 6.1 Key Strengths 100 6.2 Key Weaknesses 104 6.3 Summary 108 Chapter 7 Conclusions 109 7.1 Addressing the Objectives 109 7.2 Future Directions 110 7.3 Thesis Summary 112 Bibliography 117 Appendices 132 A.l Open Source Chronology (Selected Events) 132 A.2 Open Source Projects 136 A.3 Open Source Definition 148 A.4 GNU General Public License 149 vii List of Tables Table 1. Characteristics of selected open-source projects 6 Table 2. Distribution of sources by software engineering validation method 7 Table 3. Comparison of various free software licensing practices 15 Table 4. Typical change request 37 Table 5. Comparison of defect density measures between commercial projects and Apache 47 Table 6. Timeline of a bug fix 49 Table 7. Comparison of code productivity of the top Apache developers and the top developers in several commercial projects 79 Table 8. Levels of participation in open-source projects 80 Table 9. Top 5 languages and testing tools used in a small-scale survey on quality related activities in open-source development 92 Table 10. Apache shared information space 96 vin List of Figures Figure 1. Various categories of free software 14 Figure 2. Market share for top HTTP servers across all domains 23 Figure 3. Comparison of evolutionary development vs. waterfall life cycle 36 Figure 4. Growth of the compressed tar file for the full Linux kernel source release 40 Figure 5. Typical build cycle 42 Figure 6. Proportion of changes closed within a given number days for Apache 50 Figure 7. E-mail discourse 69 Figure 8. List server discourse 69 Figure 9. Activity for the Python mailing list 71 Figure 10. Mozilla milestone schedule for 2001 78 Figure 11. Cumulative distribution of contributions to the Apache code base 83 Figure 12. Histogram of LOC added per programmer for the GNOME project 84 Figure 13. Cumulative distribution of PR related changes to the Apache code base 85 Figure 14. Mozilla ownership architecture 89 Figure 15. Linux ownership architecture 89 ix List of Abbreviations and Nomenclature API (Application Programming Interface) - Prescribed by an operating system or application, defining the rules for interaction with other software, build A compiled program intended for distribution. Brooks's Law "Adding more manpower to a project makes it later." The perceived benefit of adding more programmers to a project is outweighed by the cost of coordinating and merging their work, bus syndrome Refers to a process that has become too dependent on the input of one individual. C2Net A software company whose flagship product is a commercial version of the Apache Web server. Acquired by Red Hat in 2000. CGI (Common Gateway Interface) - A standard for interfacing external applications with Web servers. Conway's Hypothesis States that the organization of a software system will be congruent to the organization of the group that designed the system. Copyleft A general method for making a program free software, and requiring all modified and extended versions to be free software as well, cost Effort cost, or the number of hours required to perform a task. CPAN (Comprehensive Perl Archive Network) - A large collection of Perl software and documentation. CVS (Concurrent Versions System) - The dominant version control system for open- source software development. Cyclic A software company that originally sold support for CVS. Acquired by SourceGear in 1999. Cygnus A software company credited with pioneering the commercialization of open- source software. Acquired by Red Hat in 2000. commit-then-review Changes are deemed inherently acceptable and are applied, with testing and review afterwards. GPL (General Public License) - A license typically used for free software. x GNU (GNU's Not Unix) - Used to reference the GNU Project, a development effort to produce a free Unix-like operating system. It is pronounced "guh-NEW." See also: FSF FAQ (Frequently Asked Questions) - Documents that list and answer the common questions on a particular subject, feature creep The tendency to continually add features at the expense of elegance and simplicity. Free Software Refers to the users' freedom to run, copy, distribute, study, change, and improve software. See also: GNU FSF (Free Software Foundation) - A non-profit organization that raises funds for work on the GNU Project.