<<

Collaborative Software Design and Development Open Source Systems 4

Collaborative Software Design & Development

Open Source Systems 4

The FreeBSD Project Apache Field Support

Kyle Cullen and Sabrina Smith 2/21/08

© 2006, Dewayne E Perry EE 382V – Spring 08 1 Collaborative Software Design and Development Open Source Systems 4 The papers we’ll be covering:

ƒ The FreeBSD Project: A Replication Case Study of Open Source Development ƒ Trung T. Dinh-Trong - Research scientist at Avaya Labs ƒ James M. Bieman - Professor at Colorado State University and Director of the Software Assurance Laboratory ƒ How open source software works: “free” user-to-user assistance ƒ Karim R. Lakhani - Assistant professor in the Technology and Operations Unit at Harvard Business School ƒ Eric von Hippel - Professor at the MIT Sloan School of Management and Head of the Innovation and Entrepreneurship Group

© 2006, Dewayne E Perry EE 382V – Spring 08 2 Collaborative Software Design and Development Open Source Systems 4 Advantages / Disadvantages

ƒ Advantages ƒ Freedom to work on whatever you want ƒ The review of others’ works could potentially lead to an increase in your overall software development skill level ƒ You can work at your own pace – no deadlines ƒ Disadvantages ƒ Lack of formal process ƒ Poor design and architecture ƒ Lack of development tools that are comparable to closed-source industry tools ƒ Lack of knowledge of the users’ needs

© 2006, Dewayne E Perry EE 382V – Spring 08 3 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Main Idea

ƒ A study of FreeBSD to determine contributing factors to successful OSS projects ƒ Characteristics of previously successful OSS projects ƒ Not guaranteed to be necessary or sufficient to future successful OSS projects ƒ How? ƒ Do the 7 hypotheses presented by Mockus et al. in “Two Case Studies of Open Source Software Development: Apache and Mozilla” hold up for other projects? ƒ Specifically, FreeBSD

© 2006, Dewayne E Perry EE 382V – Spring 08 4 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Data Sources

ƒ , Bug Reports, and CVS ƒ Java program used to search through these data sources and extract relevant information ƒ the number of committers ƒ the number of deltas committed by each person ƒ etc. ƒ Questionnaire sent to each of the FreeBSD core developers

© 2006, Dewayne E Perry EE 382V – Spring 08 5 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Questionnaire

ƒ Question 1: “What was the process used to develop FreeBSD?” ƒ Roles of FreeBSD participants ƒ core developers – a small group of senior developers who are responsible for deciding the overall goals and direction of the project ƒ committers – developers who have the authority to commit changes to the project CVS repository ƒ contributors – people who want to contribute to the project but do not have committer privileges ƒ How do individuals identify what they will work on? ƒ contributors do what is interesting ƒ come up with new feature ƒ fix a bug you have found ƒ search through list of current bugs and desired features ƒ core developers do the rest (mundane tasks, etc.) © 2006, Dewayne E Perry EE 382V – Spring 08 6 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Questionnaire

ƒ Question 1: “What was the process used to develop FreeBSD?” cont. ƒ Code Release Deadlines ƒ 45 days – integration notifications sent out (15 days left!) ƒ 30 days – peer reviews and last minute bug fixes ƒ 15 days – code freeze and weekly distributions start (beta versions) ƒ 0 days – final release

© 2006, Dewayne E Perry EE 382V – Spring 08 7 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Questionnaire

ƒ Question 2: “How many people wrote code for new functionality? How many people reported problems? How many people repaired defects?” ƒ 354 committers added code between 1933 and April 2003 ƒ 337 committers checked in new features ƒ 6082+ unique individuals reported problems ƒ 224 committers fixed problems

© 2006, Dewayne E Perry EE 382V – Spring 08 8 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Questionnaire

ƒ Question 3: “Were these functions carried out by distinct groups of people, that is, did people primarily assume a single role? Did large numbers of people participate somewhat equally in these activities, or did a small number of people do most of the work?” ƒ 220/354 did both bug fixing and new feature development ƒ 47 committers contributed 80% of the deltas ƒ End result: a lot more top committers in FreeBSD than in Apache

© 2006, Dewayne E Perry EE 382V – Spring 08 9 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Questionnaire

ƒ Question 4: “Where did the code contributors work in the code? Was strict code ownership enforced on a file or module level?” ƒ Contributors were free to work on all parts of code – ownership was not enforced ƒ 30% of files were only changed by one developer

© 2006, Dewayne E Perry EE 382V – Spring 08 10 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Questionnaire

ƒ Question 5: “What is the defect density of FreeBSD code?” ƒ Defect density is similar to that of Apache, and is lower than commercial products.

© 2006, Dewayne E Perry EE 382V – Spring 08 11 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: H1

ƒ “Open source developments will have a core of developers who control the code base, and will create approximately 80 percent or more of the new functionality. If this core group uses only informal ad hoc means of coordinating their work, the group will be no larger than 10 to 15 people.” ƒ Core group for FreeBSD consisted of 36 people ƒ Core group for FreeBSD only wrote 47% of code ƒ Edited hypothesis: Core developers (< 15) will control the direction of the project while a larger group of top developers (< 50) will contribute approximately 80% of the code base, and their sum is less than 25% of all developers. ƒ Spirit of this edited hypothesis matches the spirit of the original one, so hypothesis is confirmed © 2006, Dewayne E Perry EE 382V – Spring 08 12 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: H2

ƒ “If a project is so large that more than 10 to 15 people are required to complete 80 percent of the code in the desired time frame, then other mechanisms, rather than just informal ad hoc arrangements, will be required in order to coordinate the work. These mechanisms may include one or more of the following: explicit development processes, individual or group code ownership, and required inspections.” ƒ The high number of top committers in FreeBSD required more mechanisms for coordinating work than were needed in Apache; however, these methods were informal and not enforced. ƒ H2 should be edited to suggest that more mechanisms are needed, but they don’t need to be formal.

© 2006, Dewayne E Perry EE 382V – Spring 08 13 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: H3

ƒ “In successful open source developments, a group larger by an order of magnitude than the core will repair defects, and a yet larger group (by another order of magnitude) will report problems.” ƒ This hypothesis holds for FreeBSD. ƒ Could other organizational structures also work? ƒ Is a good organizational structure necessary for success?

© 2006, Dewayne E Perry EE 382V – Spring 08 14 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: H4

ƒ “Open source developments that have a strong core of developers but never achieve large numbers of contributors beyond that core will be able to create new functionality but will fail because of a lack of resources devoted to finding and repairing defects.” ƒ FreeBSD had many contributors, so H4 could not be evaluated.

© 2006, Dewayne E Perry EE 382V – Spring 08 15 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: H5

ƒ “Defect density in open source releases will generally be lower than commercial code that has only been feature-tested, that is, received a comparable level of testing.” ƒ FreeBSD supports this hypothesis. ƒ This hypothesis refers to unstable OSS releases (those that have only been feature-tested), so the authors extend the hypothesis to include stable releases as follows: ƒ “Defect density in OSS releases will be lower than commercial code that has only been feature-tested. If an OSS has a mechanism to separate unstable code from stable code or “official” releases, then the defect density of the stable code releases will be equivalent to that of commercial code after release.”

© 2006, Dewayne E Perry EE 382V – Spring 08 16 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: H6

ƒ “In successful open source developments, the developers will also be users of the software.” ƒ Developers of FreeBSD were also users, so H6 holds true. ƒ Can OSS developers who are not users be successful if they listen carefully to user feedback, i.e. is it necessary for OSS developers to also be users?

© 2006, Dewayne E Perry EE 382V – Spring 08 17 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: H7

ƒ “OSS developments exhibit very rapid responses to customer problems.” ƒ The authors didn’t have access to data about the responsiveness of FreeBSD developments to customer problems, so H7 could not be confirmed.

© 2006, Dewayne E Perry EE 382V – Spring 08 18 Collaborative Software Design and Development Open Source Systems 4 The FreeBSD Project: Final Questions

ƒ Are the 7 hypotheses presented here necessary conditions for success? ƒ Are they sufficient? ƒ Do the hypotheses hold for projects of any size, i.e. can a small OSS project that has only a small number of contributors and bug reporters be successful? ƒ Is the method for testing OSS projects more reliable than commercial testing practices?

© 2006, Dewayne E Perry EE 382V – Spring 08 19 Collaborative Software Design and Development Open Source Systems 4 Why work on OSS?

ƒ Why would you develop a software product for free when you could be getting paid? ƒ Motivations: ƒ User’s direct need for the software and software improvements worked upon ƒ Enjoyment of the work itself ƒ Enhanced reputation that may flow from making high-quality contributions to an open source project ƒ But do these motivations translate to the “necessary-but- mundane” tasks? ƒ Not really ƒ So how do these tasks get accomplished?

© 2006, Dewayne E Perry EE 382V – Spring 08 20 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: The Study

ƒ Apache field support is the “necessary-but-mundane” task looked at in this study ƒ The Apache Development Group does not provide field support for Apache, so how do users get assistance when there are defects in the program or when they have a general lack of understanding? ƒ This study attempts to answer that question and determine what motivates people to voluntarily offer help to those who need it.

© 2006, Dewayne E Perry EE 382V – Spring 08 21 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: The Need for

ƒ The Apache Development Group states: “There is no official support for Apache. None of the developers want to be swamped by a flood of trivial questions that can be resolved elsewhere. Bug reports and suggestions should be sent via the bug report page. Other questions should be directed to the... newsgroup where some of the Apache team... or other HTTPd gurus who should be able to help.” ƒ The Usenet newsgroup is the place where ƒ Information seekers post questions related to Apache ƒ Information providers voluntarily answer those questions for all to see ƒ What motivates the information providers to devote their time/resources to help out the information seekers?

© 2006, Dewayne E Perry EE 382V – Spring 08 22 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Research Methods

ƒ Collect usage data from the newsgroup over a period of 4 years (1996-1999) ƒ Number of unique questions posted ƒ Number of information seekers/providers ƒ Percentage of questions asked/answered by individual information seekers/providers ƒ Length of time between question being asked and question being answered ƒ Etc. ƒ Questionnaire sent to each individual who posted a question or an answer from Oct. 1, 1999 to Feb. 15, 2000 ƒ Used to gain insight into motives of information seekers and providers

© 2006, Dewayne E Perry EE 382V – Spring 08 23 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Data from 1996-1999

ƒ Information Seekers and Providers divided into 4 groups ƒ Frequent seekers – seekers who posted 4 or more questions to newsgroup in this time period (10% of seekers) ƒ Other seekers ƒ Frequent providers – providers who posted 10 or more answers to questions within the time period (10% of providers) ƒ Other providers ƒ 50% of questions were asked by 24% of seekers ƒ 50% of responses were answered by only 2% of providers ƒ 50% of questions were answered within 24 hours ƒ 25% of questions did not receive responses at all

© 2006, Dewayne E Perry EE 382V – Spring 08 24 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Questionnaire

ƒ 2 versions of the questionnaire: ƒ 1 for seekers ƒ 1 for providers ƒ Questionnaire focuses on: ƒ motivations ƒ time spent posting ƒ how much work the providers had to do in order to answer the question on the newsgroup ƒ Each person only asked to fill out questionnaire once, no matter how many times they posted to the newsgroup ƒ If the user answered a seeker questionnaire, but he turned out to actually be a provider more often than a seeker, his questionnaire was thrown out, and vice versa.

© 2006, Dewayne E Perry EE 382V – Spring 08 25 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Questionnaire Results

ƒ From the perspective of the seeker: ƒ What is the cost of posting a question? ƒ Only the cost of forming and posting the question. (Majority of information seekers – 90.4% – were not posting time critical questions.) ƒ Approximately 11.5 minutes to post ƒ Would have taken about 115 minutes to find the information on their own ƒ Benefit to cost ratio is very good! ƒ From the perspective of the provider ƒ Cost of answering a question broken into 2 parts ƒ An able and willing provider must be matched with a seeker ƒ An answer must be provided

© 2006, Dewayne E Perry EE 382V – Spring 08 26 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Questionnaire Results

ƒ Cost of matching a provider to a seeker ƒ Cost would be high if providers scanned the newsgroup only to find questions they could answer ƒ Actually, providers read newsgroup to gain valuable information about problems others are encountering and how they may be solved. This information is often valuable to the management and upgrading of their own websites. ƒ Providers stumble upon questions that they already know the answers to while they are reading, and thus matching providers to seekers has no cost. ƒ Cost of answering the question ƒ 50% of frequent providers spent < 1 minute answering questions ƒ 80% of other providers spent < 5 minutes ƒ Reason for low cost is that providers already know the answers ƒ Cost is low, but not zero, so why do it? © 2006, Dewayne E Perry EE 382V – Spring 08 27 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Provider Motivations

ƒ What motivates providers to answer questions posted by seekers? ƒ According to the questionnaires, the primary motivations are: ƒ The provider expects reciprocity ƒ The provider is “helping the cause” ƒ The provider will gain reputation or enhance career prospects ƒ The provider finds answering questions to be intrinsically rewarding ƒ It is the provider’s job (this is the least of the motivations) ƒ Are these responses valid? ƒ Respondents could only be saying what they believe to be “socially correct”

© 2006, Dewayne E Perry EE 382V – Spring 08 28 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: The Issues

ƒ This study should be relevant to other OSS projects that employ voluntary online support for users, but is that guaranteed to be true? What are the potential issues? ƒ The success of Apache field support is largely due to the fact that providers spend a lot of time reading the newsgroup in order to learn. What if we have a project where there is less to learn? ƒ Will the newsgroup be as effective if question loads are much greater? Would there be enough providers? ƒ What if the questions being asked are unique to the seekers and thus the providers do not have “off-the-shelf” answers for them?

© 2006, Dewayne E Perry EE 382V – Spring 08 29 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Conclusions

ƒ The “necessary-but-mundane” task of field support for Apache was accomplished using voluntary online support. ƒ Why were providers willing to give information for free? ƒ “The public posting of both questions and answers created a site that potential information providers wanted to visit and study in order to gain valuable information for themselves.” ƒ The public posts also included names of the providers, which allowed providers to gain reputation for posting. ƒ Providers felt that since they were help by reading the newsgroup, they should help others in kind.

© 2006, Dewayne E Perry EE 382V – Spring 08 30 Collaborative Software Design and Development Open Source Systems 4 Apache Field Support: Conclusions Cont.

ƒ Many providers indicated that they found answering questions to be fun or intrinsically rewarding. In that sense, is the task truly mundane? ƒ What do you do with tasks that nobody finds fun? ƒ Are there other ways to recruit volunteers to perform mundane tasks? ƒ What other OSS-related “necessary-but-mundane” tasks might exist that people would or wouldn’t volunteer for?

© 2006, Dewayne E Perry EE 382V – Spring 08 31