Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2007 Is Bigger Always Better?: Toward a Resource-Based Model of Open Source Software Development Communities. Glen Sagers

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected] THE FLORIDA STATE UNIVERSITY

COLLEGE OF BUSINESS

IS BIGGER ALWAYS BETTER? TOWARD A RESOURCE-BASED MODEL OF OPEN SOURCE SOFTWARE DEVELOPMENT COMMUNITIES.

By

GLEN SAGERS

A Dissertation submitted to the Department of Management Information Systems in partial fulfillment of the requirements for the degree of Doctor Of Philosophy

Degree Awarded: Spring Semester, 2007 The members of the Committee approve the Dissertation of Glen Sagers defended on March 12th, 2007.

David Paradice ______Professor Directing Dissertation

G. Stacy Sirmans ______Outside Committee Member

Molly Wasko ______Committee Member

Katherine Chudoba ______Committee Member

Approved:

______

Caryn Beck-Dudley, Dean, College of Business

The Office of Graduate Studies has verified and approved the above named committee members.

ii ACKNOWLEDGEMENTS

I would like to thank my dissertation co-chairs, Drs. Molly Wasko and David Paradice for their patience, support and guidance in this process, and for their aid in clarifying my thinking on this study. I would like to thank my dissertation committee, Drs. G. Stacy Sirmans and Kathy Chudoba, for their advice for making this research more focused. I appreciate the advice and commiseration of my fellow doctoral students. I would like to thank Dr. Terry Dennis and the faculty at Illinois State University for believing in me during the final stages of this dissertation. I wish to thank my wife, Sharon and my mother and father, Diane & Larry, for their proofreading of various drafts of this document. Finally, I wish to thank my wife and children for their infinite patience over the last few years, it would not have been possible without them to lean on.

I would also like to thank those individuals who have contributed to open source software. They not only made this dissertation topic possible, they made it much more pleasant to write. I would particularly like to thank those who have contributed to the OpenOffice.org project, the excellent office suite used for the writing and figures in this document. I also wish to thank those who have contributed to Bibus, a great citation manager that works perfectly with OpenOffice.org. Finally, I would like to thank those who have contributed to , KDE, Mozilla (Firefox especially) and the many utilities that make up a modern Linux distribution. As I have used these software packages for the past six or seven years, they have progressed from programmers hobbies to production quality software thanks to the contributions of many thousands of individuals.

iii TABLE OF CONTENTS

LIST OF TABLES...... vi

LIST OF FIGURES...... vii

ABSTRACT...... viii

1. INTRODUCTION...... 1

2. CONCEPTUAL MODEL & THEORETICAL FOUNDATION...... 4 2.1 Background...... 4 2.1.1 The OSS Concept...... 4 2.1.2 OSS vs. Proprietary Software Development...... 7 2.2 The Social Dilemma of OSS...... 7 2.2.1 Public Goods...... 8 2.2.2 Production of Public Goods...... 8 2.2.3 Consumption of Public Goods...... 9 2.2.4 Sustaining Public Goods...... 10 2.3 Defining OSS Projects...... 14 2.4 Prior Research...... 16 2.4.1 Business and Advocacy Studies...... 17 2.4.2 Demographic & Motivational Studies...... 17 2.4.3 Economic Studies...... 18 2.4.4 Software Quality Studies...... 18 2.4.5 OSS Project Organization and Governance Studies...... 19 2.5 Gap in Prior Research ...... 25 2.5.1 Communities...... 25 2.5.2 Research Questions...... 26 2.6 Resource-Based Model of Online Social Structures...... 27 2.6.1 Resource Availability...... 27 2.6.2 Benefit Creation Process...... 27 2.6.3 Attraction and Retention...... 28 2.7 Adapted Theoretical Model...... 29 2.7.1 Assessing OSS Software Success...... 30 2.7.2 Adapted Model...... 31 3. RESEARCH MODEL AND HYPOTHESES...... 33 3.1 Hypotheses...... 33 3.1.1 Resources and Communication Activities...... 33 3.1.2 Communication Activities and Success...... 35 3.1.3 Success and Sustainability...... 39 4. METHODOLOGY...... 44 4.1 Sample and Procedures...... 44 4.2 Measures...... 47

iv 4.2.1 Resources...... 47 4.2.2 Support Communication Activities...... 47 4.2.3 Development Communication Activities ...... 49 4.2.4 Software Success...... 50 4.2.5 Community Success...... 52 4.2.6 Attraction and Retention...... 53 5. ANALYSIS AND RESULTS...... 55 5.1 Preliminary Analyses...... 55 5.2 PLS Results...... 56 5.3 Model Testing...... 58 5.4 Exploratory Analysis...... 65 6. DISCUSSION...... 71 6.1 Supported Hypotheses...... 71 6.2 Partially Supported Hypotheses...... 72 6.3 Non-supported Hypotheses...... 74 7. CONCLUSIONS...... 78 7.1 Limitations...... 78 7.2 Contributions...... 79 7.3 Future Research...... 81 7.4 Concluding Remarks...... 81 APPENDIX A. HISTORY OF OSS...... 82

APPENDIX B. OPEN SOURCE DEFINITION...... 86

APPENDIX C. SURVEY...... 89

APPENDIX D. HUMAN SUBJECTS DOCUMENTATION...... 94 Informed Consent Documentation...... 94 Human Subjects Approval...... 95 REFERENCES...... 96

BIOGRAPHICAL SKETCH...... 104

v LIST OF TABLES Table 1. Summary of Existing Literature...... 20 Table 2. Summary of Hypotheses...... 43 Table 3. Items Measuring Subjective Performance...... 52 Table 4. Items Measuring Trust in the Ability of Others...... 53 Table 5. Items Measuring Sense of Belonging...... 53 Table 6. ICC Values and Descriptive Statistics...... 56 Table 7. Average Variance Explained & Composite Reliabilities...... 57 Table 8. Correlation of Constructs and AVE Values...... 59 Table 9. Factor Loadings and Cross-Loadings for Multi-Item Scales...... 60 Table 10. Summary of Hypothesis Testing...... 64 Table 11. Step 1 - Impact of Resources on Communication Activity...... 66 Table 12. Step 2 - Impact of Resources on Success...... 66 Table 13. Step 3 - Impact of Resources on Attraction and Retention...... 66 Table 14. Step 4 - Impact of Communication Activity on Success...... 67 Table 15. Step 5 - Impact of Communication Activity on Attraction and Retention...... 67 Table 16. Step 6 - Impact of Success on Attraction and Retention...... 68 Table 17. Final Model...... 68 Table 18. Comparison of Hypothesis and Exploratory R2 Values...... 70

vi LIST OF FIGURES Figure 1. Composition of OSS Projects ...... 16 Figure 2. Resource-Based Model of Online Social Structures...... 29 Figure 3. Theoretical Model...... 31 Figure 4. Hypothesized Path Model...... 42 Figure 5. Typical Sourceforge Project Page...... 45 Figure 6. Typical Discussion Thread...... 48 Figure 7. Herfindahl-Hirschman Index...... 49 Figure 8. Typical Bug Reports...... 50 Figure 9. Full Hypothesized Model...... 54 Figure 10. Data Collection Time Line...... 54 Figure 11. Path Model Results...... 61 Figure 12. Exploratory Analysis Model...... 69 Figure 13. OSS survey...... 89 Figure 14. OSS survey, continued...... 90 Figure 15. OSS survey, continued...... 91 Figure 16. OSS survey, continued...... 92 Figure 17. OSS Survey, Final Portion ...... 93 Figure 18. Informed Consent Document...... 94 Figure 19. Human Subjects Approval...... 95

vii ABSTRACT Open Source Software (OSS) has exploded over the last few years as a means of producing high-quality software. The members of OSS project communities develop and support the software on a mostly volunteer basis, usually with no financial remuneration. This software is then made freely available (in both monetary terms and licensing terms) to those who wish to utilize it. Much has been written about the use of OSS in business, motivations of the volunteers, OSS software quality and how OSS communities are organized and governed. Two aspects of OSS that remain unexplored revolve around how an OSS project community is sustained, and whether such a community is necessary for the success of the software. These questions form the basis for this study.

In this study, OSS is first demonstrated to have many properties of a public good, with the associated attributes of non-rivalry and non-excludability. Unlike typical public goods, OSS is not subject to underproduction as it may be disjunctively produced. It is not subject to overutilization either, since multiple copies may be made for essentially zero cost. The key issue to be investigated in OSS is neither production nor consumption of the public good, but rather how to sustain the project community which writes, supports, and improves the software. Sustaining this community is possible due to network effects – that is, the software becomes more useful as more individuals use it. Among this body of users are some individuals who are willing to donate their time and talents to the community.

A model of community success which proposes that resources furnished by project members are converted into benefits to the community through communication activities is utilized to answer the research questions driving this study. A community must maintain access to a pool of resources such as the time, energy, knowledge and material resources of its members. These resources are converted into benefits for the community through communication activities. Increased communication activities about support and development issues relating to the software lead to a more successful software product and a more successful community, as indicated by higher levels of social capital within the community. A more successful OSS project – in terms of both software and community – will be able to grow through retention of existing members and attraction of new members. These individuals in turn increase the resources available to the community.

viii Objective and survey data from 39 projects hosted on Sourceforge are examined longitudinally to determine whether the number of members in an OSS community influences communication activities within the community. The effects of communication activities among project members on software and community success are then measured. The influence of software and community success on the projects' ability to retain and attract members is assessed. Findings indicate that community size is crucial to maintaining communication activities among members and that increased size also leads directly to retention of existing members. The variety of topics in communications between members negatively influences some aspects of community success, while the number of bugs reported increases one measure of software success. Success of both the community and the software affects the attraction of new members, while only community success leads to the retention of existing users. Overall, the results indicate that size of the community does matter in writing, supporting and improving the software and show that an active community is crucial in sustaining the OSS project.

ix 1. INTRODUCTION Open Source Software (OSS) is a form of software which is mainly produced by voluntary, decentralized communities whose members meet virtually over the Internet. These individuals collaborate to produce a free (in monetary terms) software product which rivals its commercial counterparts and which is also freely (as in free speech) modifiable by those with interest and knowledge in software development.

Currently, OSS looms large in the public eye. There are many OSS projects which are increasingly utilized on personal computers, such as the OpenOffice.org office suite, the GAIM instant messaging client, the Bittorrent file-sharing program and the Mozilla Firefox web browser. On the server side, the Apache web server powers over 65% of the world's websites (Netcraft 2005), Sendmail touches over 70% of all emails sent across the Internet (Klimas et al. 2003) and Samba allows seamless access for Windows clients to material stored on UNIX servers (Samba Project 2005). Many computer magazines review OSS software on a regular basis (Keizer 2005; Sarrel 2005), indicating its prominence as an alternative to commercial software.

Governments around the world have investigated the benefits of OSS and many are deciding to implement open source programs. In Munich, Linux replaced Windows on some 14,000 desktops (USA Today 2003) and the Brazilian government estimates it could save $120 million per year by adopting Linux instead of Windows (Kingstone 2005). Closer to home, the state of Massachusetts is currently grappling with the issue of whether to specify an open document format (Savvas 2006) and following recent debacles with voting machines whose accuracy cannot be verified, several states are pushing for open source voting machines (BlackBoxVoting 2006).

In summary, OSS seems to be creating a revolution in how software is developed, maintained and acquired. From a practical standpoint, it has the potential to revolutionize the software industry and save millions of dollars in research and development costs (Ghosh 2006). From a theoretical standpoint, organizations and academics are investigating the ways in which OSS projects are organized and how other products and services – even university courses – can be made open (MIT 2006).

At the most basic level, the production of open source software (OSS) is a story of a mostly volunteer workforce uniting to produce valuable software which may be downloaded

1 from the Internet at no cost. Looking beyond this level, OSS production flies in the face of traditional economic logic which states that it is difficult to induce a “community of human beings to organize and sustain organization for the production and maintenance of public goods” (Weber 2004, p.9). Traditional economic theory suggests that goods which are subject to collective provision are even more difficult to provide (Olson 1965). Contrary to these pessimistic predictions, the success of OSS suggests that collective action can be sustained by a community of volunteers.

Much has been written about OSS and many research projects are underway to understand issues surrounding OSS. Some streams of research examine how OSS is utilized in businesses and other endeavors (Adelstein 2004; Knight 2004; Mims 2004). Other research investigates why individuals voluntarily donate their efforts to OSS development (Casadesus- Masanell and Ghemawat 2006; Economides and Katsamakas 2006; Ghosh et al. 2002; Lakhani et al. 2002). Another area of interest examines the economics surrounding the production of OSS (Hawkins 2004; Kuan 2004), the quality of the software produced (Kuan 2002; Wheeler 2005; Zhao and Elbaum 2003) and how the communities which develop OSS are organized (Markus et al. 2000; Shah 2006). Much has been learned from these studies, but many productive avenues for research remain. One aspect of OSS development that remains unexplored involves how the active community of individuals who write and support OSS software is sustained and how that community influences the ultimate success of the software.

The key difficulty in sustaining a community to support, maintain and improve software lies in the negative impacts of size. These include difficulty in communication (Levine and Moreland 1990) and social loafing (Markus and Connolly 1990; Rafaeli and LaRose 1993). While these are major problems for typical communities, some authors have proposed that larger OSS communities will reap benefits such as more efficient distributed debugging (Raymond 1999), user-to-user support of the software (Lakhani and von Hippel 2003) and other network effects that result from many users (Weber 2004). If these benefits truly exist in OSS, a larger community should be more beneficial in providing them. A number of companies apparently believe that they can reap these benefits, including Netscape, who opened the source code to their flagship browser that has since grown into the Mozilla project and produced Firefox (a web browser making inroads against the dominance of Internet Explorer), Thunderbird (an email client), and other software. Similarly, Sun Microsystems experimented with OSS by releasing the code to their office suite, Star Office. This became OpenOffice.org which today is a full-

2 featured office suite which has been improved, supported and maintained by volunteers from around the world. Sun has recently reaffirmed that they see tangible benefits provided by OSS communities by opening the source code of their flagship products, Solaris and Java. These projects have already attracted a community to maintain, improve and support them. Other companies have acted similarly, releasing software as open source to reap the benefits a community of users can provide (Bonaccorsi et al. 2006).

The two key questions to be investigated by this research are how an OSS project community is sustained, and whether an active community contributes to the success of an OSS project. Answering these questions will provide insight into whether OSS communities are able to manage the logistical complexity that often accompanies increases in size, and whether increased size provides the claimed benefits to the community in the form of more individuals supporting, maintaining, and improving the software through access to the source code. If larger communities are more successful in these endeavors, this would suggest that OSS projects must grow to be successful, and that commercial software companies would be well advised to utilize communities in testing their software, regardless of whether the source code is publicly available (Cothrel and Williams 1999; McDermott 2000).

3 2. CONCEPTUAL MODEL & THEORETICAL FOUNDATION

2.1 Background This chapter introduces the basic concepts of OSS and compares and contrasts open source software development with proprietary software development. The dilemmas surrounding the production, consumption, and maintenance of OSS are discussed against a theoretical background of public goods, volunteerism and communities. A picture of the current state of OSS research is presented. Finally, the research questions guiding this study are presented and a theoretical model is developed.

2.1.1 The OSS Concept Imagine, if you will, a “typical” home personal computer (PC) user. This user has purchased a new PC and wishes to use it for basic tasks such as managing personal finances and writing letters. Many current PCs come bundled with software for these tasks, but often these programs are “trial”, “basic” or otherwise feature-limited versions of the software to encourage users to buy the full versions. To be sure, the basic version of a financial program may suffice for many users to track their checking accounts, but will likely be insufficient for advanced users with more complex needs. At this point – so the software vendor hopes – most people will have become accustomed to the interface and features of the bundled program and will purchase the full version. However, the customer has other choices; she can go back to her previous pencil and paper method of tracking the account, or she can purchase a different piece of software.

Let us further suppose that our hypothetical user is “sold” on the advantages of keeping her account records electronically, but feels that the full version of the bundled software does not fulfill her needs or is too expensive. She visits her local computer store and with the help of the friendly salesperson, decides to purchase another title. She installs it, (hopefully) easily converts her data to the new format and proceeds to use the software. As time passes, if she is satisfied with the software, she will likely purchase upgraded versions of that software title with additional features. This pattern of purchasing commercial software as a commodity is commonly accepted as the main way to obtain software, but is certainly not the only method.

Besides illegal copying, which will not be discussed here, there are at least two other means for an end user to procure software. A second form of software is so-called “freeware”. As the name implies, no fee is charged to purchase freeware programs; they are usually freely

4 downloadable from the Internet. Our typical user, who has decided that the bundled program which came with the shiny new PC no longer meets her needs, now has a second choice. Instead of going to her local software emporium, she may decide to search the Internet for a free program to manage her finances. Many pieces of freeware exist and it is likely that she will find something to meet her needs. Having found a suitable piece of software, she proceeds to download it in a format that her computer can understand and execute; this form is known as a “binary”. She installs the binary, converts existing data and uses the program. As time passes, she downloads updates to the program that add new functionality and fix software bugs and again is presumably satisfied with the software.

Both “payware” and freeware programs – hereinafter referred to as proprietary software – share the commonality that they are distributed primarily in binary form. The reasons for this are two-fold. First, binary code is not readable by humans, thus the intellectual property ideas of the author may be protected from exploitation by others. Second, in the case of non-free programs, safeguards can be built into software binaries that help prevent piracy of the program, allowing the author to collect profits by selling many copies of the software to multiple users1.

A third class of software exists in addition to these two types of proprietary software. It is this class of software, known as open source software (OSS), which may be changing not only the business of writing software, but also the core concepts of intellectual property. In OSS, rather than distributing the binary code and obscuring the ideas of the programmer, the human- readable source code is also distributed to allow – and even encourage – others to make use of, change and improve the software. In OSS, the goal is to distribute the programmers’ work, rather than to protect their ideas.

At its simplest, OSS is software for which the source code is freely available. There are some caveats to this simple definition, however. OSS software is developed by communities of programmers – mostly volunteers – who meet mainly via the Internet to create software (Ghosh et al. 2002; Lakhani et al. 2002). This code is copyrighted by the author, as soon as it is written down, just like any other idea which is fixed in tangible form. In order for others to legally access this code, permission must be granted by the programmers. For both proprietary and OSS software, copyright is superseded by license agreements. Proprietary license agreements 1 There is a further possibility for obtaining software which is mainly used in business-to-business situations, rather than as a commodity to consumers. In this situation, a contractor develops a custom software package for a customer. The customer may be provided only a binary, or may receive the source code to the program to allow them to fix bugs and add new features. These exchanges are governed by some type of contract that specifies what each party will receive and what may be done with the software, for example whether it may be resold, transferred or similar considerations. 5 typically allow certain rights, such as the ability to make a single backup copy and to install the software on a single PC. OSS licenses, on the other hand, allow a number of additional concessions, which are enumerated below.

In order to facilitate easy licensing by individuals writing software, a number of advocates of the idea of sharing software have banded together to create multiple OSS licenses. This group is known as the Open Source Initiative (OSI). This non-profit organization manages and promotes the Open Source Definition (OSI 2006b), a community-accepted definition of what OSS licenses should allow. For a more complete history of the influences that led to the creation of the OSI, see Appendix A.

The stated goal of the OSI is to provide a way of certifying that software that claims to be open source is consistent with the ideals of the OSS movement, by ensuring that OSS is released under licenses that agree with the principles of the Open Source Definition. The Open Source Definition states that OSS licenses must provide, in part, at least the following basic rights to users of the software, in addition to simple access to source code (OSI 2006b):

1. Free Redistribution – Everyone must be able to sell or give away the software. 2. Source Code – The source code must be included, or freely available (via download, for example). Additionally, the license must allow for redistribution of said source code, not just the compiled, binary form of the program. 3. Derived Works – The license must allow for modifications and derivative works; these works must be redistributable under the same license. 4. No Discrimination Against Persons or Groups – No person or group of persons may be discriminated against, nor may any field of endeavor be excluded from utilizing the software. There are currently 58 licenses that meet these and the remaining requirements of the Open Source Definition. While these licenses differ on some fundamental points, all are considered “open source” licenses by the OSI. The full text of the Open Source Definition appears in Appendix B.

In summary, OSS is software distributed under a license which meets the Open Source Definition. In other words, OSS is software for which the source code is freely available; this source code is modifiable and redistributable by anyone with the desire and ability to do so. This contrasts sharply with proprietary software, where the source code is treated as a trade secret of the company or individual producing the software and typically protected at all costs. In OSS, the goal is to distribute the works and ideas of the programmers to allow others to benefit and to allow improvements by other programmers.

6 2.1.2 OSS vs. Proprietary Software Development Besides the differences in the way in which OSS and proprietary software are licensed, there are differences in how each is developed. There are many software development methodologies; a full contextualization of all of them is beyond the scope of this research, but some comparisons will help illustrate the key differences between OSS and proprietary development.

Proprietary software is typically developed by teams of individuals working within a firm. As members of a firm, they are bound by contract to the firm, have roles which are largely defined by their position in the hierarchy of the firm, and work on tasks assigned by a manager of some type (Weber 2004). The product is designed in a “top-down” mode, then programmers write code which implements this design (Neus and Scherf 2005). The programmers are subject to schedules and have a specific list of deliverables (Mockus et al. 2002). The core goal of proprietary development is to deliver a finalized product for a profit (Feller and Fitzgerald 2000).

In stark contrast to this model, OSS is developed largely by volunteers organized into OSS projects2 (Ghosh et al. 2002; Lakhani et al. 2002). These developers are not bound to the project by any type of formal contractual agreement. The tasks each developer wishes to perform are typically self-assigned (Crowston et al. 2005b; Mockus et al. 2002). Rather than explicit design, most OSS projects rely on writing code, releasing it for testing and making improvements in an evolutionary fashion, a strategy often called “release early, release often” (Raymond 1999, p. 7). Any deadlines or lists of specific features to be included in a given program version are determined primarily by users of the software (Mockus et al. 2002). Finally, as an evolutionary product, the goal of OSS is not necessarily to develop a finalized product; requirements are continuously elaborated and development and improvement continue as long as there are interested developers (Feller and Fitzgerald 2000).

2.2 The Social Dilemma of OSS Because OSS is developed by the collective action of a mostly volunteer workforce, it may be considered an example of a public good (Bitzer and Schroder 2005). A definition of public goods is presented to frame the discussion in this section. Two social dilemmas commonly associated with public goods are discussed. Based on this discussion, it is shown that these dilemmas are not problematic for OSS, unlike most public goods, but that OSS faces a

2 A full definition of the composition of OSS projects is given in section 2.3. Briefly, a project consists of the software, documentation about the software, and the developers and users of the software. 7 different dilemma of maintaining a community of volunteers to maintain, improve and support the software.

2.2.1 Public Goods Public goods are typically characterized by two properties, non-rivalry and non- excludability (Samuelson 1954). Non-rival means that a good is not used up in its consumption. All software is a non-rival good; one individual’s use of a copy of a program does not interfere with another's utilization of a copy of the same software. While making copies of proprietary software is often illegal, this has no bearing on whether the good is non-rival, as multiple users can utilize multiple copies – legal or illegal – without infringing on others' ability to do the same.

Since a true public good is not used up in its consumption, it makes little sense to incur additional costs by trying to exclude individuals from use of the good. Thus, the second characteristic associated with true public goods is non-excludability. Many physical public goods are de-facto non-excludable; public parks, bridges and roads are generally open to all (Hardin 1982). Open source software is a non-excludable good by license. As outlined above, in order for a license to be approved by the OSI, it must fit the Open Source Definition, which mandates that no individual may be excluded from utilizing the software on any grounds (OSI 2006b). Since OSS is non-rival and non-excludable, it has the key characteristics that define a public good.

Public goods are typically subject to social dilemmas which center on free riding behavior. Free riders are those who utilize the efforts of others without helping to provide the good themselves (Dennis et al. 1990; Olson 1965). Free riders exist in many situations, including OSS. In OSS there are users – who may even make up the majority of all users – who simply utilize the software without contributing back to OSS project (Weber 2004). The social dilemma occurs as follows: if everyone acts in their own self-interest, all individuals free ride on the efforts of others. However, if everyone free rides, the public good is not produced. This is a situation in which individual rationality leads to collective irrationality – i.e., a social dilemma.

2.2.2 Production of Public Goods The first part of this social dilemma involves the production of a public good. In order for any good to be provided, someone must bear the costs. Because public goods are non- excludable, it is in the best interest of each individual to allow others to bear the whole cost of providing the good, since they can reap the benefits of the good whether or not they contribute.

8 Since each individual is interested in his or her own economic welfare, each individual is motivated to act in precisely the same way and allow others to provide the good. This individual rationality leads to collective irrationality with the result that no one will provide the good, creating a social dilemma which has been termed the social fence (Olson 1965; Yamagishi and Sato 1986). Additionally, as community size increases, each individual's contribution is proportionately smaller and less visible, thus the temptation to free ride on the efforts of others is larger (Gallupe et al. 1992; Olson 1965). Despite these theoretical difficulties in producing public goods, roads, bridges and parks all exist. These public goods are usually provided by government bodies, which have the legal authority to levy taxes and thus do not need to rely on voluntary contributions, eliminating the possibility of free riding.

No authoritative body compels individuals to write software and subsequently release it under OSS licenses. Under these circumstances, OSS ought not exist (Weber 2004), since self- interested individuals should free ride upon the efforts of others and not contribute code in the first place; but OSS definitely exists and appears to be flourishing. This seeming paradox is explained by the private production of a public good. Under certain circumstances a public good can be produced without outside bodies mandating contribution. This situation is known as disjunctive production. In disjunctive production, an individual – or a community – can cooperatively provide the good that benefits all (Yamagishi and Sato 1986). In this case, free riders are not detrimental, as the good will be produced by those few members. In fact, free riding is nearly certain to occur in the case of disjunctive production, due to the non- excludability of a public good, but the good will still be produced by the community. As a disjunctively produced public good OSS is not subject to a lack of production due to free riding. A relatively group – the OSS project community – can and does produce the good (Bitzer and Schroder 2005).

2.2.3 Consumption of Public Goods The second part of this social dilemma is related to the consumption of the public good by free riders. When individuals utilize a good without contributing, there is a tendency to deplete the public good. Despite the definition of a public good as a non-rival good, most physical public goods are rival, meaning that they are used up in their consumption. Most physical public goods suffer from the effects of overuse and are subject to overcrowding. If too many people go to a park, it will become overused; if too many drivers are on a road, traffic will be heavy and wear and tear will increase. This situation is described in the “Tragedy of the

9 Commons” (Hardin 1968), in which a group of herdsmen have common pasture land. This land must be shared among all the herdsmen, but the pasture cannot support all the animals that each herdsman wishes to graze. To avoid overgrazing, the herdsmen must each exercise individual restraint in how many animals they allow into the pasture. However, since each herdsman wants to maximize his own personal profit and since the group lacks formal contracts and enforcement, each farmer will place all of their animals on the pasture and overgrazing will result. This state of over-utilization is quite common in many physical goods, but other public goods exist which are truly non-rival, as the theory of public goods suggests. OSS, like all software, is non-rival; an unlimited number of individuals may utilize multiple copies of a piece of software without hampering utilization by other users. As a virtual good, software – including OSS – is not subject to depletion by overuse, hundreds or thousands of copies can be made for essentially zero cost (Weber 2004).

To summarize, a community of developers can bear the entire effort of producing the software which may then be used by all. Thus, free riding in OSS does not impair production of the software. Since all software is non-rival, free riders do not pose a consumption problem. The key issue to be investigated, then, is neither initial provision nor consumption, but rather sustaining and maintaining the volunteer efforts of an OSS project community to ensure the ongoing development, support and improvement of the software.

2.2.4 Sustaining Public Goods OSS projects depend mostly on voluntary collective action for the continued maintenance of both the software good and the associated community resources (Ghosh et al. 2002; Lakhani et al. 2002). Thirty years have been invested into research to explain how public goods may be produced despite the difficulties in inducing contributions. Successfully maintaining a public good without force remains a difficult proposition (Weber 2004). And yet, OSS communities exist and are flourishing. How can we explain the ability of the OSS project community to sustain and continue to add value to a public good in the face of this social dilemma?

The answer to how OSS can maintain a collective good is that OSS may be considered to be not only a non-rival good, but actually an “anti-rival” good (Weber 2004). While this term is somewhat unwieldy, it nicely captures the idea behind the concept more commonly known as network externalities. When goods are subject to network externalities, their value increases slightly with each additional user of the good. Commonly cited examples of anti-rival goods are telephones and fax machines. The first person with a telephone was not able to call anyone, but 10 as soon as another individual installed a telephone, instantaneous communication was possible. As the number of subscribers to telephone service grew, each person was able to contact more people, increasing the value of the network.

OSS benefits from network externalities in that debugging becomes more efficient as more users utilize the software (Weber 2004). More famously, this claim is stated as “With enough eyeballs, all bugs are shallow” (Raymond 1999, p. 8). In other words, as more individuals use the software, more bugs will be exposed. Another benefit of large numbers of users is that when many people use a given piece of software, it becomes easier to exchange data with these other users. Based on these benefits, one would expect that OSS is an anti-rival good that actually benefits from network externalities and more users, even when most of those users do not actively contribute to the maintenance, support or improvement of the software. A large percentage of the overall number of users of the software will simply free ride on the efforts of others and never contribute back to the software, however, some small number of users will provide something of value to the OSS project, even if it means simply reporting a bug out of frustration (Weber 2004).

These users who report bugs, ask for new features and contribute fixes to the software are typically volunteers (Ghosh 2006). As such, they are individuals who perform some kind of productive activity at below market value (Wilson and Musick 1997). This definition does not preclude volunteers from benefiting from their work; it merely means that they are not directly remunerated for their activities (Wilson 2000). Volunteers perform their work for a variety of personal reasons, including social motivations, career expansion, as an expression of deeply-held convictions, and to enhance their own understanding and practice new skills (Clary et al. 1998). These motivations are consistent with developer motivations found in OSS literature such as enhanced reputation, development of software coding skills and belief in the principles that underlie OSS (Ghosh et al. 2002; Lakhani et al. 2002), as such, for purposes of this study, it will be assumed that individuals volunteer for these reasons. Whatever their initial motivations to volunteer, individuals may remain active in the community and make a commitment to “an ongoing helping relationship that may entail considerable personal costs of time, energy and opportunity” (Clary et al. 1998, p. 1517). When individuals make such a commitment to an OSS community, they become the means of maintaining the software and associated community resources. Indeed, one definition of volunteering is “acting to produce a 'public' good” (Wilson 2000, p. 216), which fits well with the tasks performed by volunteers in OSS communities.

11 The volunteers who produce OSS band together into communities to produce, maintain and support the software. Communities are groups of individuals who work together to help each other achieve a common purpose (Cothrel and Williams 1999). Communities exist in many forms, including traditional “physical” communities who meet face-to-face (Levine and Moreland 1990) and online “virtual” communities who meet primarily or solely via computer mediated communication (Wenger 1998). Both physical and virtual communities typically form as individuals seek out other like-minded people with whom to communicate about their topic of choice. As individuals interact within a community, they form social bonds with others. These bonds, often referred to as social capital, are the basis for individuals feeling a sense of belonging to the community and trusting other members (Nahapiet and Ghoshal 1998). This social capital acts as the “glue” that holds individuals to a community, thus, volunteers may initially join a community for a variety of reasons, but stay due to the relationships they form with other community members. Social capital can enhance members' reputations and status within the community, but does not transfer well should an individual move from one community to another (Bourdieu 1986).

Communities go through a life cycle which includes a number of stages, such as “forming, storming, norming, performing, and adjourning” (Tuckman and Jensen 1977, p. 427), which correspond to start-up, resolution of conflict issues, adjustment of members to community norms, achievement of collective goals and finally the dissolution or death of the community. OSS communities have been shown to go through stages of forming around an idea or initial piece of software (Raymond 1999), conflict resolution (Elliot and Scacchi 2002) and achievement of collective goals (Crowston et al. 2003; Stewart and Ammeter 2002). Sometimes OSS projects are not sustained and dissolve, for evidence of this one need look no further than the thousands of “dead” projects hosted on Sourceforge (Sourceforge 2006). The life cycle of a community is thus integral to the growth and development of that community; therefore attraction and retention of membership is vital to sustaining the community.

Regardless of their makeup or location, to sustain a membership and achieve common purposes, communities must first engage and involve their members (Cothrel and Williams 1999). All types of communities, including OSS communities, must provide something of value to participants in order to retain their membership (Butler 2001; Moreland and Levine 1982). Communities furnish this value by providing the very things that volunteers seek such as social contact (McClelland 1985), an opportunity to enhance their reputation (Lakhani et al. 2002;

12 Lakhani and von Hippel 2003), and a means to become part of something larger than themselves by participating in collective efforts (Content Team 2006; Ostrom 1990). As communities fill the needs of members, they will be able to retain existing members and attract new members and thus grow in size. However, size has been shown to be detrimental to many communities, causing stress, reduced performance (especially in complex tasks) and reduced ability to communicate (Levine and Moreland 1990). Larger communities are also more subject to social loafing (Markus and Connolly 1990; Rafaeli and LaRose 1993) leading to reduced contributions per member. While computer-mediated communications can improve the ability to communicate effectively (Dennis et al. 1990; Gallupe et al. 1992; George et al. 1990) larger communities often have trouble retaining and attracting members even in online settings (Butler 2001).

The volunteers who form the OSS project community are drawn from a much larger pool of all users of the software which includes free riders (Weber 2004). The pool is heterogeneous in nature; some individuals are non-technical users, others may be considered power-users who have extensive application knowledge and still other individuals may be expert software developers who love to apply their skills to solving difficult problems. Among this large and diverse set of all users of the software is a relatively small number of individuals who are willing to expend their time and energy to contribute to the software development effort (Weber 2004). This phenomenon of needing a large pool of potential contributors in order to find and attract an appropriate number of volunteers is not unique to OSS, volunteerism studies consistently find that only half of Americans volunteer for any cause, and it may safely be said that the percentage who are interested in and able to contribute to OSS is much smaller (Clary et al. 1998; Wilson 2000; Wilson and Musick 1997). The Internet has enabled OSS projects to communicate more readily with this enormous pool of potential talent and connect with those who are willing and able to help; these interested individuals form OSS project communities (Weber 2004). These individuals are willing to participate in the disjunctive production of a public good even if it is not obviously in their best self-interest to do so.

In conclusion, the key issue in viewing OSS as a public good is not a question of provision or consumption; it is a question of sustainability. Regardless of whether a project is started by an individual or a community of like-minded individuals coming together to create a public good through collective action, sustaining the OSS project community through the actions of volunteers is critical for the support, maintenance and continued improvement of the software.

13 2.3 Defining OSS Projects It is easy to delineate the boundaries of a typical proprietary software development team; it consists of the programmers and project managers who are paid for their services. Since there are no contractual mechanisms binding developers to OSS projects, it is more difficult to say who is a community member. As an ideal type, an OSS project consists of a software program (licensed under an “approved” license) developed by a community of one or more (mostly volunteer) developers who meet (primarily) via the Internet to design and author the software (Weber 2004). While simple, this definition fits many of the OSS projects in existence.

OSS projects exist within a larger “OSS movement” which, for purposes of this study, is defined as all software development projects whose software is distributed under an OSI- approved license. This OSS movement shapes attitudes of participants in OSS projects, partially through concepts embodied in the Open Source Definition about how OSS should work (OSI 2006b). Within the OSS movement, there are literally hundreds of thousands of OSS projects; one large OSS hosting site alone lists over 100,000 projects, in various stages of completion (Sourceforge 2005b). Sourceforge is just one of many options for online hosting of OSS projects and does not represent the sum total of all OSS projects (Berlios 2006; Freshmeat 2006; Savannah 2006).

Given that there are hundreds of thousands of OSS projects within the larger movement, it is safe to assume that not all projects are the same. Despite these differences, the historical influences that shaped OSS have created commonalities among OSS projects. Given the similarities among OSS projects, for the purposes of this study an OSS project is defined as consisting of a website (i.e., a virtual meeting space), the software itself, any codified knowledge about the software, the OSS “project community” and other users. Some of the elements included in this definition of an OSS project may not be present across all OSS projects, but these components are typical. Each component of this definition is described below.

As previously noted, the ideal type of an OSS project is composed of volunteers who collaborate virtually via the Internet to produce OSS. Since communications and development activity take place virtually, the project maintains a presence on the Internet. This website serves as a virtual meeting place for those interested in the software and provides a location from which anyone can download the software. In addition to creating software, OSS projects produce codified knowledge in the form of software documentation, email lists, bug and feature request databases and archived electronic communications between members, such as discussion forums.

14 Anyone interested may use these electronic resources to obtain the software, report bugs, request new features, communicate with others and find solutions to problems which may emerge as they utilize the software.

Producing software and all associated documentation and codified knowledge generally requires the efforts of a community. Many previous studies have included only “developers” – those who are primarily responsible for writing most of the software code – as members of the OSS project community (Crowston et al. 2005b; Hars and Ou 2001; Hertel et al. 2003). Recently it has been noted that developers are only one portion of the OSS project community (Crowston et al. 2006b). It has been proposed that users of the software find and fix bugs and request new features (Raymond 1999). In addition, these users share their knowledge to aid and provide support to other users (Lakhani and von Hippel 2003). Given the apparent value of contributions from those typically considered users of the software, for the purposes of this study, those users of the software who communicate with the developers (and each other) will be termed “active users” and included with developers as part of the OSS project community.

Not all users of OSS contribute by communicating with the community, some simply download and use the software. They do not contribute bug reports or ideas back to the community. As such, they would typically be considered free riders. Despite their lack of direct contributions to a given project, they serve as evangelists of the software, providing word-of- mouth advertising (Weber 2004). These free riders provide network effects and are the pool from which volunteers are drawn to maintain, support, and improve OSS. Because free riders provide network effects, they are included as members of OSS projects and as members of the OSS movement, but not included in the OSS project community.

To summarize, this research entails investigation of the OSS project as a whole. The project consists of much more than the software and is more than just the OSS project community. Taken as a whole, an OSS project includes the software, the project community and the store of knowledge and skills about the software. Some of this knowledge is codified in email list archives, discussion forums, bug databases, documentation and similar material found on the project website. The components included in this definition of an OSS project within the larger OSS movement are illustrated in Figure 1.

15 OSS Movement OSS Project OSS Project OSS Project OSS Project OSS OSS Project Project OSS Project Project Website

Codified Knowledge Software (Documentation, Email Lists, Forums, Bug Tracking) Product

OSS Project Community

Developers Active Users

Other Users (Free Riders)

Figure 1. Composition of OSS Projects

2.4 Prior Research A review of the existing literature surrounding OSS reveals five major streams of study. First are business or advocacy case studies of the putative advantages of OSS. Next come studies that investigate the demographics and motivations of individuals participating in OSS project communities. Third, a number of studies investigate the economics behind the production of OSS. Fourth, many studies have investigated the quality of software produced by OSS projects. The final stream examines how OSS projects are organized and governed and how work is coordinated within the project. Each of these research streams will be examined below. A summary of some representative is found in Table 1. The papers selected for this table represent the state of research over the past 6 years or so (2000-2006). During the first four years of this period, there were only a few papers published each year. Since 2004, there have been dozens of papers published which highlight the impacts of the principles underlying OSS on fields as diverse as education, computer software and management. Despite the

16 diversity of outlets in which OSS literature has been published, the main research streams have remained consistent, as OSS has common impacts across fields of study in terms of economics and governance practices. The following summary does not include all literature on OSS development, but rather highlights each research stream.

2.4.1 Business and Advocacy Studies The first stream of literature includes business and advocacy studies which typically appear in trade journals, magazines and other popular press outlets. Often, these articles discuss the reasons for a corporation's decision to adopt open source software and what the company expects to achieve as a result of that adoption. Other articles discuss such matters as whether Linux needs a “killer app” – that is, an application so exceptional that it makes the underlying stand out in comparison with MacOS or Windows (Adelstein 2004; Knight 2004). Another common article topic in this stream involves comparing features of a given product vis-à-vis a proprietary product; comparing Linux with Windows or Apache with IIS, for example. A key feature of most of these studies is that they involve a single firm using OSS software and focus on technical issues surrounding the software.

These studies are useful to practitioners or potential adopters of OSS. By discussing potential pitfalls and advantages of implementing OSS products these articles make it possible for implementers to make more accurate decisions about whether OSS will be useful in their specific context. Due to their often narrow scope – one or a few pieces of software – they are of only limited value in scholarly research, except to illustrate trends.

2.4.2 Demographic & Motivational Studies The second stream of OSS research examines the demographic composition of OSS development projects and investigates the personal motivations of those involved in creating OSS (Ghosh et al. 2002; Lakhani et al. 2002). Most of these studies are based on surveys of individuals who are participants in an OSS project and ask questions such as “Please indicate your top three reasons for contributing to this project.” and “Have you been financially compensated in any way for participating in this project?” (Lakhani et al. 2002; Roberts et al. 2006). Various demographic questions are often asked, such as where the individual resides, educational levels, age, gender, and so forth.

These studies provide a good baseline for comparing OSS projects and have shown that volunteers, who are not paid for their contributions, form the highest percentage of contributors

17 to most OSS projects3. Further, these studies have yielded valuable data to suggest that certain intrinsic and extrinsic rewards motivate individuals to volunteer their efforts to OSS projects (Ghosh et al. 2002). The key contribution of these demographic and motivation studies lies in identifying some of the many reasons for individual contributions, as seen by the participants themselves (Bagozzi and Dholakia 2006; Weber 2004). Some of the motivations and rewards identified by these types of studies include recognition from peers, the desire to improve one's skills, a better reputation within the community and belief in the OSS philosophy (Lakhani et al. 2002). While a complete analysis of these motivations is outside the scope of this study, the fact that some individuals are motivated to participate is invaluable; indeed, these volunteers make up a large percentage of OSS project communities (Ghosh 2006).

2.4.3 Economic Studies The third research stream investigates the economics of OSS. Two main approaches are taken within this stream. First, some scholars investigate the economic rationale for individuals voluntarily giving away the fruits of their labors. The findings of these studies indicate that individuals accrue some type of private benefit from their actions, such as improving their reputation, showcasing their skill in writing software, or other intrinsic rewards (Lerner and Tirole 2002; Weber 2004). The second approach to the economics of open source considers why a firm would voluntarily open the code to a previously proprietary product. Again, the answer seems to be that firms can reap some type of private benefit from opening their source code and allowing others to read and contribute to it. These benefits include tapping the skills of many volunteer coders to debug and improve their product (Hawkins 2004). In both the individual and firm cases, there is some economic (if not fiduciary) benefit to contributing to OSS projects.

2.4.4 Software Quality Studies A fourth research stream investigates the quality of the code produced by OSS projects. This is typically done by comparing the code from one OSS product with the code from a proprietary piece of software that is similar along some dimensions. Some typically used dimensions are size, usually measured as LOCs or KLOCs (lines of code/kilo lines of code) (Mockus et al. 2002), number of bugs or bug fix rates (Kuan 2002), idealized measures of development effectiveness (Stamelos et al. 2002) and software complexity (Kuan 2001).

3 Some articles suggest that while unpaid volunteers make up the bulk of contributors, they do not make up the bulk of contributions. This finding seems very dependent on which particular projects are studied, as some large projects have paid developers, but many smaller projects do not. 18 The findings from this research stream seem to indicate that OSS code is usually, but not always, better than proprietary code. The findings of these studies seem to depend to some extent upon which dimensions are chosen for comparison. Increased utilization of OSS software in the last few years is perhaps the most telling evidence that OSS code is generally of good quality. To name just two examples, OpenOffice.org has seen steadily increasing downloads of their office suite (OpenOffice.org 2006) and the Apache web server powers more websites every year (Netcraft 2005).

2.4.5 OSS Project Organization and Governance Studies The fifth research stream examines how OSS development projects are organized and governed and how work is coordinated. It has been suggested that OSS projects will naturally have a flat organizational structure with users and developers interacting without the restrictions normally imposed by hierarchical organizational structures (Raymond 1999). However, this vision of classless interaction does not seem to be the norm among OSS projects. Various other authors have found that OSS development projects are organized as hierarchies (Gacek and Arief 2004; Kuk 2006), in onion-like layers (Mockus et al. 2002), or as social networks (Crowston and Howison 2006).

OSS project communities must rely on non-contractual mechanisms to induce volunteers to write code. This leads to the use of social norms to induce compliance with the expectations of the community (Elliot and Scacchi 2002; Sagers 2004) and the use of network forms of governance (Markus et al. 2000; Sagers 2004). OSS project communities utilize many coordination mechanisms that are similar to the mechanisms used in traditional proprietary software development (Crowston et al. 2005b). However, since there are no contracts binding the majority of developers to a particular project, personal interest on the part of the volunteer plays a major role in assigning tasks; that is, tasks are typically self-assigned. Other coordination mechanisms include asking at large within the community for volunteers for a given task, asking within a select group for volunteers and asking potential contributors to consult and coordinate with others before starting a task (Crowston et al. 2005b). In general, many OSS project communities utilize governance and coordination mechanisms that rely on non-contractual means of inducing compliance with the projects' aims.

In summary, there have been many excellent studies of the software produced by OSS processes, of individual motivations to contribute to OSS development and of how OSS projects are governed. This prior research identified many salient features of OSS projects and 19 development, such as the composition of OSS projects and the value of the created product. Additionally, prior research indicates that the practices used in developing OSS can provide benefits to many other fields of endeavor, as elaborated in the following section.

Table 1. Summary of Existing Literature

Paper Level of Analysis/Methods Findings Business and Advocacy Case Studies Knight, 2004 Level: Software Discusses the trend of applications originally written for Does Linux really Method: Advocacy case Linux being ported to Windows. need a “killer app” to study succeed? Gives support for the idea that applications, not the underlying platform are the reason people start using OSS programs (i.e., users are introduced to Firefox on Windows, OpenOffice.org on Windows, not these programs under Linux). Adelstein, 2004 Level: Software Identifies specific needs of corporate and home users of Desktop Linux: The Method: Advocacy case Linux as a desktop platform, such as improved support for Final Hurdles study laptops, more uniformity in the printing system and support for more devices.

Claims that with these items in place, Linux may take off as platform of choice. Mims, 2004 Level: OSS vendors News article concerning a partnership between several Novell, IBM and HP Method: News Article Linux vendors to allow the hardware companies to ship and unite efforts to put support SUSE Linux on their servers. Linux on top Demographic & Motivational Studies Ghosh, Glott, Krieger, Level: Individual Motivations: Learning and personal skill development, & Robles, 2002 Method: Surveyed share knowledge with community, enjoy OSS methods, Free/Libre and Open developers from a selection free-software ideology. Source Software: of OSS projects Survey and Study Demographics: Average age 27, 33% bachelor's degree, (a.k.a. FLOSS Survey) 28% masters, 53% professional programmers, worldwide distribution. Lakhani, Wolf, Bates Level: Individual Motivations include intellectual stimulation, improving & DiBona, 2002 Method: Surveyed 10% (coding) skill, need software for work or play, free-software The Boston Consulting random selection of projects ideology, feel obligated to give back.. Group Hacker Survey on Sourceforge, personal email link to web-based Demographics: Average age 30, 98% male, worldwide survey distribution, 45% professional programmers (not necessarily for OSS), 70% volunteer/30% paid.

20 Table 1. Continued

Paper Level of Analysis/Methods Findings Hertel, Niedner & Level: Individual Motivated by identification with community, pragmatic Herrmann, 2003 Method: Surveyed members motives to improve software, self-efficacy and belief in Motivation of Software of several kernel mailing free-software ideology. No demographics reported. Developers in Open lists. Respondents were Source Projects: An self-categorized as Internet-based Survey “developers” or “interested of Contributors to the readers” Linux Kernel Bagozzi & Dholakia, Level: Individual Surveyed participants in Linux user groups rather than 2006 Method: Survey development projects, meaning that “typical” reasons for Open Source Software participation, such as enhanced reputation and prestige, User Communities: A were presumably absent. Found that members contributed Study of Participation because they felt social identity with the community. in Linux User Groups Ghosh, 2006 Level: Software Project Although primarily an economic impact study, findings for Study on the: Method: Surveyed OSS demographics are also reported, indicating that 65% of OSS Economic Impact of developers, also conducted is written by individuals, firms contribute 15%, and Open Source Software economic analysis. institutions another 20%. on Innovation and Competitiveness of the Demographics: Europe is the leading region for globally- Information and collaborating developers, followed closely by North Communication America. Average age, educational levels and professions Technologies (ICT) of OSS developers similar to previous FLOSS and BCG Sector in the EU surveys. Roberts, Hann & Level: Individuals Utilize mostly previously-developed measures of Slaughter, 2006 Method: Archival & survey motivations. Developers motivations are related in complex Understanding the data of participants in ways. Extrinsic motivations (payment) for contributions to Motivations, Apache projects. Apache projects does not diminish intrinsic motivations for Participation, and contribution. The contribution levels of paid developers is Performance of Open higher than unpaid, but intrinsic motivation levels do not Source Software impact contribution levels for either paid or unpaid Developers: A developers. Longitudinal Study of the Apache Projects

Economic Studies Lerner & Tirole, 2002 Level: Individual developers Individuals should participate if benefits of contribution Some Simple & individual firms exceed opportunity costs, corporations should open code if Economics of Open Method: Economic analysis it can boost profits in excess of costs. Source Hawkins, 2004 Level: Individual firms Firms should open source code if it appears possible to The Economics of Method: Economic Analysis attract contributions from community that will offset costs Open Source Software of in-house development. for a Competitive Firm Bitzer & Schröder, Level: OSS project Provide economic arguments supporting the reasons that 2005 contributors individuals would contribute their efforts to privately Bug-Fixing and Code- Method: Simulation provide a public good. Simulation results indicate that OSS Writing: The Private will be provided by young, low-cost individuals who gain Provision of Open private benefits related to signaling their programming Source Software. skills.

21 Table 1. Continued

Paper Level of Analysis/Methods Findings Bonaccorsi, Level: Firms Many Italian software firms have adapted a hybrid model of Giannengeli & Rossi, Methodology: Surveys of licensing for their products and revenue streams. In other 2006 Italian software firms words, some of their products are licensed under OSS Hybrid Business licenses with revenues coming from providing commercial Models in the Open support, while other products are proprietarily licensed and Source Software income comes from sales. Industry Casasdesus-Masanell, Level: Firms Analyze the competitive relationship between Linux and Ghemawat, 2006 Methodology: Economic Windows, and find that Linux is an (economically) viable Dynamic Mixed analysis competitor to Windows, but unlikely to totally displace it Duopoly: A Model from the market. Forecasts a continuing duopoly in the Motivated by Linux vs. operating systems market. (Note: Does not take other Windows` operating systems, such as Mac OS or various Unix versions into account). Economides & Level: Firms Found that proprietary software platforms often have Katsamakas, 2006 Methodology: Economic equilibrium price below marginal cost. Open source Two-Sided simulation platforms have a larger variety of applications available, but Competition of lower total (economic) profits than proprietary platforms. Proprietary vs. Open When OSS competes with proprietary software, the Source Technology proprietary system is likely to dominate in terms of market Platforms and the share and economic profits. Implications for the Software Industry Ghosh, 2006 Level: Software Project Found direct economic impacts of Euro 12 billion to Study on the: Method: Surveyed reproduce OSS directly. OSS code base overall has Economic Impact of individual OSS developers doubled every 18-24 months over past eight years, and is Open Source Software & analyzed economic projected to continue this growth for the next few years. on Innovation and impact. Competitiveness of the OSS potentially saves industry 36% in research & Information and development investment. Communication Technologies (ICT) Sector in the EU Software Quality Studies Schmidt & Porter, Level: Software Use the SKOLL project to perform continual testing and 2001 Method: Continuous code profiling of two other OSS project's code. Initial findings Leveraging Open quality analysis indicate that continual regression testing and profiling finds Source Communities bugs that are not exposed by the ad-hoc quality assurance To Improve The processes used by the two open source projects. Quality And Performance Of Open Source Software Mockus, Fielding & Level: Software Defect density will generally be lower in OSS and OSS will Herbsleb, 2002 Method: Studied defect generally respond very quickly to customer problems. Two Case Studies of density and time required to There was some variance between Apache and Mozilla, Open Source Software resolve defects, compared to which the authors ascribe to organizational structure Development: Apache several closed-source differences. and Mozilla products. Developed hypotheses based on findings

22 Table 1. Continued

Paper Level of Analysis/Methods Findings Stamelos, Angelis, Level: Software Find that the applications are higher quality than would be Oikonomou & Bleris, Method: Use software expected given lack of controls over development process, 2002 measurement tools to but lower than the industrial standard used. Code Quality Analysis compare quality of 100 In Open Source Linux applications with Software Development industrial standards Kuan, 2004 Level: Software Found that in two of three cases, OSS fixed bugs more Is Open Source Method: Compares bug fix quickly than comparable closed source program. Also Software “Better” rates of pairs of open and noted that more bugs were found in closed source that OSS Than Closed Source closed source web servers, products, contrary to Raymond's prediction, but does not Software? Using Bug- operating systems and user follow up on this statement. Fix Rates to Compare interfaces, using a hazard Software Quality rate model. (Note – includes what are often considered feature requests as bugs) Zhao & Elbaum, 2003 Level : OSS Project Surveyed developers about quality assurance practices in Quality Assurance Method: Survey of OSS projects. Findings indicate that OSS has introduced Under The Open developers new methods of quality assurance, but that they may not be Source Development exploitable under some proprietary scenarios. Also note Model that OSS quality assurance activities are evolving. Wheeler, 2005 Level: Meta-analysis Summarizes dozens of white papers and academic studies Why Open Source Method: Summaries of with goal of showing that OSS is better than proprietary Software / Free existing white papers and software in the areas of market share, reliability, Software (OSS/FS)? academic studies performance, security, scalability and total cost of Look at the Numbers! ownership. OSS Project Organization, Governance and Coordination Raymond, 1999 Level: OSS project Proposes that OSS communities will essentially be flat, with The Cathedral and Method: A treatise on how users interacting with developers and project leaders, The Bazaar OSS should work. Bazaar without an organizational hierarchy. Also proposes that model often touted as how users will contribute bug reports and code. OSS does work; lots of individuals contributing, Basis for claims of “With enough eyeballs, all bugs are each with own approach; shallow” and “Every good piece of software starts by releasing code early and scratching a developers personal itch” often. Markus, Manville & Level: OSS project and OSS Discusses governance methods used in OSS projects to Agres, 2000 movement coordinate and motivate a volunteer workforce. Motivation What Makes a Virtual Method: Case study is discussed in terms of items thought to be important to Organization Work? OSS developers, such as reputation, social status, OSS ideology and increased skill sets. Coordination and governance are framed in terms of network governance factors, such as monitoring and collective sanctions, vetting process, culture and rules. Suggests ways in which these governance styles may be applied to other organizations.

23 Table 1. Continued

Paper Level of Analysis/Methods Findings Elliot & Scacchi, 2002 Level: OSS project Investigates how conflict is mitigated and resolved in OSS Communicating and Method: Archival data (chat projects. Find that belief in , in open Mitigating Conflict in logs, email list archives and disclosure of ideas and in the value of freedom of choice in Open Source Software electronic forums) choosing work assignments are valuable to OSS developers. Development Projects Overall, the OSS ideology aids in resolving conflict, as does the archiving of messages, which may contain discussions about the conflict.. Mockus, Fielding & Level: OSS project Discovered that in Apache, multiple developers added code Herbsleb, 2002 Method: Investigated to any given module, rather than strictly enforcing “code Two Case Studies of internal structure of Apache ownership”. Mozilla had more core individuals with more Open Source Software and Mozilla through enforced “ownership”. Posit that projects are organized Development: Apache published descriptions from hierarchically, with core developers, an order-of-magnitude and Mozilla each project; compared this larger group that fixes bugs and another order-of-magnitude with who actually group which reports problems. This model is part of the contributed code. basis of the “onion-like layers” models of OSS community structure. Crowston & Howison, Level: OSS project Centrality varied widely, meaning that different 2005 Method: Use SNA to communities communicated in different ways; some Hierarchy and determine centrality in bug- coordinated efforts through a small number of individuals, Centralization in Free fixing lists for OSS project in others, everyone talked to everyone else. Most were and Open Source communities (projects somewhere in between, suggesting a development team Software Team selected from Sourceforge) which coordinates efforts. Raises interesting questions Communications about how much community influences the claimed advantages of OSS. Crowston & Howison, Level: OSS project Uses coordination theory approach to determine how 2005 Method: Coding of email developers manage coordination of software development Coordination of interactions between tasks within projects. Found that tasks are mostly self- Free/Libre Open developers in OSS projects assigned, however some assignment of tasks to specific Source Software individuals occurs, as does asking for volunteers to perform Development a task. Wasko, Sagers & Level: OSS Project Examined OSS projects based on theoretical framework of Dickey, 2005 Method: Survey & archival network governance. Tested whether social controls and Network Governance data from Sourceforge trust influence coordination & conflict management within in Open Source OSS projects and whether coordination & conflict Software Development management influence project success. Projects Higher levels of network density among OSS project members yield greater concern about individual reputations, but does not lead to less access to development team. Restricted access, concern about reputation and trust lead to better coordination, but not to coordination and better conflict management. Better coordination leads to a more successful OSS project, conflict management does not improve project success.

24 Table 1. Continued

Paper Level of Analysis/Methods Findings Kuk, 2006 Level: Individual developers Shows that while the majority of contributions to the KDE Strategic Interaction Method: Coding of contents project come from a small number of developers, these and Knowledge of mailing list postings individuals interact strategically with other key developers Sharing in the KDE to share knowledge. Developer Mailing List While this approach is effective in expanding knowledge sharing, evidence is presented that extreme concentration of developers may actually reduce knowledge sharing. The input of other community members can not be ignored with impunity. Grewal, Lilien, Level: Individuals within Found that the relationships among developers spanned Mallapragada, 2006 projects different projects (i.e., developers contributed to more than Location, Location, Methodology: Social one project), and that the level of embeddedness of Location: How network analysis to developers does influence project success. Network determine embeddedness of Embeddedness Affects individuals within and Findings show that levels of embeddedness have different Project Success in between projects effects on commercial and technical success. Open Source Systems Shah, 2006 Level: Individual developers Compares motivations for contribution to two OSS Motivation, Method: Survey, archival, communities with different governance structures. The Governance, and the and interview data from main driver for contributions to both communities was the Viability of Hybrid Sourceforge desire for software-related improvements. The majority of Forms in Open Source participants left the community once their software needs Software Development were met; however, a small subset remained involved.

The motivations of developers changed over time and for those who remained with the project, development became a hobby, but governance structures affected these changes in motivations.

2.5 Gap in Prior Research

2.5.1 Communities To date, many studies have proposed that an active community will improve the success of OSS projects, but no studies have investigated this claim empirically. The purpose of this dissertation research is to fill that gap. It has been posited that a community of users aids an OSS project in such areas as distributed debugging (Raymond 1999), system testing (Mockus et al. 2002), providing assistance to other users (Lakhani and von Hippel 2003) and in providing novel ideas for improving the software (Raymond 1999). A larger community has a larger volunteer labor force to perform these tasks and therefore should be more efficient in supporting, maintaining and developing the software, but since size has negative impacts on many communities, the effect of community size on the ultimate success of OSS projects must be empirically tested.

25 The value of an active community has been demonstrated in a number of fields. In high- tech industries, a community of early adopters of products (termed “lead users”) provides ideas and working prototypes to manufacturers, who incorporate the improvements into their products (von Hippel 1986). Similarly, in the sporting goods industry, windsurfers designed innovations which were incorporated into new product releases (Franke and Shah 2003). Only one study has examined the extent to which OSS users contribute to the project in the form of bug reports and fixes and in the form of feature requests or code to add new features (Mockus et al. 2002). The findings of this study indicate that many individuals in the community reported bugs, while a smaller group of individuals fixed them. While this study is a valuable first contribution, the findings may not be widely applicable, since the OSS projects studied were Apache and Mozilla, large, complex projects which are likely atypical compared to the hundreds of thousands of less “visible” projects. Consequently, a gap remains in determining the effects of communities on the success of OSS projects.

2.5.2 Research Questions As a disjunctively produced public good, OSS software may be produced by a single individual. However, such a course of action makes one individual responsible for writing, supporting, fixing, and improving the software. These tasks can be shared among members of a community, but it is not certain that the advantages of a community sharing tasks outweighs the extra costs of coordination and difficulties in communication. Communities are hypothesized as important to performing the necessary tasks to support, maintain, and improve the software. Since the value of communities has been demonstrated in other fields (Franke and Shah 2003) and is hypothesized as important to OSS projects, (Lakhani and von Hippel 2003; Mockus et al. 2002; Raymond 1999), this study attempts to discover empirically whether an active community aids OSS projects in ways that influence the ultimate success of the project. If, as previously theorized, an active OSS project community truly helps by supporting, maintaining, and improving the software, it is of utmost importance that the project maintain a thriving community. This leads to the general research questions driving this study; namely:

1. How is an OSS project community sustained? 2. How does an active community contribute to the success of an OSS project?

26 2.6 Resource-Based Model of Online Social Structures In order to address these general research questions, a theoretical basis is necessary that explains how communities are sustained and add value to the production of a good or service. One such theoretical framework, the Resource-Based Model of Online Social Structures, proposes that to be sustainable, a community must provide positive benefits to its members (Butler 2001). These benefits must outweigh membership costs in order for the volunteers to remain with the community and allow growth (Moreland and Levine 1982). The core premise of this model is that a community must “maintain access to a pool of shared resources and support the social processes that convert those resources into valued benefits for the participants” (Butler 2001), p. 347) to retain existing and attract new members.

2.6.1 Resource Availability The resources available to communities are typically provided by the members of the community and include the “knowledge, time, energy, money and material resources” of individuals (Butler 2001, p. 347). The precise form resources take depends somewhat on the goals of the community; for example, in online interest communities there must be members who are knowledgeable about the topics of interest (Wasko and Faraj 2005). Similarly, in a support group, some members must be willing to expend the time and energy to be supportive (Butler 2001). In OSS communities, the individuals must be interested in supporting, maintaining, and improving the software.

By aggregating the time, knowledge, talents and other resources of its members, a community can draw from a larger pool to create benefits (Rice 1982). In cases where members are the primary source of resources, membership size is the primary determinant of resource availability – larger communities may possess greater economic resources (McPherson 1983) or more information about the problem at hand (Wittenbaum and Stasser 1996). However, as previously noted, larger communities may have trouble with communication and social loafing, thus, resources alone are not sufficient to sustain the community; the resources must be converted to benefits through some process (Butler 2001).

2.6.2 Benefit Creation Process Communities, whether online or traditional, create many benefits for their members. Some of these benefits include interpersonal relationships and companionship, (McClelland

27 1985; Wellman et al. 1996), social capital and increased reputation (Lakhani et al. 2002; Nahapiet and Ghoshal 1998), increased ability to spread information (Kaufer and Carley 1993) and a support structure for collective action (Ostrom 1990). Members must communicate with each other to successfully convert member-provided resources into benefits for individuals or the entire community. This communication is the heart of the benefit creation process for all online or traditional communities (Butler 2001).

Communication is essential to share knowledge resources, to coordinate activities and for the social support that develops relationships among members (Wellman and Wortley 1990). Communication activities are important in such functions as problem-solving efficiency (Shaw 1981) and community cohesiveness (Shaw 1981; Stogdill 1959). Therefore, in addition to the level of resources available to the community, the level of communication activities is vital in converting resources into benefits. No matter what resources are available to communities – online or traditional – without communication the resources remain dormant and members receive no benefits (Butler 2001). While communication typically becomes more difficult as the community grows, there are ways to mitigate the negative effects of size, such as providing formal structures for communication and reporting (Galbraith 1974), or by utilizing alternative communication technologies such as computer mediated communication, which tends to lead to a organization with less internal hierarchy (Davidow and Malone 1992). If a community can manage the increased communication activity that may come with increased size, it can be more successful in converting resources into benefits for members.

2.6.3 Attraction and Retention When a community has a pool of potential resources and supports the communication activities needed to convert these resources into member benefits, the community will be better able to attract and retain participants and thus increase in size (Butler 2001). As the community grows, more potential resources are available. This basic theoretical framework, incorporating resources, communication activities and sustainability, was developed and empirically tested in online Listserv communities by Butler as shown in Figure 2 (2001).

28 Member Attraction Resource +/- & Retention Availability (Gain & Loss)

Benefit Creation Process

Figure 2. Resource-Based Model of Online Social Structures

2.7 Adapted Theoretical Model Butler tested this original model in an online setting (a large number of Listserv communities) by collecting data over a three month period to measure “size, communication activities and membership change” (2001, p. 353). These represent resource availability, the benefit creation process and member attraction and retention, respectively. Butler further notes that this theory is “applicable to a wide variety of social structures” (Butler 2001), p. 352), thus, the model will be adapted to study the type of online social structure exemplified by OSS project communities. Butler uses the term “social structure” to to the Listserv communities in his study. Since the social structure at the focus of this study is the OSS project community, this research uses the term “community” instead.

One of the key differences between the online interest communities investigated by Butler and OSS projects is that OSS projects are engaged in creating a specific product – software. In Butler's (2001) study, the outcome was the change in size of the community as the benefit creation process (communication activities) varied. In this study, the outcome is sustaining the community through attraction and retention of members. Sustaining that community depends upon software success, as production of the software is the reason for the existence of the community. Production of the software requires the collective efforts of a community. This link between an actual product and the community that produces it is perhaps best articulated by OSS proponent Tim O'Reilly. In his words, the “community-development aspect of open source means that user communities, not the products themselves, may be the key determinants of a project's success” (O'Reilly 1999, p. 32).

29 2.7.1 Assessing OSS Software Success A large body of literature exists which seeks to quantify success in software development (Davis 1989; DeLone and McLean 1992; Delone and McLean 2002; Seddon 1997). The most recognized review of software success literature indicates that success includes the quality of the software (sometimes termed system quality), information quality, use, user satisfaction, individual impact and organizational impact (DeLone and McLean 1992). Some constructs applicable to typical systems development are either difficult to translate to the OSS setting or are simply not applicable to OSS (Crowston et al. 2003).

A comparison of OSS and proprietary software development success measures was undertaken by Crowston et al. (2003). The findings of that study indicate that code quality, (and a related construct, documentation quality) and user satisfaction (e.g., Hartwick 1994) are important to OSS development and are relatively easy to utilize as indicators of OSS success. Other constructs suggested by DeLone and McLean are also important assessments, but are very difficult to collect empirically. For example, use of a given piece of OSS software would be an excellent indicator of whether a given OSS project is successful, but actual use is very difficult to determine, due to three factors. First, OSS is freely downloadable and requires no registration or other notification to the developers that the software is being used. Second, OSS is often distributed through multiple channels (i.e., included in many Linux distributions and downloaded from multiple mirror websites), therefore even the download numbers obtained from the OSS project website are not necessarily accurate indicators of total downloads. Third, a single copy of an OSS program may be installed on any number of computers (conversely, just because someone downloads the software does not mean that it will be utilized, it may not meet their needs and be deleted), thus a single download does not necessarily directly translate to one user. Despite these difficulties in obtaining absolute numbers, the relative number of downloads can still be a useful proxy for comparing different pieces of software under some circumstances (Crowston et al. 2003).

Beyond simply discussing whether existing software success constructs fit the unique context of OSS, Crowston et al., (2003) propose a number of alternative means of assessing OSS success. A few examples highlight some of the differences between OSS and traditional software success – especially the concept that both community and software success are indicators of OSS success. The number of developers – and a corollary construct, the number of active participants in mailing lists and forums – indicates whether an OSS project is successful in attracting and retaining a community of volunteers to write and test software. Similarly, the 30 level of communication activities between these individuals indicates whether they are actively engaged in developing and maintaining the software. Finally, the ratios between open and resolved bugs and between features requests and features added indicate that software development is progressing. OSS success shares some similarity with proprietary software success, but, just as OSS development has unique features that set it apart from proprietary software development, not all traditional system success constructs fit the context of OSS.

In summary, success in OSS projects is multifaceted (Crowston et al. 2003) and is not only dependent on the success of the software, but also on the success of the project community (O'Reilly 1999). Given the definition of OSS projects used in this study – which includes the project community as an integral part of the project – it is vital to include the success of both the project community and the software as antecedents to sustaining the project.

2.7.2 Adapted Model Since OSS projects are organized around creating software, this study includes the success of the software as an adaptation of Butler's original model of community sustainability. Further, since OSS projects need a viable community to carry out the activities which allow continued development of the software and user support, the success of the community is included as an additional indicator of OSS project success. Therefore, Butler’s original model is extended by examining not only how the community is sustained, but also how a community aids in the creation of the software by investigating the impact of communication activities on software and community success. This adapted model is presented in Figure 34.

Resource Benefit Product & Member Attraction Availability Creation Community & Retention Process Success (Gain & Loss)

Figure 3. Theoretical Model.

4 In the adapted model the explicit feedback connection between member attraction and retention and higher resource availability is made implicit, as indicated by the dotted line, as this study is longitudinal. Membership attraction and retention, as the dependent variables in the model, measure the size of the community at time 2 and are an indicator of the level of member-provided resources available at that time. 31 This research makes an important contribution to the resource-based model of communities by applying this model to OSS development. By applying this model to a different form of electronic community, this research serves as additional confirmation of Butler's original findings. Further, by applying this model to OSS project communities, the basic model of sustaining a community is extended to include the production of a good or service, which will answer the research questions guiding this study. In the following chapter, formal hypotheses are developed by applying the theoretical framework to the specific context of OSS projects.

32 3. RESEARCH MODEL AND HYPOTHESES

3.1 Hypotheses According to Butler's original model, a community must have a set of available resources which can be converted into benefits for members through benefit creation processes, which are based around communication activities. In this chapter formal hypotheses are developed based upon the underlying theories presented in the prior chapter.

3.1.1 Resources and Communication Activities Depending on the type of community, the available resources take a number of forms. Typical resources include knowledge, time, energy, money and material resources (Rice 1982). For example, in a professional development group, the knowledge of more experienced members may be leveraged as a resource for newer members (Butler 2001). The resources available to a community facilitate the accomplishment of its collective goals. As in other communities, the contribution of individual resources is essential to achieving OSS project goals. Some especially important resources for OSS project communities are the time and effort that members devote to it and the knowledge and skills of the members about the software (Butler 2001; Lakhani et al. 2002).

When members are the primary source of community resources, the size of the community is an indicator of the amount of resources available (Butler 2001). As the number of members in an OSS project community increases, the total available knowledge within that community increases (Wittenbaum and Stasser 1996). When more members are available, the chances also increase that one will have the information needed to solve a specific problem (Butler 2001). Finally, more users improve quality control by testing the software and reporting bugs.

A popular view of how the distributed debugging process works stems from OSS proponent and community leader Eric S. Raymond, who paints a picture of many skilled individuals reading the source code of OSS to find bugs and then fix them. He states that “a lot of users are hackers” (Raymond 1999, p. 6) and, “given enough eyeballs, all bugs are shallow” (Raymond 1999, p. 8). While there are undoubtedly some skilled users of OSS software who are capable of finding and fixing bugs by reading through source code, this is probably not the most important method of finding bugs. Instead, most bugs are exposed while using the program.

33 Due to the complexity of modern software, there are many thousands of paths through a given program and a small community of developers could not possibly test all possible permutations within that software, let alone the potential interactions with all other software that might be running concurrently (Weber 2004). Debugging such software requires utilization by a large number of users, which exposes a greater number of possible bugs in the software. As these users see bugs and limitations in the software, some will report the bugs and ask for new features. Therefore the time and energy spent simply using the software together with members' knowledge and skills in software testing and development represent invaluable resources in OSS project communities, and are dependent on the number of members in the community.

The various resources provided by community members are brought to bear only when communication takes place between individuals (Butler 2001). In the previous example of a professional development group, the knowledge of more experienced members is of little value to the community unless it is communicated to newer members. In OSS projects, two primary types of communication activities are important. The first centers around providing help to software users and the second focuses on developing and improving the software5.

Support communication activities take place in OSS when project members help others with problems that arise during software use. Support communication activities are a “mundane but necessary task” within the OSS project community (von Krogh et al. 2003, p. 923). While much of the prestige in OSS projects seems to flow to those who write software code, many of the day-to-day activities in a project community involve assisting individuals having trouble with the software. These user-to-user support communication activities fill a valuable role in OSS projects, since, unlike proprietary software, OSS projects typically support their product for free (Lakhani and von Hippel 2003). In addition, support communication activities allow all community members to participate in sustaining the community. Not all project members have the skills to write software, but by providing help to others, these members benefit the community by serving as an online help desk. Another benefit of support communication activities is that these exchanges are often archived, creating a knowledge repository that functions as a help manual.

5 There are almost certainly other types of communication which take place in OSS project communities, and these may increase the cohesiveness of the community. These (off-topic) communications are often frowned upon in project mailing lists, and are quickly shut down, or may lead to flaming of the offending member (Markus et al. 2000). Since these communications seem to represent the exception rather than the rule in official project communication forums, and since many personal exchanges are probably carried out “off-list”, they fall outside the scope of this study. 34 In order to develop software, the OSS project community must discuss software development issues such as new functionality, project goals, software coding standards and technical implementation details. In discussing these issues, the project community may capture the knowledge of experienced members in the development process (Weber 2004). Development communication activities include filing formal bug reports and feature requests, coordinating activity among developers to be certain that those issues are addressed and writing code to be incorporated into the project (Raymond 1999). Writing code is often considered the “glamorous” portion of OSS software development, but other components of development work within the project community – such as finding bugs and coordinating efforts – are equally important in software development, since bugs can not be fixed until they are found (Weber 2004).

Studies show that in electronic settings, larger communities generate a greater quantity of communication, in part due to the potential of electronic systems to archive messages and buffer communications, thus reducing the logistical problems that occur in traditional communities (Butler 2001; Gallupe et al. 1992; Turoff 1991). One can expect that in OSS projects, a larger number of members will lead to an increased quantity of overall communication within the project. In OSS communities, increased communication activities indicate that members are helping other members who have support issues and are discussing issues relating to software development, thus:

H1a: Larger OSS communities will have more support communication activities than smaller OSS communities.

H1b: Larger OSS communities will have more development communication activities than smaller OSS communities.

3.1.2 Communication Activities and Success When members of an OSS community support other users of the software and discuss development issues, it is expected that the OSS project will ultimately be more successful. As previously noted, the success of an OSS project is multi-faceted. First, the success of the software is a key part of OSS project success, as software development is the primary reason for the project's existence. Indicators of software success for OSS include objective measures like fixing bugs, adding new features and actual use of the software, as well as subjective measures such as the perceived utility of the software (Crowston et al. 2003). However, software success alone does not tell the full story. In order to maintain and further develop the software, a

35 community is necessary – thus another element of OSS project success is success of the project community (O'Reilly 1999). For the OSS project community, success is indicated by the social capital created within the community (Fischer et al. 2002). Given the importance of both software and community success, both will be considered in this study. This section develops the proposed relationships between the two types of communication activities and the two components of success.

Support communication activities consist of OSS project members aiding other members with problems they encounter while using the software. These problems may be due to errors in the software or due to incomplete understanding on the part of the software user (Lakhani and von Hippel 2003). When problems arise due to bugs in the software, the issue must be brought to the attention of members of the project community who have the ability to fix the software code.

When an inexperienced user encounters an error, often they may not know if the problem was due to human or software error, much less how to file an official bug report. These inexperienced users may turn to the discussion forums (which are often more accessible to novices than bug databases) and report the problem. From there, they can receive help to resolve the issue, whether that help consists of correcting an error on the user's end or fixing the software. Many such posts to the OSS project forums contain such text as “Is anyone else experiencing this?” and, if necessary, sometimes include replies such as “Yep, it's a bug, I'll file a bug report” (Sourceforge 2005a). Through this somewhat indirect route, valuable information is delivered to those within the project who have the ability to fix errors in the software.

Development communication activities allow information to take a more direct route to the individuals who write code. Development communication activities include filing formal bug reports and requesting new features. The typical process for bug reporting begins with a report filed by a user. This report contains details as to how the bug occurs (to allow replication), specifics about the problem behavior and perhaps a proposed resolution or code to fix the error. After the report is filed, someone within the project with the responsibility for fixing bugs evaluates the bug to make certain it really is a software issue. They then can either assign the bug to the individual responsible for that section of the code, or leave it in the “open” category to allow others to self-select the task for resolution (Crowston et al. 2005b). After fixing the bug, the individual making the repair “closes” the bug, ending the bug-fixing process. Adding new features is similar, but perhaps less urgent, due to the relative amount of work.

36 As bugs are resolved and new features added, the software becomes more successful in several senses. First, as bugs are fixed, the software should become more stable (Mockus et al. 2002). Second, fixing bugs is generally accepted as leading to higher quality software, which is a potential indicator of OSS success (Crowston et al. 2003). Third, as new features are added to the software, the software presumably becomes more functional for users.

As bugs are fixed and features added, common practice is to increment the version number of the software and to assign a descriptive development status identifier such as “alpha”, “beta” and “stable/mature”. These stages of software development correspond with a product which may be considered very unstable, reasonably stable (but with a few remaining bugs) and very stable (i.e., production quality), respectively. The progression of software through the stages of development is proposed as one indicator of software success (Crowston et al. 2003). A final indicator of software success is the number of individuals using the software. It is difficult to determine the absolute number of users of OSS – as the software is freely available – but the number of downloads of the software is market-based measure of the relative popularity of the software (Grewal et al. 2006). These software success indicators all hinge on continued product development, based upon communication activities among project members. As bugs are fixed and more features are added, the overall software quality of the software increases (Schmidt and Porter 2001), thus:

H2a: Greater amounts of support communication activity lead to a more successful software product.

H2b: Greater amounts of development communication activity lead to a more successful software product.

Just as communication activities within the community lead to software success, it is expected that communication activities will lead to community success through increased social capital. Social capital is a source of credentials within communities, which entitles members to “credit, in the various senses of the word” (Bourdieu 1986, p. 249). Unlike other forms of capital based on individual assets, such as human capital or financial capital, social capital is created and derived from the relationships among people. For the purposes of this study, the definition of social capital proposed by Nahapiet & Ghoshal is used. Their definition is that social capital is “the sum of the actual and potential resources embedded within, available through and derived from the network of relationships possessed by an individual or social unit” (1998, p. 243). Communication activities within communities create the foundation for social

37 capital by providing a means to perform work, socialize members and cement relationships within the community (Shaw 1981). Social capital is embedded in the interactions between members of a community, giving the community cohesiveness and thereby facilitating the pursuit of collective goals (Adler and Kwon 2002). As the number of communications (interactions) between members increases, the aggregate level of social capital in the community increases (Wasko and Faraj 2005). Social capital can take many forms, such as trust in others, a sense of belonging to the community and identification with the community and commitment to its goals (Nahapiet and Ghoshal 1998). Two forms of social capital which have been identified as important in online communities are trust and a sense of belonging (Jones and Harrison 1996; Ridings et al. 2002; Wasko and Faraj 2005). Because of trust and the sense of belonging to a community, individuals are willing to engage in pro-social behaviors which benefit the community (Glaeser et al. 2002).

Research shows trust to be a complex phenomenon exhibiting multiple dimensions. Trust has been studied in many fields, including sociology (Rotter 1980), economics (Williamson 1993), management (Gulati 1995) and social psychology (Deutsch 1958). Many definitions have been proposed across these fields and all share the commonality of describing trust as a “willingness to act based on a state of confidence about the other party's motives and the degree of risk involved” (Stoecklin-Serino 2005, p. 4). Trust is important in facilitating social exchange in communities (Nahapiet and Ghoshal 1998) and prior research has indicated that trust is related to success in online communities (Jarvenpaa et al. 1998). In OSS communities, trust in others enables individuals to believe that answers they receive to support questions are correct (or at least not intentionally misleading) and that the volunteer effort they spend in contributing information (bug reports, feature requests, code, and so on) to the community will be acted upon.

As individuals communicate with others in the community and their contributions are acknowledged by others, they develop a sense of belonging to the community (Jones and Harrison 1996). This sense of a shared social identity with other community members induces members to contribute to the collective goals of the community, even if it is not in their direct self-interest to do so (Constant et al. 1996; Jones and Harrison 1996). A sense of belonging to the community is another essential form of social capital for sustaining communities, as it leads volunteer members to continue to contribute to the community.

The communication activities created when individuals give back to OSS project communities by discussing support and development issues enable social capital to form within

38 the community. Social capital in the forms of trust and a sense of belonging to the community are embedded in these interactions. As members communicate about support and development, aggregate levels of social capital increase, thus it is predicted that:

H3a: Greater amounts of support communication activity lead to more social capital within the community.

H3b: Greater amounts of development communication activity lead to more social capital within the community.

3.1.3 Success and Sustainability In order for a community to be sustained, it is generally necessary to retain existing members and attract new members (Butler 2001). Both are necessary to sustaining the community since all members may not be equally capable. Older members of the community understand community norms. Newer members may need time to become acclimated to these norms and learn needed skills. Older members may engage in groupthink while new members can bring in fresh ideas and skill sets (Crowston and Howison 2006). Therefore, in order to sustain a community, the involvement of individuals is necessary in terms of both retention and attraction. This section proposes relationships between both types of success (software success and social capital/community success) and the OSS project community's ability to sustain itself by retaining and attracting members.

Software success impacts the retention of OSS project members. If members do not perceive the software as valuable, they will look elsewhere for software to fulfill their needs6. Conversely as members of the OSS project community see that their efforts are leading to improvements in the software, they will remain involved in the project and continue to exert themselves in finding and fixing bugs and asking for and adding new features (Fang and Neufeld 2006).

Just as success of the software influences individuals to stay with a project, the success of the project community helps retain members. Since social capital does not readily transfer across communities, if an individual leaves, that investment in social capital is lost (Nahapiet and Ghoshal 1998). As long as members' needs from the community are met, they have no need to look elsewhere for the value the community provides. For example, if support groups are

6 As anecdotal evidence of this phenomenon, one needs look no further than the thousands of “dead” OSS projects in existence. Many of the over 100,000 projects on Sourceforge are in pre-alpha or alpha stages, and have remained that way for years (Sourceforge 2005b). In many of these cases, an individual had a great idea, and released an early version of the software, but was unable to attract a community to sustain it. 39 supportive, members will continue attending meetings (Butler 2001). As trust and a sense of belonging increase within a community, members will be more inclined to remain active within the community and volunteer their efforts to ensure its continued success (Kanter 1968). As individuals trust each other more and feel a greater sense of belonging, one would expect that they will want to remain a part of the community; therefore:

H4a: A community with a more successful software product will be able to retain more existing members.

H4b: Communities higher in social capital will be able to retain more existing members.

Retaining existing members is important to sustaining a community, but at some point members may no longer have need of the community and will leave. As members leave, available resources, communication activities and social capital decrease. First, since members are a primary source of resources, membership loss leads to a decrement in the resources available to the community (Butler 2001). Second, as membership decreases, the total number of potential communication activities, which convert resources into benefits for the project community, also decreases7. Third, since social capital is embedded in the interactions between members and fewer members means fewer interactions, membership loss decreases total social capital. To avoid these losses, a community must attract new members.

Just as success of the software leads to the retention of existing members, one can expect software success to increase the attraction of new members to the OSS project community. The perceptions of users relating to the utility and value of the software are evangelized to other individuals outside the community (Weber 2004). As a product is more successful, its utility increases and more individuals will be drawn to use that product8.

Attracting individuals to download and use the software is not sufficient to sustain the community. Many individuals may download the software, but unless they contribute to the community, they remain free riders. Among the set of all users of the software, which includes free riders, are individuals who are motivated to contribute by communicating with others. While all who download and use the software will not donate their efforts to the community, only a certain number is needed to support and develop the software. A large pool of free riding

7 The number of possible interactions within a community changes significantly with the number of members. Specifically, the number of possible interactions = n*(n-1), where n is the number of members. 8 In a practical example of this phenomenon, the OpenOffice.org mailing list membership has increased from 12,000 to 34,000 in approximately 2.5 years. Over the same time period, the number of downloads of the software increased from 14 million to over 53 million; a similar phenomenon was observed in the Mozilla project, among others (Mockus et al. 2002; OpenOffice.org 2006). 40 users is needed to reach that comparatively small fraction of individuals who will volunteer their efforts and become contributing project members (Weber 2004).

Not only is software success important, but community success is important to attract new members (O'Reilly 1999). An OSS project that includes useful software – but no active project community9 – will not likely be able to attract new members. The social capital created within the OSS project community signals outsiders that the project has a viable community. A viable project community attracts individuals to participate, since individuals want to be part of something larger than themselves – even in software development (Content Team 2006). When outsiders see the OSS project's success, they will be attracted to use the software and possibly join the community. Thus, it is proposed that:

H5a: A community with a more successful software product will be able to attract more new members.

H5b: Communities higher in social capital will be able to attract more new members.

To summarize, the goal of this research is to further our understanding of how OSS project communities are sustained and how a viable community adds value to the software. Prior research suggests that members of the community provide valuable potential resources that can be converted into community benefits through communication activities (Butler 2001). A primary contribution of this research is to apply this model to OSS development. This includes integrating both software and community success as key constructs in predicting the attraction and retention of community members. The hypothesized path model, derived by application of the above hypotheses to the theoretical model is presented in Figure 4. This model indicates the proposed relationships between resources, communication activities, software and community success and the ability to attract and retain members. The hypotheses are summarized in Table 2. The next chapter describes the methodology and measures used to test these hypotheses.

9 Projects which have no active project communities are not the only ones unable to attract new members, but also projects whose membership is too exclusive. There have been cases in the OSS movement of project communities refusing to accept bug reports or code, and were perceived as uncaring and aloof. In one case, the developers realized that this, in their words “screw(ed) up a free software project” (Cox 1998). 41 Support H2a Software H5a Attraction H1a Activity Product H3a H4a Resources

H2b H5b H1b Development Social Retention Activity H3b Capital H4b

Software &

42 Benefit Creation Member Gain & Community Resources Process Loss Success

Figure 4. Hypothesized Path Model

Note: Just as in the theoretical model (Figure 3), the feedback loop is implicit. Resources at Time 2 are indicated by membership size at that point, and are dependent on longitudinal member attraction and retention. Table 2. Summary of Hypotheses

Hypothesis Effect Type Relationship Between Constructs (Direction of Relationship) H1a Direct Resources – Support Activity ( + ) H1b Direct Resources – Development Activity ( + ) H2a Direct Support Activity – Software Product Success ( + ) H2b Direct Development Activity – Software Product Success ( + ) H3a Direct Support Activity – Social Capital ( + ) H3b Direct Development Activity – Social Capital ( + ) H4a Direct Software Product Success – Retention ( + ) H4b Direct Social Capital – Retention ( + ) H5a Direct Software Project Success – Attraction ( + ) H5b Direct Social Capital – Attraction ( + )

43 4. METHODOLOGY

4.1 Sample and Procedures As previously defined, OSS projects consist of the project community, free riders, software, documentation and a project website. This website serves many purposes for the project. It is the official software repository, provides a virtual meeting place for active participants and is the home of documentation for the software. Thus, the website facilitates the communication processes which facilitate the conversion of resources into a valuable product and which sustain OSS development processes. In short, the website is the focal point and meeting area of OSS projects.

Many of the large “poster children” OSS projects, such as Linux, Apache, Mozilla, OpenOffice.org, KDE, MySQL and GNOME maintain their own websites. The labor required to maintain a large website and the funds required to pay for hosting and bandwidth means that large dedicated sites may be out of reach for many of the hundreds of thousands of smaller projects that exist within the OSS movement. Since OSS is written primarily by volunteers who “meet” mainly via the Internet, a website is vital to the creation of the OSS community and software. To meet this need, a number of dedicated hosting sites have been formed to help facilitate OSS projects. Two such sites are the GNU project's Savannah and Sourceforge. Typically these sites – which host thousands of OSS projects – provide web space, tools for managing the content of websites and tools such as CVS10 (Concurrent Versioning System) for managing source code. Both sites provide these tools at no charge to the OSS projects.

Sourceforge is currently the largest OSS hosting site. As of February, 2007, more than 140,000 projects were registered (Sourceforge 2005b). The projects hosted on Sourceforge cover a wide range of software types, from small utilities to large ERP (Enterprise Resource Planning) systems, from operating systems to music and graphics software and from peer-to-peer file-sharing programs to games. A typical Sourceforge project page is shown in Figure 5, illustrating the common tools and attributes shared by all Sourceforge OSS projects.

10 CVS and similar systems allow developers to check-out (much like a library book, that is, they obtain a complete copy of the computer source code) code modules, make changes to that code, and then check-in the code. Version control systems keep track of the changes made to the files in the source code tree, and aid in resolving differences in changes made by different developers. 44 Figure 5. Typical Sourceforge Project Page

45 Sourceforge maintains statistics on all projects which are useful for comparing projects. These statistics were utilized in the selection of projects for this study. An initial set of projects was chosen in early 2004 on the basis of communication levels. Communication levels were initially assessed by examining projects from the “Top Forum Posts Count” listing maintained by Sourceforge. This criterion ensured that all projects included in the sample had active participants who communicated with each other. This “Top Forum Posts Count” list included 100 projects. These projects were reviewed and 44 projects were selected on the basis of the presence of an “Open Discussion” forum and no utilization of outside mailing lists, forums, or other venues for communication. Additionally, each project had a minimum of 10 messages posted to the Open Discussion forum during a six-week period in early 2004. This was done to ensure that active communication took place, to ensure that communication was not occurring across multiple venues and to allow for capture of all communication during that time period, standardizing the data collection.

While selecting only projects which had active project communities biases the sample towards active projects, active communication forums are the focus of this study. Further, while communication levels were relatively high across all projects in the sample during the initial selection, the projects varied in many other dimensions. For example, the project community size varied from 6 to 271 members. Projects showed a wide diversity of types, from a small utility that allowed users to install larger hard drives in a certain digital video recorder (DVR) system to a full-featured enterprise resource planning (ERP) system. The size of the software varied from a few kilobytes for small utilities to hundreds of megabytes for the largest systems. Projects also varied in terms of the stage of completion, from alpha to beta to mature. Including a wide variety of projects with different characteristics helps to ensure greater generalizability of the study results.

The data for this study were collected longitudinally. The specific data collection periods were selected to match the relationships between the constructs. The initial size of the group had to be determined prior to measuring communication activity levels, which in turn preceded community and software success, while attraction and retention were measured last. The time line for data collection and the actual measures used are detailed in the following section.

46 4.2 Measures

4.2.1 Resources The amount of resources available to the project was assessed by determining the number of active members in the OSS project community at the beginning of the study (Butler 2001). All messages posted between September 15th and October 31st, 2003 were collected. The header information of each message was analyzed to identify the username of the individual posting. In most of the forums, anonymous posting was allowed under the username of “nobody”. These individuals were dropped, as their true identities are impossible to determine. The alternatives of counting each “nobody” as a unique user, or all “nobodies” as a single user were discarded as this would likely overcount and undercount the true number of users, respectively. This approach of dropping “nobody” is documented in the literature as having minimal effects on measurement results (Crowston and Howison 2005). Some projects disallowed anonymous posting – in this case, no usernames were discarded. Thus, the initial size of the community was measured as the count of unique usernames within the project.

4.2.2 Support Communication Activities Support communication activities can be assessed in terms of both the quantity of communication and number of different discussion topics. The overall communications volume indicates how often individuals communicate. The amount of topic variation – how many different topics are discussed – gives a relative measure of whether the same issues are discussed at length or whether many different issues are discussed.

Overall support communication activities were assessed by counting the number of posts to the “Open Discussion” forum during the six week period from November 15th through December 31st, 2003. This time period was chosen to separate the time of resource availability assessment from the communication activities that turn the resources into benefits for the project. All message headers posted during this time period were collected, including the entire thread as long as the seed message was posted during the study period. These headers included the topic of the post, the username of the poster and the date the message was posted. Message threading was preserved. A screenshot of a typical discussion thread, illustrating the information available, is shown in Figure 6.

47 Figure 6. Typical Discussion Thread.

48 In addition to the total number of messages posted in the forums, the diversity of topics within the forum benefits the community. Increased topic diversity indicates that members of the community are surfacing more issues, meaning that more potential issues can be addressed (Butler 2001). To measure the diversity of topics discussed in the forum, an established measure of message dispersion within discussion threads was used (Butler 2001). This measure is based upon the Herifindahl-Hirschman (HH) index, which is typically utilized in economics to determine the concentration of firms within an industry and by the U.S. Department of Justice to determine whether a proposed merger will lead to a monopolistic situation (i.e., highly concentrated around a single firm) (U.S. Department Of Justice 2007). A slightly adapted version the HH index may also be used as a measure of the concentration of messages within a thread and as an indicator of the number of threads (Davies 1979). Topic variation was calculated by reversing the HH index (1-HHI) as shown in Figure 7, where Si is the percentage of the total messages [0...1] which are part of discussion thread i and MsgCount is the total number of messages posted during that time period.

Figure 7. Herfindahl-Hirschman Index

The topic variation measure varies between 0 and 1, with lower numbers indicating low topic variation, or in other words, that only a few topics are being discussed, while higher numbers mean that many topics are discussed. When more topics are discussed the potential to benefit many subsets of the community membership increases (Butler 2001). The topic variation measure was scaled by a factor of 10,000 to bring the values into the range of the other measures used in this study.

4.2.3 Development Communication Activities Activity in the bug report and feature request databases was utilized to analyze development communication activities among members (Crowston et al. 2003). The bug report and feature request systems represent developmental communication, rather than support for software users. The number of bugs reported and the number of new features requested indicate the level of communication within the community about development issues. All bug reports and feature requests were collected for the same period as for support communication activities, that is,

49 November 15th to December 31st, 2003. This ensured that the same individuals were measured as contributing to both support and development communication activities, while keeping the measurement of these activities independent from the initial assessment of resource availability. For each bug report and feature request, the Sourceforge-assigned identification number, the date the report was filed, the identity of the individual submitting the report and the subject line of the report were collected. The number of bug reports and feature requests was summed to form the measure of development communication activity. A typical project's bugs are shown in Figure 8, feature requests are similar.

Figure 8. Typical Bug Reports.

4.2.4 Software Success To measure software success, two objective and two subjective measures were used. The first objective measure is a ratio calculated as the number of closed bug reports and feature requests divided by the total number of bug reports and feature requests (Crowston et al. 2003). To determine the ratio of closed to total bugs, the status of each bug report and feature request (those previously collected to measure of development communication activity from November

50 15, 2003 to December 31st, 2003) was determined as of February 18th, 2004. This date was selected to separate success measures from the communication activity that led to the success and corresponds to a survey – described below – which measured user perceptions of software success.

The second objective measure of software success is the number of individuals using the software. While determining the exact number of individuals using a given piece of software – especially OSS, which typically has no compulsory licensing schemes – is very difficult, the number of times the software is downloaded may be utilized as a proxy for the number of users. While not an absolute count of users, it may be used to determine a relative number of users compared to other pieces of software (Crowston et al. 2006a). The number of downloads during January 2004 was chosen to allow correlation of downloads with subjective measures of software success assessed via survey.

Data for the subjective measures of software success was collected via a survey of individuals posting to the project's “Open Discussion” forum. Six weeks worth of messages headers were collected from January 1, 2004 to Feb 18, 2004, again including the topic, date and identity of the poster. These dates were chosen to separate the measurement of software success from the preceding communication activity. Each individual (excluding, of course, “nobody”) who posted a message to this forum was invited via an email containing a personalized link to take part in a web-based survey. To provide incentive to participate in the study, each respondent earned a one dollar (US) donation to his or her OSS project and the three projects with the highest response rates were given an additional donation of one hundred dollars (US), via the Sourceforge donation system.

The first subjective measure of software success was obtained by asking project members to rate software performance and utility. The questions used for this measure were taken from published scales and appear in Table 3 (Hartwick and Barki 1994). All items were assessed via a 7-point Likert scale. Reverse-coded items were recoded prior to analysis.

51 Table 3. Items Measuring Subjective Performance Indicate your feelings concerning the software developed by this project.

I consider the system to be.. ● good .... bad

● terrible .... terrific

● useful .... useless

● worthless .... valuable

A second subjective measure of software success is indicated by the development stage of the software11 (Crowston et al. 2003). Sourceforge tracks development status on a 7-point scale – planning, pre-alpha, alpha, beta, production/stable, mature and inactive. A move from alpha to beta or beta to mature status indicates that the developers feel that the software is progressing in its development. The change in development status from the initial date of the determination of the community size (September 15th, 2003) to the closing date of the survey (February 18th, 2004), was selected to allow sufficient time for enough development to take place to justify a status change.

4.2.5 Community Success Two measures of social capital were selected that have previously been shown to be important in the development of online communities. The first measure, trust, is a complex construct that has been defined and studied along many dimensions, at multiple levels of analysis, using many different methodologies. For this study, trust in the ability of others was selected as the most theoretically appropriate measure. Members' perceptions of trust in the ability of others was measured via survey using items taken from published scales in information systems research as listed in Table 4 (Jarvenpaa et al. 1998). The second measure of social capital, members' sense of belonging to the community, was also measured via survey using items from Jones et al., (1996) as listed in Table 5.

11 While it may be argued that changes in developmental stage of the software can be objectively determined, in the case of Sourceforge, this status is is determined by the project administrator, making it subjective. 52 Table 4. Items Measuring Trust in the Ability of Others

● I feel very confident about the project members' skills

● Project members have much knowledge about the work that needs to be done

● Project members have specialized capabilities that can increase the project's performance

● Project members are very capable of performing their tasks

Table 5. Items Measuring Sense of Belonging

● I feel a sense of belonging to this project

● I feel that I am a member of this project

● I see myself as part of this project

4.2.6 Attraction and Retention The dependent measures in this study are the attraction of new members and the retention of existing members. The sum of these two sets of members represent the resources available to the community at time 2 and are conceptually the same as the feedback loop from success to resources in the theoretical model. To assess attraction and retention longitudinally, the messages posted during the six-week period from June 1st to July 15th, 2004 were collected. This time period was selected to show true retention and attraction effects. Nine months had passed since the initial size determination, which allowed time for those volunteers not committed to the project to leave and new members to become truly involved with the project. Just as with previous data collections, individual usernames were again extracted and this list of users was compared with the list of users from the initial six-week period. Attraction was measured as the count of new individuals who were not active during the first six-weeks (September 15th to October 31st, 2003), but who were active in the community during the second six-week data collection period (June 1st to July 15th, 2004). Retention was measured as the count of members active during the first six weeks and still active during the second six-week period.

The full hypothesized model showing all paths and constructs is shown in Figure 9, and the time line for collection of data is shown in Figure 10.

53 Topic H2a Variation Downloads H5a

Total Subjective Messages Performance Attraction H1a Support Activity H3a Product H4a Success Size H2b H5b Resources Trust In Ability Bug Reports Retention

H1b Sense Of Member Gain Feature Belonging H4b Requests H3b & Loss Social Development Capital Activity

Benefit Creation Software & Community Process Success Figure 9. Full Hypothesized Model.

Initial Resources Software Success Attraction & (Size) (Downloads) Retention Sept 15-Oct 31, Jan, 2004 Jun 1-Jul 15, 2004 2003

Support & Development Software Success Communication Activity (Subjective (Bug & Feature Requests) Performance) Nov 15-Dec 31, 2003 Jan 1-Feb 18, 2004

Figure 10. Data Collection Time Line.

54 5. ANALYSIS AND RESULTS This chapter describes the results of data analysis. The preliminary analysis of responses from the survey is presented in the first section. The analysis using partial least squares (PLS) is presented in the second section, followed by the results of hypothesis testing in the third section. The fourth and final section of the chapter presents the results of an exploratory analysis conducted to allow theory building.

5.1 Preliminary Analyses A total of 6,227 messages were posted to the forums of the 44 projects during the six week period from January 1st to February 18th, 2004. These messages were posted by 1,724 unique individuals. Each individual was sent a personalized email invitation to take part in the web-based survey; 122 of the initial emails were returned as undeliverable, for a total of 1602 potential respondents. A total of 355 usable responses were returned, giving an overall response rate of 22%. Five projects were later dropped from the analysis, because they had moved away from Sourceforge or had abandoned the forums by the time of the data collection for attraction and retention (June 1st to July 15th, 2004), leaving 39 projects for analysis. These projects averaged 45 members each (initial size, measured September 15th through October 31st, 2003). An average of 158 messages per project were posted during the time period from November 15th to December 31st, giving an average of 3.5 messages posted per project member during that period. The survey collected an average of 8 responses per project. The average project response rate for the survey was 18%.

To test for response bias, demographic variables were collected from respondents. These were then compared with published accounts which surveyed only developers in other OSS projects. In this sample, the mean age was 37.6 years – somewhat older than the mean of 30 years found by Lakhani, et al (2002). The respondents in this sample were 96% male, slightly less than the 98 to 99% found by Lakhani et al. (2002) and Ghosh et al. (2002). The respondents to this survey lived in the following areas (numbers from Lakhani et al. (2002) in parentheses): Americas 47.9% (46.9%), Europe 39.3% (42.4%), Africa, Asia and Pacific 10% (10.7%). A Chi-square analysis between the two surveys suggests no significant response bias compared to findings of other research (χ2=0.286, df = 2).

55 5.2 PLS Results Since the hypothesized model is at the project level of analysis, it was first necessary to aggregate the survey data by project, using the means of individual responses. Before performing such an aggregation, the intraclass correlations (ICC) must be evaluated. The ICC value ensures that members of a given community responded similarly about the variable and that members from other communities responded differently, meaning that there is a unique influence due to the community. If the ICC is zero, individuals within a project are no more alike than individuals not associated with that project; if the ICC value is one, individuals within a project responded similarly. Table 6 provides the ICC values and descriptive statistics for all variables used in this study. The range of ICC values was 0.74-0.97, indicating a high level of agreement among individuals within projects.

Table 6. ICC Values and Descriptive Statistics Construct Mean SD Range Skewness Kurtosis ICC Size 45.33 70.37 2-399 3.70 16.84 N/A Topic variation 9,957.03 96.70 9,444.44-9999.85 -4.32 21.50 N/A Number of messages, 11/03 157.87 162.53 6-663 1.50 1.64 N/A Bug reports 26.82 57.93 0-346 4.71 25.33 N/A Feature requests 8.79 22.57 0-137 5.12 28.92 N/A Downloads 30,364.13 70,138.16 28-370,574 3.79 15.51 N/A Resolved vs. total issues 0.36 0.32 0.00-0.95 0.31 -1.22 N/A Development status 4.49 0.78 2.00-5.00 -1.52 1.69 N/A Subjective performance 6.18 0.57 4.58-7 -0.59 0.27 0.74 Trust in ability of others 4.05 0.47 2.58-4.75 -1.28 1.97 0.79 Sense of belonging 3.09 0.60 1.92-4.14 -0.16 -0.66 0.97 Attraction (Individuals) 41.28 57.11 0-217 2.13 3.85 N/A Retention (Individuals) 14.41 20.11 0-99 3.03 10.11 N/A

Partial least squares (PLS) structural equation modeling was chosen to test the hypotheses. PLS is a structural equation modeling technique that simultaneously assesses the reliability and validity of the measures of theoretical constructs and estimates the relationships among these constructs (Wold 1982). PLS was chosen due to its robustness to small sample sizes and to non-normal distributions such as those found in Table 6 (Chin and Todd 1995). PLS can be used to analyze measurement and structural models with multi-item constructs, including direct, indirect and interaction effects and is widely used in IS research (Ahuja et al. 2003; Chin and Todd 1995; Keil et al. 2000). PLS requires a sample size consisting of 10 times the number 56 of predictors, using either the indicators of the most complex formative construct or the largest number of antecedent constructs leading to an endogenous construct, whichever is greater (Wold 1982). This limitation meant that only two predictors of software success could be included. An initial PLS model which included all four software success measures showed that the number of downloads was a better objective predictor of software success than the ratio of closed to total bugs and feature requests. Similarly, subjective assessment of performance, measured via survey, was better than development status as a subjective measure of success, thus, only downloads and subjective performance were included as software success measures in the structural model.

Although the measurement and structural parameters are estimated together, a PLS model is analyzed and interpreted in two stages: first, the reliability and validity of the measurement model is assessed, followed by the assessment of the structural model. In terms of the measurement model, the first step in PLS analysis is to assess the convergent validity of the constructs of interest by examining the average variance extracted (AVE). The AVE measures the amount of variance that a latent variable captures from its indicators relative to the amount due to measurement error. Acceptable AVE values should exceed the generally recognized 0.50 cut-off, indicating that the majority of the variance is accounted for by the construct. All AVE values were above this cut-off, indicating adequate convergent validity. In addition, the individual survey items that make up a theoretical construct must be assessed for inter-item reliability. In PLS, the internal consistency of a given block of indicators can be calculated using the composite reliability (ICR) (Werts et al. 1973). Acceptable values of an ICR should exceed 0.7 (Fornell and Larcker 1981) and are interpreted like a Cronbach’s coefficient. All ICR values were above the 0.7 cutoff, indicating reliability in the measurement model. Table 7 summarizes these results.

Table 7. Average Variance Explained & Composite Reliabilities Construct # Of Items AVE ICR Construct # of Items AVE ICR Size 1 N/A N/A Subjective performance 4 0.621 0.81 Topic variation 1 N/A N/A Trust in ability of others 4 0.782 0.94 Number of messages, 11/03 1 N/A N/A Sense of belonging 3 0.831 0.95 Bug reports 1 N/A N/A Attraction 1 N/A N/A Feature requests 1 N/A N/A Retention 1 N/A N/A Downloads 1 N/A N/A

57 Discriminant validity indicates the extent to which a given construct is different from other constructs. The measures of each construct should be distinct and all indicators should load on the appropriate construct. One criterion for adequate discriminant validity is that the construct should share more variance with its measures than with the measures of other constructs in the model. To evaluate discriminant validity, the AVE may be compared with the square of the correlations among the latent variables (Chin 1998). The diagonal of Table 8 contains the square root of the AVE. All AVEs are greater than the off-diagonal elements in the corresponding rows and columns, demonstrating adequate discriminant validity.

A second way of evaluating discriminant validity is to analyze the factor loadings of the indicators (Chin 1998). Every indicator should have a higher loading on the desired construct than on any other factor. Factor loadings and cross-loadings for the multi-item measures were calculated from the PLS output and are presented in Table 9. An examination of the loadings and cross-loadings confirms that the observed indicators load more highly on their own construct than any other and thus demonstrate adequate discriminant and convergent validity.

5.3 Model Testing The theoretical model and the hypothesized relationships among the constructs were tested using the bootstrap technique in PLS Graph 3.00, build 1126, which estimates the precision of PLS tests. For this study, 200 iterations were used, which results in 200 estimates of each parameter in the measurement model (Chin and Frye 1996). The results of the bootstrap give path estimates and t-statistics for each of the hypothesized relationships. The significant paths revealed by this test are shown in Figure 11. To test the specific hypotheses, the t-statistics for each path coefficient are evaluated and a p-value is calculated for each. For this sample a p- value (two-tailed) of 0.1 was chosen, to guard against Type 1 errors due to the relatively small sample size.

58 Table 8. Correlation of Constructs and AVE Values Construct 1 2 3 4 5 6 7 8 9 10 11 12 13 1 Size 1 2 Topic variation 0.217 1 3 Number of messages 0.660** 0.363* 1 4 Bug reports 0.804** 0.132 0.491** 1 5 Feature Requests 0.837** 0.079 0.391* 0.912** 1 6 Downloads 0.440** 0.123 0.575** 0.575** 0.350* 1 7 Resolved vs. total issues 0.126 0.019 0.222 0.308 0.165 0.287 1 8 Development status 0.093 0.121 0.200 0.113 0.103 0.178 0.130 1 9 Subjective performance -0.060 -0.256 0.057 0.003 -0.010 -0.005 0.271 0.108 .841 10 Trust in ability -0.012 0.011 0.027 0.012 0.048 -0.040 0.109 0.136 0.435** .884 11 Sense of belonging -0.244 -0.265 -0.081 -0.103 -0.148 0.125 0.073 0.068 0.403* 0.496** .843

59 12 Attraction 0.732** 0.236 0.684** 0.651** 0.489** 0.766** 0.197 0.130 -0.160 -0.099 -0.183 1 13 Retention 0.287 0.065 0.257 0.172 0.189 0.015 0.098 0.108 -0.049 0.255 -0.083 0.081 1 * Correlation is significant at the 0.05 level (2-tailed). ** Correlation is significant at the 0.01 level (2-tailed).

Table 9. Factor Loadings and Cross-Loadings for Multi-Item Scales Size variation Topic Downloads Attraction messages Total performance Subjective Retention reports Bug Feature requests of ability in Trust others of belonging Sense Size of community 1.00 0.22 0.44 0.73 0.66 -0.06 0.29 0.80 0.84 0.00 -0.24 Topic variation 0.22 1.00 0.12 0.24 0.36 -0.28 0.06 0.13 0.08 0.03 -0.27 Number of downloads 0.44 0.12 1.00 0.77 0.57 -0.01 0.01 0.57 0.35 -0.03 0.13 Attraction 0.73 0.24 0.77 1.00 0.68 -0.16 0.08 0.65 0.49 -0.09 -0.18 Messages, 11/03 0.66 0.36 0.57 0.68 1.00 0.04 0.26 0.49 0.39 0.04 -0.08 Indicate your feelings concerning the software developed by this project. I consider the software to be Valuable … Worthless -0.03 -0.31 -0.03 -0.14 0.03 0.82 0.10 0.04 0.03 0.25 0.25 Indicate your feelings concerning the software developed by this

60 project. I consider the software to be Useful … Useless -0.10 -0.21 -0.01 -0.15 0.01 0.89 -0.20 -0.03 -0.03 0.33 0.36 Indicate your feelings concerning the software developed by this project. I consider the software to be Good … Bad -0.03 -0.11 0.03 -0.11 0.10 0.81 -0.05 -0.01 -0.03 0.52 0.41 Indicate your feelings concerning the software developed by this project. I consider the software to be Terrific … Terrible -0.02 0.12 0.09 -0.08 0.19 0.60 0.26 0.04 0.05 0.52 0.27 Retention 0.29 0.06 0.01 0.08 0.26 -0.06 1.00 0.17 0.19 0.27 -0.08 Number of bug reports 0.80 0.13 0.57 0.65 0.49 0.00 0.17 1.00 0.91 0.02 -0.10 Number of feature requests 0.84 0.08 0.35 0.49 0.39 -0.01 0.19 0.91 1.00 0.06 -0.15 I feel very confident about the project members' skills -0.02 -0.06 0.02 -0.08 -0.04 0.38 0.28 0.04 0.08 0.87 0.45 Project members have much knowledge about the work that needs to be done 0.10 0.11 -0.02 -0.01 0.14 0.32 0.27 0.06 0.08 0.90 0.36 Project members have specialized capabilities that can increase the project's performance -0.07 -0.10 -0.07 -0.14 -0.02 0.38 0.16 -0.02 0.00 0.85 0.59 Project members are very capable of performing their tasks -0.03 0.11 -0.07 -0.12 0.04 0.35 0.21 -0.03 0.01 0.92 0.36 I feel a sense of belonging to this project -0.16 -0.23 0.13 -0.15 -0.02 0.33 -0.03 -0.08 -0.10 0.33 0.83 I feel that I am a member of this project -0.26 -0.23 0.07 -0.19 -0.12 0.37 -0.14 -0.12 -0.17 0.46 0.94 I see myself as part of this project -0.24 -0.27 0.15 -0.15 -0.08 0.36 -0.05 -0.08 -0.13 0.51 0.96 = 0.149 = = 0.677= & & Loss 2 2 Attraction Retention R R Member Member Gain H4b H5a 0.444*** 0.808*** H5b -0.310* Social = 0.593= Capital = 0.012= Product 2 = 0.106 = 0.102 = Success 2 Success Sense Of Belonging 2 2 Subjective Downloads R R Performance R R Trust In Ability Software & Community H3a H2b -0.286* 1.269** Total Topic = 0.047= 0.436= 0.647= 0.701= Activity Activity Support Feature Process 2 2 2 2 Variation Requests Messages R R R R Bug Reports Development Benefit Creation H1b H1a 0.804** 0.660*** H1b H1a 0.837** 0.217*** < 0.01< < 0.05< < 0.1< p p p Size Resources * * ** *** Figure Results. Model 11. Path

61 The explanatory power of the structural model is assessed by examining the R2 value which indicates the variance accounted for in the dependent constructs. The results of specific hypothesis tests appear below.

The first hypotheses, H1a, predicted that increased resources would increase support activity within the project. This hypothesis was supported; both paths between size and the indicators of support activity were significant. The R2 value for topic variation was 0.047, while the R2 for total messages was 0.436, meaning that size predicted 4.7% of the variance in the topic variation (β = 0.217, p < 0.01) and 43.6% of the variance in the total number of messages posted to the forums (β = 0.660, p < 0.01). In other words, size of the community predicted less than half of the variation in volume of communication; smaller communities may talk as much or more than larger communities. H1b predicted a positive relationship between resources and development activity, this hypothesis was also supported, as the paths between size and both indicators of development activity were significant. R2 was 0.647 for variance and 0.701 for feature requests, meaning that community size predicted 64.7% of the variance in the number of bug reports (β = 0.804, p < 0.05) and 70.1% of the variance in the number of feature requests (β = 0.837, p < 0.05) respectively. Thus, a larger community reports more bugs and asks for more new features.

The R2 values for downloads and subjective performance were 0.593 and 0.106 respectively. In other words, all support and development activities accounted for a total of 59.3% of the variance in the number of downloads but only 10.6% of the variance in subjective measures of performance. Hypothesis 2a predicted that support activity would have a positive effect on software product success. No support for this hypothesis was found (topic variation – subjective performance, β = -0.346, ns; topic variation – downloads, β = -0.100, ns; total messages – downloads, β = 0.355, ns; total messages – subjective performance, β = 0.184, ns). Hypothesis 2b predicted a positive link between development activity and software product success, this hypothesis was partially supported, with one of four potential paths significant (bug reports – downloads, β = 1.269, p < 0.1; bug reports – subjective performance, β = 0.046, ns; feature requests – downloads, β = -0.938, ns; feature requests – subjective performance, β = -0.097, ns).

The R2 values for trust in the ability of other members and a sense of belonging were 0.012 and 0.102 respectively, meaning that support and development communication activities accounted for 1.2% and 10.2% of the variance in the two measures of community success.

62 Hypothesis 3a predicted a positive relationship between support activity and social capital, one path was significant, but in the opposite direction that was expected (topic variation – trust in the ability of others, β = 0.021, ns; topic variation – sense of belonging, β = -0.286, p < 0.1; total messages – trust in the ability of others, β = 0.047, ns; total messages – sense of belonging, β = 0.043, ns). Hypothesis 3b predicted a positive relationship between development activity and social capital, this hypothesis was not supported (bug reports – trust in the ability of others, β = -0.241, ns; bug reports – sense of belonging, β = 0.262, ns; feature requests – trust in the ability of others, β = 0.256, ns; feature requests – sense of belonging, β = -0.381, ns).

The R2 for attraction and retention were 0.677 and 0.149 respectively, meaning that software and community success account for 67.7% of the attraction of new members and 14.9% of the retention of the existing members in the community at time 2. Hypothesis 4a predicted a positive link between software product success and retention of existing project members, this hypothesis was not supported (downloads – retention, β = -0.014, ns; subjective performance – retention, β = -0.195, ns). Hypothesis 4b predicted a positive relationship between social capital and retention, this hypothesis was partially supported, one of two potential paths was significant (trust in the ability of others – retention, β = 0.444, p < 0.01, sense of belonging – retention, β = -0.197, ns).

Hypothesis 5a predicted a positive relationship between software success and the attraction of new project members. This hypothesis was partially supported as one of two potential paths was significant (downloads – attraction, β = 0.808, p < 0.01; subjective performance – attraction, β = -0.066, ns). Hypothesis 5b predicted a positive relationship between social capital and attraction, one path was significant but in the opposite direction expected (trust in the ability of others – attraction, β = 0.101, ns; sense of belonging – attraction, β = -0.310, p < 0.1).

In summary, the hypothesis tests provided partial support for the original model. Specifically, increased size of the project community leads to greater support activity between members and more bug reports and feature requests. There is some support for the idea that increased support and development activity lead to a more successful product and to a more successful community. Similarly, there are indications that a more successful software product and a more successful project community can subsequently increase the size of the community. Table 10 presents a summary of hypothesis support.

63 Table 10. Summary of Hypothesis Testing

Hypothesis Effect Relationship Between Constructs Supported (Direction) Type (Direction of Relationship) H1a Direct Resources – Support Activity Yes (+) ( + ) Size – Topic Var., β = 0.217, p < 0.01 Size – Total Msg, β = 0.660, p < 0.01 H1b Direct Resources – Development Activity Yes (+) ( + ) Size – Bugs, β = 0.804, p < 0.05 Size – Feature Req., β = 0.837, p < 0.05 H2a Direct Support Activity – Software Success No ( + ) Topic Var. – Subj. Perf., β = -0.346, ns Topic Var. – Downloads, β = -0.100, ns Total Msg – Downloads, β = 0.355, ns Total Msg – Subj. Perf., β = 0.184, ns H2b Direct Development Activity – Software Success Partially (+) ( + ) Bugs – Downloads, β = 1.269, p < 0.1 Bugs – Subj. Perf., β = 0.046, ns Feat. Req. – downloads, β = -0.938, ns Feat. Req. – Subj. Perf., β = -0.097, ns H3a Direct Support Activity – Social Capital Partially (-) ( + ) Topic Var. – Trust in Ability. β = 0.021, ns Topic Var. – Sense of Bel., β = -0.286, p < 0.1 Total Msg – Trust in Ability, β = 0.047, ns Total Msg – Sense of Bel., β = 0.043, ns H3b Direct Development Activity – Social Capital No ( + ) Bugs – Trust in Ability, β = -0.241, ns Bugs – Sense of Bel., β = 0.262, ns Feat. Req. – Trust in Ability, β = 0.256, ns Feat. Req – Sense of Bel., β = -0.381, ns H4a Direct Software Success – Retention No ( + ) Downloads – Retention, β = -0.014, ns Subj. Perf. – Retention, β = -0.195, ns H4b Direct Social Capital – Retention Partially (+) ( + ) Trust in Ability – Ret., β = 0.444, p < 0.01 Sense of Bel. – Ret., β = -0.197, ns H5a Direct Software Success – Attraction Partially (+) ( + ) Downloads – Attr., β = 0.808, p < 0.01 Subj. Perf. – Attr., β = -0.066, ns H5b Direct Social Capital – Attraction Partially (-) ( + ) Trust in Ability – Attr., β = 0.101, ns Sense of Bel. – Attr., β = -0.310, p < 0.1

64 5.4 Exploratory Analysis Given the fewer than hypothesized significant paths in the hypothesis model, exploratory analysis of other possible paths between model constructs was used to determine if there was a better fit for the data. PLS analysis of the fully saturated model suggested a significant path that was not tested in the hypothesis model. The results of this analysis are presented below.

The exploratory analysis consists of multi-step regression analyses, using PLS Graph, in which the impact of each set of variables is assessed on one other set of variables, to allow estimation of the model parameters between each set of variables. This allows for each set of variables to account for all possible variance between predictor and dependent variables, without violating the sample size restrictions of ten times the number of predictors variables – a key concern in this research given the relatively small sample size of 39 OSS projects. In the first step, the effects of resources on support activity and development activity were examined. The second step evaluated the impact of resources on software and community success measures. The third step investigated the impacts of size on attraction and retention. The fourth step examined the effects of support activity and development activity on software and community success. The fifth step evaluated the influence of support and development activity on attraction and retention. The sixth step assessed the impact of software and community success on attraction and retention. In the last step, each path that was shown to be significant in each of the prior steps was included to create a final model. This model was then evaluated in the same way as the hypothesized model. A detailed description of each step of the model is provided below.

Each step of the model was estimated using the bootstrapping function of PLS Graph, with 200 iterations (Chin and Frye 1996). The explanatory power of each path model is indicated by the R2 value. This value represents the amount of variance accounted for in the final construct of each model. The significance of each path in each model was evaluated, using a two-tailed p-value cutoff of 0.10. This relatively high value was chosen to avoid Type 1 errors due to the exploratory nature of the research and the small sample size.

The first step of the model examined the question of whether project resources had an impact on support communication levels and development communication levels. The results of this step of the analysis are presented in Table 11. The findings suggest that project resources play a major role in the levels of both support and development communication activity within a project, especially on the total number of messages posted to the forums and on the total number of bug reports and feature requests filed, just as reported in the original analysis. 65 Table 11. Step 1 - Impact of Resources on Communication Activity Topic variation Total messages Bug reports Feature requests Size 0.22*** 0.66*** 0.80** 0.84** R2 0.05 0.44 0.65 0.70 *** p < 0.01 ** p < 0.05 * p < 0.1

The second step of the model evaluated the influence of resources on software and community success. The results of this step are presented in Table 12. These results suggest that size has a modest impact on software success as measured by downloads and that size has a small negative effect on project members' sense of belonging. Size had no effect on either subjective feelings of performance or trust in the ability of others.

Table 12. Step 2 - Impact of Resources on Success. Downloads Subjective performance Trust in ability Sense of belonging Size 0.44* -0.08 -0.23 -0.25* R2 0.19 0.01 0.05 0.06 *** p < 0.01 ** p < 0.05 * p < 0.1

The third step examined the effect of resources on attraction of new members and retention of existing members. The results are presented in Table 13. The findings suggest that larger projects will be better able to attract new members and that size also plays a role in the retention of new members.

Table 13. Step 3 - Impact of Resources on Attraction and Retention. Attraction Retention Size 0.73*** 0.29** R2 0.54 0.08 *** p < 0.01 ** p < 0.05 * p < 0.1

The fourth step investigated the effects of support and development communication activity on software and community success. The findings are presented in Table 14. They indicate that topic variation has a negative influence on members' sense of belonging to the project and that the number of bug reports positively influences the number of software downloads.

66 Table 14. Step 4 - Impact of Communication Activity on Success Downloads Subjective performance Trust in ability Sense of belonging Topic variation -0.10 -0.36 0.13 -0.29** Total messages 0.36 0.18 0.10 0.05 Bug reports 1.27** 0.06 -0.33 0.26 Feature requests -0.94 -0.10 0.31 -0.38 R2 0.59 0.11 0.04 0.10 *** p < 0.01 ** p < 0.05 * p < 0.1

The fifth step examined the impact of support and development communication activity on attraction of new members and retention of existing members. The results of this step are presented in Table 15. The findings suggest that both the total number of messages in the discussion forum and the number of bug reports positively influenced attraction of new members. No success factors predicted retention of existing members.

Table 15. Step 5 - Impact of Communication Activity on Attraction and Retention. Attraction Retention Topic variation 0.00 -0.02 Total messages 0.45*** 0.26 Bug reports 0.86** 0.20 Feature requests -0.47 0.28 R2 0.63 0.08 *** p < 0.01 ** p < 0.05 * p < 0.1

The sixth step investigated the influence of software and community success on attraction and retention. The results are presented in Table 16. These findings suggest that the number of downloads positively impacts the attraction of new members, that a sense of belonging among current members negatively impacts the attraction of new members and that trust in the ability of other members has a positive impact on the retention of existing members.

67 Table 16. Step 6 - Impact of Success on Attraction and Retention. Attraction Retention Downloads 0.82*** 0.03 Subjective performance -0.10 0.36 Trust In ability 0.13 0.29* Sense of belonging -0.35** -0.24 R2 0.68 0.26 *** p < 0.01 ** p < 0.05 * p < 0.1

In the seventh and final step, the significant paths from the previous six steps were integrated to form a measurement model, which was then analyzed with PLS Graph. Figure 12 shows this model, with significant paths in bold. The model results are given in Table 17. Table 18 compares the R2 values from the hypothesis model with those from the final exploratory model.

Table 17. Final Model Topic variation Topic messages Total reports Bug Feature requests Downloads of belonging Sense Attraction Retention Size 0.22*** 0.66*** 0.80** 0.84** -0.06 -0.20 0.52 0.29** Topic variation -0.22* Total messages 0.52 Bug reports 0.63* -0.17 Feature requests Downloads 0.62* Trust in ability 0.27* Sense of belonging -0.14* R2 0.05 0.44 0.65 0.70 0.33 0.11 0.81 0.16 *** p < 0.01 ** p < 0.05 * p < 0.1

68 & Loss & Attraction Retention Member Gain Member 0.269* 0.620* -0.144* Social Capital Product Success Success Sense of of Sense of Others of Belonging Subjective Subjective Downloads Performance Trust In Ability Ability In Trust Software & Community & Software 0.625* -0.223* 0.287** Total Total Topic Topic Activity Activity Support Feature Process Variation Requests Messages Bug Reports Bug Development Benefit Creation Benefit 0.804** 0.660*** 0.837** 0.217*** < 0.01 < < 0.05 < < 0.1 < p p p p p p Size *** *** * * ** Resources Figure 12. Exploratory Analysis Model.Figure 12. Exploratory Analysis

69 Table 18. Comparison of Hypothesis and Exploratory R2 Values. Construct Hypothesis model R2 Exploratory model R2 Topic variation 0.047 0.047 Total messages 0.436 0.436 Bug reports 0.647 0.647 Feature requests 0.701 0.701 Downloads 0.593 0.331 Trust in ability of others 0.012 -0.167 Sense of belonging 0.102 0.107 Attraction 0.677 0.811 Retention 0.149 0.155

In summary, the results of this exploratory testing indicate that the resources available to an OSS project form a key asset that influences the support that users give each other and the development activity that occurs. Available resources also affect the ability of the project to retain members, but not the project's ability to attract new members. Topic variation has a negative influence on project members' sense of belonging to the project. The number of bugs reported has a positive effect on the number of downloads of the software. The number of downloads has a positive effect on the attraction of new members. Trust in the ability of other project members allows the project to retain members, e.g., when project members trust the ability of other members, they are likely to remain with the project. Finally, when members have a sense of belonging to the project, the project is less likely to attract new members.

70 6. DISCUSSION Comparison of the results of the hypothesized path model and the fully-saturated path model formed by the exploratory analysis shows remarkable similarity. In fact, all paths that were significant in the hypothesized model were significant when the constructs were allowed to freely associate with each other in the exploratory analysis. This means that for the paths that showed significance, the hypothesized path model performed extraordinarily well. Additionally, the exploratory analysis showed only one possible path not accounted for by the hypothesized model, the path between size and retention of members. Each of the paths – significant and non- significant – and their theoretical implications are discussed in detail below.

6.1 Supported Hypotheses Hypothesis 1a, that more resources leads to increased support communication activity, was supported. Larger communities communicated more overall and also discussed a wider variety of topics related to support. As communities communicate more, the time, energy, knowledge and other resources provided by members will be converted to benefits for the members and for the community as a whole (Butler 2001). In OSS projects this means that the “mundane but necessary task” (von Krogh et al. 2003, p. 923) of supporting the project's software is better handled by a larger community. While not all community members are able to write software, these members carry the important function of supporting other members (Lakhani and von Hippel 2003). By talking more about a greater variety of topics, members of the community can distribute the work of supporting the software.

Hypothesis 1b stated that larger OSS communities will have more development communication activities. This hypothesis was also supported, meaning that a larger community can expose more bugs and suggest more new features. This is to be expected, both in terms of “many eyeballs making all bugs shallow”, as expressed by Raymond (1999, p. 8) and by the practical mechanism of many individuals using the software in different ways (Weber 2004). As community size increases, there are simply more people available to aid in the distributed debugging process and more users interested in using the software in novel ways, leading to more feature requests.

The fully saturated model indicated one additional path that was not proposed in the hypothesized model. The path between size and retention of existing members was significant. This finding contradicts Butler's findings, who found that increased size led to increased member

71 loss in Listserv communities, in both relative and absolute terms (2001). The difference between the findings across the two studies may be due to the difference in what's produced in each type of community. In a Listserv, the “product” is knowledge; that is, individuals presumably become more knowledgeable about the topic(s) discussed between members. In OSS project communities, the “products” include not only knowledge from participating in discussion forums (Lakhani and von Hippel 2003), but also the software produced. The continued maintenance, improvement and support of the software is visible, concrete evidence that the project community is attaining its goals. Individual knowledge in discussion forums, on the other hand, is very personal and difficult for other members to observe. Since the software is visible evidence of progress, individuals are more likely to remain with the project, even though it is often more difficult to feel part of larger communities, due in part to the number of possible communication partners. Butler also suggests that community size forms an “audience resource” which is important for individuals seeking visibility or an outlet for their ideas (2001, p. 348). These larger audiences give more exposure to the volunteer members, perhaps inducing them to stay involved in larger communities. A larger community also provides better visibility for showcasing skills and improving reputation, which have been shown to be important to OSS developers (Ghosh et al. 2002; Lakhani et al. 2002).

The largely volunteer makeup of OSS projects means that they depend on efforts from many members to maintain, support, and improve the software. As a disjunctively produced public good, OSS can be privately provided for the good of all, but the sheer volume of support requests and bug reports could quickly overwhelm a small group of developers. When this happens the developers could simply ignore these requests, but as Alan Cox has noted, in at least one case where this occurred, a group of Linux kernel developers realized that this was a good way to “screw up a free software project” (1998). In order for software improvements to be made – for bugs to be fixed and new features added – the bugs must first be exposed by the community and then reported. Like many communities which communicate virtually, larger OSS communities are apparently able to overcome the difficulties in communicating that often accompany increases in size (George et al. 1990; McPherson 1983) and effectively communicate about issues relating to the complex tasks of software development (Levine and Moreland 1990).

6.2 Partially Supported Hypotheses Hypothesis 2b stated that increased development activity would lead to increased software success. Partial support for this hypothesis was found, in the form of the path from the

72 number of bug reports to the number of downloads. As bugs are found and fixed in software, the software evolves and becomes more usable and stable, leading more individuals to download and use the software (Crowston et al. 2003; Mockus et al. 2002). The other paths suggested by this hypothesis were not significant. More bug reports did not lead to increased subjective perceptions of performance. Increased feature requests did not increase downloads, nor lead to higher perceptions of performance. This research only measured the relationship between reports and software success, not the effects of resolved issues on software success. Simply having bugs reported and new features requested is likely not enough for individuals to feel the software is more successful, whereas actually fixing bugs and adding new features would likely have greater impact on software success. Surprisingly, although neither the path from feature requests to downloads nor from feature requests to subjective performance was significant, the relationships were both negative. This might suggest that users are more willing to utilize software with bugs than software which does not have all the features they would like, although this conclusion cannot be definitively drawn without significant paths.

Hypothesis 4b indicated a relationship between social capital and the retention of individuals within the group. This hypothesis was partially supported in the form of a path between trust in the ability of others and retention. As members trust the ability of others to develop, support and maintain the software, they are more likely to remain with the group (Jarvenpaa et al. 1998). The path between a sense of belonging and retention was not significant, indicating that individuals do not necessarily remain with a project even if they feel they are a part of it. Given the predictions of volunteerism literature, this lack of a path is especially surprising. Individuals may volunteer for a variety of reasons but typically only stay with a community if they feel a part of it (Clary et al. 1998). One potential explanation for individuals not remaining with the community is that there are different kinds of volunteers. The first type, called spot volunteers, volunteer casually and target specific needs. A second type of volunteer fills a more formal role, remains with the organization a longer time, and experiences a sense of accomplishment and gratification. The final type of volunteer involves individuals who volunteer due to pressure from peers or employers (Powers 1998). Published descriptions of task assignment in OSS projects indicate that tasks are largely self-selected (Crowston et al. 2005a). This suggests that many, perhaps most, individuals who participate in OSS projects are spot volunteers, and simply volunteer occasionally as they are affected by a bug or need a new features (Raymond 1999; Weber 2004). Those who remain involved with the project may move into more formal roles such as developers (Markus et al. 2000; Stewart 2003).

73 Hypothesis 5a, which predicted a link between software success and the attraction of new members, was partially supported. The path between the number of downloads and the attraction of new members was significant. The number of downloads – as a proxy of the number of users of the software – does not necessarily equate with the number of participants in the project community. Most individuals who download the software will simply use it and not give anything back, they are free-riders. However, as the pool of users increases in size, that fraction of users who are motivated to give back to the community – whether to ask questions, answer questions or contribute code – also increases (Weber 2004). The link between subjective performance and attraction of new members was not supported. This indicates that members of the community may not actively spread the word about their project, even though they feel that the software meets their needs. Network effects are important in computer use – when many individuals utilize the same software, data interchange is eased – so one would expect community members to ask others to utilize the software if only out of self-interest. Just because members tell others that a certain piece of software meets their needs does not mean that others will adopt it, however! One individual's perceptions that the software is successful does not mean they will be able to convince others of that.

6.3 Non-supported Hypotheses Hypothesis 2a predicted that increased support activity would lead to greater software success; however, no paths were significant. This means that neither a variety of topics in the discussion forums nor a large number of posts to those forums directly drives software success. Although this hypothesis proposed an indirect path from discussions software problems in the forums to the filing of formal bug reports, based on qualitative evidence found in Sourceforge forums, it is possible that this happens only rarely. Further, as those individuals who participate in the discussion forums are already users of the software, they may not need to download the software, and thus would not show up in download numbers, given that download data was collected immediately after discussion participation was measured. Finally, members who have to post support requests may be less satisfied with the software as a direct result of their difficulties. However, the lack of a path between support communication activity and perceptions of performance indicates that a potential bias to the survey results – caused by selection of only individuals who actively participate in the project – does not exist. It is possible that the active users surveyed would skew their views of the performance of the software based on their involvement with the project, akin to asking mothers what they think of

74 their children. This bias to self-rated perceptions could be a major threat to the validity of any survey involving project members, but the lack of this path indicates that no such bias exists.

Hypothesis 3a suggested a link between support activity levels and social capital in OSS communities. This hypothesis was not supported, however, the path between topic variation and a sense of belonging to the project was significant, but opposite the expected direction, e.g., higher levels of topic variation in the discussion forums led to decreased feelings of belonging among members. This reverse relationship is similar to Butler's finding that high topic variation resulted in member loss; however, he does not offer a specific explanation for this phenomenon (2001). One possible explanation of this effect is that the diversity of topics makes it harder for community members to find common interests within the community. The other paths that were part of this hypothesis were not supported. The lack of a significant path between topic variation and trust in the ability of others shows that members do not rate their trust in others based on the variety of things they say. The lack of paths between the total messages and trust and sense of belonging similarly indicates that social capital is not necessarily generated based on the quantity of interactions that take place in the discussion forums. The formation of social capital is based on relationships between community members, but not necessarily on the number of communication activities (Nahapiet and Ghoshal 1998). Instead, feelings of respect, gratitude and friendship are important in the development of social capital (Bourdieu 1986). Development of these emotional ties would involve multiple communication with a certain number of others, not a large volume of communication with many forum participants.

No support was found for Hypothesis 3b, which stated that increased development communication should result in increased social capital. One possible explanation for the lack of a relationship is that all participants in the community were surveyed, rather than just those traditionally considered developers. It is likely that most community members do not interact enough within the bug reporting and feature request systems to build social capital. Much like a high volume of communication did not lead to the development of social capital in Hypothesis 3a, simply reporting a bug or asking for a feature is not enough to form the relationships that make up social capital. Bug reports and feature requests are acted upon within the developer community, rather than among all users. Developers communicate with each other as the software is improved, and may communicate enough with each other to develop social capital. It is possible that if only developers were surveyed, the results would show a relationship between development communications and increased social capital.

75 No support was seen for Hypothesis 4a, which predicted a relationship between software success and the retention of members. Neither the path between subjective performance and retention nor the path between downloads and retention was significant, however, both paths were negative. This finding is counterintuitive, why would someone who is happy with the software's performance no longer be a participant in the community? One possible answer is that they do continue to use the software (which is impossible to track), but do not feel a need to participate in the forums, because the software “just works” for them. This pattern of “significant churn, with members coming and going” from active participation was suggested by Butler as a possible feature of all online communities (2001, p. 358). In other words, this may just be a part of life for an OSS community, with new volunteer talent joining the community and other members no longer actively participating. As spot volunteers, individuals participate for a time on a specific task; once the task is completed they stop contributing. This doesn't mean they are finished being a volunteer, they may come back later to perform other specific tasks. Tracking this pattern of participation would require more frequent sampling of contributions in the community over an extended period of time.

Hypothesis 5b proposed that increased social capital led to increased attraction of new members. No support was found for this hypothesis. The path between a sense of belonging and attraction of new members was significant, but negative. It was expected that a sense of belonging to the community would induce members to spread the word about the project, thus attracting new members. The fact that the reverse occurred could indicate that communities whose members feel a strong sense of belonging become closed. Existing members may not accept newcomers, thus the social capital that binds the group together becomes a barrier to entry for new members (Locke 1999). The path between trust in the ability of others and attraction was not significant. This may mean that members' trust in the ability of others is not apparent to outsiders and therefore cannot induce them to join. Volunteering is linked to social capital, in the sense that individuals with more social ties (those of higher socioeconomic status, extroverts, those who belong to organized religions, or who otherwise have more social ties) volunteer more often (Wilson 2000). The narrow focus of this study investigates only levels of social capital within the OSS project community, not generalized social capital community members may have in their private lives. Like all social capital, the trust generated in this community does not transfer well from community to community (or to the physical world) where it could induce others to participate in the project community (Bourdieu 1986).

76 Overall, although all paths in this model were not supported, the findings begin to answer the research questions. OSS project communities must grow in size to sustain the communication that supports the community. The exact size of the community likely depends on the scope and other factors uniques to the project – that is, there is not necessarily a specific size required for success for all OSS projects. It appears that a critical mass of community members is necessary to support the software and continue to develop it. Unlike Listserv communities, there were no negative effects associated with increased size of the community, perhaps because OSS communities focus on the software product, rather than just knowledge. As previously noted, this software is visible evidence that the project community is effectively reaching its goals, leading individuals to remain with the project, despite the difficulties in communication that usually accompany an increase in size (Levine and Moreland 1990).

It also appears that both support and development communication activities have some effect on the ultimate success of OSS projects. Support activities are a factor in sustaining the community and development activities are a factor in maintaining and advancing the software. It is unlikely that a community could be successful without both types of communication. Software success, community success and the size of the community also appear to play roles in the gain and loss of members from the community. These elements appear to be critical to sustaining the community over time.

77 7. CONCLUSIONS

7.1 Limitations This study extends prior research by examining a large sample of OSS projects longitudinally, using both survey and objective data. However, there are some important limitations in this design that should be noted. The first limitation of this research is the small sample size. While this study investigated a comparatively large number of projects relative to other OSS research, it is difficult to generalize from this sample to all projects hosted on Sourceforge, or to OSS in general, given the variety of project types and organization and communication structures. Additionally, given the size of the sample, there is insufficient statistical power to detect small effect sizes, meaning that some of the non-significant findings may be due to the small sample size. Indeed, several paths were “almost” significant, such as the paths between topic variation and subjective performance, between total messages and number of downloads, between feature requests and downloads and between a sense of belonging and retention. A larger sample might show significance for these paths.

This study was longitudinal in nature, but only covered a fairly limited time period. The nine months during which these projects were studied may not have been enough to see much of the life cycle of an OSS project. For example, during this period, several projects transitioned from active participation to a fairly dormant state. These projects left no real indications behind in their discussion forums as to why members stopped participating. This is a good example of the fluidity of these voluntary communities. Without contractual obligations, the volunteers who make up the community may come and go as they please, making it difficult to sustain membership. To better see patterns in members joining and leaving, as well as stages in the life cycle of the OSS project, more frequent samples could be taken over a longer period of time.

One possible threat to the statistical validity of this study is non-response bias. Overall, 355 responses were collected out of a total of 1602 surveys delivered. While demographic variables collected matched earlier studies well, it is possible that non-respondents differed significantly from respondents in ways that would materially affect the outcomes of this study. For example, it is possible that respondents felt a sense of belonging to the community or trusted the abilities of others, while non-respondents did not.

A final limitation of this study is that it focused on only the “Open Discussion” forums on Sourceforge. This allowed uniform data collection, but may not have captured all public

78 communication about the project. The selection criteria for the projects surveyed required that projects use their “Open Discussion” forums and not utilize outside venues of communication, but these selection criteria were applied at only one point in time. During the study period, several projects moved away from the Sourceforge forums, while others started external mailing lists and opened other Sourceforge forums such as “Help”, “Users Helping Users” and language- specific forums for international users. These actions are likely simply part of the project lifecycle as the project grows too large for one forum to handle all communication relating to the project, but excluding this data means that some support communication activities may have been overlooked.

7.2 Contributions This research aimed to fill the gap left by prior studies relating to the importance of a community in OSS projects. Raymond (1999) suggested that users find – and sometimes fix – bugs in software. Mockus et al., (2002) made initial measurements that supported this proposition, but otherwise, very little additional research was undertaken to back up this popular conception of how OSS should work. The results of this research provide confirmation for the findings by Mockus et al., (2002) and support Raymond's (1999) claim. Users definitely find – and may fix – software bugs.

These results partially answer the initial questions guiding this research. The first research question was “How is an OSS project community sustained?” It appears that OSS project communities are sustained by the interaction of members, specifically, that member growth and retention depend on the social capital generated as members interact. Sustaining the community is also somewhat dependent upon the success of the software. Without viable software, there is no reason for OSS projects to exist, thus, in order to attract new members, there must be something to download. After downloading, some will volunteer their efforts and become community members by communicating with others. The community is sustained by having a large pool of users to draw from, some of whom will volunteer (Weber 2004).

The second research question was “How does a community contribute to the success of an OSS project?” It appears that the size of the community is very influential in supporting the software and finding and reporting issues that arise with its use. While these were not the ultimate measures of success for the project, issues cannot be resolved without first being reported. The communication activities that turn the resources of members into the benefits for the community have some effect on software success and community success. Thus, a 79 community contributes to the success of the OSS project by communicating, just as the theoretical model predicted. In short, it appears that an active community does matter in producing, supporting and sustaining OSS.

The research presented in this study extends theory by taking a novel approach to the study of OSS communities. Communities were examined through the lens of the resource-based model of community sustainability proposed by Butler (2001). In implementing this study, both objective and subjective data were collected longitudinally to predict the sustainability of the OSS project community. The findings presented show that a community must “maintain access to a pool of shared resources and support the social processes that convert those resources into valued benefits for the participants” (Butler 2001, p. 347). This study thus appears to support Butler's (2001) study, although since software and community success have been added to the model, the effects of size are somewhat different that Butler noted. In OSS, the size of the community does matter in support, maintenance, and improvement of the software. OSS project leaders would be well advised to care for those in the project community, rather than alienating them. As is the case for all volunteers, unless OSS community members feel that their time and efforts are being utilized well, and are recognized for their accomplishments, they will stop volunteering (Shin and Kleiner 2003).

The study results are novel in showing the role that size plays in communication patterns and how that communication influences the ultimate success of the project. The findings show that a large, active community is important to OSS software development, a relationship proposed by von Hippel et al., (2003) in OSS projects. Prior to this dissertation, this relationship had been empirically found only in other fields of endeavor. This finding is relevant to OSS project leaders, in that it shows that feedback from users is important in creating software that others will download, and that some fraction of those users will eventually become contributing members of the community. As such, OSS leaders should continue to provide forums for and otherwise actively promote support activities, as well as heed the bug reports and feature requests contributed by members. Conversely, other non-OSS communities and commercial software firms can benefit by listening to their users and integrating their suggestions. In that sense, this research provides confirmation that users can be a valuable source of innovations, as suggested by von Hippel (1986).

80 7.3 Future Research The first step to extending this research should be to replicate and extend the results by selecting a larger number of projects, sampled from Sourceforge as well as other hosting venues. By including more projects, multiple sites and projects at different levels of activity, it will be possible to overcome some of the limitations present in this study such as low statistical power. This would also result in findings that can be more easily generalized to other OSS projects and hosting venues

Any extension utilizing a larger number of projects should also utilize a longer time period, preferably more than one year. This is a daunting task, due to the fluid nature of participation in OSS communities, but the task may be simplified by the collection of some archival data. Such a study of size and community sustainability over a period of years could help to capture elements of the life cycle of OSS projects not apparent in this study such as patterns of “spot volunteering”.

Finally, given the robustness of the hypothesized model as compared to the fully saturated model, another productive research effort would be to utilize different constructs for software success and community success, as well as to measure support and development communication differently. As a simple example, one could include all discussion forums (not just “Open Discussion”) and email lists for the software, or categorize messages based on content to better determine whether the message thread was about support, development, or some other topic. Doing so with a larger sample size will add statistical power which may well show that some of the “almost” significant paths are indeed significant by capturing all communication relating to the software.

7.4 Concluding Remarks This study was able to meet its design goals of determining the role of a community in OSS projects and providing answers to some questions of how such a community may be sustained. Like most research, there were limitations that are a threat both to statistical validity and to the generalizability of the findings. These limitations are just that, limits to our ability as researchers to definitively provide answers to all the questions we dare to ask. The research presented in this dissertation advances the theoretical state of the art in OSS research, examines some practical implications for organizers of OSS projects and provides an additional theoretical framework for future examination of OSS project communities.

81 APPENDIX A. HISTORY OF OSS Most OSS software is developed by communities of programmers who volunteer their time to write software for no fiduciary gain. These programmers, organized into so-called “OSS projects” meet virtually via the Internet and coordinate their efforts to produce high-quality software. This freely-available software forms an alternative to proprietary software. To understand how OSS programs are produced, a brief history of computer software is necessary. This history is not intended to be exhaustive, but hopefully will aid the reader's understanding. For a thorough and very readable history of OSS, see Weber, (2004); much of the following narrative is adapted and paraphrased from that discussion.

Although the popular press touts open source software as a new idea, the practice of obtaining binary software is actually comparatively old. In the early days of computing, software was distributed only as human-readable source code, which was then compiled to create the binary form. When computers filled rooms and only a few were available, the users grouped together to build shared software utilities to extend the built-in functionality of the machines and allow full utilization of the power of the very expensive hardware. Changes made to the software by one user were typically shared with all other users and each user compiled the software into binary form themselves.

This practice of sharing software continued from throughout the 1950's and 1960's and ballooned in the 1970's, which saw the introduction of comparatively cheap hardware, such as the DEC PDP-11. These machines were quickly purchased by university departments and private corporations, resulting in many more computers and users. This explosion of (for that time) cheap, high-powered hardware led to the development of the UNIX operating system at Bell Telephone Labs in 1969. The source code for this operating system was available from AT&T (who owned Bell Labs) at no charge and no binaries were distributed. It was up to the individual installing UNIX to compile it and install it on the particular computer they owned.

UNIX grew rapidly during the 1970's. As users wrote enhancements to the basic operating system and utility software to utilize the computer more effectively, they made the source code of this software available to other UNIX users. With the source code in hand, other users could improve upon the initial version of the software and share the code to these enhancements. During the 1970's, UNIX forked several times – in other words, several different communities took the software and developed specialized versions to handle specific tasks. Perhaps the most famous of these was the Berkeley Systems Distribution (BSD) maintained by 82 the institution of the same name. Like all other UNIX versions at the time, the source code to BSD UNIX was freely available.

In the early 1980's, AT&T was embroiled in a long-running antitrust suit with the US government. The allegations were that AT&T, Bell Labs and Western Electric had acted as monopolists in a broad variety of telecommunications services. The incoming Reagan administration took a different view of the problem than had its predecessors and rather than dissolve AT&T the courts decided to break off the Bell companies as their own operating companies. AT&T management, no longer constrained by the 1956 Consent Decree that limited its activities to telecommunications, decided to take UNIX in a commercial direction and licensing fees for UNIX skyrocketed to hundreds of thousands of dollars within a few years. The new licensing terms restricted the source code and mainly binary forms were distributed. Other software companies who made software for microcomputers also took this approach to software and licensed their programs in binary form only.

BSD UNIX, as a descendant of AT&T's variant of UNIX, shared much of its code with the newly-commercialized version. This meant that those who wanted to utilize BSD had to pay license fees to AT&T for the code. In June, 1989, the Berkeley group released a version of BSD that consisted of only the portions of the operating system free from AT&T's proprietary code. Although incomplete, this version was useful and was well received by users. Given this reception, the Berkeley group decided to rewrite many of the tools for UNIX that still contained AT&T intellectual property in a “clean-room” implementation. This meant that individuals would reverse-engineer programs based only on published documentation, not on existing source code. The group asked for help from the user community to write these tools and received many contributions over the next two years, finishing the colossal task in 1991.

By this time, PCs had come into their own and were starting to appear in many businesses and schools. Other users of BSD ported this code from the original DEC architectures to the IBM x86 architecture. Other companies released variants of BSD UNIX under the terms of the Berkeley license, which essentially allowed for free redistribution of the operating system and source code, as long as credit was given to the original authors. These licensing terms became known as the BSD license and later formed the basis of some other OSS licenses. The ideals of sharing code that originated with these programmers formed part of the basis of the current OSS movement.

83 The several BSD variants are neither the only open source operating systems, nor even the best-known OSS operating systems today. In 1991, a graduate student at the University of Helsinki purchased a personal computer with a 386 processor for use in his classes. Like many computers at the time, it came with Microsoft's DOS operating system. The student, Linus Torvalds, didn't want to use DOS for his programming and the commercial UNIX variants for the Intel 386 processor at the time were simply too expensive for a student budget (Torvalds notes that if he had been aware of 386/BSD at the time, he probably would have utilized it and never looked further). There was, however, a teaching version of UNIX called Minix available. Minix was available for approximately $100 for floppy disks with source code. Linus purchased and installed Minix and then decided to write his own system based on Minix. He soon abandoned the Minix underpinnings and in fall of 1991 announced his new operating system on an Internet newsgroup.

In his release announcement on USENET, Torvalds noted that he would accept modifications and integrate them into his system. The response was extraordinary, with 100 people joining the newsgroup and contributing fixes and enhancements by year’s end. Since that time Linux has become one of the poster children of OSS development. Linus Torvalds still maintains and coordinates development of the source code and accepts contributions from volunteers who wish to write code. The community of users and developers that surround the operating system shares the ethos of sharing their code with the world. They differ from the BSD group in some details and in licensing terms, but many of the same ideals are in place.

A third group that has influenced the development of the current OSS movement was formed in 1984 at MIT. Richard Stallman, frustrated by his inability to obtain source code for a buggy printer driver from Xerox, decided to form the Free Software Foundation (FSF), a non- profit organization devoted to the goal that software should be a common good and freely available to all. The group wrote many tools to run on the UNIX operating system and named the collective project as the GNU project (a recursive acronym for GNU's Not UNIX). They ensured that these utilities would remain free with the GNU General Public License (GPL). The group continues to maintain these tools and provides web hosting and other resources for OSS projects. More influential than the tools written by the FSF is the zealotry with which they defend the ideal that software should be available to all.

The Free Software Foundation takes the license under which software is utilized as a moral issue and sees all proprietary, binary-only software licenses as a moral bad, even if they

84 are pragmatically good. The GPL license in particular requires that any changes to a GPL program must remain freely available. The FSF is an outspoken critic of many other types of software and has gained some notoriety based on their zealotry. They do not represent the whole of the OSS movement, but due to their visibility, they are often seen as spokespersons for the movement. Since the Linux operating system is licensed under the GPL, it has become the de- facto GNU operating system, even though it has no official standing as such. This further perpetuates an existing myth that all OSS is composed of anti-intellectual-property, anti- copyright groups.

The final group which has influenced the OSS movement is the Open Source Initiative (OSI). This non-profit organization manages and promotes the Open Source Definition, a community-accepted definition of what OSS licenses should allow (the full text of the Open Source Definition appears in Appendix B). The OSI was founded in 1998 by several leaders of prominent OSS projects including Eric S. Raymond, author of the Sendmail program and Bruce Perens of the Perl project. Notably absent from the organization is Richard Stallman, who continues to lead the FSF (OSI 2006a). The organization's stated goal is to provide a way of certifying that software that claimed to be open source really matched up with the ideals of the community. Unlike the FSF, the OSI is not rabidly anti-intellectual property, rather “the OSI is a marketing program for free software. It's a pitch for 'free software' on solid pragmatic grounds rather than ideological tub-thumping” (OSI 2006a). This group promotes OSS in all its forms and apparently has had great success in creating public acceptance of the term “open source” rather than “free software” (OSI 2006a).

The influences of early program sharing, UNIX, Linux, the FSF (and the associated GNU project and GPL license) and the OSI have all interacted to make open source what it is today. The projects and organizations have created a legacy of philosophies surrounding the OSS projects and OSS developers. From the FSF comes a certain amount of – perhaps fanaticism is too strong a word – but certainly idealism. From Linux comes a very strong contributory spirit and from BSD comes a strong heritage of sharing code. Finally, the OSI has become an umbrella organization that allows OSS developers to share their many commonalities, despite differences on some points of licensing. While OSS projects and licenses differ, a strong commitment to the keeping the human readable source code available flows through all projects.

85 APPENDIX B. OPEN SOURCE DEFINITION The Open Source Definition is a list of conditions that software licenses must meet in order to receive the Open Source Initiative's stamp of approval as “Open Source Licenses” and be allowed to use the “Open Source” trademark. The Open Source Definition insures that the beliefs of the community about what freedoms OSS should allow are adhered to. The version appearing below was annotated by the Open Source Initiative, explaining the rationale for each condition (OSI 2006b). Some minor formatting changes have been applied so that it fits the formatting of this document.

The Open Source Definition, Version 1.9

The indented, italicized sections below appear as annotations to the Open Source Definition (OSD) and are not a part of the OSD.

Introduction Open source does not just mean access to the source code. The distribution terms of open-source software must comply with the following criteria:

1. Free Redistribution – The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several` different sources. The license shall not require a royalty or other fee for such sale. Rationale: By constraining the license to require free redistribution, we eliminate the temptation to throw away many long-term gains in order to make a few short-term sales dollars. If we did not do this, there would be lots of pressure for cooperators to defect. 2. Source Code – The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost–preferably, downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed. Rationale: We require access to un-obfuscated source code because you can not evolve programs without modifying them. Since our purpose is to make evolution easy, we require that modification be made easy. 3. Derived Works – The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software. Rationale: The mere ability to read source is not enough to support independent peer review and rapid evolutionary selection. For rapid evolution to happen, people need to be able to experiment with and redistribute modifications.

86 4. Integrity of The Author's Source Code – The license may restrict source-code from being distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software. Rationale: Encouraging lots of improvement is a good thing, but users have a right to know who is responsible for the software they are using. Authors and maintainers have reciprocal right to know what they are being asked to support and protect their reputations. Accordingly, an open-source license must guarantee that source be readily available, but may require that it be distributed as pristine base sources plus patches. In this way, "unofficial" changes can be made available but readily distinguished from the base source. 5. No Discrimination Against Persons or Groups – The license must not discriminate against any person or group of persons. Rationale: In order to get the maximum benefit from the process, the maximum diversity of persons and groups should be equally eligible to contribute to open sources. Therefore we forbid any open-source license from locking anybody out of the process. Some countries, including the United States, have export restrictions for certain types of software. An OSD-conformant license may warn licensees of applicable restrictions and remind them that they are obliged to obey the law; however, it may not incorporate such restrictions itself. 6. No Discrimination Against Fields of Endeavor – The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research. Rationale: The major intention of this clause is to prohibit license traps that prevent open source from being used commercially. We want commercial users to join our community, not feel excluded from it. 7. Distribution of License – The The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties. Rationale: This clause is intended to forbid closing up software by indirect means such as requiring a non-disclosure agreement. 8. License Must Not Be Specific to a Product – The rights attached to the program must not depend on the program's being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution. Rationale: This clause forecloses yet another class of license traps. 9. License Must Not Restrict Other Software – The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software. Rationale: Distributors of open-source software have the right to make their own choices about their own software. Yes, the GPL is conformant with this requirement. Software linked with GPLed libraries only inherits the GPL if it forms a single work, not any software with which they are merely distributed.

87 10.License Must Be Technology-Neutral – No provision of the license may be predicated on any individual technology or style of interface. Rationale: This provision is aimed specifically at licenses which require an explicit gesture of assent in order to establish a contract between licensor and licensee. Provisions mandating so-called "click-wrap" may conflict with important methods of software distribution such as FTP download, CD-ROM anthologies, and web mirroring; such provisions may also hinder code re-use. Conformant licenses must allow for the possibility that (a) redistribution of the software will take place over non-Web channels that do not support click-wrapping of the download, and that (b) the covered code (or re-used portions of covered code) may run in a non-GUI environment that cannot support popup dialogues.

88 APPENDIX C. SURVEY The entire survey administered to OSS community members in February, 2004 is included below for reference. The survey has been split for inclusion in this document, and comprises Figures 13 - 17, inclusively

Figure 13. OSS survey.

89 Figure 14. OSS survey, continued.

90 Figure 15. OSS survey, continued.

91 Figure 16. OSS survey, continued.

92 Figure 17. OSS Survey, Final Portion .

93 APPENDIX D. HUMAN SUBJECTS DOCUMENTATION

Informed Consent Documentation

Figure 18. Informed Consent Document.

94 Human Subjects Approval

Figure 19. Human Subjects Approval.

95 REFERENCES Adelstein, T. "Desktop Linux: The Final Hurdles," 2004 (available online at http://lxer.com/module/newswire/view/25901/; accessed April 2006). Adler, P. S. and Kwon, S.W. "Social Capital, Prospects For A New Concept," Academy of Management Review, (27:1), 2002, pp. 17-40. Ahuja, M., Galletta, D. and Carley, K. "Individual Centrality And Performance In Virtual R&D Groups: An Empirical Study," Management Science, (49:1), 2003, pp. 21-38. Bagozzi, R. P. and Dholakia, U.M. "Open Source Software User Communities: A Study Of Participation In Linux User Groups," Management Science, (52:7), 2006, pp. 1099-1115. Berlios "Berlios Developer: Welcome," 2006 (available online at http://developer.berlios.de/; accessed April 2006). Bitzer, J. and Schroder, P.J.H. "Bug Fixing And Code-Writing: The Private Provision Of Open Source Software," Information, Economics, and Policy, (17:3), 2005, pp. 389-406. BlackBoxVoting "Black Box Voting," 2006 (available online at http://blackboxvoting.com/s9/; accessed April 2006). Bonaccorsi, A., Giannengeli, S. and Rossi, C. "Entry Strategies Under Competing Standards: Hybrid Business Models In The Open Source Software Industry," Management Science, (52:7), 2006, pp. 1085-1098. Bourdieu, P. "The Forms Of Capital," in Handbook of Theory and Research for the Sociology of Education, J. Richardson (Ed.), Greenwood Press, 1986, pp. 241-258. Butler, B. S. "Membership Size, Communication Activity, And Sustainablility: A Resource- Based Model Of Online Social Structures," Information Systems Research, (12:4), 2001, pp. 346- 362. Casadesus-Masanell, R. and Ghemawat, P. "Dynamic Mixed Duopoly: A Model Motivated By Linux Vs. Windows," Management Science, (52:7), 2006, pp. 1072-1084. Chin, W. W. "The Partial Least Squares Approach To Structural Equation Modeling," in Modern Methods for Business Research, G. A. Marcoulides (Ed.), Lawrence Erlbaum, Mahway, New Jersey, 1998, pp. 295-336. Chin, W. W. and Frye, T. "PLS Graph, 2.91.03.04," University of Calgary, Calgary, Canada, 1996. Chin, W. W. and Todd, P.A. "On The Use, Usefulness, And Ease Of Use Of Structural Equation Modeling In MIS Research: A Note Of Caution," MIS Quarterly, (19:2), 1995, pp. 237-246. Clary, E. G., Snyder, M., Ridge, R. D., Copeland, J., Stukas, Arthur A., Miene, Peter and Haugen, J. "Understanding And Assessing The Motivations Of Volunteers: A Functional Approach," Journal of Personality and Social Psychology, (74:6), 1998, pp. 1516-1530. Constant, D., Sproull, L. and Kiesler, S. "The Kindness Of Strangers: The Usefulness Of Electronic Weak Ties For Technical Advice," Organization Science, (7:2), 1996, pp. 119-135.

96 Content Team "What Is The Reason Behind The Success Of Open Source & Open Source Languages?," 2006 (available online at http://www.indicthreads.com/discuss/420/open_source_languages_success.html; accessed April 2006). Cothrel, J. and Williams, R.L. "On-Line Communities: Helping Them Form And Grow," Journal Of Knowledge Management, (3:1), 1999, pp. 54-60. Cox, A. "Cathedrals, Bazaars And The Town Council," 1998 (available online at http://features.slashdot.org/article.pl?sid=98/10/13/1423253; accessed April 2006). Crowston, K. and Howison, J. "The Social Structure Of Open Source Software Development Teams," First Monday, (10:2), 2005, p. (none). Crowston, K. and Howison, J. "Hierarchy And Centralization In Free And Open Source Software Team Communications," Knowledge, Technology and Policy, (18:4), 2006, pp. 65-85. Crowston, K., Annabi, H. and Howison, J. "Defining Open Source Project Success," in 2003 International Conference On Information Systems, S. T. March, A. Massey and J. I. DeGross (Eds.), Seattle, Washington, 2003, pp. 327-340. Crowston, K., Annabi, H., Howison, J. and Masango, C. "Effective Work Practices For FLOSS Development: A Model And Propositions," in Proceedings Of The Thirty-Eighth Annual Hawaii International Conference On System Sciences (HICSS '05), Sprague, Ralph H., Jr (Ed.), Waikoloa, Hawaii, 2005a, p. (none). Crowston, K., Howison, J. and Annabi, H. "Information Systems Success In Free And Open Source Software Development: Theory And Measures," Software Process: Improvement and Practice, (11:2), 2006a, pp. 123-148. Crowston, K., Wei, K., Li, Q. and Howison, J. "Core And Periphery In Free/Libre And Open Source Software Team Communications," in Thirty-Ninth Hawaii International Conference On System Sciences, R. Sprague (Ed.), Poipu, Kauai, HI, 2006b, p. (none). Crowston, K., Wei, K., Li, Q., Eseryel, U. and Howison, J. "Coordination Of Free/Libre Open Source Software Development," in Twenty-Sixth International Conference On Information Systems, W. R. King and R. Torkzadeh (Eds.), Las Vegas, NV, 2005b, pp. 181-193. Davidow, W. H. and Malone, M.S. The Virtual Corporation, HarperCollins, New York, NY, 1992. Davies, S. "Choosing Between Concentration Indices: The Iso-Concentration Curve," Economica, (46:181), 1979, pp. 67-75. Davis, F. D. "Perceived Usefulness, Perceived Ease Of Use, And User Acceptance Of Information Technology," MIS Quarterly, (13:3), 1989, pp. 319-339. DeLone, W. H. and McLean, E.R. "Information Systems Success: The Quest For The Dependent Variable," Information Systems Research, (3:1), 1992, pp. 60-94. Delone, W. H. and McLean, E.R. "Information Systems Success Revisited," in Proceedings Of The Thirty-Fifth Hawaii International Conference On System Sciences, J. Nunamaker and R. Sprague (Eds.), WaiKoloa, Hawaii, 2002, p. (none). Dennis, A. R., Valacich, J. S. and Nunnamaker J. F. "An Experimental Investigation Of The Effects Of Group Size In An Electronic Meeting Environment," IEEE Transactions on Systems, Man and Cybernetics, (20:5), 1990, pp. 1049-1057. Deutsch, M. "Trust And Suspicion," Journal of Conflict Resolution, (2:4), 1958, pp. 265-279.

97 Economides, N. and Katsamakas, E. "Two-Sided Competition Of Proprietary Vs. Open Source Technology Platforms And The Implications For The Software Industry," Management Science, (52:7), 2006, pp. 1057-1071. Elliot, M. and Scacchi, W. "Communicating And Mitigating Conflict In Open Source Software Development Projects," Unpublished Paper, University of California, Irvine, California, 2002. Fang, Y. and Neufeld, D.J. "Should I Stay Or Should I Go? Worker Commitment To Virtual Organizations," in Proceedings Of The Thirty-Ninth Hawaii International Conference On System Sciences, R. Sprague (Ed.), Poipu, Kauai, HI, 2006, p. (none). Feller, J. and Fitzgerald, B. "A Framework Analysis Of The Open Source Software Development Paradigm," in 2000 Internationational Conference On Information Systems, W. J. Orlikowski, S. Ang, P. Weill, H. C. Krcmar and J. I. DeGross (Eds.), Brisbane, Australia, 2000, pp. 58-69. Fischer, G., Scharff, E. & Ye, Y. "Fostering Social Creativity By Increasing Social Capital," in Social Capital and Information Technology, M. Huysman & V. Wulf (Eds.), MIT Press, 2002, pp. 355-399. Fornell, C. and Larcker, D. "Evaluating Structural Equation Models With Unobservable Variables And Measurement Error," Journal of Marketing Research, (18:1), 1981, pp. 39-50. Franke, N. and Shah, S. "How Communities Support Innovative Activities: An Exploration Of Assistance And Sharing Among End-Users," Research Policy, (32:1), 2003, pp. 157-178. Freshmeat "Freshmeat.Net: Statistics And Top 20," 2006 (available online at http://freshmeat.net/stats/; accessed April 2006). Gacek, C. and Arief, B. "The Many Meanings Of Open Source," IEEE Software, (21:1), 2004, pp. 33-40. Galbraith, J. R. "Organization Design: An Information Processing View," Interfaces, (4:3), 1974, pp. 28-36. Gallupe, R. B., Dennis, A. R., Cooper, W. H., Valacich, J. S., Bastianutti, L. M. and Nunamaker, J.F.J. "Electronic Brainstorming And Group Size," The Academy of Management Journal, (35:2), 1992, pp. 350-369. George, J., Easton, G., Nunamaker, J. F. and Northcraft, G. "A Study Of Collaborative Group Work With And Without Computer-Based Support," Information Systems Research, (1:4), 1990, pp. 394-415. Ghosh, R. "Study On The: Economic Impact Of Open Source Software On Innovation And The Competitiveness Of The Information And Communication Technologies (ICT) Sector In The EU," International Institute of Infonomics, University of Maastricht, 2006, (available online at http://ec.europa.eu/enterprise/ict/policy/doc/2006-11-20-flossimpact.pdf, accessed January 2007) Ghosh, R. A., Glott, R., Krieger, B. and Robles, G. "Free/Libre And Open Source Software: Survey And Study (A.K.A. Floss Survey)," International Institute of Infonomics, University of Maastricht, 2002, (available online at http://www.infonomics.nl/FLOSS/report/, accessed June 2003) Glaeser, E. L., Laibson, D. and Sacerdote, B. "An Economic Approach To Social Capital," The Economic Journal, (112:483), 2002, pp. 437-458. Grewal, R., Lilien, G. L. and Mallapragada, G. "Location, Location, Location: How Network Embeddedness Affects Project Success In Open Source Systems," Management Science, (52:7), 2006, pp. 1043-1056. 98 Gulati, R. "Does Familiarity Breed Trust? The Implications Of Repeated Ties For Contractual Choices In Alliances," Academy of Management Journal, (38:1), 1995, pp. 85-112. Hardin, G. "The Tragedy Of The Commons," Science, (162:3859), 1968, pp. 1243-1248. Hardin, R. Collective Action, Johns Hopkins University Press, Baltimore, Maryland, 1982. Hars, A. and Ou, S. "Working For Free? Motivations Of Participating In Open Source Projects," in Thirty-Fourth Hawaii International Conference On System Sciences, J. Nunamaker and R. Sprague (Eds.), Wailea Maui, Hawaii, 2001, p. (none). Hartwick, J. and Barki, H. "Explaining The Role Of User Participation In Information System Use," Management Science, (40:4), 1994, pp. 440-465. Hawkins, R. E. "The Economics Of Free Software For A Competitive Firm," Netnomics, (6:2), 2004, pp. 103-117. Hertel, G., Niedner, S. and Herrmann, S. "Motivation Of Software Developers In Open Source Projects: An Internet-Based Survey Of Contributors To The Linux Kernel," Research Policy, (32:7), 2003, pp. 1159-1177. Jarvenpaa, S., Knoll, K. and Leidner, D.E. "Is Anybody Out There? Antecedents Of Trust In Global Virtual Teams," Journal of Management Information Systems, (14:4), 1998, pp. 29-64. Jones, M. C. and Harrison, A.W. "Is Project Team Performance: An Empirical Assessment," Information and Management, (31:2), 1996, pp. 57-65. Kanter, R. M. "Commitment And Social Organization: A Study Of Commitment Mechanisms In Utopian Communities," American Sociological Review, (33:4), 1968, pp. 499-517. Kaufer, D. S.and Carley, K.M. Communication At A Distance: The Influence Of Print On Sociocultural Organization And Change, Lawrence Erlbaum, Hillsdale, NJ, 1993. Keil, M., Tan, B., Wei, K., Saarinen, T., Tuuainen, V. and Wassenaar, A. "A Cross-Cultural Study On Escalation Of Commitment Behavior In Software Projects," MIS Quarterly, (24:2), 2000, pp. 299-325. Keizer, G. "Openoffice.Org Suite Updates," 2005 (available online at http://www.informationweek.com/story/showArticle.jhtml?articleID=175007533; accessed April 2006). Kingstone, S. "Brazil Adopts Open-Source Software," 2005 (available online at http://news.bbc.co.uk/1/hi/business/4602325.stm; accessed April 2006). Klimas, S., Klimas, P. and Klimas, M. "Linux Benefit: For The Undecided," 2003 (available online at http://linux.about.com/od/embedded/l/blnewbie_0toc.htm; accessed April 2006). Knight, J. "Does Linux Really Need A "Killer App" To Succeed?," 2004 (available online at http://trends.newsforge.com/article.pl?sid=04/12/20/1715209&tid=37&tid=2&tid=132&tid=29; accessed May 2005). Kuan, J. "Open Source Software As Lead User’s Make Or Buy Decision: A Study Of Open And Closed Source Quality," in Open Source Software: Economics, Law And Policy, J. Cremer, J. Lerner and J. Tirole (Eds.), Toulouse, France, 2002, p. (none). Kuan, J. W. "Open Source Software As Consumer Integration Into Production," Unpublished Paper, Stanford Institute for Economic Policy Research, Irvine, CA, 2001. Kuan, J. W. "Is Open Source Software Better Than Closed Source Software? Using Bug-Fix Rates To Compare Software Quality," Unpublished Paper, Stanford Institute for Economic Policy Research; Software Industry Center, Carnegie Mellon University, Irvine, CA, 2004. 99 Kuk, G. "Strategic Interaction And Knowledge Sharing In The Kde Developer Mailing List," Management Science, (52:7), 2006, pp. 1031-1042. Lakhani, K. R. and von Hippel, E. "How Open Source Software Works: "Free" User-To-User Assistance," Research Policy, (32:6), 2003, pp. 923-943. Lakhani, K. R., Wolf, B., Bates, J. and DiBona, C. "The Boston Consulting Group/Osdn Hacker Survey," Boston Consulting Group, 2002, (available online at http://www.osdn.com/bcg/BCGHACKERSURVEY-0.73.pdf, accessed June 2003) Lerner, J. and Tirole, J. "Some Simple Economics Of Open Source," The Journal of Industrial Economics, (50:2), 2002, p. 197. Levine, J. M. and Moreland, R.L. "Progress In Small Group Research," Annual Review of Psychology, (41:none), 1990, pp. 585-634. Locke, E. A. "Some Reservations About Social Capital," The Academy of Managment Review, (24:1), 1999, pp. 8-9. Markus, M. L. and Connolly, T. "Why CSCW Applications Fail: Problems In The Adoption Of Interdependent Work Tools," in Proceedings Of The Conference For Computer Supported Work, Association for Computer Machinery (Ed.), New York, NY, 1990, pp. 371-380. Markus, M. L., Manville, B. and Agres, C.E. "What Makes A Virtual Organization Work?," Sloan Management Review, (42:1), 2000, pp. 13-26. McClelland, D. "How Motives, Skills, And Values Determine What People Do," American Journal of Pyschology, (40:7), 1985, pp. 812-825. McDermott, R. "Community Development As A Natural Step," Knowledge Management Review, (3:5), 2000, pp. 16-19. McPherson, M. "The Size Of Voluntary Organizations," Social Forces, (61:4), 1983, pp. 1045- 1064. Mims, B. "Novell, IBM And HP Unite Efforts To Put Linux On Top," Salt Lake City, 2004. MIT "MIT Opencourseware," 2006 (available online at http://ocw.mit.edu/index.html; accessed January 2006). Mockus, A., Fielding, R. T. and Herbsleb, J.D. "Two Case Studies Of Open Source Software Development: Apache And Mozilla," ACM Transactions on Software Engineering Methodology, (11:3), 2002, pp. 309-346. Moreland, R. L. & Levine, J.M. "Socialization In Small Groups: Temporal Changes In Individual-Group Interactions," in Advances in Experimental Social Psychology, L. Berkowitz (Ed.), Academic Press, 1982, pp. 137-192. Nahapiet, J. and Ghoshal, S. "Social Capital, Intellectual Capital, And The Organizational Advantage," Academy of Management Review, (23:2), 1998, pp. 242-266. Netcraft "Netcraft Web Server Survey," 2005 (available online at http://news.netcraft.com/archives/web_server_survey.html; accessed November 2005). Neus, A. and Scherf, P. "Opening Minds: Cultural Change With The Introduction Of Open- Source Collaboration Methods," IBM Systems Journal, (44:2), 2005, pp. 215-225. O'Reilly, T. "Lessons From Open-Source Software Development," Communications of the ACM, (42:4), 1999, pp. 32-37. Olson, M. The Logic Of Collective Action, Harvard University Press, Cambridge, MA, 1965.

100 OpenOffice.org "Openoffice.Org Statistics: Spreadsheet," 2006 (available online at http://stats.openoffice.org/spreadsheet/index.html; accessed March 2006). OSI "Open Source Initiative Osi - Osi History:Documents," 2006a (available online at http://opensource.org/docs/history.php; accessed May 2006). OSI "Open Source Initiative Osi - The Open Source Definition," 2006b (available online at http://opensource.org/docs/definition.php; accessed March 2006). Ostrom, E. Governing The Commons: The Evolution Of Institutions For Collective Action, Cambridge University Press, Cambridge, MA, 1990. Powers, M. "Life Cycles And Volunteering," Human Ecology Forum, (26:3), 1998, pp. 3-9. Rafaeli, S. and LaRose, R.J. "Electronic Bulletin Boards And 'Public Goods' Explanations Of Collaborative Mass Media," Communications Research, (20:2), 1993, pp. 177-197. Raymond, E. S. The Cathedral And The Bazaar: Musings On Linux And Open Source By An Accidental Revolutionary, O'Reilly and Associates, Inc, Sebastopol, CA, 1999. Rice, R. E. "Communication Networking In Computer Conferencing Systems: A Longitudinal Study Of Group Roles And System Structure," in Communication Yearbook, M. Burgoon (Ed.), Sage, 1982, pp. 925-944. Ridings, C. M., Gefen, D. and Arinze, B. "Some Antecedents And Effects Of Trust In Virtual Communities," Journal of Strategic Information Systems, (11:3), 2002, pp. 271-295. Roberts, J. A., Hann, I. and Slaughter, S.A. "Understanding The Motivations, Participation, And Performance Of Open Source Software Developers: A Longitudinal Study Of The Apache Projects," Management Science, (52:7), 2006, pp. 984-999. Rotter, J. "Interpersonal Trust, Trustworthiness, And Gullibility," American Psychologist, (35:1), 1980, pp. 1-7. Sagers, G. W. "The Influence Of Network Governance Factors On Success In Open Source Software Development Projects," in Proceedings Of The Twenty-Fifth International Conference On Information Systems, V. Sambamurthy and R. T. Watson (Eds.), Washington, DC, 2004, pp. 427-438. Samba Project "What Is Samba?," 2005 (available online at http://us1.samba.org/samba/what_is_samba.html; accessed December 2005). Samuelson, P. A. "The Pure Theory Of Public Expenditure," Review of Economics and Statistics, (36:4), 1954, pp. 387-390. Sarrel, M. D. "Top 15 Firefox Extensions," 2005 (available online at http://www.pcmag.com/article2/0,1759,1758849,00.asp; accessed March 2005). Savannah "Savannah: Statistics," 2006 (available online at http://savannah.gnu.org/stats/; accessed April 2006). Savvas, A. "Massachusetts Open Source Cio Resigns," 2006 (available online at http://www.computerweekly.com/Articles/2006/01/03/213502/MassachusettsopensourceCIOresi gns.htm; accessed January 2006). Schmidt, D. C. and Porter, A. "Leveraging Open Source Communities To Improve The Quality And Performance Of Open Source Software," in Making Sense Of The Bazaar: Proceedings Of The First Workshop On Open Source Software Engineering, Feller, Joseph, Fitzgerald, Brian and A. van der Hoek (Eds.), Toronto, Canada, 2001, pp. 52-56.

101 Seddon, P. B. "A Respecification And Extension Of The Delone And Mclean Model Of IS Success," Information Systems Research, (8:3), 1997, pp. 240-252. Shah, S. K. "Motivation, Governance, And The Viability Of Hybrid Forms In Open Source Software Development," Management Science, (52:7), 2006, pp. 1000-1014. Shaw, M. E. Group Dynamics: The Psychology Of Small Group Behavior, McGraw-Hill, New York, NY, 1981. Shin, S. and Kleiner, B.H. "How To Manage Unpaid Volunteers In Organizations," Management Research News, (26:2-4), 2003, pp. 63-71. Sourceforge "Anyone Else Having Msn/Messenger Problems," 2005a (available online at http://sourceforge.net/forum/message.php?msg_id=3078208; accessed May 2006). Sourceforge "Sourceforge.Net: Welcome To Sourceforge.Net," 2005b (available online at http://sourceforge.net; accessed November 2005). Sourceforge "Sourceforge.Net: G03. Data Preservation Policy, Data Removal Instructions," 2006 (available online at http://sourceforge.net/docman/display_doc.php?docid=14041&group_id=1#project_alt; accessed April 2006). Stamelos, I., Angelis, L., Oikonomou, A. and Bleris, G.L. "Code Quality Analysis In Open Source Software Development," Information Systems Journal, (12:1), 2002, pp. 43-60. Stewart, D. "Status Mobility And Status Stability In A Community Of Free Software Developers," in 2003 Annual Meeting Of The Academy Of Management, D. M. Rousseau (Ed.), Seattle, WA, 2003, p. (none). Stewart, K. J. and Ammeter, T. "An Exploratory Study Of Factors Influencing The Level Of Vitality And Popularity Of Open Source Projects," in Twenty-Third International Conference On Information Systems, F. Miralles and J. Valor (Eds.), Barcelona, Spain, 2002, pp. 843-847. Stoecklin-Serino, C. M. "Building Trust: An Examination Of The Impacts Of Brand Equity, Security, And Personalization On Trust Processes." Doctoral Dissertation, Florida State University, Tallahassee, FL, 2005. Stogdill, R. M. Individual Behavior And Group Achievement, Oxford University Press, New York, NY, 1959. Tuckman, B. W. and Jensen, M.A.C. "Stages Of Small Group Development Revisited," Group And Organization Studies, (2:4), 1977, pp. 419-427. Turoff, M. "Computer-Mediated Communication Requirements For Group Support," Journal of Organizational Computing, (1:1), 1991, pp. 85-113. U.S. Department Of Justice "DOJ Antitrust: Herfindahl-Hirschman Index," 2007 (available online at http://www.usdoj.gov/atr/public/testimony/hhi.htm; accessed January, 2007). USA Today "Microsoft Loses Lucrative Munich Deal To Rival Linux," 2003 (available online at http://www.usatoday.com/tech/news/2003-05-29-linux-munich-choose_x.htm; accessed December 2005). von Hippel, E. "Lead Users: A Source Of Novel Product Concepts," Management Science, (32:7), 1986, pp. 791-805. von Krogh, G., Spaeth, S. and Lakhani, K.R. "Community, Joining, And Specialization In Open Source Software Innovation: A Case Study," Research Policy, (32:7), 2003, pp. 1217-1241.

102 Wasko, M. and Faraj, S. "Why Should I Share? Examining Social Capital And Knowledge Contribution In Electronic Networks Of Practice," Management Information Systems Quarterly, (29:1), 2005, pp. 35-58. Weber, S. The Success Of Open Source, Harvard University Press, Cambridge, MA, 2004. Wellman, B. and Wortley, S. "Different Strokes For Different Folks: Community Ties And Social Support," American Journal of Sociology, (96:3), 1990, pp. 558-588. Wellman, B., Salaff, J., Dimitrova, D., Garton, L., Gulia, M. and Haythornwaite, C. "Computer Networks As Social Networks: Collaborative Work, Telework, And Virtual Community," Annual Review of Sociology, (22:1), 1996, pp. 213-238. Wenger, E. Communities Of Practice, Cambridge University Press, Cambridge, UK, 1998. Werts, C. E., Linn, R. L. and Joreskog, K.G. "Intraclass Reliability Estimates: Testing Structural Assumptions," Educational and Psychological Measurement, (34:1), 1973, pp. 25-33. Wheeler, D. A. "Why Open Source Software / Free Software (OSS/FS, FLOSS, Or FOSS)? Look At The Numbers!," 2005 (available online at http://www.dwheeler.com/oss_fs_why.html; accessed January 2006). Williamson, O. E. "Calculativeness, Trust, And Economic Organization," Journal of Law and Economics, (36:1), 1993, pp. 453-486. Wilson, J. "Volunteering," Annual Review of Sociology, (26:none), 2000, pp. 215-240. Wilson, J. and Musick, M. "Who Cares? Toward An Integrated Theory Of Volunteer Work," American Sociological Review, (62:5), 1997, pp. 694-713. Wittenbaum, G. M. & Stasser, G. "Management Of Information In Small Groups," in What's Social about Social Cognition?: Research on Socially Shared Cognition in Small Groups, Nye, J.L., Brower, A. M. (Ed.), Sage, 1996, pp. 261-282. Wold, H. "Systems Under Indirect Observation Using PLS," in A Second Generation of Multivariate Analysis, C. Fornell (Ed.), Praeger, 1982, pp. 325-347. Yamagishi, T. and Sato, K. "Motivational Bases Of The Public Goods Problem," Journal of Personality and Social Psychology, (50:1), 1986, pp. 67-73. Zhao, L. and Elbaum, S. "Quality Assurance Under The Open Source Development Model," Journal Of Systems And Software, (66:1), 2003, pp. 65-75.

103 BIOGRAPHICAL SKETCH Glen Sagers is a doctoral candidate at Florida State University. He received his Bachelor of Science degree from Utah State University in 1999 and his Masters of Business Administration from Kansas State University in 2002. His research interests include open source software development and resource exchange in online communities. He is currently an assistant professor in the School of Information Technology at Illinois State University.

104