<<

Purdue University Purdue e-Pubs

Open Access Theses Theses and Dissertations

Fall 2014 Impact of license selection on quality Benjamin J. Cotton Purdue University

Follow this and additional works at: https://docs.lib.purdue.edu/open_access_theses Part of the Computer Engineering Commons, and the Computer Sciences Commons

Recommended Citation Cotton, Benjamin J., "Impact of license selection on open source software quality" (2014). Open Access Theses. 314. https://docs.lib.purdue.edu/open_access_theses/314

This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact @purdue.edu for additional information.

PURDUE UNIVERSITY GRADUATE SCHOOL Thesis/Dissertation Acceptance

Benjamin James Cotton ! ! Impact of license selection on open source software quality

Master of Science

Kevin Dittman Jeffrey Brewer

Jeffrey Whitten

To the best of my knowledge and as understood by the student in the Thesis/Dissertation Agreement, Publication Delay, and Certification/Disclaimer (Graduate School Form 32), this thesis/dissertation adheres to the provisions of Purdue University’s “Policy on Integrity in Research” and the use of copyrighted material.

Kevin Dittman Jeffrey Whitten 11/24/2014

IMPACT OF LICENSE SELECTION ON OPEN SOURCE SOFTWARE

QUALITY

AThesis

Submitted to the Faculty

of

Purdue University

by

Benjamin J. Cotton

In Partial Fulfillment of the

Requirements for the Degree

of

Master of Science

December 2014

Purdue University

West Lafayette, Indiana ii

Dedicated to my wife, Angela, and my daughters, Eleanor and Bridget, whose unconditional love and support made this possible. iii

ACKNOWLEDGMENTS

No thesis is ever completed without support, advice, and encouragement. I would like to thank the following people for their contributions to my e↵orts. Professors Kevin Dittman, Je↵rey Whitten, and Je↵rey Brewer whose guidance and input as I developed the idea for this work kept me on the right track. Professors Dittman and Whitten also taught several classes that kept me motivated in the pursuit of my degree. Gerry McCartney, whose creation of the ITaP Scholarship Program inspired me to apply to graduate school in the first place. Preston Smith, Randy Herban, Carol Song, and Rob Futrick, who were my supervisors at various times through the course of my graduate studies and graciously allowed me time to attend class in the middle of the work day. Similarly, I must acknowledge the coworkers who had to deal with my sporadic absences. Finally, but not least, members of various open source communities including the Fedora Documentation team and the Greater Lafayette Open Source Symposium. Their ideas, both related to my thesis and not, have shaped my interest in open source and community collaboration. iv

TABLE OF CONTENTS

Page LIST OF TABLES ...... vi LIST OF FIGURES ...... vii ABBREVIATIONS ...... viii GLOSSARY ...... ix ABSTRACT ...... x CHAPTER 1. INTRODUCTION ...... 1 1.1 Statement of the Problem ...... 1 1.2 Significance of the Problem ...... 2 1.3 Research Question ...... 4 1.4 Licenses ...... 4 1.4.1 ...... 4 1.4.2 Permissive ...... 5 1.5 Assumptions ...... 5 1.6 Limitations ...... 6 1.7 Delimitations ...... 6 1.8 Summary ...... 7 CHAPTER 2. REVIEW OF THE LITERATURE ...... 8 2.1 Definition of Quality ...... 8 2.2 Quality Metrics ...... 9 2.2.1 Bug reports ...... 9 2.2.2 Selecting metrics ...... 9 2.2.3 Static Analysis ...... 12 2.3 Technical Debt ...... 12 2.3.1 Definition ...... 12 2.3.2 Measurement ...... 13 2.4 SonarQube ...... 15 2.5 Summary ...... 15 CHAPTER 3. METHODOLOGY ...... 16 3.1 Hypotheses ...... 16 3.2 Data Collection ...... 16 3.2.1 Software Selection ...... 16 3.2.2 Metrics Collected ...... 20 v

Page 3.2.3 Collection Environment ...... 21 3.3 Analysis Methods ...... 21 3.4 Threats to Validity ...... 21 3.5 Summary ...... 22 CHAPTER 4. PRESENTATION OF DATA AND FINDINGS ...... 23 4.1 Presentation of the data ...... 23 4.2 Analysis of the data ...... 27 4.3 Summary ...... 29 CHAPTER 5. CONCLUSION, DISCUSSION, AND RECOMMENDATIONS 31 5.1 Conclusion ...... 31 5.2 Future Work ...... 32 5.3 Summary ...... 32 LIST OF REFERENCES ...... 33 vi

LIST OF TABLES

Table Page 1.1 The open source licenses included in this study...... 7 2.1 Attributes of software quality as defined by Boehm, Brown, and Lipow (1976)...... 11 3.1 Software projects included in this study ...... 17 3.2 The measures used to evaluate projects...... 20 4.1 Complexity and technical debt measurements ...... 23 4.2 Mean technical debt of projects by language and paradigm...... 30 vii

LIST OF FIGURES

Figure Page 4.1 Distribution of technical debt for programs in this study ...... 27 4.2 Distribution of programming languages for programs in this study ... 28 4.3 Technical debt of projects analyzed in this study ...... 29 4.4 Technical debt of programs in this study ...... 30 viii

ABBREVIATIONS

BSD Berkeley Software Distribution CDDL Common Development and Distribution License GPL GNU General Public License FSF Foundation KLOC thousand lines of code LGPL ”Lesser” GNU General Public License LOC lines of code ISO International Standards MPL Public License MTBF mean time between failures MTTF mean time to failure OSI OSS open source software PMI Institute SQALE software quality assessment based on life-cycle expectations ix

GLOSSARY free software software under a license that provides the four freedoms defined by the (2013b) open source software under a license that meets the definition given by the Open Source Initiative (n.d.) permissive software under a license that is open source but not free software x

ABSTRACT

Cotton, Benjamin J. M.S., Purdue University, December 2014. Impact of license selection on open source software quality. Major Professor: Kevin C. Dittman.

Open source software plays an important part in the modern world, powering businesses large and small. However, little work has been done to evaluate the quality of open source software. Two di↵erent license paradigms exist within the open source world, and this study examines the di↵erence in software quality between them. In this thesis, the author uses technical debt as a measure of software quality. Eighty open source projects (40 from each paradigm) were downloaded from the popular open source hosting SourceForge. Using complexity, code duplication, comments, and unit test coverage as inputs to the SonarQube technical debt model, each project was evaluated. The technical debt was normalized based on the cyclomatic complexity and the paradigms were compared with the Mann-Whitney test. The results showed a clear di↵erence between the two paradigms. However, the results presented in this thesis are only a starting point. The collected data suggest that the programming language used in a project has an impact on the project’s quality. In addition, SonarQube plugins for the popular C and C++ languages were beyond the budget of this work, excluding many projects from consideration. This thesis closes with several suggestions for further avenues of investigation. 1

CHAPTER 1. INTRODUCTION

This chapter presents the foundation of the study. It begins with a statement of the problem and its significance. The research question is stated in clear terms. Important definitions, including explanations of the license paradigms referenced throughout the study, are provided. Assumptions, limitations, and delimitations applicable to the study are enumerated.

1.1 Statement of the Problem The development of open source software has grown from the purview of hobbyist programmers into a major source of revenue. In 2012, Red Hat became the first open source company to record a billion dollars in revenue (Babcock, 2012). Red Hat has seen steady growth in revenue in the past decade and reported net revenue above $100 million in 2011 and 2012 (Red Hat, 2013). Other companies such as Oracle also generate revenue from support of their open source o↵erings. Many large Internet corporations such as Google, Facebook, and Amazon make heavy use of open source software to run their business. Small businesses especially rely on the open source Word Press and MySQL projects for their web presence (Hendrickson, Magoulas, & O’Reilly, 2012). Researchers (Kuan, 2002; Mockus, Fielding, & Herbsleb, 2002) have investigated the quality of open source projects in comparison to their proprietarily-developed counterparts. Some e↵orts have been made to determine the e↵ects on quality of practices in open source projects (Capra, Francalanci, & Merlo, 2008). Additional work has examined the role of license selection in developer and user engagement (Subramaniam, Sen, & Nelson, 2009). To date, no one has researched the e↵ect of license selection on software quality. 2

Because ”quality” is a broad term with many possible measures, this study focuses on technical debt as a measure of quality. There are two broad license paradigms in the open source world. One requires authors of derivative works to make their source available under the same license. The other permits derivative works to be relicensed under any terms. This work investigates if this di↵erence between the legal implications of licenses results in a di↵erence in technical debt.

1.2 Significance of the Problem As stated previously, open source software is a key factor in the success of many small businesses. In their study, Hendrickson et al. (2012) projected that United States companies who rely on hosted web sites powered by open source software have an annual revenue of approximately one trillion dollars. Although that projection is based on several layers of assumptions, if it is at the right order-of-magnitude it represents approximately 5-10% of the 2012 gross domestic product of the United States (United States Bureau of Economic Analysis, 2013). This does not include the revenue from businesses that do not use third-party web hosting but nevertheless rely on open source software for operations or revenue. With open source software playing such a large role in the economy, the quality of the software becomes a critical attribute. A decade ago, software bugs had an annual cost of $60 billion to the United States economy (Newman, 2002). More recently, a bug in stock trading software caused Knight Capital a $440 million loss within a period of minutes (Mackenzie, 2012). These bugs are not necessarily attributable to open source software (in the Knight Capital case, particularly, the software was developed in-house and presumably was built around a closely-guarded algorithm), but they show the impact that software quality can have. Quality in open source software came to the public eye in April of 2014 when the ”Heartbleed” bug was disclosed. This vulnerability in the widely-used OpenSSL 3 project opened the door for attackers to retrieve data from supposedly-protected servers. In addition to the costs of rapid mitigation e↵orts, Heartbleed was used to steal data on over 4.5 million Community Health Systems patients (Deutscher, 2014). Many quality control tools and methodologies are available to project managers. Some tools have eliminated entire classes of bugs from software (Siy & Votta, 2001), but quality remains a concern. While methodologies like Six Sigma (American Society for Quality, n.d.) and frameworks like the Capability Maturity Model Integration (CMMI Institute, n.d.) can aid in improving quality, there is generally an investment of time and money required to implement them. For projects that are community-driven without reliable financial backing, such formalized techniques are not easily implemented. If the mere act of selecting the appropriate license can reduce technical debt, there is significant value in being able to make that decision. Ensuring the quality of a project is one of the six basic functions of a project manager (Brewer & Dittman, 2010, p. 18). Quality is not restricted to the code developed by the project team. Upstream software quality becomes important when it is brought into the project. Defect removal is the most costly aspect of a project (Jones, 2004), so selecting higher-quality starting points can lower the total cost of aproject.Projectsthataremorefocusedontechnologyintegrationthan development may also use open source software as an input. Thus, this study touches on two separate areas of the Project Management Body of Knowledge: Quality Management and Procurement Management. Technical debt is advantageous in this regard because it is measured in development days. Thus, a product’s technical debt can serve as direct input to the project schedule. 4

1.3 Research Question Does a di↵erence exist in the technical debt of software developed under a copyleft license versus software developed under a permissive license?

1.4 Licenses

1.4.1 Copyleft Copyleft licenses, as typified by the GNU General Public License (GPL), focus on preserving the freedom of users (Williams, 2011). Anyone who receives software licensed under a copyleft license may edit and re-distribute the software, provided that the derivative work is also licensed under the same terms (Carver, 2005). This requirement ensures that software initially released under an open source license cannot be re-licensed under a license that restricts the four freedoms laid out by the Free Software Foundation (2013b). Critics decry the coincident e↵ect of making all software that includes copyleft-licensed code necessarily copyleft, arguing that it is a threat to intellectual property (Mundie, 2011). The legal and political merits of copyleft licenses are beyond the scope of this work. Raymond (1999) famously noted ”given enough eyes, all bugs are shallow.” If this is true, it follows that reducing the pool of potential contributors potentially weakens a projects ability to produce quality software. Developers who wish to use a non-copyleft license would avoid projects that use a copyleft license and would not make derivative works of copyleft projects. Thus, employing a copyleft license might reasonably be expected to diminish the quality of a software project. Research has indeed shown that use of copyleft licenses is associated with lower developer interest (Subramaniam et al., 2009). Conversely, because copyleft licenses compel derivative works to use the same license, any defects fixed downstream are available to all users, including the original developer. Proponents could argue that copyleft promotes quality software by 5 preventing fixes from being ”hidden”. However, the original developer is under no obligation to accept any bug fixes or enhancements created by downstream projects.

1.4.2 Permissive Permissive licenses still meet the Open Source Initiatives definition (n.d.) of ”open source”, but lack the same-license requirement that is the hallmark of copyleft. Anyone who receives software licensed under a permissive license may edit and re-distribute the software, under any license they choose (Carver, 2005). This leads to the possibility that software that was formerly open source may become proprietary. For those who view free software as a moral imperative (Williams, 2011), such licenses are unacceptable. From a practical standpoint, permissive licenses are associated with higher levels of developer interest (Subramaniam et al., 2009). One may expect that this is due to the fact that permissive licenses maximize developer freedom instead of user freedom. Again using Raymond’s (1999) ”enough eyes” aphorism, it is reasonable to expect that a broader developer pool will result in higher quality software. However, Subramaniam et al. (2009) found an association between permissive licenses and lower interest from users. One may expect a lower degree of user feedback (e.g. bug reports) as a result, providing fewer opportunities for developers to notice and fix defects. Since derivative works can include fixes not available to the original authors, it can be seen how permissive licenses might hinder the quality of the original software.

1.5 Assumptions The following assumptions are made for this study:

Quantitative analysis of code provides a meaningful measure of software • quality. Because a user’s perception of quality is by nature subjective, any results from a qualitative study may not reflect the view of users. 6

The variation in quality between software using di↵erent licenses of the same • paradigm is negligible.

The projects selected for analysis are representative of all projects using the • same licensing paradigm. It is reasonable to conjecture that licenses within the same paradigm may have di↵erent e↵ects on technical debt. However, for the sake of simplicity and increased sample size, this study will assume no variation.

Astatisticallysignificantdi↵erenceintechnicaldebtwillbemeaningfulto • open source projects. Those who see the terms of a as a moral issue will not likely be swayed by practical arguments.

1.6 Limitations The following limitations apply to this study:

Technical debt will be the only measure of software quality to be evaluated. •

1.7 Delimitations The following delimitations apply to this study:

Only projects that use one of the most popular licenses as given in the Open • Source Initiative’s (2006) Report will be considered. These licenses are listed in Table 1.1. The paradigm classification for each license is based on comments published by the (Free Software Foundation, 2013a).

No distinction will be drawn between versions of the same license, so long as • the versions fall into the same licensing paradigm. 7

No metrics that rely on bug reports will be used. The quantity of bug reports • is likely related to the usage of a project. Usage of open source projects can rarely be measured directly (Crowston, Annabi, & Howison, 2003), making it dicult to account for this e↵ect.

Only projects written in a programming language for which a free plugin for • the SonarQube software platform exists.

Table 1.1 The open source licenses included in this study.

License Paradigm ApacheLicense Permissive BSD 2-Clause Permissive BSD 3-Clause Permissive Common Development and Distribution License Copyleft EclipsePublicLicense Copyleft GNU ”Lesser” General Public License (LGPL) Copyleft GNUGeneralPublicLicense(GPL) Copyleft MIT(a.k.a.”X11”)License Permissive Copyleft

1.8 Summary This chapter presented an introduction of the research study, beginning with a statement of the problem and its significance. The research question was provided, along with definitions, including explanations of the license paradigms referenced throughout the study. Assumptions, limitations, and delimitations applicable to the study were enumerated. 8

CHAPTER 2. REVIEW OF THE LITERATURE

This chapter presents a review of literature relevant to this study. It examines the definition of quality and discusses approaches to and diculties in measuring quality. Finally, the concept of technical debt is presented as well as a review of literature in support of the tool used in this study.

2.1 Definition of Quality Crosby (1979) defined quality as ”conformance to requirements.” He rightly notes that inexpensive, commodity goods and high-end, luxury products can both be of high quality if they meet the requirements of the customer. Subsequent definitions, such as those by the International Standards Organization (ISO) and Project Management Institute (PMI), use a similar basis (Brewer & Dittman, 2010). While this definition is very suitable in a theoretical context, Crosby does note that it is dependent on requirements being clearly and fully stated. Open source projects frequently lack formal project management processes, including requirements analysis (Feller, Fitzgerald, et al., 2002). If a project lacks suciently defined requirements, its quality cannot be assessed based on those requirements. Crosby’s definition makes a good guiding principle for this work, but it cannot serve as a practical definition. 9

2.2 Quality Metrics

2.2.1 Bug reports Direct user measures, such as bug reports, seem on the surface to be a reasonable proxy for conformance to requirements. Florac (1992) stated that ”the number and frequency of problems and defects associated with a software product are inversely proportional to the quality of the software.” However, Crowston et al. (2003) identified several problems with the user-driven measures when applied to open source projects. Foremost is the poorly-defined user base. The decentralized nature of distribution for many open source projects makes a census of users nearly impossible. Feedback that is gathered from users tends to be non-representative. Surveys of users are possible if the survey mechanism is included in the software (or if the software includes automated bug reporting). Since we cannot guarantee that the user bases for all software included in this study are identical, use of such mechanisms is questionable. Mohagheghi, Conradi, and Børretzen (2006) identified further problems with the use of bug reports to evaluate the quality of open source software. Defect reports and enhancement requests are often entered into the same tracking system, which makes analysis based on reports dicult without manual filtering. In addition, some reports do not reference the release number of the software, leaving the applicability to the current release ambiguous. Based on the evidence from literature, it is clear that bug-related metrics are not suitable for evaluating open source software produced by di↵erent groups.

2.2.2 Selecting metrics What kinds of metrics are appropriate? Kaner and Bond (2004) argued direct metrics are best. Derived measures such as mean time to failure (MTTF) and mean time between failures (MTBF) too ambiguous. In addition, users will 10 experience software di↵erently based on their skills and their needs. Some metrics appear on the surface to be direct but are hard to define. Examples include lines of code (the meaning of a ”line” is vague and language-dependent) and developer e↵ort, which is a↵ected by many factors. Kaner and Bond go on to to define a method for metric selection. Metrics should not be selected based on what operations can be performed to calculate them. Instead, metric selection begins with considering what question needs answered and the nature of the information required to answer the question. The final step is to select metrics that address the information in the appropriate context. Schneidewind (1992) agreed that direct measures of quality factors is best, but conceded that this is not often possible. When quality factors cannot be measured directly, the metrics used must be validated. ”Metrics should be evaluated to determine whether they measure what they purport to measure prior to using them.” The quality measurement methodology should be based on the perspective of the user, not the developer. Metric validation is done with noparametric statistical methods because the assumptions regarding distribution are less demanding. He lists six criteria for validating a metric: association, consistency, discriminative power, tracking, predictability, and repeatability. Boehm et al. (1976) said the goal is to define metrics such that ”given an arbitrary program, the metric quantitatively measures the desired attribute and overall software quality is a function of one or more metrics.” They defined seven key attributes of software, each composed of several sometimes- overlapping primitives, as listed in Table 2.1. These fifteen primitive attributes form a basis for selecting software quality metrics. Boehm et al. used the key attributes to answer three questions about software. The first question is how well can the software be used as-is? This is referred to as ”as-is utility” and is composed of reliability, eciency, and human engineering. The second question is how easy is the software to maintain? This is 11

Table 2.1 Attributes of software quality as defined by Boehm et al. (1976).

Key attribute Primitive attributes Portability Device independence Self-containedness Reliability Self-containedness Accuracy Completeness Robustness/integrity Consistency Eciency Accountability Device eciency Accessibility Human Engineering Robustness/integrity Accessibility Communicativeness Testability Accountability Accessibility Communicateiveness Self-descriptiveness Understandability Consistency Self-descriptiveness Structuredness Conciseness Legibility Modifiability Structuredness Augmentability 12 referred to as ”maintainability” and is composed of testability, understandability, and modifiability. The final question is can the software still be used in a di↵erent environment? This is answered solely by the portability attribute.

2.2.3 Static Analysis Florac (1992) defined static analysis of code as an examination that ”identifies a problem or defect that is found in a non-operational environment.” This is in contracts to analysis of software that is dependent on executing the code. Static analysis may be performed by code reviews, lint programs, or other means.

2.3 Technical Debt

2.3.1 Definition Cunningham (1992) was the first to use the term ”technical debt.” He used the analogy of a mortgage loan: ”not right code” is used to ship a product more quickly in the same way a mortgage is used to buy a house more quickly. As a mortgage must be paid with interest, so too must the ”not right code” be paid with rework. The greater the technical debt incurred upfront, the greater the payment required in the future. In e↵ect, technical debt may be understood as selling quality to purchase a shorter initial development time. Although Cunningham is credited with introducing the term to the literature, it is easy to trace the roots of the idea back further. Crosby (1979) wrote ”every penny you don’t spend on doing things wrong, over, or instead becomes half a penny right on the bottom line.” He called the expense of scrap, rework, testing, inspection, et cetera the ”cost of quality.” Crosby’s arguments for saving actual cash can be taken as an argument in favor of reducing technical debt. 13

Brown et al. (2010) refined Cunningham’s metaphor by introducing several components of technical debt: principal, interest probability, and interest amount. Principal is the cost of eliminating debt by refactoring or other means. Interest probability represents the probability that a particular form of technical debt will be manifest in a visible way. The third and final component is interest amount, which is the additional cost of fixing the debt, perhaps because the defect was discovered by the customer. They also note that non-code artifacts such as design documents and testing plans can contribute to the technical debt of a project. Brown et al. conclude that technical debt is not necessarily a problem. It only challenges a project when the debt becomes too high. The definition of ”how high” remains open because there were no tested methods of assigning a present value to debt at the time their paper was published. (Gat & Ebert, 2012) agree that technical debt is a useful metaphor, but disagree with each other on the application. Gat argued for assigning dollar values to technical debt as a way to e↵ectively communicate the impact to stakeholders. Ebert reiterates the point that not all debt is bad debt, and suggests that the imprecise calculation of debt is a distraction.

2.3.2 Measurement Nugroho, Visser, and Kuipers (2011) proposed a method for measuring technical debt based on project estimation principles. They define ”rework e↵ort” as the product of rework fraction and rebuild value. Rework fraction is simply the portion of the code base that requires rework in order to pay o↵technical debt. Rebuild value is defined as an estimate of e↵ort required to complete the necessary rework tasks. Debt interest is estimated by the maintenance e↵ort, which is calculated by dividing the product of the maintenance factor and the rebuild value by the quality factor. The maintenance factor is an estimation of the amount of 14 code changes required due to maintenance, and the quality factor is a value based on the quality level of the existing code. Eisenberg (2012) developed another method after rejecting the built-in technical debt calculator provided by the analysis tool (Sonar, now called ”SonarQube”) he used. This tool is discussed further later in this chapter. Eisenberg identified six statically analyzed metrics that contribute to technical debt. The first is the amount of duplicate code. Duplicate code requires multiple edits for asinglechangeandthusleadstoadditionalmaintenancee↵ort.Thesecondmetric is rules compliance, where rules are industry and program-specific coding standards. Deviance from these standards diminish code readability and maintainability. The third metric is interface comments. Undocumented program interfaces can lead to inadvertent defects and make the code more dicult to understand. The fourth metric is the density of general comments. Package interdependence is the fifth metric. Highly interdependent packages are harder to maintain. The final static metric is method/class complexity. The more complex a method or class is, the more dicult it is to understand and maintain. Eisenberg then calculated debt based on estimated e↵ort required to fix each occurrence of a remediation. The sum of the e↵ort can be converted to a labor cost based on actual or representative developer rates. While Eisenberg’s method is relatively simple, he does not provide a well-explored reason for rejecting the included calculation tool. As a result, evaluating the validity of his method is dicult. Conley and Sproull (2009) had suggestions for some of the measurements. They noted that modularity should be measured using packages instead of files or classes. They additionally used McCabe’s (1976) cyclomatic complexity to evaluate the complexity of code modules, a standard measure in software engineering. 15

2.4 SonarQube SonarQube (2013) is an open source static analysis tool designed to support multiple languages and metrics. SonarQube comes with a technical debt plugin, which Eisenberg (2012), as noted above, found unsuitable. He still used SonarQube to perform the measurements for his technical debt measurement methodology. An additional plugin with a more refined technical debt model is also available for a fee. This plugin is based on the SQALE (software quality assessment based on life-cycle expectations) method developed by Hegeman (2011). Since its initial release, SonarQube has been employed in several scholarly works. Plosch, Gruber, Korner, and Saft (2010) noted that regular use of static analysis requires ”measures that can be retrieved automatically” due to the e↵ort required for manual evaluation. SonarQube matches their ”vision of tool-support for continuous quality management and can be integrated into the development process.” SonarQube provides trend-based visualization but does not give guidelines on how to deal with the results. For the purposes of this study, that is not a concern. Haderer, Khomh, and Antoniol (2010) called SonarQube one of the leading static analysis tools. They found that Sonar does not integrate well into automated testing workflows, but again that is outside the scope of this study’s concern. Motherudin and Tong (2011) used SonarQube in the development of their dashboard. They recommended that SonarQube be incorporated by projects into their own project management dashboards.

2.5 Summary This chapter provided a review of the relevant literature. The next chapter provides the framework and methodology used in this study. 16

CHAPTER 3. METHODOLOGY

This chapter provides the framework and methodology used in the research study. In this chapter, I present the testing hypothesis, the environment used for data collected. The process for metric collection and analysis is explained. This chapter concludes with a discussion of the threats to validity.

3.1 Hypotheses The hypotheses for this study are:

H0: No di↵erence in technical debt exists between copyleft-licensed and permissive-licensed software.

H↵:Adi↵erenceintechnicaldebtexistsbetweencopyleft-licensedand permissive-licensed software.

3.2 Data Collection

3.2.1 Software Selection Software packages analyzed for this study come from the open source hosting site SourceForge.com. The packages were selected by performing a search of the ”Most Popular” projects for the licenses listed in Table 1.1. The selected packages and their license paradigm are listed in Table 3.1. All packages were downloaded on March 29, 2014. The most recent available release was downloaded; in-development branches were not considered except when a stable release was not presented. 17

Table 3.1: Software projects included in this study

Software package Version License Paradigm PDFMerge 1.22 Permissive Adminer Permissive OpenGTS 2.5.3 Permissive Json-lib 2.4 Permissive Joda-Time 2.3.2 Permissive Sahi 20130429 Permissive FlacSquisher 1.2.1 Permissive OWASPZedAttackProxy 2.2.2 Permissive VietOCR 3.5 Permissive Ganglia (gmetad-python) 3.6.0 Permissive PMD 5.1.0 Permissive CoCEd Permissive DrJava Permissive JSch 0.1.51 Permissive HyperSQLDatabaseEngine 2.3.2 Permissive ISPConfigHostingControlPanel 3.0.5.3 Permissive FreeMarker 2.3.20 Permissive lwjgl 2.9.1 Permissive dom4j 1.6.1 Permissive WikidPad 2.2 Permissive DSpace 4.1 Permissive KoLmafia 16.2 Permissive PyFFI 2.1.11 Permissive Geotag 0.092 Permissive ksar 5.0.6 Permissive PyUSB 1.0.0b1 Permissive 18

Table3.1: continued Software package Version License Paradigm FreeTTS 1.2.2 Permissive SciPy 0.14.0 Permissive NagiosQL 3.2.0 Permissive Simple CV 1.3 Permissive Ming 0.4.2 Permissive SCons 2.3.1 Permissive PyTZ 2006p Permissive SimpleHTMLDOMParser 1.5 Permissive picard 1.1.3 Permissive Open Source Point of Sale 2.2 Permissive pyparsing 2.0.1 Permissive phpseclib 0.3.6 Permissive magmi 0.7.18 Permissive HybridAuth 2.1.2 Permissive SAP NetWeaver Server Adapter for 0.7.2 Copyleft Div/er Copyleft ShellEd 2.0.3 Copyleft PyDev 3.4.1 Copyleft RSS Owl 2.2.1 Copyleft EclipseFP 2.5.6 Copyleft jautodoc 1.11.0 Copyleft DocFetcher Copyleft Mondrian 3.6.1 Copyleft Robocode 1.9.1.0 Copyleft Eclipse Checkstyle Plug-in 5.7.0 Copyleft TikiOne Steam Cleaner 2.3.1 Copyleft TuxGuitar 1.2 Copyleft 19

Table3.1: continued Software package Version License Paradigm VASSAL 3.2.11 Copyleft SQuirreL SQL Copyleft HattrickOrganizer r2501 Copyleft jTDS 1.3.1 Copyleft JFreeChart 1.017 Copyleft SoapUI 4.6.4 Copyleft JasperReportsLibrary 5.5.1 Copyleft SweetHome3D 4.3 Copyleft phpMyAdmin 4.1.12 Copyleft Weka Copyleft FreeMind 1.0.0 Copyleft AngryIPScanner Copyleft Vuze 5300 Copyleft gns3 0.8.6 Copyleft SABnzbd+ 0.7.17 Copyleft OpenLP 2.0.4 Copyleft TaskCoach 1.4.0 Copyleft eXtplorer 2.1.5 Copyleft The Bug Genie 3.2.6 Copyleft jpcap 0.01.16 Copyleft HAPI 2.2 Copyleft RPy 2.3.1 Copyleft dcm4che 2.18.0 Copyleft SMC 6.3.0 Copyleft 20

3.2.2 Metrics Collected This study collected technical debt as measured by the SQUALE technical debt plugin for the SonarQube software analysis system. Measurements were collected for each of the software packages listed in Table 3.1. The packages listed in Table 3.1 come from a search of the ”Most Popular” projects for each of the subject licenses on the open source hosting site SourceForge.com. The values fed into SQUALE are given in Table 3.2. Thresholds are drawn from Eisenberg (2012). Absent a well-established methodology in the scholarly literature, Eisenberg’s approach is straightforward enough for use here. Each programming language uses a language-specific plugin, so some measures are not available in all languages.

Table 3.2 The measures used to evaluate projects.

Measure Python C# Java JavaScript PHP Class complexity 60 60 60 60 60 File complexity 60 60 60 60 N/A Duplicate blocks yes yes yes yes yes FIXME/TODO comments yes yes yes yes FIXME only Comment density 25 25 25 25 25 Branch coverage in unit tests 80 80 80 80 80 Line coverage in unit tests 80 80 80 80 80

The size of a project reasonably impacts the potential for the accumulation of technical debt. It therefore makes sense to normalize the technical debt reported by SonarQube. Two measures of size are readily available from SonarQube analysis: lines of code (LOC) and cyclomatic complexity. The line count for a section of code is a strict count of the non-blank, non-comment lines. As such, it is subject to the coding style of the developer. Cyclomatic complexity, as introduced by McCabe 21

(1976), is based on the structure of the code elements and is therefore consistent regardless of visual style. For this study, the technical debt of a project is reported as days per thousand units of complexity.

3.2.3 Collection Environment SonarQube was installed on a dedicated virtual machine to prevent any interference from regular use. In order to facilitate checkpointing in case of data loss or corruption, all downloaded code was be saved to a virtual disk image. The performance characteristics of the virtual machine are not relevant since static analysis is performed.

3.3 Analysis Methods The distribution of technical debt for the two license paradigms were be compared using the Mann-Whitney test(Mann & Whitney, 1947). This non-parametric test was selected due to its minimal assumptions about the distribution of the data. Arcuri and Briand (2011) recommend this test for testing the di↵erences in two groups. Due to the relative novelty of this study, a significance of 0.10 was chosen for the test.

3.4 Threats to Validity The following threats to validity have been identified:

The programming language of a project may have an impact on quality due to • the increased presence of testing tools (Zhao & Elbaum, 2000). If a language is overrepresented in the sample, the results may or may not be meaningful.

The sample is not random. Random selection of a project is impossible • because there is no definitive list of all projects and hosting sites do not provide a random option in search features. 22

Similarly, the representativeness of the sample projects cannot be proven. • Not all projects are hosted on SourceForge. The SourceForge user community • may or may not represent open source projects in general.

3.5 Summary This chapter has presented the methodology used in this study, including setup and statistical tests. 23

CHAPTER 4. PRESENTATION OF DATA AND FINDINGS

4.1 Presentation of the data This chapter presents the results of the data collection as described in Chapter 3. Table 4.1 presents the measured technical debt (in days) and cyclomatic complexity. The distribution of normalized debt is shown in Figure 4.1.

Table 4.1: Complexity and technical debt measurements

Software package Paradigm Language Technical Complexity Debt Adminer Permissive PHP 15 0.9 AngryIPScanner Copyleft Java 1739 14.9 CoCEd Permissive C# 1929 93.7 dcm4chee Copyleft Java 19839 195.4 Div/er Copyleft Java 6692 47.8 DocFetcher Copyleft Java 7440 66.1 dom4j Permissive Java 4818 24.6 DrJava Permissive Java 41126 436.1 Dspace Permissive Java 37030 333.5 Eclipse Checkstyle Copyleft Java 3495 22.1 Plug-in EclipseFP Copyleft Java 14971 130.5 eXtplorer Copyleft PHP 3633 8.8 FlacSquisher Permissive C# 356 31.1 FreeMarker Permissive Java 10119 58.8 24

Table 4.1: continued Software package Paradigm Language Technical Complexity Debt FreeMind Copyleft Java 14169 130 FreeTTS Permissive Java 4584 18.7 Ganglia Permissive Python 555 2.1 (gmetad-python) Geotag Permissive Java 11071 62 gns3 Copyleft Python 10116 581.1 GURPS Copyleft Java 5608 55.6 HAPI Copyleft Java 9650 80.1 HattrickOrganizer Copyleft Java 18732 229.1 HybridAuth Permissive PHP 831 0.7 HyperSQL Database Permissive Java 46519 356.6 Engine ISPConfig Permissive PHP 11089 18.1 JasperReports Copyleft Java 39694 605.5 Library jautodoc Copyleft Java 3783 23.7 JfreeChart Copyleft Java 22259 116.4 Joda-Time Permissive Java 8405 18.8 jpcap Copyleft Java 1047 8.6 Jsch Permissive Java 3788 30.2 Json-lib Permissive Java 2928 10.8 jTDS Copyleft Java 6942 52.3 jVi Copyleft Java 9497 34.2 KoLmafia Permissive Java 51519 456.2 ksar Permissive Java 1094 14.8 lwjgl Permissive Java 13097 99 25

Table 4.1: continued Software package Paradigm Language Technical Complexity Debt magmi Permissive PHP 922 0.9 Ming Permissive Python 2099 17 Mondrian Copyleft Java 34918 294.9 NagiosQL Permissive PHP 2183 2.8 OpenGTS Permissive Java 44580 207.7 OpenLP Copyleft Python 8807 231 Open Source Permissive PHP 5597 5.2 Point of Sale OWASP Zed Permissive Java 26194 280.2 Attack Proxy PDFMerge Permissive C# 11085 271.4 PentahoPlatform Copyleft Java 24986 232.6 phpMyAdmin Copyleft PHP 13456 22.5 phpseclib Permissive PHP 2993 4.8 picard Permissive Java 16071 66 PMD Permissive Java 12584 108.5 PyDev Copyleft Java 6021 42.8 PyFFI Permissive Python 5671 37 pyparsing Permissive Python 2693 22.6 PyTZ Permissive Python 240 17531.6 PyUSB Permissive Python 525 4.3 Robocode Copyleft Java 10476 80.7 RPy Copyleft Python 1642 15.6 RSSOwl Copyleft Java 25670 368.9 SABnzbd+ Copyleft Python 11948 71.8 Sahi Permissive Java 3121 19.6 26

Table 4.1: continued Software package Paradigm Language Technical Complexity Debt SAP Netweaver Copyleft Java 1111 12.4 Plugins SciPy Permissive Python 28024 345.1 Scons Permissive Python 13089 79.1 ShellEd Copyleft Java 488 6.1 SimpleCV Permissive Python 2970 46.9 Simple HTML Permissive PHP 244 0.6 DOM Parser SMC Copyleft Java 3195 322.7 SoapUI Copyleft Java 40039 555.2 SquirrelSQL Copyleft Java 37584 433.1 SweetHome3D Copyleft Java 17044 132.3 TaskCoach Copyleft Python 17595 145.6 TheBugGenie Copyleft PHP 12644 15.3 TikiOne Steam Copyleft Java 408 5.3 Cleaner TuxGuitar Copyleft Java 16866 263.5 VASSAL Copyleft Java 36007 306.3 VietOCR Permissive Java 1581 16.5 Vuze Copyleft Java 117597 1255.4 Weka Copyleft Java 58903 456.6 WikidPad Permissive Python 27708 152.6

For both licensing paradigms, Java was the dominant language. However, many of the most popular (as listed on SourceForge) copyleft programs are written in C or C++. Since a free SonarQube plugin for those languages was not available 27

Figure 4.1. Distribution of technical debt for programs in this study

at the time of this study, those programs could not be included. Figure 4.2 shows the distribution of programming languages for both licensing paradigms.

4.2 Analysis of the data Avisualinspectionofthenormalizedtechnicaldebtvaluesforthetwo paradigms, as seen in Figure 4.3, suggests we should see a di↵erence between them. Using the Mann-Whitney test confirms the visual analysis. The normalized technical debt in copyleft-licenced programs is higher than in permissively-licensed programs (U = 531, p = .010). Thus, the null hypothesis is rejected. 28

Figure 4.2. Distribution of programming languages for programs in this study

As Zhao and Elbaum (2000) noted, the language used in a project can have an impact on the resulting quality. Due to the predominance of Java in this study, the test was re-run with only the Java projects. The Mann-Whitney test does not require the sample sizes to be equal, but it does lose power with greater inequality (Mann & Whitney, 1947). In this case, although the visual analysis (as shown in Figure 4.4) suggests copyleft Java programs have lower technical debt, the statistical results are inconclusive (U = 258, p = .370 ). Indeed, averaging the normalized technical debt by language shows a higher mean debt for copyleft Java programs. The same is true for Python and not PHP, but the total count of the projects in those languages are small enough to render 29

Figure 4.3. Technical debt of projects analyzed in this study

statistical tests meaningless. Nonetheless, it reinforces the findings of Zhao and Elbaum (2000) and o↵ers avenues for further exploration. Table 4.2 lists the mean debt by language and paradigm.

4.3 Summary This chapter presented the data collected during this study. It also presented the results of statistical tests. The null hypothesis was rejected based on the results of the tests. 30

Figure 4.4. Technical debt of Java programs in this study

Table 4.2 Mean technical debt of projects by language and paradigm.

Language Permissive Copyleft C# 53.5 (none) Java 7.2 1.8 PHP 8.7 1.8 Python 8.3 21.5 31

CHAPTER 5. CONCLUSION, DISCUSSION, AND RECOMMENDATIONS

5.1 Conclusion In conclusion, we have established that a di↵erence exists in the technical debt, of copyleft- and permissively-licensed software projects. Permissive licenses correspond to lower technical debt. Because permissive licenses have higher levels of developer engagement (Subramaniam et al., 2009), more developer involvement may be understood to correspond to lower levels of technical debt. Absent other considerations, the results of this study suggest that a project manager looking to develop or incorporate open source software should look toward permissive licenses. This is a simple method for improving the likely overall quality of the project. By using higher-quality inputs, the project’s quality management e↵orts can be used to address other areas. It is important to remind the reader that this study is novel. While published literature on technical debt exists, much of the discussion is qualitative rather than quantitative. Furthermore, comparisons of the license paradigms within open source software development appears to be a largely unexplored field. Literature discussing open source software quality is largely confined to comparison against proprietary programs. Due to the novelty of this study, the reader must be careful to not overextend the conclusions. The p value of the Mann-Whitney test was small enough that the null hypothesis would have been rejected with a less conservative significance. Nonetheless, this study is hardly a definitive statement on the matter. The results presented here serve to indicate that further study is warranted. 32

5.2 Future Work Because this study is so novel, a great deal of work remains to be done. First, alargerstudythatinvolvesprojectswritteninCandC++shouldbeconducted. These languages are widely used in software projects; TIOBE.com’s August 2014 index has C as the most popular language and C++ as the 4th most popular language. Both languages have averaged in the top 5 for over two decades Software (2014). Inclusion of these languages would give additional credibility to the results. Furthermore, enlarging the sample to enable single-language comparisons would eliminate any language-specific e↵ects. As Zhao and Elbaum (2000) observed, and this study’s results confirmed, quality is not consistent across languages. A set of studies, each focused on one language, would provide more robust results. Knowing that developer and user engagement di↵er based on the license paradigm Subramaniam et al. (2009), further study of the communities around projects is warranted. Is the observed di↵erence in technical debt due to the e↵ects of the license, or are coincident governance factors responsible? While some studies have touched on governance, further study is warranted. In particular, a good taxonomy is neededto serve as a basis for study of governance within projects. Another avenue for further work is to improve technical debt models. Mapping technical debt to other quality measures will help validate the use of technical debt and allow for consistent measurement across projects. In addition, more rigorous technical debt models will allow project managers and developers to guide decisions to reduce the accumulation of technical debt.

5.3 Summary Ihavemeasuredandcomparedtechnicaldebtofsoftwareprojectsinorderto evaluate the relative quality of two open source license paradigms. The results of the study showed that permissively-licensed projects are of higher quality than copyleft-licensed projects. Possibilities for future work were also suggested. LIST OF REFERENCES 33

LIST OF REFERENCES

American Society for Quality. (n.d.). Six sigma. Retrieved February 16, 2013, from: http://asq.org/learn-about-quality/six-sigma/overview/overview.. Arcuri, A., & Briand, L. (2011). A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Software Engineering (ICSE), 2011 33rd International Conference on (pp. 1–10). Babcock, C. (2012, March). Red Hat: First $1 billion open source company. Retrieved February 16, 2013, from: http://www.informationweek.com/development/open-source/ red-hat-first-1-billion-open-source-comp/232700454. Boehm, B. W., Brown, J. R., & Lipow, M. (1976). Quantitative evaluation of software quality. In Proceedings of the 2nd international conference on Software engineering (pp. 592–605). Brewer, J. L., & Dittman, K. C. (2010). Methods of IT project management. Prentice Hall. Brown, N., Cai, Y., Guo, Y., Kazman, R., Kim, M., Kruchten, P., ... others (2010). Managing technical debt in software-reliant systems. In Proceedings of the FSE/SDP workshop on Future of software engineering research (pp. 47–52). Capra, E., Francalanci, C., & Merlo, F. (2008). An empirical study on the relationship between software design quality, development e↵ort and governance in open source projects. Software Engineering, IEEE Transactions on, 34 (6), 765–782. Carver, B. W. (2005). Share and share alike: Understanding and enforcing open source and free software licenses. Berkeley Tech. LJ , 20 ,443. CMMI Institute. (n.d.). Frequently Asked Questions. Retrieved February 16, 2013, from: http://cmmiinstitute.com/cmmi-getting-started/frequently-asked-questions/.

Conley, C. A., & Sproull, L. (2009). Easier said than done: an empirical investigation of software design and quality in open source software development. In System Sciences, 2009. HICSS’09. 42nd Hawaii International Conference on (pp. 1–10). Crosby, P. B. (1979). Quality is free: The art of making quality certain (Vol. 94). McGraw-Hill New York. 34

Crowston, K., Annabi, H., & Howison, J. (2003, December). Defining open source software project success. In Proceedings of the 24th international conference on information systems (pp. 327–340). Cunningham, W. (1992). The WyCash portfolio management system. In ACM SIGPLAN OOPS Messenger (Vol. 4, pp. 29–30). Deutscher, M. (2014, August). Millions of patients’ data hacked in “first confirmed” Heartbleed heist. Retrieved August 25, 2014 from http://siliconangle.com/blog/2014/08/25/ millions-of-patients-data-hacked-in-first-confirmed-heartbleed-heist/. Eisenberg, R. J. (2012). A threshold based approach to technical debt. ACM SIGSOFT Software Engineering Notes, 37 (2), 1–6. Feller, J., Fitzgerald, B., et al. (2002). Understanding open source software development. Addison-Wesley London. Florac, W. A. (1992). Software quality measurement: A framework for counting problems and defects (Tech. Rep.). DTIC Document. Free Software Foundation. (2013a). Various licenses and comments about them. Retrieved February 17, 2013, from: http://www.gnu.org/licenses/license-list.html. Free Software Foundation. (2013b). What is free software? Retrieved February 16, 2013, from: http://www.gnu.org/philosophy/free-sw.html. Gat, I., & Ebert, C. (2012). Point counterpoint. Software, IEEE, 29 (6), 52–55. Haderer, N., Khomh, F., & Antoniol, G. (2010). SQUANER: A framework for monitoring the quality of software systems. In Software Maintenance (ICSM), 2010 IEEE International Conference on (pp. 1–4). Hegeman, E. (2011). InfoSupport-on the quality of quality models. Hendrickson, M., Magoulas, R., & O’Reilly, T. (2012). Economic impact of open source on small business: A case study.O’ReillyMedia. Jones, C. (2004). Software project management practices: Failure versus success. CrossTalk: The Journal of Defense Software Engineering,5–9. Kaner, C., & Bond, W. P. (2004). Software engineering metrics: What do they measure and how do we know? methodology, 8 ,6. Kuan, J. (2002). Open source software as lead user’s make or buy decision: a study of open and closed source quality. Stanford Institute for Economic Policy Research, Stanford University. Mackenzie, M. (2012, August). Regulators seek curbs on trading bugs. Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 18 (1), 50–60. 35

McCabe, T. J. (1976). A complexity measure. Software Engineering, IEEE Transactions on(4), 308–320. Mockus, A., Fielding, R. T., & Herbsleb, J. D. (2002). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology (TOSEM), 11 (3), 309–346. Mohagheghi, P., Conradi, R., & Børretzen, J. A. (2006). Revisiting the problem of using problem reports for quality assessment. In Proceedings of the 2006 international workshop on Software quality (pp. 45–50). Motherudin, F., & Tong, W. W. (2011). Evaluation of code quality best practices into dashboard. In International Workshop on CMMI based Software Process Improvement in Small and Medium Sized Enterprises (p. 17). Mundie, C. (2011, December). Speech transcript. Retrieved February 16, 2013, from: http://www.microsoft.com/en-us/news/exec/craig/05-03sharedsource.aspx. Newman, M. (2002). Software errors cost US economy $59.5 billion annually. NIST Assesses Technical Needs of Industry to Improve Software-Testing. Nugroho, A., Visser, J., & Kuipers, T. (2011). An empirical model of technical debt and interest. In Proceedings of the 2nd Workshop on Managing Technical Debt (pp. 1–8). Open Source Initiative. (n.d.). The open source definition. Retrieved February 16, 2013, from: http://opensource.org/docs/osd. Plosch, R., Gruber, H., Korner, C., & Saft, M. (2010). A method for continuous code quality management using static analysis. In Quality of Information and Communications Technology (QUATIC), 2010 Seventh International Conference on the (pp. 370–375). Raymond, E. (1999). The cathedral and the bazaar. Knowledge, Technology & Policy, 12 (3), 23–49. Red Hat. (2013). Annual reports. Retrieved February 17, 2013, from http://investors.redhat.com/annuals.cfm. Schneidewind, N. F. (1992). Methodology for validating software metrics. Software Engineering, IEEE Transactions on, 18 (5), 410–422. Siy, H., & Votta, L. (2001). Does the modern code inspection have value? In Proceedings of the IEEE international Conference on Software Maintenance (ICSM’01) (p. 281). Software, T. (2014, August). TIOBE Index for August 2014. Retrieved August 31, 2014 from http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html. SonarQube. (2013). Sonar. Retrieved from: http://www.sonarsource.org. Subramaniam, C., Sen, R., & Nelson, M. L. (2009). Determinants of open source software project success: A longitudinal study. Decision Support Systems, 46 (2), 576–585. 36

United States Bureau of Economic Analysis. (2013). Selected national income and product account tables. Retrieved from: http://www.bea.gov/national/txt/dpga.txt. Williams, S. (2011). Free as in freedom: ’s crusade for free software.O’ReillyMedia. Zhao, L., & Elbaum, S. (2000). A survey on quality related activities in open source. ACM SIGSOFT Software Engineering Notes, 25 (3), 54–57.