Monzur Murshed
Total Page:16
File Type:pdf, Size:1020Kb
An investigation of software vulnerabilities in open source software projects using data from publicly-available online sources S. M. Monzur Murshed A thesis submitted to the Faculty of Graduate and Postdoctoral Affairs in partial fulfillment of the requirements for the degree of Master of Applied Science in Technology Innovation Management Carleton University Ottawa, Ontario Copyright © 2017 S. M. Monzur Murshed An investigation of software vulnerabilities in open source software projects using data from publicly-available sources Copyright © 2017 S. M. Monzur Murshed _______________________________________________________________________________________________________________________________________________ Abstract Software vulnerabilities is an active area of research, but little is known about how publicly-observable properties of open source software projects and developer communities relate to the time taken to discover and fix vulnerabilities in the projects’ software. This thesis examines that relationship using data harvested from online sources about a sample of 60 open source content management system (CMS) projects and 1268 vulnerabilities affecting the software produced by those projects. Combining project release histories with metrics from two online databases provided reliable proxy dates for vulnerability introduction and fix, but not discovery. Higher commit density (a proxy for project activity) was associated with shorter time of exposure. The lifecycle model, data collection workflow, and software scripts will enable researchers to replicate and extend this analysis, and the evidence-based recommendations provided here will enable improvements to the coverage, quality, access, and integration of online sources for project and vulnerability metrics. ______________________________________________________________________ ii An investigation of software vulnerabilities in open source software projects using data from publicly-available sources Copyright © 2017 S. M. Monzur Murshed _______________________________________________________________________________________________________________________________________________ Acknowledgements I would like to acknowledge my thesis supervisor, Professor Steven Muegge, for his patient and tireless guidance. I would like to thank my daughter, Muntaha, and my wife, Gulshan, for their tolerance. ______________________________________________________________________ iii An investigation of software vulnerabilities in open source software projects using data from publicly-available sources Copyright © 2017 S. M. Monzur Murshed _______________________________________________________________________________________________________________________________________________ Table of contents Abstract................................................................................................................................ii Acknowledgements............................................................................................................iii Table of contents.................................................................................................................iv List of figures......................................................................................................................vi List of tables......................................................................................................................vii Glossary of terms................................................................................................................ix 1 Introduction.......................................................................................................................1 1.1 Objective....................................................................................................................2 1.2 Deliverables...............................................................................................................2 1.3 Relevance...................................................................................................................2 1.4 Contribution...............................................................................................................4 1.5 Overview of method..................................................................................................5 1.6 Organization of the document....................................................................................5 2 Literature review...............................................................................................................7 2.1 Software vulnerabilities.............................................................................................8 2.2 Open source software...............................................................................................17 2.3 Vulnerability databases and standards.....................................................................21 2.4 Summary and synthesis of key findings from the literature....................................24 3 Publicly-available data sets.............................................................................................27 3.1 CVE Details.............................................................................................................27 3.2 Black Duck Open Hub.............................................................................................36 4 Conceptual framework....................................................................................................41 4.1 Software vulnerability ability lifecycle....................................................................41 4.2 Application case: The Apache HTTP Server Project..............................................44 5 Hypotheses......................................................................................................................48 5.1 Contributors (H1).....................................................................................................49 5.2 Commit activity (H2)...............................................................................................50 5.3 Comments (H3)........................................................................................................51 5.4 Code churn (H4)......................................................................................................52 6 Research design and method...........................................................................................53 6.1 Identify assertions from the literature......................................................................54 6.2 Investigate data sets.................................................................................................54 6.3 Develop conceptual framework...............................................................................54 6.4 Develop hypotheses.................................................................................................55 6.5 Select samples..........................................................................................................55 6.6 Inspect small sample................................................................................................57 6.7 Specify variables and measures for full sample.......................................................57 ______________________________________________________________________ iv An investigation of software vulnerabilities in open source software projects using data from publicly-available sources Copyright © 2017 S. M. Monzur Murshed _______________________________________________________________________________________________________________________________________________ 6.8 Collect data for full sample......................................................................................60 6.9 Statistical analysis of full sample.............................................................................61 7 Results.............................................................................................................................63 7.1 Sample of open source software projects.................................................................63 7.2 Manual inspection of small sample..........................................................................67 7.3 Statistical analysis of full sample.............................................................................70 7.3.1 Project metrics..................................................................................................70 7.3.2 Univariate displays............................................................................................77 7.3.3 Bivariate displays..............................................................................................79 7.3.4 Summary statistics and correlation...................................................................82 7.3.5 Regression results (models 1 and 2).................................................................83 7.3.6 Regression diagnostics (model 1).....................................................................85 7.3.7 Time-averaging.................................................................................................87 7.3.8 Transformations................................................................................................89 7.3.9 Effect size........................................................................................................100 8 Discussion.....................................................................................................................104