2019 State of the Software Supply Chain The 5th annual report on global open source software development
presented by in partnership with supported by Table of Contents
Introduction...... 3 CHAPTER 4: Exemplary Dev Teams...... 26 4.1 The Enterprise Continues to Accelerate...... 27 Infographic...... 4 4.2 Analysis of 12,000 Large Enterprises...... 27
CHAPTER 1: Global Supply of Open Source...... 5 4.3 Component Releases Make Up 85% of a Modern Application...... 28 1.1 Supply of Open Source is Massive...... 6 4.4 Characteristics of Exemplary 1.2 Supply of Open Source is Expanding Rapidly...... 7 Development Teams...... 29 1.3 Suppliers, Components and Releases...... 7 4.5 Rewards for Exemplary Development Teams...... 34
CHAPTER 2: Global Demand for Open Source...... 8 CHAPTER 5: The Changing Landscape ...... 35 2.1 Accelerating Demand for 5.1 Deming Emphasizes Building Quality In...... 36 Open Source Libraries...... 9 5.2 Tracing Vulnerable Component Release 2.2 Automated Pipelines and Downloads Across Software Supply Chains...... 37 DevOps Are Key Drivers...... 10 5.3 Adversaries Increasingly Target Open Source Components...... 38 CHAPTER 3: Exemplary Project Teams ...... 11 5.4 Government and Industry Apply New 3.1 Research Goals...... 12 Standards to Secure Software Development...... 42 3.2 Time to Remediate Vulnerabilities...... 13 Sources...... 45 3.3 Time to Update Dependencies...... 15 Appendix A...... 46 3.4 Stale Dependencies...... 16 Acknowledgments...... 46 3.5 ...... Exploring the Link Between MTTR and MTTU 16 About the Analysis...... 46 3.6 Hypothesis Testing...... 18 Appendix B 47 3.7 Finding Different Behavioral Groups...... 22 ...... 3.8 Guidance for Open Source Project Appendix C...... 49 Owners and Contributors...... 24 3.9 Guidance for Enterprise Appendix D...... 50 Development Teams...... 24 Introduction
Now in its fifth year, Sonatype's annual State of the The 2019 State of the Software Supply Chain Report Software Supply Chain Report examines the rapidly blends a broad set of public and proprietary data with expanding supply and continued exponential growth survey results, expert research, and analysis to reveal in consumption of open source components. Our the following: research also reveals best practices exhibited by ⊲⊲75% growth in supply of open source component exemplary open source software projects and exem- releases over the past two years (Chapter 1) plary commercial application development teams. ⊲⊲68% year over year growth in download requests This year, for the first time, we’ve collaborated with from the Central Repository to 146 billion (Chapter 2) research partners Gene Kim from IT Revolution ⊲⊲18x faster median time to update dependencies for and Dr. Stephen Magill, Principal Scientist at Galois exemplary open source components (Chapter 3) and CEO of MuseDev, to objectively examine and ⊲⊲55% reduction in the use of vulnerable open source empirically document, release patterns and hygiene component releases within managed software practices across 36,000 open source project teams supply chains (Chapter 4) and 3.7 million open source releases. ⊲⊲71% increase in confirmed or suspected open Separately, we observed 12,000 commercial source related breaches since 2014 (Chapter 5) engineering teams to document their consumption of open source and third party libraries. We also At Sonatype, we’ve long been active contributors to, conducted two surveys, with a combined participation and maintainers of, numerous open source efforts, of over 6,200 development professionals, to better including Apache Maven and Nexus Repository understand the current state of DevSecOps and Manager. Since 2005, we’ve curated and operated software supply chain management practices. We the Central Repository, which last year alone serviced compared teams with, and without, automated open 146 billion download requests for component releases source governance capabilities (in an attempt) to from developers around the world. Commercially, our reveal the baseline benefit of building applications interest lies in helping developers and organizations that utilize higher quality open source components. accelerate innovation and minimize risk by continu- ously sourcing third-party libraries from the highest quality open source projects.
Together with our partners, we are proud to share this research. We hope that you find it valuable.
3 Exemplary Dev Teams experience a
Exemplary 55% reduction Projects are in the use of vulnerable OSS component 18x releases in automated Exemplary at updating environments Projects are faster dependencies 21,448 pg. 34 new open source 3.4x pg. 13 releases per day faster pg. 6 at remediating Exemplary Dev Teams are known vulnerabilities The top 5% of 10x more projects remediate pg. 29 to schedule security vulnerabilities likely dependency within 21 days pg. 29 updates
pg. 14 Exemplary Dev Teams are 313,000 12x more likely average enterprise to have automated tools to downloads of OSS manage open source dependencies pg. 31 components per year pg. 42 Secure Dev Practices are pg. 27 PCI introduces 9.3x more new likely to proactively standards remove pg. 31 for development troublesome teams using open dependencies 146B source components Java download requests in 2018, represented a 51% of pg. 36 JavaScript 68% components have a year over year growth pg. 38 known security vulnerability pg. 9 1 in 10 pg. 37 71% increase downloads in open source related of Java breaches over the component releases have past five years known vulnerabilities CHAPTER 1 Global Supply of Open Source A World of Infinite Choice 1.1 Supply of Open components. These new versions not only offer enhanced features, but also provide improved Source is Massive performance, bug fixes, and security patches.3 There are now more than 3.7 million unique Java open source software component releases in the KEY POINT Central Repository, 800,000 unique JavaScript 1.2 Supply of Open Source packages in npm, 1.2 million unique Python com- is Expanding Rapidly ⊲⊲On average, developers had ponent releases housed in the PyPI repository, and access to more than 21,448 Sonatype’s study across several open source 1.6 million .NET component releases in the NuGet new open source component component ecosystems reveals number of Gallery.1 There are also more than 2.2 million releases every day, since the releases housed within public repositories beginning of 2018. containerized applications housed in Docker Hub increased from 16.6 million to 28.4 million from — up from 900,000 the previous year.2 January 2018 through today. On average, devel- This massive supply of software parts is rapidly opers had access to more than 21,448 new open and organically expanding due to constant source component releases every day, since the innovations and regular versioning of existing beginning of 2018.
FIG. 1A OSS Component Growth from 2017 – 2019 4M OSS Component Growth from 2017-2019 +81% % 3.5M 75 3M average growth +213% over two years. 25K +21% 2.5M 2017 2019 20K CHAPTER 1: GLOBAL SUPPLY OF OPEN SOURCE OF OPEN SOURCE SUPPLY CHAPTER 1: GLOBAL 2M 15K +79% 10K 1.5M 5K +48% 0 1M Go crates.io +109%
.5M +76% +21% +21% +213%
Go crates.io RubyGems Packagist npm PyPI NuGet Java
2019 STATE OF THE SOFTWARE SUPPLY CHAIN REPORT 6 While Java components continue to dominate by significant questions for organizations wanting to Fueling Rapid Innovation Consumption of open sheer number of component releases available, better manage their software supply chains: source is so vast that most organizations cannot package types like npm and crates.io demon- identify how many components are entering ⊲⊲How often do projects publish new versions? strated the highest growth rates. npm packages into their software supply chains, how those experienced 109% growth and now total 836,000 ⊲⊲Do certain projects release updates more components are flowing through development component releases. Newcomer crates.io (Rust) frequently? lifecycles, the relative quality and security of those increased 213% during the period and now offers ⊲⊲Do other projects release updates less components, or which components exist within more than 25,000 component releases (see frequently? production applications. FIGURE 1A).4 ⊲⊲What are the implications? According to International Data Corporation (IDC), Open source growth is robust across numerous ⊲⊲Who are the best component suppliers? “there were 22.30 million software developers in ecosystems, but npm has grown particularly fast the world at the outset of 2018. IDC estimated that due to JavaScript’s emergence as a universal This year, State of the Software Supply Chain 11.65 million are full-time developers, 6.35 million web application programming language. For researchers set out to answer these questions and are part-time developers, and 4.30 million are JavaScript developers looking for a library, tool, or many more. nonprofessional developers.”6 In 2018, developers adapter, odds are very good that someone else around the world consumed hundreds of billions has already created it and published it to npm of open source software component releases. where it can easily be borrowed. According to 1.3 Suppliers, Components the stewards of the npm repository, “Every week and Releases roughly 160 people publish their first package in To match terminology in the previous State of 5 the registry.” the Software Supply Chain reports and provide consistency across research findings, we will New component versions are released for a vari- follow the Maven terminology using the following ety of reasons, including supporting new function- definitions: ality, fixing defects, or supporting a new API. New releases may also address non-feature-related ⊲⊲A supplier is a Maven Group (e.g., org.apache. concerns, such as improving performance, adding httpcomponents) or updating their dependencies, or remediating CHAPTER 1: GLOBAL SUPPLY OF OPEN SOURCE OF OPEN SOURCE SUPPLY CHAPTER 1: GLOBAL security vulnerabilities. ⊲⊲A component is a Maven Group and Artifact (e.g., org.apache.httpcomponents, httpclient) Although brand new projects are constantly being ⊲⊲A component release is a specific Maven created and introduced, growth in open source Group, Artifact and Version (e.g., org.apache. supply is driven mostly by new versions of existing httpcomponents, httpclient, 4.5.6). For other projects being published. While the growth ecosystems, we sometimes refer to releases as in supply fuels rapid innovation, it does pose packages.
2019 STATE OF THE SOFTWARE SUPPLY CHAIN REPORT 7 CHAPTER 2 Global Demand for Open Source Fueling Rapid Innovation 2.1 Accelerating Demand for Open Source Libraries The growing demand for innovation has accel- erated implementations of automated software development pipelines while also driving open source consumption to new heights across all major ecosystems. 1 0 Number of Download 1 Requests for Java Component 2.1.1 Demand for Java FIG.Releases 2A Number 2012 of – Download2018 In 2018, the aggregate number of download Requests for Java Component requests for Java component releases from the Releases 2012 – 2018 Central Repository grew 68% year over year to 146 12 billion. With an estimated 12 million Java developers around the world, this equates to 12,166 per person.7
2.1.2 Demand for JavaScript 100 While growth in demand for Java components is remarkable, a view into JavaScript package downloads demonstrates even greater growth in developer demand. In 2018, average weekly npm package downloads increased from approximately 3.5 billion to 10 billion — an increase of 185%.8 To add further context, there are an estimated 9.7 million JavaScript developers in the world — mean- ing the average JavaScript developer is sourcing 1,030 packages per week or 53,608 packages per 9 0 CHAPTER 2: GLOBAL DEMAND FOR OPEN SOURCE DEMAND FOR OPEN SOURCE CHAPTER 2: GLOBAL year.
2.1.3 Others Other public repositories have experienced similar levels of developer demand. For example, down- loads of RubyGems packages have grown from 2 8 billion to 33 billion over the past three years.10
NuGet package downloads have increased from BILLIONS 756 million in 2015 to an annual run rate of 16.2 billion in 2019.11 0 2012 201 201 201 201 201 201
2019 STATE OF THE SOFTWARE SUPPLY CHAIN REPORT 9 A view into JavaScript component downloads demonstrates even greater growth in developer demand. In 2018, average weekly npm package downloads increased from approximately 3.5 billion to 10 billion — an increase of 185%.
JavaScript Package Downloads, 12 2.2 Automated Pipelines and Rolling Weekly Average 2013 – 2019 DevOps Are Key Drivers FIG.SOURCE: 2B JavaScript NPM INC., LAURIEPackage VOSS Downloads, @SELDO Exponential growth in the consumption of open Rolling Weekly Average 2013 – 2019 source component releases and containers is SOURCE: NPM INC., LAURIE VOSS (@SELDO) 10 a proxy for the adoption of automated software development tools and DevOps pipelines. Automated tooling can generate hundreds or thousands of download requests per build.
In the context of software supply chain manage- ment, each download equates to a procurement effort by development teams. Each open source software component release is chosen from an OSS project that acts as a supplier to developers who assemble tens, hundreds, and sometimes thousands of component releases into a finished application. CHAPTER 2: GLOBAL DEMAND FOR OPEN SOURCE DEMAND FOR OPEN SOURCE CHAPTER 2: GLOBAL
2 BILLIONS 0 201 201 201 201 201 201 2019
2019 STATE OF THE SOFTWARE SUPPLY CHAIN REPORT 10 CHAPTER 3 Exemplary Project Teams Open Source Projects Are Not Created Equal 3.1 Research Goals FIG. 3A ConstructingConstructing the the Study Study Dataset Dataset (N (N = =36,203) 36,203) We wanted to better understand the health and habits of the open source component ecosystem, N = 266,170 and how software developers choose which OSS Components were published in components to use in their own projects. To do The Central Repository. this, we studied all the Java artifacts stored in The N = 168,231 Central Repository (often referred to as “Maven Components had at least two version Central”). At the time of this writing, The Central releases in the last five years. Repository has over 266,000 unique components with over 3.7 million component releases. N = 101,252 Components were part of the “open source software supply chain” The research set out to answer: (e.g., they are or they have a dependency). ⊲⊲Do differences exist in how effectively OSS N = 100,643 projects update their dependencies and fix Components follow the Maven standard for versioning guidance. vulnerabilities? Are there exemplary components (e.g., correct use of numeric version strings, components separated by dots.) that do this better than others? N = 76,795 ⊲⊲Are exemplary components more widely-used Components have dependencies satisfying than “non-exemplary” components? all of the above. ⊲⊲What factors correlate with exemplary N = 36,203 components? Components have updated a dependency at least once. ⊲⊲What advice can be offered to producers of OSS components and the developers that consume them?
For the purpose of this study, we restricted our Stale Dependencies (fewer is better): the average percentage of out-of-date component depen- CHAPTER 3: EXEMPLARY PROJECT TEAMS TEAMS PROJECT CHAPTER 3: EXEMPLARY analysis to components that met six qualifying criteria (see FIGURE 3A), resulting in a data set dencies (i.e., a newer version has been released) of 36,203 components. Given this data set, we present when the component has a new release. examined a number of attributes to measure Release Period (shorter is better): average time responsiveness to security vulnerabilities and in days each component version spends as the determine what properties exemplary component “current” release. A shorter average release period project teams shared. In addition to the security equates to more frequent releases. relevant attributes — described in the next section — the primary attributes tracked were: Popularity: the average number of downloads per day from The Central Repository. Number of Dependencies: the maximum count of dependencies for any given component across all SCM Commits per Month: average number of versions in the study period, as measured by the commits to the GitHub repo per month — a measure dependencies in the Maven pom.xml file. of development activity (i.e., developer velocity).
2019 STATE OF THE SOFTWARE SUPPLY CHAIN REPORT 12 Developer Team Size: as measured by the updates as security relevant if there was a known average number of unique developers committing vulnerability targeting the old version at the time each month. the new version was released. We then measure Presence of Continuous Integration (CI): as mean time to remediate (MTTR) as the average measured by the detection of any CI-related time required for a component to adopt security configuration files in the source code repository relevant updates to dependencies. (e.g., Travis, Jenkins, CircleCI, etc.). Note that the TTR clock starts when a component Support Type: support for the component comes version fixing a vulnerability is released. Specifically, from an open source foundation, a commercial the TTR clock is not started when the vulnerability organization, or is not officially supported by any is reported or published; early efforts used vulnera- organization (e.g., a personal project). bility publish times from various sources as the TTR starting time led to many problems. For example, To gather these attributes, we augmented the data responsible disclosure often delays the publishing KEY POINTS as follows: of the vulnerability until developers have released 12 ⊲⊲CHAOSS: using the perceval utility to gather a fixed component version, or vulnerabilities may ⊲⊲TTR is the time required for a GitHub commit data, we gathered the number have CVE numbers associated with the date they component team to remediate any of commits per month for twelve months, as well were reserved rather than published. These factors security vulnerabilities reported as the number of unique developers committing make it difficult to obtain precise dates of when against their dependencies. during each month. component teams were advised of vulnerabilities. 13 ⊲⊲For developers wanting to use more ⊲⊲Libraries.io: using this dataset, we were able On the other hand, data on which components fixed to gather the number of GitHub stars, forks, and secure components, those exhibit- a known vulnerability with a new version release is pull requests. ing faster MTTR are desirable. widespread and much more reliable. ⊲⊲Sonatype Nexus IQ Server:14 using Sonatype ⊲⊲In the study, 47% of components — data, we are able to gather vulnerability informa- For developers wanting to use more secure after releasing a new version — had tion and a derived component popularity score components, those exhibiting faster MTTR are a vulnerability discovered in one of (based on how frequently components are seen its dependencies, during the period desirable. FIGURE 3B (we cropped the x-axis at the by the Nexus IQ repository scanning service). in which that version was current. CHAPTER 3: EXEMPLARY PROJECT TEAMS TEAMS PROJECT CHAPTER 3: EXEMPLARY 95th percentile) shows a histogram of MTTR across all components, demonstrating the long tail of 3.2 Time to Remediate components that apply security updates very slowly, Vulnerabilities often years after they are released. We observed: ⊲⊲In the study, 47% of components — after releasing The first outcome metric we focused on was time a new version — had a vulnerability discovered to remediate (TTR). TTR is the time required for a in one of its dependencies, during the period in component team to remediate any security vul- which that version was current. (N = 36,203) nerabilities reported against their dependencies. For a vulnerable dependency, remediation occurs ⊲⊲The median TTR was 180 days (similar to num- bers reported in previous versions of the State of when it is upgraded to a new version that fixes the the Software Supply Chain Report). The top 5% vulnerability. We combined data on component of components remediated vulnerabilities within updates with data on vulnerabilities, and tagged 21 days.
2019 STATE OF THE SOFTWARE SUPPLY CHAIN REPORT 13 Mean and Median Time to Remediate Vulnerabilities FIG.(cumulative 3B Mean percentage) and Median Time to Remediate Vulnerabilities (cumulative percentage) SOURCE: 2019 STATE OF THE SOFTWARE SUPPLY CHAIN REPORT 100 2
90 1 0 Median 95% 0 TTR is percentile 180 days occurs at 1,302 days 0 (3.5 years) Mean 0 TTR is 326 days 0