Automated Identification of Security Issues from Commit Messages and Bug Reports

Total Page:16

File Type:pdf, Size:1020Kb

Automated Identification of Security Issues from Commit Messages and Bug Reports Automated Identification of Security Issues from Commit Messages and Bug Reports Yaqin Zhou Asankhaya Sharma SourceClear, Inc. SourceClear, Inc. [email protected] [email protected] ABSTRACT 1 INTRODUCTION The number of vulnerabilities in open source libraries is increasing To aid in the software development, companies typically adopt rapidly. However, the majority of them do not go through public issue-tracking and source control management systems, such as disclosure. These unidentied vulnerabilities put developers’ prod- GitHub, JIRA, or Bugzilla. In fact, as of April 2017, GitHub reports ucts at risk of being hacked since they are increasingly relying on having almost 20 million users and 57 million repositories. Accord- open source libraries to assemble and build software quickly. To ing to Atlassian, JIRA is used by over 75,000 companies. These nd unidentied vulnerabilities in open source libraries and se- tools are very popular for open source projects, and are essential cure modern software development, we describe an ecient au- to modern software development. Developers work on reported is- tomatic vulnerability identication system geared towards track- sues in these systems, then commit corresponding code changes ing large-scale projects in real time using natural language pro- to GitHub (or other source code hosting platforms, e.g. SVN or Bit- cessing and machine learning techniques. Built upon the latent in- Bucket). Bug xes and new features are frequently merged into a formation underlying commit messages and bug reports in open central repository, which is then automatically built, tested, and source projects using GitHub, JIRA, and Bugzilla, our K-fold stack- prepared for a release to production, as part of the DevOps prac- ing classier achieves promising results on vulnerability identica- tices of continuous integration (CI) and continuous delivery (CD). tion. Compared to the state of the art SVM-based classier in prior There is no doubt that the CI/CD pipeline helps improve de- work on vulnerability identication in commit messages, we im- veloper productivity, allowing them to address bugs more quickly. prove precision by 54.55% while maintaining the same recall rate. However, a signicant amount of security-related issues and vul- For bug reports, we achieve a much higher precision of 0.70 and nerabilities are patched silently, without public disclosure, due to recall rate of 0.71 compared to existing work. Moreover, observa- the focus on fast release cycles as well as the lack of manpower and tions from running the trained model at SourceClear in production expertise. According to statistics collected at SourceClear [2], 53% for over 3 months has shown 0.83 precision, 0.74 recall rate, and of vulnerabilities in open source libraries are not disclosed publicly detected 349 hidden vulnerabilities, proving the eectiveness and with CVEs. For Go and JavaScript language libraries, the percent- generality of the proposed approach. age is as high as 78% and 91% respectively. In today’s world of agile software development, developers are CCS CONCEPTS increasingly relying on and extending free open source libraries to get things done quickly. Many fail to even keep track of which open • Security and privacy → Software security engineering; source libraries they use, not to mention the hidden vulnerabilities that are silently patched in the libraries. So even if the vulnerabil- KEYWORDS ity is xed in source code as seen by the commit message or bug vulnerability identication, machine learning, bug report, commit report, there may be users of the library who are not made aware of the issue and continue using the old version. Unaware of these ACM Reference format: hidden vulnerabilities, they are putting their products at risk of be- Yaqin Zhou and Asankhaya Sharma. 2017. Automated Identication of Se- ing hacked. Thus, it is important to locate such silent xes as they curity Issues from Commit Messages and Bug Reports. In Proceedings of can be used to help developers decide about component updates. 2017 11th Joint Meeting of the European Software Engineering Conference Motivated to nd the unidentied vulnerabilities in open source and the ACM SIGSOFT Symposium on the Foundations of Software Engineer- libraries and secure modern software development, in this paper ing, Paderborn, Germany, September 4-8, 2017 (ESEC/FSE’17), 6 pages. we describe an automatic vulnerability identication system that https://doi.org/10.1145/3106237.3117771 tracks a large number (up to 10s of thousands) of open source li- braries in real time at low cost. Many techniques, such as static and dynamic analysis, are em- Permission to make digital or hard copies of all or part of this work for personal or ployed to ag potentially dangerous code as candidate vulnerabil- classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full cita- ities, but they are not appropriate for tracking existing unknown tion on the rst page. Copyrights for components of this work owned by others than vulnerabilities on a large scale, at low cost (due to false positives). ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specic permission Firstly, these tools can only support one or two specic languages and/or a fee. Request permissions from [email protected]. and certain patterns of vulnerabilities well. For example, the static ESEC/FSE’17, September 4-8, 2017, Paderborn, Germany analysis tool FlawFinder [19] is limited to nding buer overow © 2017 Association for Computing Machinery. risks, race conditions, and potential shell meta-character usage in ACM ISBN 978-1-4503-5105-8/17/09...$15.00 https://doi.org/10.1145/3106237.3117771 C/C++. Moreover, most of these approaches operate on an entire 914 ESEC/FSE’17, September 4-8, 2017, Paderborn, Germany Yaqin Zhou and Asankhaya Sharma software project and deliver long lists of potentially unsafe code the underlying issues due to imprecision of analysis. For example, with extremely high false positive rates (i.e., 99% shown in [14]). RATS does not nd Cross-Site Scripting (XSS) or SQL Injection vul- They require a considerable amount of manual review and thus nerabilities. Moreover, when applied to real world projects, these are not suitable to track projects at large scale. tools raise massive false positives that are hard to reduce. For most software systems, bugs are tracked via issue trackers Dynamic analysis analyzes the source code by executing it on and code changes are merged in the form of commits to source real inputs. Basic dynamic analysis (or testing) tools search for vul- control repositories. Therefore, it is convenient to check these ba- nerabilities by trying a wide range of possible inputs. There are sic artifacts (a new bug report or commit) of software development also dynamic taint analysis tools that do taint tracking at runtime to detect vulnerabilities in real time. Besides, source code changes, [1, 3, 13]. For example, PHP Aspis does dynamic taint analysis to bug reports and commits messages contain rich contextual infor- identify XSS and SQL vulnerabilities [13]. ZAP [1], a popular test- mation expressed in natural language that is often enough for a ing tool in industry, nds certain types of vulnerabilities in web security researcher to determine if the underlying artifact relates applications. It requires users to dene scan policies before scan- to a vulnerability or not. ning and perform manual penetration testing after scanning. Building on these insights, we automate the above identica- Symbolic execution [8] is a technique that exercises various code tion process via natural language processing and machine learning paths through a target system. Instead of running the target system techniques. From GitHub, JIRA, and Bugzilla, we collected a wide with concrete input values like dynamic analysis, a symbolic execu- range of security related commits and bug reports for more than tion engine replaces the inputs with symbolic variables which ini- 5000 projects, created between Jan. 2012 and Feb. 2017, to build tially could be anything, and then runs the target system. Cadar et commits and bug reports datasets with ground truth. The collected al. [5] present KLEE, an open-source symbolic execution tool to an- data, however, demonstrates a highly imbalanced nature, where alyze programs and automatically generate system input sets that vulnerability-related commits and bug reports are less than 10%. achieve high levels of code coverage. However, it requires manual Hence, it is challenging to train classiers to identify vulnerabili- annotation and modication of the source code. Also, like most ties with good performance. To address it, we design a probability- symbolic execution tools, runtime grows exponentially with the based K-fold stacking algorithm that ensembles multiple individ- number of paths in the program which leads to path explosion. ual classiers and exibly balances between precision and recall. Machine learning. Besides the above techniques that focus exclu- We summarize our major contributions below. sively on the raw source code, machine-learning techniques pro- • We present an automatic vulnerability identication sys- vide an alternative to assist vulnerabilities detection by mining con- tem that ags vulnerability-related commits and bug re- text and semantic information beyond source code. Among these ports using machine learning techniques. Our approach works, some [11, 16, 17, 20, 20] focus on detecting vulnerabilities can identify various vulnerabilities regardless of program- combined with program analysis. For instance, Shar et al. [17] fo- ming languages at low cost and large scale. Our proposed cus on SQL injection and cross-site scripting while Sahoo et al. [16] K-fold stacking model for commits outperforms the state investigate malicious URL detection. of the art SVM model by 54.55% in precision.
Recommended publications
  • Identifying Challenges for OSS Vulnerability Scanners - a Study & Test Suite
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X, NO. X, JUNE 202X 1 Identifying Challenges for OSS Vulnerability Scanners - A Study & Test Suite Andreas Dann∗, Henrik Platey, Ben Hermannz, Serena Elisa Pontay, Eric Boddenx Abstract—The use of vulnerable open-source dependencies is a known problem in today’s software development. Several vulnerability scanners to detect known-vulnerable dependencies appeared in the last decade, however, there exists no case study investigating the impact of development practices, e.g., forking, patching, re-bundling, on their performance. This paper studies (i) types of modifications that may affect vulnerable open-source dependencies and (ii) their impact on the performance of vulnerability scanners. Through an empirical study on 7,024 Java projects developed at SAP, we identified four types of modifications: re-compilation, re-bundling, metadata-removal and re-packaging. In particular, we found that more than 87% (56%, resp.) of the vulnerable Java classes considered occur in Maven Central in re-bundled (re-packaged, resp.) form. We assessed the impact of these modifications on the performance of the open-source vulnerability scanners OWASP Dependency-Check (OWASP) and Eclipse Steady, GitHub Security Alerts, and three commercial scanners. The results show that none of the scanners is able to handle all the types of modifications identified. Finally, we present Achilles, a novel test suite with 2,505 test cases that allow replicating the modifications on open-source dependencies. Index Terms—Security maintenance, Open-Source Software, Tools, Security Vulnerabilities. F 1 INTRODUCTION and recall. However, developers and distributors frequently HE use of open-source software (OSS) is an established fork, patch, re-compile, re-bundle, or re-package existing T practice in software development, even for industrial OSS [2], [3], [4], [8].
    [Show full text]
  • View Whitepaper
    INFRAREPORT Top M&A Trends in Infrastructure Software EXECUTIVE SUMMARY 4 1 EVOLUTION OF CLOUD INFRASTRUCTURE 7 1.1 Size of the Prize 7 1.2 The Evolution of the Infrastructure (Public) Cloud Market and Technology 7 1.2.1 Original 2006 Public Cloud - Hardware as a Service 8 1.2.2 2016 - 2010 - Platform as a Service 9 1.2.3 2016 - 2019 - Containers as a Service 10 1.2.4 Container Orchestration 11 1.2.5 Standardization of Container Orchestration 11 1.2.6 Hybrid Cloud & Multi-Cloud 12 1.2.7 Edge Computing and 5G 12 1.2.8 APIs, Cloud Components and AI 13 1.2.9 Service Mesh 14 1.2.10 Serverless 15 1.2.11 Zero Code 15 1.2.12 Cloud as a Service 16 2 STATE OF THE MARKET 18 2.1 Investment Trend Summary -Summary of Funding Activity in Cloud Infrastructure 18 3 MARKET FOCUS – TRENDS & COMPANIES 20 3.1 Cloud Providers Provide Enhanced Security, Including AI/ML and Zero Trust Security 20 3.2 Cloud Management and Cost Containment Becomes a Challenge for Customers 21 3.3 The Container Market is Just Starting to Heat Up 23 3.4 Kubernetes 24 3.5 APIs Have Become the Dominant Information Sharing Paradigm 27 3.6 DevOps is the Answer to Increasing Competition From Emerging Digital Disruptors. 30 3.7 Serverless 32 3.8 Zero Code 38 3.9 Hybrid, Multi and Edge Clouds 43 4 LARGE PUBLIC/PRIVATE ACQUIRERS 57 4.1 Amazon Web Services | Private Company Profile 57 4.2 Cloudera (NYS: CLDR) | Public Company Profile 59 4.3 Hortonworks | Private Company Profile 61 Infrastructure Software Report l Woodside Capital Partners l Confidential l October 2020 Page | 2 INFRAREPORT
    [Show full text]
  • A BZ Media Publication
    SDT333 cover_Layout 1 12/22/16 2:26 PM Page 1 A BZ Media Publication JANUARY 2017 • ISSUE NO. 333 • $9.95 • www.sdtimes.com SDT333 Full Page Ads_Layout 1 12/21/16 9:15 AM Page 2 Data Quality Made Easy. Your Data, Your Way. NAME @ Melissa Data provides the full spectrum of data Our data quality solutions are available quality to ensure you have data you can trust. on-premises and in the Cloud – fast, easy to use, and powerful developer tools, We profile, standardize, verify, match and integrations and plugins for the Microsoft enrich global People Data – name, address, and Oracle Product Ecosystems. email & phone. Start Your Free Trial www.MelissaData.com/sd-times Germany India www.MelissaData.de www.MelissaData.in United Kingdom Australia www.MelissaData.co.uk www.MelissaData.com.au www.MelissaData.com 1-800-MELISSA SDT333 page 3_Layout 1 12/21/16 1:45 PM Page 3 Contents ISSUE 333 • JANUARY 2017 NEWS FEATURES 6 News Watch 2016: The Year of Artificial Intelligence 18 2017 trends in software development 20 Top retailers are open to hacks 23 Android 7.1: What you can expect 24 HashiCorp and the state of automation page 10 26 Notes from Node.js Interactive 29 The Software Testing World Cup chronicles Digital transformation is essential to the future of business 33 CollabNet enters DevOps arena 39 Red Hat powers the API economy page 34 COLUMNS 48 GUEST VIEW by Alan Ho Move fast while avoiding Diffusing the monolith time bomb automated testing pitfalls 49 ANALYST VIEW by Al Hilwa Immigration, progress and politics 50 INDUSTRY WATCH by David Rubinstein 2017: The future starts now page 41 Software Development Times (ISSN 1528-1965) is published 12 times per year by BZ Media LLC, 225 Broadhollow Road, Suite 211, Melville, NY 11747.
    [Show full text]
  • Technology Report 1H 2018
    16 70,000 45 160,000 40 14 60,000 140,000 35 12 120,000 50,000 TEV ($mm) TEV ($mm) 30 10 100,000 40,000 25 8 80,000 30,000 20 6 60,000 15 20,000 Transaction Count Transaction Count 4 40,000 10 10,000 2 5 20,000 0 0 0 0 Date TEV Target Acquirer(s) Deal Overview Announced (mm) BMC Software is a global leader in digital enterprise software solutions. The acquisition of 5/29/18 BMC continues KKR's long record of supporting B2B software companies, such as its current $8,300 portfolio of Epicor and Calabrio. Microsoft agreed to acquire GitHub in a push to empower developers, boost enterprise use of 6/4/18 GitHub, and continue to build out its developer tools. More than 28 million developers use 7,500 GitHub as a community to build, collaborate and share ideas. Athenahealth provides cloud-based business services and mobile applications for medical groups and health systems. Elliott has made an activist takeover offer, believing that it could 5/7/18 6,900 help provide the operational change necessary for athenahealth to fundamentally change the Healthcare IT industry. Partners Group will take a majority stake in software development firm GlobalLogic. 5/21/18 GlobalLogic helps clients construct innovative digital products that enhance customer 2,000 engagement. Web.com is a global provider of full range internet services and online marketing tools. SIRIS 6/21/18 Capital wants to acquire Web.com believing that it can add the strategic and operational 1,885 expertise to help grow Web.com.
    [Show full text]
  • ESIP Software Guidelines: Bibliography and Resources
    ESIP Software Guidelines: Bibliography and Resources References “26.4. Unittest — Unit Testing Framework — Python 3.5.2 Documentation.” Accessed November 23, 2016. https://docs.python.org/3.5/library/unittest.html. “Accessibility - W3C.” Accessed June 15, 2016. https://www.w3.org/standards/webdesign/accessibility#wai. “Agile Project Management.” Accessed October 20, 2016. https://www.pivotaltracker.com/. “A JavaScript Library for Building User Interfaces - React.” Accessed October 26, 2016. https://facebook.github.io/react/. Alter, George, George C Banks, Denny Borsboom, Sara D Bowman, Steven J Breckler, Stuart Buck, Chris Chambers, et al. Transparency and Openness Promotion (TOP) Guidelines. Open Science Framework, 2016. osf.io/9f6gx. “Apache Subversion.” Accessed October 19, 2016. https://subversion.apache.org/. “API Blueprint | API Blueprint.” Accessed October 20, 2016. https://apiblueprint.org/. Atlassian. “Bitbucket | The Git Solution for Professional Teams.” Bitbucket. Accessed October 20, 2016. https://bitbucket.org/. “Backbone.js.” Accessed October 26, 2016. http://backbonejs.org/. “Best Practice Library | Section508.gov.” Accessed May 24, 2016. http://section508.gov/content/learn/best-practice-library. “Bootstrap · The World’s Most Popular Mobile-First and Responsive Front-End Framework.” Accessed October 20, 2016. http://getbootstrap.com/. Brutlag, Jake. “Speed Matters.” Google Research Blog, June 23, 2009. https://research.googleblog.com/2009/06/speed-matters.html. Burger, Matthias, Klaus Juenemann, and Thomas Koenig. RUnit: R Unit Test Framework (version 0.4.31), 2015. https://cran.r-project.org/web/packages/RUnit/index.html. Burgess, Annie. “2015 AIST Evaluations Overview.” Federation of Earth Science Information Partners, 2016. http://testbed.esipfed.org/sites/default/files/2015_AIST_Evaluations_Overview.pdf. Car, Nicholas. “Data Reuse Fitness Assessment Using Provenance.” Denver, CO, 2016.
    [Show full text]
  • Getting the Most out of Your Open Source Investments
    Session ID : OPN214 Getting the most out of your open source investments Zaheda Bhorat Max Spevack Principal Open Source Technologist Senior Manager, Linux and Open Source Amazon Web Services Amazon Web Services © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Open Source Consume Contribute Community Communicate Culture We are AWSOpen Hands up! Open Source Trends 40M+ 41B+ Open Source Developers Lines of Code 44M+ 1.3M+ New Repositories on New contributors (2019) GitHub (2019) Courtesy of The Linux Foundation Sources: Sourceclear, Sonatype, Github Amazon mission To be Earth’s most customer-centric company … Consume License detection & compliance Stay close to upstream Don’t ingest unlicensed code Keep your 1P and 3P code separate Best Practices ClearlyDefined SPDX https://github.com/amzn/askalono Tools Contribute 2016 gerrit-check scapy gradle fast align apache phoenix lombok tez capybara plugin - cbmc pygresql cmock devkit dovecot appium libarchive esp-open-rtos picker jcommander hue iot toolbar mysql workbench libsoup zipper eclipse paho mariadb-connector-j - github little proxy little cnn aws californium pyzmq cocoapods rails remotedebugger elastalert redis appium-gulp-plugin street address street chromedriver xcpretty google guava wing - debug - apache hive apache joshua loader - catwatch libfuse fabric daterange webdriveragent utilities sync appium grpc hot - wycheproof appium oci-fetch - django eval yubico ansible kernel - gulp-typescript yarn mr mqtt boto 3 uml ldns bixie - pig go scikit-fuzzy react mshadow
    [Show full text]
  • Solving Your Open Source Risk with Sourceclear by Veracode
    SOLVING YOUR OPEN SOURCE RISK WITH SOURCECLEAR BY VERACODE SOLVING YOUR OPEN SOURCE RISK WITH SOURCECLEAR BY VERACODE RISKS ASSOCIATED WITH OPEN SOURCE LIBRARY USE The demand on software development teams is greater than ever. With the cultural move towards DevOps, the implementation of CI/CD systems, and the desire to operate in an agile manner, developers are being asked to push out more software, in shorter periods of time, than ever before. In turn, developers are increasingly relying on open source libraries, or pre-built pieces of code available online, which allow them to add functionality to their code without having to build it from scratch. The result is that software today is rarely made up completely of first-party code, but Background Knowledge more often “assembled” from other Direct vs Indirect Libraries sources. How much of a typical application is comprised of open source libraries? We When leveraging open source libraries, there are seeing numbers as high as 90 percent, are two categories of libraries pertaining to with the remaining 10 percent first-party your application. Direct dependency libraries are all of the pieces of open source code that code used to stitch the libraries together your developers are directly using and adding and assemble an application. to the application they are developing. Explosive growth of open source However, in the open source world, many times these libraries are also relying on other libraries open source libraries for some of their And the use of open source libraries to functionality. And it’s very common for those assemble applications is accelerating.
    [Show full text]
  • Quarterly Enterprise Software Market Review Q1 2020
    Quarterly Enterprise Software Market Review Q1 2020 Boston San Francisco 200 Clarendon Street, Floor 45 601 Montgomery Street, Suite 2010 Boston, MA 02116 San Francisco, CA 94111 Peter M. Falvey Michael H.M. Shea Christopher J. Pingpank Michael S. Barker Managing Director Managing Director Managing Director Managing Director 617.896.2251 617.896.2255 617.896.2218 415.762.8101 [email protected] [email protected] [email protected] [email protected] Jeffrey G. Cook Brad E. McCarthy William Broughton Misha Cvetkovic Principal Principal Vice President Vice President 617.896.2252 617.896.2245 617.896.2248 415.762.8104 [email protected] [email protected] [email protected] [email protected] www.sheaco.com Member FINRA & SIPC Copyright ©2020 Shea & Company Overview About Our Firm 1 2 26 $10Bn+ 15+ 80+ Firm focused exclusively on Offices in Boston and San Professionals focused on the Advised transaction value in Average years of experience Transactions completed enterprise software Francisco software industry last 12 months amongst our senior bankers representing billions of dollars in value Mergers & Acquisitions, Private Placements & Capital Raising Shea & Company has advised on important transactions representing billions of dollars in value across the strategic acquirer and financial investor landscape with clients in the U.S. as well as Canada, Europe and Israel. has been acquired by has made a majority investment in has been acquired by has acquired has been acquired by has received an investment from has been acquired
    [Show full text]
  • Magic Quadrant for Application Security Testing
    This research note is restricted to the personal use of [email protected]. Magic Quadrant for Application Security Testing Published: 29 April 2020 ID: G00394281 Analyst(s): Mark Horvath, Dionisio Zumerle, Dale Gardner Modern application design and the continued adoption of DevSecOps are expanding the scope of the AST market. Security and risk management leaders will need to meet tighter deadlines and test more complex applications by seamlessly integrating and automating AST in the software delivery life cycle. Strategic Planning Assumptions By 2025, 70% of attacks against containers will be from known vulnerabilities and misconfigurations that could have been remediated. By 2025, organizations will speed up their remediation of coding vulnerabilities identified by SAST by 30% with code suggestions applied from automated solutions, up from less than 1% today, reducing time spent fixing bugs by 50%. By 2024, the provision of a detailed, regularly updated software bill of materials by software vendors will be a non-negotiable requirement for at least half of enterprise software buyers, up from less than 5% in 2019. Market Definition/Description Gartner’s view of the market is focused on transformational technologies or approaches delivering on the future needs of end users. Gartner defines the application security testing (AST) market as the buyers and sellers of products and services designed to analyze and test applications for security vulnerabilities. We identify four main AST technologies: ■ Static AST (SAST) technology analyzes an application’s source, bytecode or binary code for security vulnerabilities, typically at the programming and/or testing software life cycle (SLC) phases. This research note is restricted to the personal use of [email protected].
    [Show full text]
  • CCNA Cyber Ops SECOPS 210-255 Official Cert Guide
    Exclusive Offer – 40% OFF Cisco Press Video Training ciscopress.com/video Use coupon code CPVIDEO40 during checkout. Video Instruction from Technology Experts Advance Your Skills Train Anywhere Learn Get started with fundamentals, Train anywhere, at your Learn from trusted author become an expert, or get certified. own pace, on any device. trainers published by Cisco Press. Try Our Popular Video Training for FREE! ciscopress.com/video Explore hundreds of FREE video lessons from our growing library of Complete Video Courses, LiveLessons, networking talks, and workshops. ciscopress.com/video CCNA Cyber Ops SECOPS 210-255 Official Cert Guide OMAR SANTOS, CISSP No. 463598 JOSEPH MUNIZ, CISSP No. 344594 Cisco Press 800 East 96th Street Indianapolis, IN 46240 ii CCNA Cyber Ops SECOPS 210-255 Official Cert Guide CCNA Cyber Ops SECOPS 210-255 Official Cert Guide Omar Santos, CISSP No. 463598 Joseph Muniz, CISSP No. 344594 Copyright© 2017 Pearson Education, Inc. Published by: Cisco Press 800 East 96th Street Indianapolis, IN 46240 USA All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the publisher, except for the inclusion of brief quotations in a review. Printed in the United States of America First Printing June 2017 Library of Congress Control Number: 2017937634 ISBN-13: 978-1-58714-703-6 ISBN-10: 1-58714-703-3 Warning and Disclaimer This book is designed to provide information about the CCNA Cyber Ops SECOPS 210-255 exam.
    [Show full text]
  • Using Machine Learning to Identify Security
    Using Machine Learning to Identify Security Issues in Open-Source Libraries Asankhaya Sharma Yaqin Zhou SourceClear Outline - Overview of problem space - Unidentified security issues - How Machine Learning can help - Results - WOPR Demo Open-Source Library Growth Projection: > 400M Libraries by 2026 Complexity of Libraries has exploded For every one library you add to a Node.js project, 9 For every 1 Java others are added library you add to your projects, 4 others are added The Code Cocktail Vulnerabilities in Open-Source Libraries ● Known Sources ○ CVEs / NVD ○ Advisories Security Issues are ○ Mailing list disclosures often not reported or publically mentioned ● Unidentified issues ○ Commit logs ○ Bug reports ○ Change logs ○ Pull Requests Mining for unidentified vulnerabilities WOPR: Tool for Reviewing Unidentified Issues Researcher Vulnerability Machine Workflow Engine Queue Knowledgebase Learning Research Tool REST API Bugzilla Changelogs NVD CVE Commit Watcher JIRA GitHub Misc. Reserved CVE Mailing List Data Source Machine Learning for Identifying Vulnerabilities “do machine learning like the great engineer you are, not like the great machine learning expert you aren’t.” Martin Zinkevich, Rules of Machine Learning: Best Practices for ML Engineering http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf System overview ML Pipeline Data collection ● Regular expression to filter out security-unrelated issues ○ Rule sets cover almost all possible expressions related to Source # of tracked projects security issues ● Tracked 8536 projects
    [Show full text]
  • Dependency Management 2.0 – a Semantic Web Enabled Approach
    Dependency Management 2.0 – A Semantic Web Enabled Approach Ellis Emmanuel Eghan A Thesis In the Department of Computer Science and Software Engineering Presented in Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy (Computer Science) at Concordia University Montreal, Quebec, Canada July 2019 © Ellis Emmanuel Eghan, 2019 CONCORDIA UNIVERSITY SCHOOL OF GRADUATE STUDIES This is to certify that the thesis prepared By: Ellis Emmanuel Eghan Entitled: Dependency Management 2.0 – A Semantic Web Enabled Approach and submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science) complies with the regulations of this University and meets the accepted standards with respect to originality and quality. Signed by the final examining committee: _______________________________________________________ Chair Dr. Govind Gopakumar _______________________________________________________ External Examiner Dr. Giuliano Antoniol _______________________________________________________ Examiner Dr. Ferhat Khendek _______________________________________________________ Examiner Dr. Dhrubajyoti Goswami _______________________________________________________ Examiner Dr. Nikolaos Tsantalis _______________________________________________________ Supervisor Dr. Juergen Rilling Approved by ________________________________________________________________ Dr. Leila Kosseim, Graduate Program Director September 3, 2019 ___________________________________________________ Dr. Amir Asif, Dean Gina
    [Show full text]