<<

Fuzz Testing ROI Framework Executive Summary

Fuzz testing maximizes testing with a fraction of time, cost, and resources. Get insight into the type of value you can expect from your fuzzer.

A review of software security investments reveals that a majority of spending is in application testing solutions, such as static analysis, software composition analysis, and scanners. These conventional testing approaches, however, test known or common attack patterns, only addressing CVEs or CWEs. But what about the unknown vulnerabilities -- the weaknesses attackers often exploit?

Fuzz testing is a technique where malformed inputs are sent to an application in hopes of triggering anomalous behavior. Anomalous behavior is usually a sign of an underlying vulnerability -- typically a zero-day. is a proven technique that maximizes defect detection with the least amount of time and resources. As a result, it not only buys organizations time and money, it also frees scarce technical resources from manual, mundane tasks and allows them to focus on strategic initiatives that require true expertise.

This framework is a model for framing the way you evaluate the economic return of investing in fuzz testing or other comparable solutions. Organizations can also use this framework to help predict which fuzz testing solutions will offer the most value based on organizational needs. Solutions

In this framework, we will evaluate four • Bootstrapped Continuous Fuzzing comparable security testing techniques. Bootstrapped continuous fuzzing refers to the practice of internally developing your own continuous fuzzer • Manual Penetration Testing utilizing open-source fuzzers such as AFL. AFL is a Penetration testing, also known as pentesting or coverage-guided generational fuzzer (also known as ethical hacking, refers to the practice of a person or guided fuzzing) and, unlike protocol fuzzers, it has persons simulating attacks against a software to almost AI-like capabilities built into its engine. When it identify weaknesses that could lead to exploits. sends a test case to a target, it is able to monitor the This is offered a service where application security target’s reaction. It, then, takes in the feedback to experts leverage both manual and dynamic testing influence the next set of test cases it generates solutions to conduct testing for a determined on-the-fly. Overtime, the test cases become amount of time. increasingly closer to probing at the application's weaknesses. Additionally, this means all test cases it • Protocol Fuzzers crafts are custom generated for your applications. Protocol fuzzing is a dynamic application security testing solution for negative testing. Think of • ForAllSecure Mayhem protocol fuzzers and all other fuzzers as a team of Mayhem is an advanced fuzz testing technique that penetration testers in machine form. Unlike a combines guided fuzzing with symbolic execution, a service, you can scale up and down the number of patented technology from 10 years of academic fuzzers based on demand. Protocol fuzzers rely on research at Carnegie Mellon University. You get all the a pre-defined library of grammar or file formats test capabilities of a coverage-guided generational fuzzer, cases. These pre-built test suites are crafted by a plus more! Like AFL, Mayhem is able to intelligently team of engineers tailored for specific applications generate custom test cases on-the-fly. Unlike AFL, in specific environments. Mayhem is able to overcome technical inefficiencies of guided fuzzing thanks to the ingenuity of symbolic execution. While guided fuzzers are intelligent, they are not perfect. Guided fuzzers rely on heuristics for Want to learn more? generating its input. While it is sufficient for the Download the What is Advanced short-term, it can impact your ROI in the long run. This is where symbolic execution comes in, tracing logical Fuzz Testing whitepaper or pathways through the executable code and therefore Mayhem solution brief. offering far greater greater code coverage. Mayhem produces a win-win situation to its users.

©2020 ForAllSecure 3 Product Operations

We’ll start our analysis by addressing Remember: Engineers from these vendors must a contentious and multilayered topic: manually build the library of test suites based on RFCs. product licensing. Product licensing is Therefore, test suites for newer or uncommon protocols, an obvious cost, but it is a common such as 5G or Zigbee, are either unavailable or immature. misconception that it is the largest cost. Organizations that choose to build their own test suite, Below is a detailed walk through of may find it more costly and even impossible due to lack product cost for each solution. of technical expertise in the talent market.

Manual Penetration Testing Bootstrapped Continuous Fuzzing Operation Costs Bootstrapping fuzzing is an alluring alternative, because Penetration testing has no direct product license open-source fuzzers, such as AFL, are available free of or operational cost. However, we urge readers charge. However, free is never free. Security engineers to consider how service costs can impact your with ClusterFuzz and OSS-Fuzz have disclosed that organization’s budget. while it is possible to bootstrap and operate these high-performance fuzzers in production, people often Recurring service costs are considered an operational underestimate the complexity of upstanding such expense (OpEx), while annual product licenses are solutions. Their comment echoes what we’ve observed considered a capital expense (CapEx). Depending in the market as well. Customers have cited to us that on your organization, acquiring OpEx budget may be one of the biggest oversights they made was not thinking more challenging than acquiring CapEx budget. The ahead to the ongoing maintenance cost of such a availability of OpEx budget is unpredictable, hinging complicated product. on company performance or quarterly financial reporting timelines. As a result, you will have to reflect Several brave ForAllSecure customers have attempted on whether security testing is something you would to bootstrapped their own continuous fuzzing solutions. consider a luxury or necessity. Some were successful in developing a minimum viable product (MVP) that was deployed into their Protocol Fuzzing Operation Costs organization. It even gained internal buy-in and traction. Ultimately, they eventually transitioned to ForAllSecure Protocol fuzzers charge on a per protocol basis. Our Mayhem because they realized that they had become a market research revealed that vendors offer roughly development organization for their bootstrapped fuzzing 32 protocols and files in a “standard” offering for solution -- deploying bug fixes and building new features decent, mid-level fuzzing. on an ongoing basis. Eventually, maintenance became a distraction from the larger application security vision for A critical consideration for those evaluating protocol the department. fuzzers is whether your tool of choice supports your desired protocol or file format.

©2020 ForAllSecure 4 ForAllSecure Mayhem

ForAllSecure Mayhem is priced based on two factors: tier and number of cores. The appropriate tier for you will be determined by the features that you seek. The number of cores you use will be determined by the scale and speed you’d like out of your analysis engine. In short, the more computing power you place behind the fuzzing engine, the more effective your analysis runs will be.

Vulnerability Assessment

Now that you’ve made a purchase decision, what value can you expect during the vulnerability assessment process?

Vulnerability management is described as the “cyclical practice of identifying, classifying, remediating, and mitigating vulnerabilities” in software. There are eight procedural steps for a single vulnerability assessment. The sections below walk through the effort involved in conducting each step with each solution.

Vulnerability Management Cycle

PREWORK ASSESS PRIORITIZE ACT

• Determine scope • Report • Assign value • Remediate of program • Scan • Gauge exposure • Mitigate • Define roles and responsiilities • Identify assets • Add context • Accept risk

• Select vulnerability assessment tools

• Create and refine IMPROVE RE-ASSESS policy and SLAs • Eliminate underlying issues • Rescan • Identify asset context sources • Evolve process and SLA • Validate

• Evaluate metrics

©2020 ForAllSecure 5 MANUAL PENETRATION PROTOCOL BOOTSTRAPPED FORALLSECURE TESTING FUZZING CONTINUOUS FUZZING MAYHEM

STEP 1 When testing applications for the first time, attack surface analysis is the first and most critical task. Attack Surface Analysis

Different from in-house tools, While the labor involved in attack surface analysis can be astonishing, it is a one-time, up-front cost per services require attack application. Automated solutions leverage existing configurations and attack surface analysis for surface analysis in every future assessments. service engagement -- meaning you will be paying the cost of doing attack surface analysis each time, regardless of whether the app has been tested before.

Vulnerability analysis is the process of reducing security risks in applications. The purpose is to STEP 2 uncover security flaws within a target. The output of a vulnerability assessment is a list of defects Vulnerability Analysis suspected to have caused the software-under-test (SUT) to behaveunexpectedly. The vulnerability analysis step primarily focuses on configuration and running of the fuzzing solution(s).

Penetration testing Mature protocol fuzzers are Bootstrapped solutions aren’t As a part of the configuration consultants use a number of plug-and-play, requiring as fully-baked as commercial step, Mayhem requires tools to support their negligible cost in configuration offerings, requiring ongoing software to be packaged for vulnerability assessment. and assessment efforts. experimentation and ingestion. Luckily, packaging Though you may not realize it fine-tuning for adequate is a quick process that can be you are paying for installation results. You may even need to done in one quick command and configuration of these write a harness to get started. at the CLI. The purpose of tools, but in the form of Harnesses give your fuzzer an packaging is to bypass many person hours. Thus, entry point into the target. It of the frustrations around installation and configuration also provides your fuzzer with harnessing. For more costs are still applicable in a guidance by setting information about packaging, services offering. boundaries on where to test. learn more in this blog. As you may already suspect, you will need to get help here from developers who understand the structure of the application well. Expect to spend a significant amount of time in this step should you choose to move forward with a bootstrapped continuous fuzzer.

©2020 ForAllSecure 6 MANUAL PENETRATION PROTOCOL BOOTSTRAPPED FORALLSECURE TESTING FUZZING CONTINUOUS FUZZING MAYHEM

One of the most delicate subjects in vulnerability analysis is false-positives. No application security STEP 3 testing technique is immune. Whether an organization uses SAST, DAST, or IAST, results are reviewed Validation by a human and false-positives are manually sifted out. The validation process significantly varies depending on the chosen method.

Before sharing their Protocol fuzzers’ findings Similar to protocol fuzzers, Mayhem automatically assessments with customers, aren’t confirmed to be bootstrapped fuzzers also validates each of its findings consultants validate findings exploitable. Identifying the require expertise for results by replaying the test case from their solutions to sift out exact issue that was triggered interpretation, issue three times. When false-positives. Though you is particularly challenging identification, and defects reproducibility is confirmed in may not have to go through because all protocol fuzzers verification. Unlike protocol each attempt, the defect is the manual validation process, take a black-box approach fuzzers, guided fuzzers restart reported. This approach you are certainly paying for it. -- meaning it has no insight the target with each ensures Mayhem has zero into the software, only what it submission, therefore you false-positives, allowing is putting in, and what is know the exact test case that security teams to bypass this coming out of it. Think of it like triggered the vulnerability. step altogether. diagnosing a vehicle issue, but Additionally, guided fuzzers only having insight into what take a grey-box approach, grade of gas you pump into offering more insight, such as your car. In addition, protocol the line of code affected or fuzzers do not restart the system level information, target each time a test case is quickening the validation sent. Therefore, you do not process. Bear in mind that know whether it was the last misconfiguration of guided test case that triggered the fuzzers can lead to false- issue, or the combination of positives. test cases that were sent. With very few engineers familiar with fuzzing, it takes a reverse-engineering expert to interpret results, pinpoint issues, and verify defects.

Simply put, it is ineffective for security and development teams to fix every defect. There are STEP 4 instances where the cost of fixing a defect and missing time-to-market outweigh the potential risk of being exploited. Within the vulnerability management practice, it is common to bucket confirmed Triage defects into three general categories: critical defects requiring fixes before release, defect requiring fixes post release, and backlogged defects to be fixed “when time allows”.

Fuzzing solutions provide informative results that outline the system level impact of an exploit, quickening the triaging process. The level of detail provided by each solution is comparable. On the other hand, manual penetration testing will cost slightly more, as cross correlation will need to be done across the various tools they use.

©2020 ForAllSecure 7 MANUAL PENETRATION PROTOCOL BOOTSTRAPPED FORALLSECURE TESTING FUZZING CONTINUOUS FUZZING MAYHEM

Depending on the political landscape of an organization, reporting can be a time-consuming and STEPS 5 AND 6 stressful process. Developers may have been burned by false-positive defects slowing down their productivity and releases. These experiences can make cross collaboration across teams tense. Report and Fix Security teams are often forced into a defensive position, where they must negotiate their fix “mandates”.

Negotiating fixes via manual Negotiating the fix of vulnerabilities with these two fuzzing When defects are reported, penetration testing results will solutions also requires a lift, though not as much as penetration Mayhem delivers a test case, likely require the largest lift testing. The results that are shared by these fuzzers are crafted which proves the validity of among all other options. with the security audience in mind. Security teams will have to the defect. Therefore, Mayhem Penetration testers are only interpret, translate, and explain the impact of the vulnerabilities to allows defect remediation to commissioned to uncover the developer. The great news is that security teams will have become a fire-and-forget vulnerabilities. Therefore, they access to the test cases, allowing them to conduct full between security and do not necessarily have a investigations. development. Mayhem vested interest in having the reports issues with both the findings fixed. To ensure security and development maximum value in this step, audience in mind. In some ensure that your penetration cases, Mayhem is able to testers will be “showing their decipher the exact line of code work” in form of a impacted. demonstration or sharing of the test case that triggered the vulnerability. While it’s appealing to have the problems outsources and taken care of, one of the downsides of this approach is to that you do not have access to all test cases, only a select few.

STEPS 7 AND 8 Test and Deploy

Once the defect is remediated, the fix must be tested and verified before deployment. Testing a fix is straight-forward if remediation packages include the test case that triggered the defect.

A recent interview with fuzzing experts Billy Rios from QED Secure Solutions, Jared DeMott from VDA Labs, and Chris Clark from Synopsys revealed that the quality of fuzzing results are highly dependent on a well-built tool that takes an intelligent and strategic approach to testing. This means that configuring your solutions correctly can make all the difference. “When fuzzing is done well, it’s a powerful technique in your application security toolbox,” they shared.

©2020 ForAllSecure 8 Breaking Down the Product Benefits and in Mayhem’s case, done continuously as a part of CI/CD pipeline. Users may observe a noticeable The sections below outline the intangible values each difference in quality between continuous fuzzing and solution delivers as cited by customers. Product Mayhem’s test suites. The quality of results -- defects justifications often focus on qualitative data. However, found as well as test suite -- from open source fuzzers we find quantitative data to be equally critical for is largely dependent on implementation. More often ensuring a full 360 degree examination of a selected than not, fluency behind the technical workings of technology’s impact across an entire organization. fuzzing is required for a fruitful outcome from these open source solutions. Regression testing Code Coverage Vulnerability analysis rarely ends with a single assessment. When defects are uncovered and fixed The quality of analysis has thus far been overlooked. the same set of security testing must be performed, Code coverage is a critical factor in results quality. once again, to validate fixes -- also known as However, because DAST does not require access to regression testing. Ownership over application test source code, these solutions have little understanding of suites is a driving purchasing requirement for some their coverage. Below is a typical graph of new defects organizations, especially for those who are maturing found over time: their application security processes.

• Manual Penetration Testing While manual pentesting services offload the work CODE COVERAGE TOTAL DEFECTS FOUND DEFECTS FOUND of conducting security in-house, any test suites generated as a part of the service becomes the consulting organization's proprietary information. Release Therefore, clients are required to book additional assessments for validating fixes.

• Protocol Fuzzing Because protocol fuzzers license per test suite, users have access to the test suite for regression testing as well. The test suites are pre-built by the vendor, therefore they often encompass common or known attack patterns that assume a one-size- fits-all approach. These test suites are not custom to your application TIME • Bootstrapped Continuous Fuzzing and ForAllSecure Mayhem Both bootstrapped continuous fuzzing and Mayhem autonomously generate test cases on-the-fly, based on feedback from the SUT. Test cases can be saved for future regression testing,

©2020 ForAllSecure 9 Most DASTs fail to offer continuous ROI due to the • Protocol Fuzzing pesticide paradox. Pesticide Paradox states that if the Protocol fuzzers take a systematic approach to same tests are repeated over and over again, eventually delivering test cases -- meaning they can be the same test cases will no longer find new bugs. It is a thorough. However, this approach is the greatest misconception that no reported bugs indicates the victim of the pesticide paradox. They automate software under test is secure. More often than not, it testing to the same areas of code, centralizing indicates defects have clustered in limited sections of the defects throughout an application. Users cite that software, creating hotspots. Below is an pesticide- the ROI of these solutions decrease over time, immune graph of new defects over time: forcing them to consider other complementary fuzzers for testing variety.

• Manual Penetration Testing and Bootstrapped

CODE COVERAGE TOTAL DEFECTS FOUND DEFECTS FOUND Continuous Fuzzing Manual penetration testing and continuous fuzzing offer the testing variety users of protocol fuzzing New Edge Release seek. However, their randomized approach has its benefits and drawbacks. Without the methodical approach of protocol fuzzers, they easily miss subtler defects.

• ForAllSecure Mayhem Mayhem leverages strengths from each of the aforementioned methods. Mayhem unifies the randomized approach of guided fuzzing with the systematic approach of symbolic execution. Symbolic execution ensures thorough analysis,

TIME finding deep defects other solutions miss. It continuously identifies and breaks through new areas of software, maximizing its code coverage and preventing hotspots for defects to cluster.

Want to learn more about the value of code coverage within the fuzzing cycle? Read this blog on, “Beginning Fuzz Cycle Automation: Improving Testing and Fuzz Development with Coverage Analysis”.

©2020 ForAllSecure 10 Scale • Protocol Fuzzers Protocol suite licenses are for consecutive use only. The challenge with negative testing is that it aims to Concurrent use will require the purchase of additional tackle the “infinite space” problem. There are an test suite licenses - not to mention hardware and real infinite number of ways software can be misused. estate to house the hardware. Protocol fuzzers run While negative testing is vital, it is tedious and boring, against systems, not software. This presents requiring extensive time, resources, and cost. Thus, challenges when scaling horizontally. There is no easy automation is a significant feature that deeply or economic way to replicate systems. In order to influences the effectiveness and scalability of a horizontally scale, they must buy a number of the solution. same system, exponentially adding to costs.

• Manual Penetration Testing • ForAllSecure Mayhem A human-in-the-loop approach limits scale. Google considers “sufficient” fuzzing to be 1 CPU Humans are imperfect beings with emotional, years. Mayhem saves test cases, allowing users to not mental, and physical needs. Overwork and only continuously run regression testing quickly and boredom lead to inconsistencies, oversight, and effortlessly, but pick up exactly where they left off in demoralization, impacting result quality - another their last run. Depending on the number of cores limit to scalability. As organizations mature in their utilized, Mayhem can scale up or down based on an application security program, they opt to organization’s testing needs. Features such as discontinue their penetration testing services for a concurrency also allow organizations to test multiple solution they can run in-house. applications at once.

• Bootstrapped Continuous Fuzzing Standing up a MVP solution is manageable. However, as application security programs mature, organizations require greater automation for scale. Requirements become exponentially complex and difficult to manage. Security engineers of the ClusterFuzz and OSS-Fuzz team have disclosed that even with their padded budgets and world- class experts, it took Google years to achieve full automation. For a long-term solution that grows with your organization and application security program, interviewees recommended a vendor solution.

©2020 ForAllSecure 11 CI/CD Integration • ForAllSecure Mayhem Commercialized modern fuzzers, such as Mayhem, Fuzzing is most effective when it is integrated as a seek to make continuous fuzzing widely available and part of the developer pipeline. However, traditional lower the technical barrier to entry. Mayhem places a fuzzers, although they have a quicker time to fuzz, are framework around the entire fuzzing process with notorious for their inability to integrate into DevOps features around automation, triaging, scriptability, and pipelines -- their largest limitation. As software testing integrations. gets pushed out further right of the SDLC, remediation becomes increasingly expensive and Cost of Doing Nothing time-to-market delayed. In the long run, this can affect an organization's productivity and overall Some may refute that there is “no savings to worry” appsec cost. about if an investment is not made to begin with, arguing that they’ve been “just fine so far”. In this section, we • Manual Penetration Testing and Protocol Fuzzers argue the cost of doing nothing. Manual pentesting and protocol fuzzing typically occur in post-development phases, such as QA. If we were to put a simple dollar amount to it, the cost of These solutions are excellent for right-of-ship a security incident could be upwards of $4 million. A 2019 testing. However, when they are forced into CI/CD Ponemon study revealed that the cost of a data breach is pipelines, they can be costly and even impossible $4.88 million dollars. to incorporate in the developer workflow. However there’s more to cost than just a number. In • Bootstrapped Continuous Fuzzing recent years, there has been a global imperative for Modern open source fuzzers can be integrated as organizations to take better care of protecting customers a part of the development lifecycle. However, they -- whether it’s their data or their safety. GDPR is a weren't built with enterprise use cases in mind. prominent example. Standards aside, the world is also They require customization and specialization from becoming increasingly aware of vendor negligence when security experts and academics. Mike Walker, it comes to security. Customers are demanding more out Senior Director of Microsoft Research NExT of their vendors. From digital ransoms in the healthcare Special Projects, observes, “typically the future of industry to defective software on airplanes, there are technology is already here, it’s just unevenly several high-profile security incidents today where the distributed.” He shares that the fuzzing technique largest cost wasn’t only from the checkbook. perfectly fits this generalization. He recounts how many view fuzzing as “black magic”, leaving many organizations scratching their heads about how they’ll ever be able to bring this advanced technique into their organization. Many of the complexities around bootstrapping continuous fuzzing have unfortunately fed into this very myth.

©2020 ForAllSecure 12 • Lost customers In September 2017, Equifax faced a data breach. Nefarious actors stole customer data, including names, social security numbers, birthdates, and home addresses. Equity research firm, Baird, estimated that at least 143 million customers are affected. The breach was made possible due to a zero-day vulnerability in a popular open source server framework, Apache Struts. In a May 2019 financial earnings call, Equifax disclosed that the cybersecurity incident cost the organization $1.4 billion in incident response and an overhaul of their technology and data security program. This estimate does not include legal costs. Equifax’s Buzz Score -- an indication of how negative or positive people feel about a brand -- fell 33 points in the first 10 days after the hack was publicized.

• Damaged reputation During 2013’s peak holiday shopping months, popular retailer Target was breached -- 40 million customer credit card accounts, and up to 110 million sets of personal information such as email addresses and phone numbers were stolen. Target is still reeling from the aftermath of its breach. The breach was made possible through stolen third-party HVAC credentials, allowing unauthorized access across the Target network, including their POS systems. Thus far, the breach has cost the retailer $61 million. In 2013, Target had a Buzz Score of 20.7. The year following the data breach, Target’s Buzz Score dropped to a shocking 9.4. The retailer has spent the last 5 years recovering its image. While they’ve managed to win back the trust of their loyal customer base, their Buzz Score in 2018 clocked in at 17.3 -- 3.4 less than before its breach.

©2020 ForAllSecure 13 • Threatened public safety In July 2015, two renowned security researchers Charlie Miller and Chris Valasek demonstrated in a video the remote hacking of a then newly-released Jeep Chrysler. Miller and Valasek used the OBD-II connector to leverage a zero-day exploit and allow access to the car’s CAN bus. Initially, the hack was received with amusement and humor, with the windshield wipers violently and unexpectedly swishing back and forth. It was entertainment for both the viewers and the driver, until the transmission and brakes were disabled, the vehicle coasting to a stop in a ditch alongside a St Louis, Missouri, highway at rush hour. Consumers grew concerned over their safety. Since the hack demonstration, 1.4 million Jeep Chryslers have been recalled and fixed to ensure the safety of Fiat Chrysler passengers. The demonstration was a cornerstone for cybersecurity. Adequate cybersecurity measures in automobiles grew to become a market-driven demand.

Conclusion

Fuzz testing helps organizations effectively mitigate software security risks economically. Depending on your organizational goals, you may find that one fuzzer is able to deliver more value to your organization. Organizations are encouraged to leverage this framework to understand the overall value they may extract from their fuzzer of choice before purchase.

For a demonstration of Mayhem, contact ForAllSecure at [email protected].

©2020 ForAllSecure 14 Securing the world’s most critical software.

ForAllSecure was founded on the mission to make software secure. Utilizing patented technology from a decade of research at Carnegie Mellon University, ForAllSecuredelivers assisted intelligence software security testing. Fortune 1000 companies in aerospace, automotive, and high-tech partner with ForAllSecure for scalable, advanced security testing that keeps pace with increasing development speeds and deployment frequencies. DARPA deemed ForAllSecure the winner in the 2016 Cyber Grand Challenge, and MIT Technology Review named ForAllSecure in the 50 Smartest Companies 2017 list. Efficiently and effectively secure mission critical software with ForAllSecure.

For more information, visit www.forallsecure.com

To learn more, contact [email protected]

©ForAllSecure 2020. All rights reserved.