Parmesan: Sanitizer-Guided Greybox Fuzzing
Total Page:16
File Type:pdf, Size:1020Kb
ParmeSan: Sanitizer-guided Greybox Fuzzing Sebastian Österlund, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida, Vrije Universiteit Amsterdam https://www.usenix.org/conference/usenixsecurity20/presentation/osterlund This paper is included in the Proceedings of the 29th USENIX Security Symposium. August 12–14, 2020 978-1-939133-17-5 Open access to the Proceedings of the 29th USENIX Security Symposium is sponsored by USENIX. ParmeSan: Sanitizer-guided Greybox Fuzzing Sebastian Österlund Kaveh Razavi Herbert Bos Cristiano Giuffrida Vrije Universiteit Vrije Universiteit Vrije Universiteit Vrije Universiteit Amsterdam Amsterdam Amsterdam Amsterdam Abstract to address this problem by steering the program towards lo- cations that are more likely to be affected by bugs [20, 23] One of the key questions when fuzzing is where to look for (e.g., newly written or patched code, and API boundaries), vulnerabilities. Coverage-guided fuzzers indiscriminately but as a result, they underapproximate overall bug coverage. optimize for covering as much code as possible given that bug coverage often correlates with code coverage. Since We make a key observation that it is possible to detect code coverage overapproximates bug coverage, this ap- many bugs at runtime using knowledge from compiler san- proach is less than ideal and may lead to non-trivial time- itizers—error detection frameworks that insert checks for a to-exposure (TTE) of bugs. Directed fuzzers try to address wide range of possible bugs (e.g., out-of-bounds accesses or this problem by directing the fuzzer to a basic block with a integer overflows) in the target program. Existing fuzzers potential vulnerability. This approach can greatly reduce the often use sanitizers mainly to improve bug detection and TTE for a specific bug, but such special-purpose fuzzers can triaging [38]. Our intuition is that we can leverage them then greatly underapproximate overall bug coverage. even more by improving our approximation of bug cover- age in a target program. By applying directed fuzzing to In this paper, we present sanitizer-guided fuzzing, a new actively guide the fuzzing process towards triggering san- design point in this space that specifically optimizes for bug itizer checks, we can trigger the same bugs as coverage- coverage. For this purpose, we make the key observation that guided fuzzers while requiring less code coverage, result- while the instrumentation performed by existing software ing in a lower time-to-exposure (TTE) of bugs. Moreover, sanitizers are regularly used for detecting fuzzer-induced er- since compilers such as LLVM [25] ship with a number ror conditions, they can further serve as a generic and effec- of sanitizers with different detection capabilities, we can tive mechanism to identify interesting basic blocks for guid- steer the fuzzer either towards specific classes of bugs and ing fuzzers. We present the design and implementation of behavior or general classes of errors, simply by selecting ParmeSan, a new sanitizer-guided fuzzer that builds on this the appropriate sanitizers. For instance, TySan [14] checks observation. We show that ParmeSan greatly reduces the can guide fuzzing towards very specific bugs (e.g., type TTE of real-world bugs, and finds bugs 37% faster than ex- confusion)—mimicking directed fuzzing but with implic- isting state-of-the-art coverage-based fuzzers (Angora) and itly specified targets—while ASan’s [37] pervasive checks 288% faster than directed fuzzers (AFLGo), while still cov- can guide fuzzing towards more general classes of memory ering the same set of bugs. errors—mimicking coverage-guided fuzzing. In this paper, we develop this insight to build ParmeSan, 1 Introduction the first sanitizer-guided fuzzer. ParmeSan relies on off-the- shelf sanitizer checks to automatically maximize bug cover- Fuzzing is a common technique for automatically discov- age for the target class of bugs. This allows ParmeSan to ering bugs in programs. In finding bugs, many fuzzers try find bugs such as memory errors more efficiently and with to cover as much code as possible in a given period of lower TTE than existing solutions. Like coverage-guided time [9, 36, 47]. The main intuition is that code coverage is fuzzers, ParmeSan does not limit itself to specific APIs or ar- strongly correlated with bug coverage. Unfortunately, code eas of the code, but rather aims to find these bugs, wherever coverage is a huge overapproximation of bug coverage which they are. Unlike coverage-guided fuzzers, however, it does means that a large amount of fuzzing time is spent covering not do so by blindly covering all basic blocks in the pro- many uninteresting code paths in the hope of getting lucky gram. Instead, directing the exploration to execution paths with a few that have bugs. Recent directed fuzzers [4, 8] try that matter—having the greatest chance of triggering bugs in USENIX Association 29th USENIX Security Symposium 2289 the shortest time. On the other side of the spectrum we have whitebox To design and implement ParmeSan, we address a num- fuzzing [6,21], using heavyweight analysis, such as symbolic ber of challenges. First, we need a way to automatically ex- execution to generate inputs that triggers bugs, rather than tract interesting targets from a given sanitizer. ParmeSan ad- blindly testing a large number of inputs. In practice, white- dresses this challenge by comparing a sanitizer-instrumented box fuzzing suffers from scalability or compatibility issues version of a program against the baseline to locate the sani- (e.g., no support for symbolic execution in libraries/system tizer checks in a blackbox fashion and using pruning heuris- calls) in real-world programs. tics to weed out uninteresting checks (less likely to contain To date, the most scalable and practical approach to bugs). Second, we need a way to automatically construct a fuzzing has been greybox fuzzing, which provides a mid- precise (interprocedural) control-flow graph (CFG) to direct dle ground between blackbox and whitebox fuzzing. By fuzzing to the targets. Static CFG construction approaches using the same scalable approach as blackbox fuzzing, but are imprecise by nature [4] and, while sufficient for exist- with lightweight heuristics to better mutate the input, grey- ing special-purpose direct fuzzers [4, 8], are unsuitable to box techniques yield scalable and effective fuzzing in prac- reach the many checks placed by sanitizers all over the pro- tice [5, 7,17, 30]. gram. ParmeSan addresses this challenge by using an ef- The best known coverage-guided greybox fuzzer is Amer- ficient and precise dynamically constructed CFG. Finally, ican Fuzzy Lop (AFL) [47], which uses execution tracing we need a way to design a fuzzer on top of these building information to mutate the input. Some fuzzers, such as blocks. ParmeSan addresses this challenge by using a two- Angora [9] and VUzzer [36], rely on dynamic data-flow stage directed fuzzing strategy, where the fuzzer interleaves analysis (DFA) to quickly generate inputs that trigger new two stages (fuzzing for CFG construction with fuzzing for branches in the program, with the goal of increasing code the target points) and exploits synergies between the two. coverage. While coverage-guided fuzzing might be a good For example, since data-flow analysis (DFA) is required for overall strategy, finding deep bugs might take a long time the first CFG construction stage, we use the available DFA with this strategy. Directed fuzzers try to overcome this lim- information to also speed up the second bug-finding stage. itation by steering the fuzzing towards certain points in the DFA-based fuzzing not only helps find new code, similar to target program. state-of-the-art coverage-guided fuzzers [9, 36], but can also efficiently flip sanitizer checks and trigger bugs. In this paper we present the following contributions: 2.2 Directed fuzzing • We demonstrate a generic way of finding interesting Directed fuzzing has been applied to steering fuzzing to- fuzzing targets by relying on existing compiler sanitizer wards possible vulnerable locations in programs [7, 13, 18, passes. 19, 41, 45]. The intuition is that by directing fuzzing to- • wards certain interesting points in the program, the fuzzer We demonstrate a dynamic approach to build a precise can find specific bugs faster than coverage-guided fuzzers. control-flow graph used to steer the input towards our Traditional directed fuzzing solutions make use of symbolic targets. execution, which, as mentioned earlier, suffers from scala- • We implement ParmeSan, the first sanitizer-guided bility and compatibility limitations. fuzzer using a two-stage directed fuzzing strategy to ef- AFLGo [4] introduces the notion of Directed Greybox ficiently reach all the interesting targets. Fuzzing (DGF), which brings the scalability of greybox fuzzing to directed fuzzing. There are two main problems • We evaluate ParmeSan, showing that our approach finds with DGFs. The first problem is finding interesting targets. the same bugs as state-of-the-art coverage-guided and One possibility is to use specialized static analysis tools to directed fuzzers in less time. find possible dangerous points in programs [13, 16]. These To foster further research, our ParmeSan prototype is tools, however, are often specific to the bugs and program- open source and available at https://github.com/vusec/ ming languages used. Other approaches use auxiliary meta- parmesan . data to gather interesting targets. AFLGo, for example, sug- gests directing fuzzing towards changes made in the appli- 2 Background cation code (based on git commit logs). While an interest- ing heuristic for incremental fuzzing, it does not answer the 2.1 Fuzzing strategy question when fuzzing an application for the first time or in scenarios without a well-structured commit log. The sec- In its most naive form blackbox fuzzing randomly generates ond problem is distance calculation to the interesting targets inputs, hoping to trigger bugs (through crashes or other er- to guide the DGF.