Ct-Fuzz: Fuzzing for Timing Leaks
Total Page:16
File Type:pdf, Size:1020Kb
ct-fuzz: Fuzzing for Timing Leaks Shaobo He Michael Emmi Gabriela Ciocarlie University of Utah, USA SRI International, USA SRI International, USA [email protected] [email protected] [email protected] Abstract—Testing-based methodologies like fuzzing are they are generally hardware-dependent. Furthermore, while able to analyze complex software which is not amenable to compiler optimizations are obliged to respect traditional traditional formal approaches like verication, model check- safety properties, they are not designed to respect two-safety ing, and abstract interpretation. Despite enormous success at exposing countless security vulnerabilities in many popular properties, and can thus generate insecure machine code. software projects, applications of testing-based approaches In this work we extend fuzzing to two-safety properties. have mainly targeted checking traditional safety properties In particular, we present ct-fuzz, 2 which enables the precise like memory safety. While unquestionably important, this discovery of timing-based information leaks while retaining class of properties does not precisely characterize other self-composition important security aspects such as information leakage, the ecacy of a-fuzz. Via [6], we reduce e.g., through side channels. testing two-safety properties to testing traditional safety In this work we extend testing-based software analysis properties by program transformation, eectively enabling methodologies to two-safety properties, which enables the the application of any fuzzer. Our implementation tackles precise discovery of information leaks in complex software. three basic challenges. First, the program under test must In particular, we present the ct-fuzz tool, which lends coverage-guided greybox fuzzers the ability to detect two- eciently simulate execution-pairs of the original; to avoid safety property violations. Our approach is capable of ex- overhead, each execution’s address space is copy-on-write posing violations to any two-safety property expressed as shared. Second, structured input-pairs must be derived from equality between two program traces. Empirically, we demon- random fuzzer-provided input; leaks are only witnessed strate that ct-fuzz swiftly reveals timing leaks in popular when inputs dier solely by secret content. Third, leakage- cryptographic implementations. inducing actions must be monitored; leaks are witnessed I. Introduction when, e.g., control ow or memory-access traces diverge, Security is a primary concern for software systems, where given inputs diering solely by secret content. errors like out-of-bounds memory accesses and inexhaustive Our implementation focuses on timing leaks related to input validation are responsible for dangerous and costly program control-ow and cpu cache; the name ct-fuzz refers incidents. Accordingly, many mechanisms exist for protecting to the constant-time property, asserting that the control-ow systems against common vulnerabilities like memory-safety er- and accessed memory locations of two executions diering rors and input injection. Among the most eective automated solely by secrets are identical. Besides constant-time, we approaches is coverage-based greybox fuzzing [1] popularized also apply ct-fuzz to ner-grained cache models to validate by a-fuzz,1 which has uncovered many critical vulnerabilities. secure yet non-constant-time programs which ensure identical Its ecacy is due to broad applicability and direct feedback: cache timing despite potentially-divergent memory-access afl’s genetic input-generation algorithm uses lightweight patterns, e.g., by preloading all accessed memory into the instrumentation to determine vulnerability-witnessing inputs. cache. While our implementation focuses on side-channel It works without user interaction and even source code, and leakage related to timing, it is extensible to other forms of sidesteps the computational and methodological bottlenecks leakage, e.g., power or electro-magnetic radiation, by further imposed by traditional program analysis techniques. extending the instrumentation mechanism to record other Fuzzers like a-fuzz target traditional temporal safety aspects of program behavior. In principle, ct-fuzz could be properties [2], i.e., properties concerning individual system extended to expose violations to any two-safety property executions. Important security aspects like secure information expressible as equality between two program traces. ow [3] are two-safety properties [4], i.e., properties concerning In the remainder we describe several aspects of the ct-fuzz pairs of executions, and cannot be precisely characterized as tool. Sections II–III cover foundations and implementation traditional safety properties [5]. Secure information ow is concerns in testing two-safety properties. Sections IV–V particularly insidious given the potential for side channels, cover ct-fuzz’s software architecture and basic functionality. through which adversaries may infer privileged information Sections VI–VII describe case studies and evaluation, demon- about the data accessed by a program, e.g., by correlating strating the application of ct-fuzz to many cryptographic execution-time dierences with control-ow dierences. Side- libraries. We outline future work in Section VIII. channels are dicult for programmers to reason about since 2ct-fuzz is on GitHub: https://github.com/michael-emmi/ct-fuzz, as are ex- 1American Fuzzy Lop: http://lcamtuf.coredump.cx/a/ periments: https://github.com/shaobo-he/ct-fuzz-artifact/releases/tag/icst2020. Coverage-Guided Greybox Fuzzer Fuzz Trace Monitor Input Crash? P II. Theoretical Foundations Self-Composition of P In this section we characterize ct-fuzz’s foundations, in- Monitor Input Trace Coverage-Guided cluding secure information ow, its reduction to traditional P Greybox Fuzzer safety properties, and coverage-guided greybox fuzzing. Input Split Diff? Crash? Fuzz Trace A. Secure Information Flow Monitor Monitor Input Trace Input Crash? We consider an abstract notion of secure information P P ow [3] over pairs of executions. Two executions e1 and e2 are f-equivalent with respect to some function f when Fig. 1. Self-composition simulates two Fig. 2. Coverage-guided fuzzers f(e1) = f(e2). We model program secrets by a declassication executions of a given program. generate inputs and report crashes. function D mapping each execution e to its declassied con- Self-Composition of P Monitor tent D(e), and say D-equivalent executions are comparable. Input Trace Intuitively, this equivalence relates executions whose input transition counts. From the perspective ofP this work, the and output values dier solely by secret content, e.g., by the salient aspects of a-fuzz areInput its broadSplit applicabilityDiff? Crash? and value of a secret key. Similarly, we model attacker capabilities eciency: it works without user interaction and source code, Monitor observation function O e and avoids prohibitive process-creationInput overheads byTrace sharing by an mapping each execution P to its observable content O(e), and say two O-equivalent executions’ address spaces in a copy-on-write fashion. These executions are indistinguishable. Intuitively, this equivalence features enable the rapid exploration of program executions relates executions which a given observer cannot dierentiate, for arbitrarily-complex software. e.g., because of negligible timing dierences. In practice, we distinguish timing variations by observing divergences in III. Design and Implementation Concerns control-ow decisions and accessed memory locations. Applying self-composition (§II-B) to coverage-guided grey- Denition 1. A program is secure when every pair of compara- box fuzzers (§II-C) poses three basic challenges: eciently ble executions are indistinguishable, for given declassication simulating execution pairs; deriving structured-input pairs and observation functions. from fuzzer-provided inputs (fuzz); and detecting leakage. B. Reducing Security to Safety A. Simulating Execution Pairs Self-composition [6] is program transformation reducing Self-composition (§II-B) dictates that two identical copies secure information ow to traditional safety properties. Here of a program execute in isolation. In the context of testing, we say a given program is safe when it cannot crash. This achieving isolation includes separating the address spaces simple notion of safety can capture violations to any temporal of each simulated execution so that the side-eects of one safety property [2] given adequate program instrumentation. are invisible to the other. On the one hand, the existing For instance, llvm’s thread, address, and memory sanitizers approach of duplicating procedures and variables is eective signal crashes upon thread- and memory-safety violations, for symbolic analyses like ct-verif [7]. However, actually and uses of uninitialized memory, respectively executing such duplicated programs on their target platforms The self-composition sc(P ) of program P is a program can alter the originals’ behavior signicantly, e.g., due to (Fig. 1) which executes two copies of P in isolation, i.e., exe- platform-dependent behavior. On the other hand, executing cution of each copy is independent of the other, which: copies in separate processes undermines the ecacy of fuzzers • halts when the copies’ executions are incomparable, and like a-fuzz which avoid process-creation overhead. • crashes when the copies’ executions are both comparable and distinguishable. B. Deriving Structured