A Journey from Symbolic Execution to Smart Fuzzing and Beyond Koushik Sen EECS Department University of California, Berkeley

Automated Test Generation: A Journey from Symbolic Execution to Smart Fuzzing and Beyond Koushik Sen EECS Department University of California, Berkeley https://people.eecs.berkeley.edu/~ksen/ 1 Programs are still written by humans, and will be written by humans 2 To Err is Human Software Bugs 3 Programs Have Bugs 4 Why Program Testing? Programmer familiarity Concrete input for debugging No false positives Easy regression 5 Why Automated Testing? 6 Automated Testing Hits the Mainstream 7 Automated Testing Hits the Mainstream 8 Automated Testing Hits the Mainstream 9 Automated Testing Hits the Mainstream 10 Automated Testing Hits the Mainstream 11 Automated Testing Hits the Mainstream 12 Automated Testing Hits the Mainstream 13 Automated Testing Hits the Mainstream 14 Automated Testing Hits the Mainstream 15 Goals of Automated Testing Assumption: A program with optional assertions Goal: Automatically generate test inputs Get “good” code coverage Find “most” assertion violations Find crashes Find security vulnerabilities 16 Approaches to Test Generation Symbolic execution Fuzz testing Hybrid Human-guidance AI guidance Many more ... Symbolic Execution and Concolic Testing 18 Symbolic Execution Java PathFinder, KLEE, S2E, Veritesting Concolic Testing Combine concrete execution and symbolic execution DART, CUTE, CREST, ConBol, Apollo, Jalangi, CATG Concrete + Symbolic = Concolic 19 Symbolic Execution and Concolic Testing void testme (int x, int y) { 1. z = 2 * y; 2. if (z == x) { 3. if (x > y+10) { 4. ERROR; 5. } 6. } 7.} 20 Symbolic Execution and Concolic Testing void testme (int x, int y) { 1. z = 2 * y; 2. if (z == x) { 3. if (x > y+10) { 4. ERROR; 5. } 6. } Path constraint x y z true x y undef 7.} 0 0 21 Symbolic Execution and Concolic Testing void testme (int x, int y) { 1. z = 2 * y; 2. if (z == x) { 3. if (x > y+10) { 4. ERROR; 5. } 6. } Path constraint x y z true x y 2y 7.} 0 0 0 22 Symbolic Execution and Concolic Testing void testme (int x, int y) { 1. z = 2 * y; 2. if (z == x) { 3. if (x > y+10) { 4. ERROR; 5. } 6. } Path constraint x y z 2y = x x y 2y 7.} 0 0 0 0 0 23 Symbolic Execution and Concolic Testing void testme (int x, int y) { 1. z = 2 * y; 2. if (z == x) { 3. if (x > y+10) { 4. ERROR; 5. } 6. } Path constraint x y z 2y = x /\ x > y + 10 x y 2y 7.} 0 0 0 0 0 0 0 24 Symbolic Execution and Concolic Testing void testme (int x, int y) { 1. z = 2 * y; 2. if (z == x) { 3. if (x > y+10) { 4. ERROR; 5. } 6. } Path constraint x y z 2y = x /\ x > y + 10 22 11 7.} 0 0 0 0 Solve Test Inputs 25 Concolic Testing in Practice • Led to the development of several industrial and academic automated testing and security tools – Projects at Intel, Google, MathWorks, NTT, SalesForce – PEX, SAGE, and YOGI at Microsoft – Apollo at IBM, and Conbol and Jalangi at Samsung – BitBlaze, jFuzz, Oasis, and SmartFuzz in academia 26 Coverage is Low 27 Why Coverage is Low? ✗ Expensive to explore each path (i.e. input) ✗ Astronomical # of paths ✗ Explores a small fraction of paths But finds complex logical bugs 28 Fuzz Testing 29 Fuzzing in One Slide Program 30 Fuzzing in One Slide Fuzzer Program 31 Fuzzing in One Slide Randomly generate Input Input Input Input Fuzzer Input Program H@5^23#t.f ./Program < /dev/random 32 Fuzzing in One Slide Randomly generate Run on Input Inputs Input Input Input Fuzzer Input Program H@5^23#t.f ./Program < /dev/random 33 Mutational Fuzzing in One Slide Seed Interesting Input(s) Input Input. Program . Input 34 Mutational Fuzzing in One Slide Seed Interesting Input(s) Input Input. Fuzzer Program . Input 35 Mutational Fuzzing in One Slide Seed Interesting Input(s) Input Pick an Input Input. Fuzzer Program . Input 36 Mutational Fuzzing in One Slide Seed Interesting Input(s) <!ATTLIST Mutate the Input Input Pick an Input Input. Input . Input . Input . Fuzzer Input Program . <!BTTLIST Input 37 Mutational Fuzzing in One Slide Seed Interesting Input(s) <!ATTLIST Mutate Run on the Input Input Pick an Inputs Input Input. Input . Input . Input . Fuzzer Input Program . <!BTTLISTInputs Input 38 Mutational Fuzzing in One Slide Seed Interesting Input(s) <!ATTLIST Mutate Run on the Input Input Pick an Inputs Input Input. Input . Input . Input . Fuzzer Input Program . <!BTTLISTInputs Input Mutational Fuzzers • Radamsa • Zzuf 39 Feedback-Directed Fuzzing 40 Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Pick an Inputs Input Input. Input . Input . Input . Fuzzer Input Program . Inputs Input 41 Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Pick an Inputs Input Input. Input . Input . Input . Fuzzer Input Program . Inputs Input Feedback • Coverage • Execution length • Well-formed input • ... 42 Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Pick an Inputs Input Input. Input . Input . Input . Fuzzer Input Program . Inputs Input Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... 43 Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Pick an Inputs Input Input. Input . Input . Input . Fuzzer Input Program . Inputs Input Yes: add Input Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... 44 Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Pick an Inputs Input Input. Input . Input . Input . Fuzzer Input Program . Inputs Input Yes: add Input Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 45 Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. WhichRun oninput to pick? the Input Input Pick an 2. HowInputs to mutate an Input input? Input. Input . Input 3. How many mutants to . Input . Fuzzer Input generate?Program . Inputs4. What kind of feedback? Input 5. How to decide if an Yes: add input is interesting? Input Resolved using heuristics over a period of 10 years Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 46 Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. WhichRun oninput to pick? the Input Input Pick an 2. HowInputs to mutate an Input input? Input. Input . Input 3. How many mutants to . Input . Fuzzer Input generate?Program . Inputs4. What kind of feedback? Input 5. How to decide if an Yes: add input is interesting? Input Resolved using heuristics over a period of 10 years Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 47 Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. WhichRun oninput to pick? the Input Input Pick an 2. HowInputs to mutate an Input input? Input. Input . Input 3. How many mutants to . Input . Fuzzer Input generate?Program . Inputs4. What kind of feedback? Input 5. How to decide if an Yes: add input is interesting? Input Resolved using heuristics over a period of 10 years Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 48 Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. WhichRun oninput to pick? the Input Input Pick an 2. HowInputs to mutate an Input input? Input. Input . Input 3. How many mutants to . Input . Fuzzer Input generate?Program . Inputs4. What kind of feedback? Input 5. How to decide if an Yes: add input is interesting? Input Resolved using heuristics over a period of 10 years Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 49 Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. WhichRun oninput to pick? the Input Input Pick an 2. HowInputs to mutate an Input input? Input. Input . Input 3. How many mutants to . Input . Fuzzer Input generate?Program . Inputs4. What kind of feedback? Input 5. How to decide if an Yes: add input is interesting? Input Resolved using heuristics over a period of 10 years Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 50 Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. WhichRun oninput to pick? the Input Input Pick an 2. HowInputs to mutate an Input input? Input. Input . Input 3. How many mutants to . Input . Fuzzer Input generate?Program . Inputs4. What kind of feedback? Input 5. How to decide if an Yes: add input is interesting? Input Resolved using heuristics over a period of 10 years Interesting? Feedback • New coverage? • Coverage • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 51 Feedback-directed Fuzzing 101 Fuzzers: Seed Interesting • AFL Inputs Mutate • AFLFastRun on the Input Input Pick an • LibfuzzerInputs Input • Angora Input. Input • . Input VUzzer . Input • Steelix . Fuzzer Input Program . Inputs• AFLGo • AFLSmart Input • Yes: add Nautilus Input • FairFuzz • PerfFuzz • JQF/Zest Interesting? Feedback• FuzzFactory • New coverage? • Coverage• RLCheck • Longer execution? • Execution length • Valid input? • Well-formed input • ... • ... No: Discard input 52 What Bugs

A Journey from Symbolic Execution to Smart Fuzzing and Beyond Koushik Sen EECS Department University of California, Berkeley

Neuron C Reference Guide Iii • Introduction to the LONWORKS Platform (078-0391-01A)

Navigation Techniques in Augmented and Mixed Reality: Crossing the Virtuality Continuum

Parallel Range, Segment and Rectangle Queries with Augmented Maps

Handwritten Digit Classication Using 8-Bit Floating Point Based Convolutional Neural Networks

High Dynamic Range Video

Supporting Technology for Augmented Reality Game-Based Learning

On the Automated Derivation of Domain-Specific UML Profiles

PAM: Parallel Augmented Maps

Distributed Cognitions GENERAL EDITORS: ROY PEA Psychological and Educational JOHN SEELY BROWN Considerations

Towards Bendable Augmented Maps

1.3. Thesis Guide

Map Generation from Large Scale Incomplete and Inaccurate Data