Similarity in Binary Executables
Total Page:16
File Type:pdf, Size:1020Kb
Similarity in Binary Executables Yaniv David Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 Similarity in Binary Executables Research Thesis Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Yaniv David Submitted to the Senate of the Technion — Israel Institute of Technology Hadar 5780 Haifa March 2020 Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 This research was carried out under the supervision of Prof. Eran Yahav, in the Faculty of Computer Science. Some results in this thesis have been published as articles by the author and research collaborators in conferences and journals during the course of the author’s doctoral research period, the most up-to-date versions of which being: Yaniv David, Nimrod Partush, and Eran Yahav. Statistical similarity of binaries. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’16, pages 266–280, New York, NY, USA, 2016. ACM. Yaniv David, Nimrod Partush, and Eran Yahav. Similarity of binaries through re-optimization. In Pro- ceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, pages 79–94, New York, NY, USA, 2017. ACM. Yaniv David, Nimrod Partush, and Eran Yahav. Firmup: Precise static detection of common vulnerabilities in firmware. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’18, pages 392–404, New York, NY, USA, 2018. ACM. Yaniv David and Eran Yahav. Tracelet-based code search in executables. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, pages 349–360, New York, NY, USA, 2014. ACM. ACKNOWLEDGEMENTS To Vered and Yael. The Technion’s funding of this research is hereby acknowledged. Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 Contents List of Figures Abstract 1 1 Introduction3 1.1 The Importance of Analyzing Binary Code....................3 1.2 Binary Code Analysis Using Search.......................4 1.3 Binary Code Search as An Information Retrieval Problem............5 1.4 Evaluating Similarity Measures..........................7 1.5 This Thesis....................................8 1.5.1 Tracelet-Based Code Search: Searching for Patched Code.......8 1.5.2 Statistical Similarity of Binaries: Tool-Chains and Configuration Obliv- ious Search................................9 1.5.3 Similarity through re-Optimization: Cross-Architecture Accelerated Search................................... 10 1.5.4 Applying Our Method To a Real-World Problem: Finding Common Vulnerabilities in Firmware........................ 10 1.5.5 Leaving the Source Code Behind: Similarity by Neural Name Prediction 11 2 Tracelet-Based Code Search in Executables 13 2.1 Overview..................................... 13 2.1.1 Motivating Example........................... 13 2.1.2 Similarity using Tracelets........................ 15 2.2 Preliminaries................................... 19 2.3 Tracelet-Based Matching............................. 20 2.3.1 Preprocessing: Dealing with Compilation Side-Effects......... 20 2.3.2 Comparing Binary Code Procedures................... 21 2.3.3 Calculating Tracelet Match Score.................... 23 2.3.4 Using the Rewrite Engine to Bridge the Gap............... 26 2.4 Evaluation..................................... 28 2.4.1 Test-bed Structure............................ 28 2.4.2 Prototype Implementation........................ 30 Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 2.4.3 Test Configuration............................ 30 2.5 Results....................................... 31 2.5.1 Comparing Different Configurations................... 31 2.5.2 Using the Rewrite Engine to Improve Tracelet Match Rate....... 33 2.5.3 Performance............................... 34 2.6 Limitations.................................... 34 3 Statistical Similarity of Binaries 37 3.1 Background: Similarity By Compisition..................... 37 3.2 Overview..................................... 39 3.3 Strand-Based Similarity by Composition..................... 42 3.3.1 Similarity By Composition........................ 42 3.3.2 Procedure Decomposition to Strands................... 43 3.3.3 Statistical Evidence............................ 44 3.3.4 Local and Global Evidence Scores.................... 45 3.4 Semantic Strand Similarity............................ 46 3.4.1 Similarity Semantics........................... 46 3.4.2 Encoding Similarity as a Program Verifier Query............ 47 3.5 Evaluation..................................... 49 3.5.1 Test-Bed Creation and Prototype Implementation............ 49 3.5.2 Using Vulnerable Code as Queries.................... 50 3.5.3 Testing Different Aspects of the Problem Separately.......... 50 3.5.4 Enabling the Use of Powerful Semantic Tools.............. 51 3.6 Results....................................... 53 3.6.1 Finding Heartbleed .......................... 53 3.6.2 Decomposing Our Method into Sub-methods.............. 53 3.6.3 Comparison of TRACY and Esh ...................... 54 3.6.4 Evaluating BinDiff ........................... 55 3.6.5 Pairwise Comparison........................... 55 3.6.6 Limitations................................ 57 4 Similarity of Binaries through re-Optimization 59 4.1 Overview..................................... 59 4.1.1 The Challenge in Establishing Binary Similarity............ 59 4.1.2 Representing Procedures in a Canonicalized & Normalized Form... 61 4.2 Algorithm..................................... 62 4.2.1 Representing Binary Procedures with Canonicalized Normalized Strands 62 4.2.2 Scalable Binary Similarity Search.................... 65 4.3 Evaluation..................................... 67 4.3.1 Creating a Corpus to Evaluate the Different Problem Vectors...... 67 4.3.2 Evaluation Setup............................. 69 Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 4.3.3 GitZ as a Scalable Vulnerability Search Tool.............. 69 4.3.4 All vs. All Comparison.......................... 70 4.3.5 Comparing Components of the Solution................. 74 4.3.6 Understanding the Global Context Effect................ 74 4.3.7 Limitations................................ 75 5 FirmUp: Precise Static Detection of Common Vulnerabilities in Firmware 77 5.1 Overview..................................... 77 5.1.1 Pairwise Procedure Similarity: The syntactic gap............ 77 5.1.2 Efficient Partial Executable Matching.................. 79 5.2 Representing Firmware Executables....................... 80 5.3 Binary Similarity as a Back-and-Forth Game.................. 83 5.3.1 Game Motivation............................. 83 5.3.2 Game Algorithm............................. 84 5.3.3 Graph-based Approaches......................... 86 5.4 Evaluation..................................... 86 5.4.1 Corpus Creation............................. 86 5.4.2 Finding Undiscovered common vulnerabilities and exposures (CVEs) in Firmware................................ 87 5.4.3 Evaluating Precision Against Previous Work.............. 89 6 Similarity by Neural Name prediction 95 6.1 Overview..................................... 95 6.2 Background - Seq2seq Models.......................... 98 6.2.1 LSTM Encoder-Decoder Models..................... 99 6.2.2 Attention Models............................. 100 6.2.3 Transformers............................... 100 6.3 Representing Binary Procedures for Name Prediction.............. 101 6.4 Model....................................... 105 6.4.1 Call Site Encoder............................. 105 6.4.2 Encoding Call Site Sequences with LSTMs............... 105 6.4.3 Encoding Call Site Sequences with Transformers............ 106 6.4.4 Encoding Call Site Sequences with Graph Neural Networks...... 106 6.5 Evaluation..................................... 106 6.5.1 Experimental Setup............................ 106 6.5.2 Results.................................. 108 6.5.3 Ablation Study.............................. 109 6.5.4 Qualitative Evaluation.......................... 110 7 Related Work 113 8 Conclusion 117 Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 Additional Examples of Procedure Name Predictions 119 Hebrew Abstracti Technion - Computer Science Department - Ph.D. Thesis PHD-2020-04 - 2020 List of Figures 1.1 A ROC plot ([Tow18]) showing how the area under (the) curve (AUC) is measured. 8 2.1 doCommand1 and its corresponding control flow graph (CFG) G1........ 14 2.2 doCommand2 and its corresponding CFG G2................... 15 2.3 A pair of 3-tracelets: (a) based on blocks (1,3,5) from Figure 2.1(b), and (b) on blocks (1’,3’,5’) from Figure 2.2(b). Note that jump instructions are omitted from the tracelet, and that empty lines were added to the original tracelet to align similar instructions............................. 16 2.4 A successful rewriting attempt with cost calculation............... 17 2.5 A full alignment and rewrite for the basic blocks 3 and 3’ of fig. 2.1(b) and fig. 2.2(b)...................................... 17 2.6 Simplified grammar for x86 assembly....................... 19 2.7 A’s members are dissimilar to B’s. We will