Automated Analysis of ARM Binaries Using the Low-Level Virtual Machine Compiler Framework
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by AFTI Scholar (Air Force Institute of Technology) Air Force Institute of Technology AFIT Scholar Theses and Dissertations Student Graduate Works 3-11-2011 Automated Analysis of ARM Binaries using the Low-Level Virtual Machine Compiler Framework Jeffrey B. Scott Follow this and additional works at: https://scholar.afit.edu/etd Part of the Hardware Systems Commons Recommended Citation Scott, Jeffrey B., "Automated Analysis of ARM Binaries using the Low-Level Virtual Machine Compiler Framework" (2011). Theses and Dissertations. 1427. https://scholar.afit.edu/etd/1427 This Thesis is brought to you for free and open access by the Student Graduate Works at AFIT Scholar. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of AFIT Scholar. For more information, please contact [email protected]. AUTOMATED ANALYSIS OF ARM BINARIES USING THE LOW- LEVEL VIRTUAL MACHINE COMPILER FRAMEWORK THESIS Jeffrey B. Scott, Captain, USAF AFIT/GCO/ENG/11-14 ABCDBAE DFDD AIR FORCE INSTITUTE OF TECHNOLOGY B !A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION IS UNLIMITED. The views expressed in this thesis are those of the author and do not reflect the official policy or position of the United States Air Force, the Department of Defense, or the United States Government. This material is declared a work of the U.S. Government and is not subject to copyright protection in the United States. AFIT/GCO/ENG/11-14 AUTOMATED ANALYSIS OF ARM BINARIES USING THE LOW-LEVEL VIRTUAL MACHINE COMPILER FRAMEWORK THESIS Presented to the Faculty Department of Electrical and Computer Engineering Graduate School of Engineering and Management Air Force Institute of Technology Air University Air Education and Training Command In Partial Fulfillment of the Requirements for the Degree of Master of Science Jeffrey B. Scott, B.S.E.E. Captain, USAF March 2011 APPROVED FOR PUBLIC RELEASE; DISTRIBUTION IS UNLIMITED. AFIT/GCO/ENG/11-14 Abstract Binary program analysis is a critical capability for offensive and defensive operations in Cyberspace. However, many current techniques are ineffective or time- consuming and few tools can analyze code compiled for embedded processors such as those used in network interface cards, control systems and mobile phones. This research designs and implements a binary analysis system, called the Architecture-independent Binary Abstracting Code Analysis System (ABACAS), which reverses the normal program compilation process, lifting binary machine code to the Low-Level Virtual Machine (LLVM) compiler’s intermediate representation, thereby enabling existing security-related analyses to be applied to binary programs. The prototype targets ARM binaries but can be extended to support other architectures. Several programs are translated from ARM binaries and analyzed with existing analysis tools. Programs lifted from ARM binaries are an average of 3.73 times larger than the same programs compiled from a high-level language (HLL). Analysis results are equivalent regardless of whether the HLL source or ARM binary version of the program is submitted to the system, confirming the hypothesis that LLVM is effective for binary analysis. iv Acknowledgements First I would like to thank my research advisor, Dr. Rusty Baldwin, for giving me the freedom to pursue research suited to my interests and skills, for reining me in when my plans became too extravagant and for always being willing to give his time and expertise to answer questions and provide useful and detailed feedback. I also want to thank my committee members for their help: Dr. Mullins, for teaching me how to identify weaknesses in a system by examining it from an attacker’s perspective, and Mr. Kimball for his stimulating discussions, for pushing me to investigate static program analysis and for encouraging me to “learn everything there is to know about ARM.” Finally, this research would not have been possible without the support of my wife, who handled things marvelously at home and cared for me and our children while I completed this research. Her love, faith and understanding have been a constant source of motivation and strength for me. v Table of Contents Table of Contents ........................................................................................................................... vi List of Figures ................................................................................................................................ ix List of Tables .................................................................................................................................. xi I. Introduction ................................................................................................................................ 1 1.1 Problem Background ....................................................................................................... 2 1.2 Research Goals ................................................................................................................ 4 1.3 Document Outline ........................................................................................................... 5 II. Literature Review ..................................................................................................................... 6 2.1 Taxonomy of Mobile Device Vulnerabilities .................................................................. 6 2.1.1 Scenario ................................................................................................................... 7 2.1.2 Policy ....................................................................................................................... 7 2.1.3 Design ...................................................................................................................... 9 2.1.4 Implementation ...................................................................................................... 10 2.1.5 Configuration ......................................................................................................... 11 2.2 Security-oriented Program Analysis .............................................................................. 11 2.2.1 Black Box Analysis ............................................................................................... 11 2.2.2 White/Grey Box Analysis ...................................................................................... 12 2.2.3 Dynamic Approaches ............................................................................................ 12 2.2.4 Static Approaches .................................................................................................. 15 2.3 Compiler Overview ....................................................................................................... 18 2.3.1 The Low-Level Virtual Machine ........................................................................... 20 2.3.2 Mid-end Program Analysis and Transformation ................................................... 21 2.3.3 Back-end Code Generation .................................................................................... 25 2.4 ARM Architecture Overview ........................................................................................ 26 2.4.1 System Programming ............................................................................................ 26 2.4.2 Memory Architecture ............................................................................................ 28 2.4.3 Instruction Set Architecture ................................................................................... 29 2.5 Summary ....................................................................................................................... 32 III. System Design and Implementation ........................................................................................ 33 3.1 System Overview........................................................................................................... 33 vi 3.2 Front-end Architecture .................................................................................................. 34 3.2.1 Object File Parser .................................................................................................. 34 3.2.2 Disassembler.......................................................................................................... 35 3.2.3 Assembly Parser .................................................................................................... 35 3.2.4 LLVM IR Code Generator .................................................................................... 35 3.3 Prototype Implementation ............................................................................................. 36 3.3.1 Object File Parser .................................................................................................. 36 3.3.2 Disassembler.......................................................................................................... 37 3.3.3 Assembly Parser .................................................................................................... 37 3.3.4 LLVM IR Code Generator .................................................................................... 41 IV. Experimental Methodology ..................................................................................................... 48 4.1 Approach ......................................................................................................................