Automating Mobile Device File Format Analysis Richard A
Total Page:16
File Type:pdf, Size:1020Kb
Air Force Institute of Technology AFIT Scholar Theses and Dissertations Student Graduate Works 8-10-2018 Automating Mobile Device File Format Analysis Richard A. Dill Follow this and additional works at: https://scholar.afit.edu/etd Part of the Computer Sciences Commons, and the Optics Commons Recommended Citation Dill, Richard A., "Automating Mobile Device File Format Analysis" (2018). Theses and Dissertations. 1916. https://scholar.afit.edu/etd/1916 This Dissertation is brought to you for free and open access by the Student Graduate Works at AFIT Scholar. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of AFIT Scholar. For more information, please contact [email protected]. Automating Mobile Device File Format Analysis DISSERTATION Rich Dill, Maj, USAF AFIT-ENG-DS-18-S-008 DEPARTMENT OF THE AIR FORCE AIR UNIVERSITY AIR FORCE INSTITUTE OF TECHNOLOGY Wright-Patterson Air Force Base, Ohio DISTRIBUTION STATEMENT A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. The views expressed in this document are those of the author and do not reflect the official policy or position of the United States Air Force, the United States Department of Defense or the United States Government. This material is declared a work of the U.S. Government and is not subject to copyright protection in the United States. AFIT-ENG-DS-18-S-008 AUTOMATING MOBILE DEVICE FILE FORMAT ANALYSIS DISSERTATION Presented to the Faculty Graduate School of Engineering and Management Air Force Institute of Technology Air University Air Education and Training Command in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science Rich Dill, B.S.C.S., M.S.C. Maj, USAF 10 July 2018 DISTRIBUTION STATEMENT A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. AFIT-ENG-DS-18-S-008 AUTOMATING MOBILE DEVICE FILE FORMAT ANALYSIS DISSERTATION Rich Dill, B.S.C.S., M.S.C. Maj, USAF Committee Membership: Dr. Gilbert L. Peterson, PhD, Chairman Dr. Douglas D. Hodson, PhD, Member Lt Col Richard S. Seymour, PhD, Member Dr. Jack E. McCrae, PhD, Dean's Representative AFIT-ENG-DS-18-S-008 Abstract Forensic tools assist examiners in extracting evidence from application files from mo- bile devices. If the file format for the file of interest is known, this process is straight- forward, otherwise it requires the examiner to manually reverse engineer the data structures resident in the file. This research presents the Automated Data Structure Slayer (ADSS), which automates the process to reverse engineer unknown file for- mats of Android applications. After statically parsing and preparing an application, ADSS dynamically runs it, injecting hooks at selected methods to uncover the data structures used to store and process data before writing to media. The resultant asso- ciation between application semantics and bytes in a file reveal the structure and file format. ADSS has been successfully evaluated against Uber and Discord, both pop- ular Android applications, and reveals the format used by the respective proprietary application files stored on the filesystem. iv Acknowledgements The Christian shoemaker does his duty not by putting little crosses on the shoes, but by making good shoes, because God is interested in good craftsmanship. { Martin Luther Thanks to the many people who helped make completing this dissertation possible. My wife and kids who let me exchange time with family to conduct research; my adviser (Dr. Peterson) for challenging my research and writing; my fellow students (Camdon Cady, Chris Wayne, and David King) who let me brainstorm ideas with them; the technical experts (Ben Gruverand Ole Andr V. Ravns) for answering my questions about Smali and Frida, respectively. Rich Dill v Table of Contents Page Abstract . iv Acknowledgements . .v List of Figures . viii List of Tables . .x List of Abbreviations . xi I. Introduction . .1 1.1 Motivation . .1 1.2 Methodology. .3 1.3 Summary . .5 II. Background . .6 2.1 Android Primer . .6 History . .7 Architecture . .8 Internals . .8 2.2 How Applications Work . 10 Architecture . 11 Execution . 16 Memory Allocation . 25 Filesystem Interaction . 28 2.3 Reverse Engineering . 29 Static Techniques. 29 Dynamic Techniques . 31 III. Related Work . 34 3.1 Static Analysis of Binary Executables . 34 3.2 Virtual Machine Data Flow Analysis . 36 3.3 Native Data Flow Analysis . 38 3.4 Integrating Native and Virtual Machine Dataflow Analysis Techniques . 42 3.5 Summary . 43 vi Page IV. System Design and Implementation . 44 4.1 ADSS Overview . 44 4.2 Static Phase . 46 4.3 Dynamic Phase . 48 4.4 Summary . 56 V. Evaluation . 57 5.1 Experimental Design . 57 5.2 Uber ......................................................... 58 5.3 Discord . 69 5.4 Summary . 82 VI. Conclusion and Future Work . 83 6.1 Expanding the Research . 84 Multiple Platforms . 84 Wide Range of Infrastructure . 85 Encrypted Files . 85 Application Security Analysis . 86 Appendix A. ADSS Final Hook List for Uber . 87 Appendix B. ADSS Final Hook List for Discord . 93 Appendix C. Parser for Uber . 98 Appendix D. Parser for Discord . 104 Bibliography . 114 vii List of Figures Figure Page 1. The architecture of the Unix OS [1]. .8 2. The architecture of Android OS [2]. .9 3. Life of an APK [3]. 11 4. APK creation. 11 5. Android APK hierarcy [4]. 12 6. Android APK installation. 13 7. ELF during linking and execution views [5]. 14 8. Application executing. 16 9. Building with a shared library [6]. 17 10. Load time linking of a dynamic library [6]. 18 11. Native application running in memory [7]. 20 12. Kernel structures associated with a process [8]. 21 13. Process segments mapped to memory [8]. 22 14. High Level view of Linux virtual memory [9]. 24 15. ARM AARCH64 Special Registers [10]. ..