Automating Mobile Device File Format Analysis Richard A

Total Page:16

File Type:pdf, Size:1020Kb

Automating Mobile Device File Format Analysis Richard A Air Force Institute of Technology AFIT Scholar Theses and Dissertations Student Graduate Works 8-10-2018 Automating Mobile Device File Format Analysis Richard A. Dill Follow this and additional works at: https://scholar.afit.edu/etd Part of the Computer Sciences Commons, and the Optics Commons Recommended Citation Dill, Richard A., "Automating Mobile Device File Format Analysis" (2018). Theses and Dissertations. 1916. https://scholar.afit.edu/etd/1916 This Dissertation is brought to you for free and open access by the Student Graduate Works at AFIT Scholar. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of AFIT Scholar. For more information, please contact [email protected]. Automating Mobile Device File Format Analysis DISSERTATION Rich Dill, Maj, USAF AFIT-ENG-DS-18-S-008 DEPARTMENT OF THE AIR FORCE AIR UNIVERSITY AIR FORCE INSTITUTE OF TECHNOLOGY Wright-Patterson Air Force Base, Ohio DISTRIBUTION STATEMENT A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. The views expressed in this document are those of the author and do not reflect the official policy or position of the United States Air Force, the United States Department of Defense or the United States Government. This material is declared a work of the U.S. Government and is not subject to copyright protection in the United States. AFIT-ENG-DS-18-S-008 AUTOMATING MOBILE DEVICE FILE FORMAT ANALYSIS DISSERTATION Presented to the Faculty Graduate School of Engineering and Management Air Force Institute of Technology Air University Air Education and Training Command in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science Rich Dill, B.S.C.S., M.S.C. Maj, USAF 10 July 2018 DISTRIBUTION STATEMENT A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. AFIT-ENG-DS-18-S-008 AUTOMATING MOBILE DEVICE FILE FORMAT ANALYSIS DISSERTATION Rich Dill, B.S.C.S., M.S.C. Maj, USAF Committee Membership: Dr. Gilbert L. Peterson, PhD, Chairman Dr. Douglas D. Hodson, PhD, Member Lt Col Richard S. Seymour, PhD, Member Dr. Jack E. McCrae, PhD, Dean's Representative AFIT-ENG-DS-18-S-008 Abstract Forensic tools assist examiners in extracting evidence from application files from mo- bile devices. If the file format for the file of interest is known, this process is straight- forward, otherwise it requires the examiner to manually reverse engineer the data structures resident in the file. This research presents the Automated Data Structure Slayer (ADSS), which automates the process to reverse engineer unknown file for- mats of Android applications. After statically parsing and preparing an application, ADSS dynamically runs it, injecting hooks at selected methods to uncover the data structures used to store and process data before writing to media. The resultant asso- ciation between application semantics and bytes in a file reveal the structure and file format. ADSS has been successfully evaluated against Uber and Discord, both pop- ular Android applications, and reveals the format used by the respective proprietary application files stored on the filesystem. iv Acknowledgements The Christian shoemaker does his duty not by putting little crosses on the shoes, but by making good shoes, because God is interested in good craftsmanship. { Martin Luther Thanks to the many people who helped make completing this dissertation possible. My wife and kids who let me exchange time with family to conduct research; my adviser (Dr. Peterson) for challenging my research and writing; my fellow students (Camdon Cady, Chris Wayne, and David King) who let me brainstorm ideas with them; the technical experts (Ben Gruverand Ole Andr V. Ravns) for answering my questions about Smali and Frida, respectively. Rich Dill v Table of Contents Page Abstract . iv Acknowledgements . .v List of Figures . viii List of Tables . .x List of Abbreviations . xi I. Introduction . .1 1.1 Motivation . .1 1.2 Methodology. .3 1.3 Summary . .5 II. Background . .6 2.1 Android Primer . .6 History . .7 Architecture . .8 Internals . .8 2.2 How Applications Work . 10 Architecture . 11 Execution . 16 Memory Allocation . 25 Filesystem Interaction . 28 2.3 Reverse Engineering . 29 Static Techniques. 29 Dynamic Techniques . 31 III. Related Work . 34 3.1 Static Analysis of Binary Executables . 34 3.2 Virtual Machine Data Flow Analysis . 36 3.3 Native Data Flow Analysis . 38 3.4 Integrating Native and Virtual Machine Dataflow Analysis Techniques . 42 3.5 Summary . 43 vi Page IV. System Design and Implementation . 44 4.1 ADSS Overview . 44 4.2 Static Phase . 46 4.3 Dynamic Phase . 48 4.4 Summary . 56 V. Evaluation . 57 5.1 Experimental Design . 57 5.2 Uber ......................................................... 58 5.3 Discord . 69 5.4 Summary . 82 VI. Conclusion and Future Work . 83 6.1 Expanding the Research . 84 Multiple Platforms . 84 Wide Range of Infrastructure . 85 Encrypted Files . 85 Application Security Analysis . 86 Appendix A. ADSS Final Hook List for Uber . 87 Appendix B. ADSS Final Hook List for Discord . 93 Appendix C. Parser for Uber . 98 Appendix D. Parser for Discord . 104 Bibliography . 114 vii List of Figures Figure Page 1. The architecture of the Unix OS [1]. .8 2. The architecture of Android OS [2]. .9 3. Life of an APK [3]. 11 4. APK creation. 11 5. Android APK hierarcy [4]. 12 6. Android APK installation. 13 7. ELF during linking and execution views [5]. 14 8. Application executing. 16 9. Building with a shared library [6]. 17 10. Load time linking of a dynamic library [6]. 18 11. Native application running in memory [7]. 20 12. Kernel structures associated with a process [8]. 21 13. Process segments mapped to memory [8]. 22 14. High Level view of Linux virtual memory [9]. 24 15. ARM AARCH64 Special Registers [10]. ..
Recommended publications
  • FIT to WORK: IMPROVING the SECURITY of MONITORED EMPLOYEES' HEALTH DATA Elizabeth A. Brown INTRODUCTION Imagine Coming to Work
    FIT TO WORK: IMPROVING THE SECURITY OF MONITORED EMPLOYEES' HEALTH DATA Elizabeth A. Brown1 INTRODUCTION Imagine coming to work one day and finding that your employer has given everyone in the company a wearable FitBit health monitor, free of charge. You pop the FitBit on, grateful for another bit of help in managing the health concerns that nag at you persistently but which never quite rise to the top of your priority list. At your next performance review, your supervisor expresses concern about your anxiety levels. Although your work output is slightly off, she notes, there has been a correlation in your lack of sleep and exercise, and she suspects you are depressed. You wonder how your employer might know these things, whether or not they are true, and then you remember the FitBit. Your supervisor then tells you that the promotion you had wanted is going to a colleague who is “better equipped to handle the demands of the job.” You interview for another job, and are asked to provide the password to the HealthDrive account that centralizes the fitness data all the apps on your iPhone collect about you during the day. Similar scenarios are playing out now in workplaces across the country, and will do so more frequently as the personal health sensor market and employee monitoring trends continue to grow. Employers are making key decisions based on employees’ biometric data, collected from specialized devices like a FitBit or the health-related apps installed on mobile phones. BP, for example, adjusts its employees’ health care premiums depending on how much physical activity their wearable FitBit devices monitor – devices that BP provides to thousands of employees, their spouses, and retirees for free.
    [Show full text]
  • Copy on Write Based File Systems Performance Analysis and Implementation
    Copy On Write Based File Systems Performance Analysis And Implementation Sakis Kasampalis Kongens Lyngby 2010 IMM-MSC-2010-63 Technical University of Denmark Department Of Informatics Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.imm.dtu.dk Abstract In this work I am focusing on Copy On Write based file systems. Copy On Write is used on modern file systems for providing (1) metadata and data consistency using transactional semantics, (2) cheap and instant backups using snapshots and clones. This thesis is divided into two main parts. The first part focuses on the design and performance of Copy On Write based file systems. Recent efforts aiming at creating a Copy On Write based file system are ZFS, Btrfs, ext3cow, Hammer, and LLFS. My work focuses only on ZFS and Btrfs, since they support the most advanced features. The main goals of ZFS and Btrfs are to offer a scalable, fault tolerant, and easy to administrate file system. I evaluate the performance and scalability of ZFS and Btrfs. The evaluation includes studying their design and testing their performance and scalability against a set of recommended file system benchmarks. Most computers are already based on multi-core and multiple processor architec- tures. Because of that, the need for using concurrent programming models has increased. Transactions can be very helpful for supporting concurrent program- ming models, which ensure that system updates are consistent. Unfortunately, the majority of operating systems and file systems either do not support trans- actions at all, or they simply do not expose them to the users.
    [Show full text]
  • The Linux Kernel Module Programming Guide
    The Linux Kernel Module Programming Guide Peter Jay Salzman Michael Burian Ori Pomerantz Copyright © 2001 Peter Jay Salzman 2007−05−18 ver 2.6.4 The Linux Kernel Module Programming Guide is a free book; you may reproduce and/or modify it under the terms of the Open Software License, version 1.1. You can obtain a copy of this license at http://opensource.org/licenses/osl.php. This book is distributed in the hope it will be useful, but without any warranty, without even the implied warranty of merchantability or fitness for a particular purpose. The author encourages wide distribution of this book for personal or commercial use, provided the above copyright notice remains intact and the method adheres to the provisions of the Open Software License. In summary, you may copy and distribute this book free of charge or for a profit. No explicit permission is required from the author for reproduction of this book in any medium, physical or electronic. Derivative works and translations of this document must be placed under the Open Software License, and the original copyright notice must remain intact. If you have contributed new material to this book, you must make the material and source code available for your revisions. Please make revisions and updates available directly to the document maintainer, Peter Jay Salzman <[email protected]>. This will allow for the merging of updates and provide consistent revisions to the Linux community. If you publish or distribute this book commercially, donations, royalties, and/or printed copies are greatly appreciated by the author and the Linux Documentation Project (LDP).
    [Show full text]
  • Your Voice Assistant Is Mine: How to Abuse Speakers to Steal Information and Control Your Phone ∗ †
    Your Voice Assistant is Mine: How to Abuse Speakers to Steal Information and Control Your Phone ∗ y Wenrui Diao, Xiangyu Liu, Zhe Zhou, and Kehuan Zhang Department of Information Engineering The Chinese University of Hong Kong {dw013, lx012, zz113, khzhang}@ie.cuhk.edu.hk ABSTRACT General Terms Previous research about sensor based attacks on Android platform Security focused mainly on accessing or controlling over sensitive compo- nents, such as camera, microphone and GPS. These approaches Keywords obtain data from sensors directly and need corresponding sensor invoking permissions. Android Security; Speaker; Voice Assistant; Permission Bypass- This paper presents a novel approach (GVS-Attack) to launch ing; Zero Permission Attack permission bypassing attacks from a zero-permission Android application (VoicEmployer) through the phone speaker. The idea of 1. INTRODUCTION GVS-Attack is to utilize an Android system built-in voice assistant In recent years, smartphones are becoming more and more popu- module – Google Voice Search. With Android Intent mechanism, lar, among which Android OS pushed past 80% market share [32]. VoicEmployer can bring Google Voice Search to foreground, and One attraction of smartphones is that users can install applications then plays prepared audio files (like “call number 1234 5678”) in (apps for short) as their wishes conveniently. But this convenience the background. Google Voice Search can recognize this voice also brings serious problems of malicious application, which have command and perform corresponding operations. With ingenious been noticed by both academic and industry fields. According to design, our GVS-Attack can forge SMS/Email, access privacy Kaspersky’s annual security report [34], Android platform attracted information, transmit sensitive data and achieve remote control a whopping 98.05% of known malware in 2013.
    [Show full text]
  • File Systems and Disk Layout I/O: the Big Picture
    File Systems and Disk Layout I/O: The Big Picture Processor interrupts Cache Memory Bus I/O Bridge Main I/O Bus Memory Disk Graphics Network Controller Controller Interface Disk Disk Graphics Network 1 Rotational Media Track Sector Arm Cylinder Platter Head Access time = seek time + rotational delay + transfer time seek time = 5-15 milliseconds to move the disk arm and settle on a cylinder rotational delay = 8 milliseconds for full rotation at 7200 RPM: average delay = 4 ms transfer time = 1 millisecond for an 8KB block at 8 MB/s Bandwidth utilization is less than 50% for any noncontiguous access at a block grain. Disks and Drivers Disk hardware and driver software provide basic facilities for nonvolatile secondary storage (block devices). 1. OS views the block devices as a collection of volumes. A logical volume may be a partition ofasinglediskora concatenation of multiple physical disks (e.g., RAID). 2. OS accesses each volume as an array of fixed-size sectors. Identify sector (or block) by unique (volumeID, sector ID). Read/write operations DMA data to/from physical memory. 3. Device interrupts OS on I/O completion. ISR wakes up process, updates internal records, etc. 2 Using Disk Storage Typical operating systems use disks in three different ways: 1. System calls allow user programs to access a “raw” disk. Unix: special device file identifies volume directly. Any process that can open thedevicefilecanreadorwriteany specific sector in the disk volume. 2. OS uses disk as backing storage for virtual memory. OS manages volume transparently as an “overflow area” for VM contents that do not “fit” in physical memory.
    [Show full text]
  • Devicelock® DLP 8.3 User Manual
    DeviceLock® DLP 8.3 User Manual © 1996-2020 DeviceLock, Inc. All Rights Reserved. Information in this document is subject to change without notice. No part of this document may be reproduced or transmitted in any form or by any means for any purpose other than the purchaser’s personal use without the prior written permission of DeviceLock, Inc. Trademarks DeviceLock and the DeviceLock logo are registered trademarks of DeviceLock, Inc. All other product names, service marks, and trademarks mentioned herein are trademarks of their respective owners. DeviceLock DLP - User Manual Software version: 8.3 Updated: March 2020 Contents About This Manual . .8 Conventions . 8 DeviceLock Overview . .9 General Information . 9 Managed Access Control . 13 DeviceLock Service for Mac . 17 DeviceLock Content Security Server . 18 How Search Server Works . 18 ContentLock and NetworkLock . 20 ContentLock and NetworkLock Licensing . 24 Basic Security Rules . 25 Installing DeviceLock . .26 System Requirements . 26 Deploying DeviceLock Service for Windows . 30 Interactive Installation . 30 Unattended Installation . 35 Installation via Microsoft Systems Management Server . 36 Installation via DeviceLock Management Console . 36 Installation via DeviceLock Enterprise Manager . 37 Installation via Group Policy . 38 Installation via DeviceLock Enterprise Server . 44 Deploying DeviceLock Service for Mac . 45 Interactive Installation . 45 Command Line Utility . 47 Unattended Installation . 48 Installing Management Consoles . 49 Installing DeviceLock Enterprise Server . 52 Installation Steps . 52 Installing and Accessing DeviceLock WebConsole . 65 Prepare for Installation . 65 Install the DeviceLock WebConsole . 66 Access the DeviceLock WebConsole . 67 Installing DeviceLock Content Security Server . 68 Prepare to Install . 68 Start Installation . 70 Perform Configuration and Complete Installation . 71 DeviceLock Consoles and Tools .
    [Show full text]
  • Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems
    Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems Nan Zhang∗, Xianghang Mi∗, Xuan Fengy∗, XiaoFeng Wang∗, Yuan Tianz and Feng Qian∗ ∗Indiana University, Bloomington Email: fnz3, xmi, xw7, [email protected] yBeijing Key Laboratory of IOT Information Security Technology, Institute of Information Engineering, CAS, China Email: [email protected] zUniversity of Virginia Email: [email protected] Abstract—Virtual personal assistants (VPA) (e.g., Amazon skills by Amazon and actions by Google1) to offer further Alexa and Google Assistant) today mostly rely on the voice helps to the end users, for example, order food, manage bank channel to communicate with their users, which however is accounts and text friends. In the past year, these ecosystems known to be vulnerable, lacking proper authentication (from the user to the VPA). A new authentication challenge, from the VPA are expanding at a breathtaking pace: Amazon claims that service to the user, has emerged with the rapid growth of the VPA already 25,000 skills have been uploaded to its skill market to ecosystem, which allows a third party to publish a function (called support its VPA (including the Alexa service running through skill) for the service and therefore can be exploited to spread Amazon Echo) [1] and Google also has more than one thousand malicious skills to a large audience during their interactions actions available on its market for its Google Home system with smart speakers like Amazon Echo and Google Home. In this paper, we report a study that concludes such remote, large- (powered by Google Assistant).
    [Show full text]
  • The Linux Device File-System
    The Linux Device File-System Richard Gooch EMC Corporation [email protected] Abstract 1 Introduction All Unix systems provide access to hardware via de- vice drivers. These drivers need to provide entry points for user-space applications and system tools to access the hardware. Following the \everything is a file” philosophy of Unix, these entry points are ex- posed in the file name-space, and are called \device The Device File-System (devfs) provides a power- special files” or \device nodes". ful new device management mechanism for Linux. Unlike other existing and proposed device manage- This paper discusses how these device nodes are cre- ment schemes, it is powerful, flexible, scalable and ated and managed in conventional Unix systems and efficient. the limitations this scheme imposes. An alternative mechanism is then presented. It is an alternative to conventional disc-based char- acter and block special devices. Kernel device drivers can register devices by name rather than de- vice numbers, and these device entries will appear in the file-system automatically. 1.1 Device numbers Devfs provides an immediate benefit to system ad- ministrators, as it implements a device naming scheme which is more convenient for large systems Conventional Unix systems have the concept of a (providing a topology-based name-space) and small \device number". Each instance of a driver and systems (via a device-class based name-space) alike. hardware component is assigned a unique device number. Within the kernel, this device number is Device driver authors can benefit from devfs by used to refer to the hardware and driver instance.
    [Show full text]
  • High Velocity Kernel File Systems with Bento
    High Velocity Kernel File Systems with Bento Samantha Miller, Kaiyuan Zhang, Mengqi Chen, and Ryan Jennings, University of Washington; Ang Chen, Rice University; Danyang Zhuo, Duke University; Thomas Anderson, University of Washington https://www.usenix.org/conference/fast21/presentation/miller This paper is included in the Proceedings of the 19th USENIX Conference on File and Storage Technologies. February 23–25, 2021 978-1-939133-20-5 Open access to the Proceedings of the 19th USENIX Conference on File and Storage Technologies is sponsored by USENIX. High Velocity Kernel File Systems with Bento Samantha Miller Kaiyuan Zhang Mengqi Chen Ryan Jennings Ang Chen‡ Danyang Zhuo† Thomas Anderson University of Washington †Duke University ‡Rice University Abstract kernel-level debuggers and kernel testing frameworks makes this worse. The restricted and different kernel programming High development velocity is critical for modern systems. environment also limits the number of trained developers. This is especially true for Linux file systems which are seeing Finally, upgrading a kernel module requires either rebooting increased pressure from new storage devices and new demands the machine or restarting the relevant module, either way on storage systems. However, high velocity Linux kernel rendering the machine unavailable during the upgrade. In the development is challenging due to the ease of introducing cloud setting, this forces kernel upgrades to be batched to meet bugs, the difficulty of testing and debugging, and the lack of cloud-level availability goals. support for redeployment without service disruption. Existing Slow development cycles are a particular problem for file approaches to high-velocity development of file systems for systems.
    [Show full text]
  • Deviceinstaller User Guide
    Device Installer User Guide Part Number 900-325 Revision C 03/18 Table of Contents 1. Overview ...................................................................................................................................... 1 2. Devices ........................................................................................................................................ 2 Choose the Network Adapter for Communication ....................................................................... 2 Search for All Devices on the Network ........................................................................................ 2 Change Views .............................................................................................................................. 2 Add a Device to the List ............................................................................................................... 3 View Device Details ..................................................................................................................... 3 Device Lists ................................................................................................................................. 3 Save the Device List ................................................................................................................ 3 Open the Device List ............................................................................................................... 4 Print the Device List ................................................................................................................
    [Show full text]
  • Oracle® Linux 7 Managing File Systems
    Oracle® Linux 7 Managing File Systems F32760-07 August 2021 Oracle Legal Notices Copyright © 2020, 2021, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Linux Cdrom Documentation
    Linux Cdrom Documentation The kernel development community Jul 14, 2020 CONTENTS i ii CHAPTER ONE A LINUX CD-ROM STANDARD Author David van Leeuwen <[email protected]> Date 12 March 1999 Updated by Erik Andersen ([email protected]) Updated by Jens Axboe ([email protected]) 1.1 Introduction Linux is probably the Unix-like operating system that supports the widest variety of hardware devices. The reasons for this are presumably • The large list of hardware devices available for the many platforms that Linux now supports (i.e., i386-PCs, Sparc Suns, etc.) • The open design of the operating system, such that anybody can write a driver for Linux. • There is plenty of source code around as examples of how to write a driver. The openness of Linux, and the many different types of available hardware has al- lowed Linux to support many different hardware devices. Unfortunately, the very openness that has allowed Linux to support all these different devices has also allowed the behavior of each device driver to differ significantly from one device to another. This divergence of behavior has been very significant for CD-ROM devices; the way a particular drive reacts to a standard ioctl() call varies greatly from one device driver to another. To avoid making their drivers totally inconsis- tent, the writers of Linux CD-ROM drivers generally created new device drivers by understanding, copying, and then changing an existing one. Unfortunately, this practice did not maintain uniform behavior across all the Linux CD-ROM drivers. This document describes an effort to establish Uniform behavior across all the dif- ferent CD-ROM device drivers for Linux.
    [Show full text]