Dynamic Analyses for Privacy and Performance in Mobile Applications

Mingyuan Xia

Doctor of Philosophy

School of Computer Science

McGill University Montreal, Quebec 2016-08-14

A Thesis Submitted to the Faculty of Graduate Studies and Research in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Copyright 2016 Mingyuan Xia DEDICATION

To my beloved family

ii ACKNOWLEDGMENTS

First and foremost, I deeply appreciate my supervisor Dr. Xue Liu for his patience and advice during my graduate study. I am also very fortunate to have Dr. Laurie Hendren and Dr. David Lie provide their invaluable feedbacks to improve my thesis work. I want to thank Dr. Zhengwei Qi and Dr. Yi Gao for collaboration in various research projects. At McGill, I would like to thank all members of CPSLab, staff of School of Computer Science, and Ron Simpson. And I enjoy the fun days with friends from MTSA and SJTU alumnus. I appreciate Howard Wang’s great French skills and Nos Th´esfor brewing the best milk tea. At IBM Almaden, Dr. Pin Zhou and Dr. Mohit Saxena have provided the greatest mentorship during my internship. Finally I want to acknowledge the IBM Ph.D. fellowship, McGill Lorne Trottier Fellowship and NSERC for financially supporting my graduate career.

iii ABSTRACT

Mobile applications (also called apps) have greatly extended and inno- vated users’ daily tasks. The mobile programming model features event-driven execution, rapid changing (about three generations per year) and ubiqui- tous accesses to user’s personal data. These features enrich app functionalities but also give rise to many new software problems that impact performance or damage user privacy, many of which are not occasional programming mistakes. In this thesis, we systematically study these problems and develop dynamic program analyses to effectively detect, diagnose and fix these new problems. We start by researching the sensitive data leakage problem in apps. Since mobile apps can access various sensitive user data stored on the device, data leaks become a great concern for both end users and app market operators. Ex- isting leak detecting approaches rely on static analysis that does not perform well on real-world apps with growing complexity, further limiting their adop- tion for real usage. We propose AppAudit, which embodies a novel dynamic analysis that can execute part of the app code while tracking the dissemination of sensitive data. AppAudit also has a static analysis to shrink analysis scope and boost analysis performance. The synergy of two analyses achieves higher detection accuracy, runs 8.3× faster and uses 90% less memory on real-world Android apps as compared to previous approaches. Based on the analysis building blocks from AppAudit, we further develop binary instrumentation to profile and improve app performance. We study 115 thousand apps and common performance anti-patterns from existing lit- erature. Based on these understandings, we propose AppInspector, which instruments apps to profile a small set of methods while collecting various app runtime diagnostic data. These profiling data is transformed into a graph

iv structure, where AppInspector programmatically diagnoses three common per- formance anti-patterns from this graph. We also develop AppSwift based on AppInspector, which transforms app code to automatically fix some perfor- mance anti-patterns and improve app performance. Both tools instrument app code automatically. Instrumented apps can run on unmodified Android OSes and thus being readily deployable to existing test environments. With extensive tests on real-world apps, AppInspector uncovers 22 performance is- sues per app, with detailed analysis results to guide developers to fix them; AppSwift automatically eliminates about 5 of such issues without any code modification from the app developer. We believe that the analysis method- ologies, frameworks and tools developed in this thesis can assist developers in debugging various performance problems and better protecting user privacy.

v ABREG´ E´

Les applications mobiles (´egalement appel´esapps) ont consid´erablement ´etenduet innov´eeles tˆaches quotidiennes des utilisateurs. Le mod`elede pro- grammation mobile dispose d’ex´ecution´ev´enementielle, API ´evolution rapide (environ trois g´en´erationspar an ) et omnipr´esente des acc`esaux donn´eesper- sonnelles de l’utilisateur. Ces fonctionnalit´esenrichissent app fonctionnalit´es, mais aussi donner lieu `ade nombreux probl`emesnouveaux logiciels que la per- formance de l’impact ou de dommages utilisateur vie priv´ee,dont beaucoup ne sont pas des erreurs de programmation occasionnelles. Dans cette th`ese,nous ´etudions syst´ematiquement ces probl`emeset d´eveloppons le programme dy- namique des analyses pour d´etecterefficacement, diagnostiquer et r´esoudreces nouveaux probl`emes. Nous commen¸conspar rechercher le probl`emede fuite de donn´eessensibles dans des apps. Comme les applications mobiles peuvent acc´eder`adiverses donn´eessensibles de l’utilisateur stock´essur l’appareil, les fuites de donn´eesdevient une grande pr´eoccupation pour les utilisateurs fin- aux et les op´erateursdu march´ede l’app. Les m´ethodes de d´etectionde fuites existantes s’appuient sur l’analyse statique qui ne fonctionne pas bien sur les applications dans le monde rel avec une complexit´ecroissante. Nous proposons AppAudit, qui incarne une nouvelle analyse dynamique qui peut ex´ecuterla partie de l’app code tout en effectuant le suivi de la diffusion des donn´eessensi- bles. AppAuditposs`ede´egalement une analyse statique pour r´etr´ecirl’analyse des performances de l’analyse et boost scopie. La synergie des deux analyses permet d’obtenir une plus grande pr´ecisionde d´etection,8.3x plus rapide et utilise ex´ecute90% moins de m´emoiresur les applications Android dans le monde r´eelpar rapport aux approches pr´ec´edentes. Sur la base des blocs de

vi construction de l’analyse d’AppAudit, nous d´eveloppons l’instrumentation bi- naire au profil et am´elioronsles performances des applications. Nous ´etudions 115 mille applications et performance communs anti-mod`eles `apartir de la litt´eratureexistante. Sur la base de ces accords, nous proposons AppInspector, qui instrumente applications au profil d’un petit ensemble de m´ethodes tout en recueillant des donn´eesde diagnostic diff´erentes application d’ex´ecution. Ces donn´eesde profilage se transforme en une structure de graphe, o`uAppIn- spector diagnostique trois performances commune anti-mod`eles`apartir de ce graphique. Nous d´eveloppons ´egalement AppSwift bas´esur AppInspector, qui transforme le code de l’application pour corriger automatiquement certaines performances anti-mod`eleset d’am´eliorer les performances des applications. Les deux outils instrument code de l’application automatiquement. Les appli- cations instrument´eespeuvent fonctionner sur les syst`emesd’exploitation An- droid non modifi´eset donc ˆetrefacilement d´eployable `ades environnements de test existants. Avec des tests approfondis sur les applications du monde r´eel, AppInspector d´ecouvre22 probl`emesde performance par application, avec des r´esultatsd’analyse d´etaill´espour guider les d´eveloppeurs de les corriger; AppSwift ´elimine automatiquement environ 5 de ces questions sans aucune modification de code `apartir du d´eveloppeur de l’application. Nous croyons que les m´ethodes d’analyse, les cadres et les outils d´evelopp´esdans cette th`ese peuvent aider les d´eveloppeurs `ad´eboguer divers probl`emesde performance et une meilleure protection de la vie priv´eedes utilisateurs.

vii TABLE OF CONTENTS

DEDICATION...... ii ACKNOWLEDGMENTS...... iii ABSTRACT...... iv ABREG´ E´...... vi LIST OF TABLES...... xi LIST OF FIGURES...... xiii 1 Introduction...... 1 1.1 Contributions...... 3 1.2 Thesis Organization...... 5 2 Background...... 6 2.1 Android System Hierarchy...... 6 2.2 Android Applications...... 7 2.2.1 Code, Manifest, and Resources...... 8 2.2.2 Execution Model and Performance...... 10 2.2.3 Permission and Privacy...... 11 3 AppAudit: Analyzing and Detecting Data Leaks...... 14 3.1 The Information Flow Problem Revisited...... 14 3.2 Related Work...... 17 3.2.1 Static Analysis...... 17 3.2.2 Dynamic Analysis...... 19 3.2.3 Compiler Techniques...... 19 3.3 The Synergy of Two Analyses...... 19 3.4 API Usage Analysis...... 21 3.4.1 Call Graph Extensions...... 21 3.4.2 API Usage Analysis...... 24 3.5 Approximated Execution...... 25 3.5.1 Object and Taint Representation...... 26 3.5.2 Basic Execution Flow...... 27 3.5.3 Complete Execution Rules...... 30 3.5.4 Tainting Rules...... 31

viii 3.5.5 Execution Extensions and Optimizations...... 32 3.5.6 Approximation Mode...... 32 3.5.7 False Positive Analysis: Execution Path Validation. 35 3.5.8 False Negative Analysis: Tainting Validation.... 38 3.5.9 Infinity Avoidance...... 40 3.6 Evaluation...... 41 3.6.1 Implementation...... 42 3.6.2 Evaluation Methodology...... 44 3.6.3 Completeness of Static API Analysis...... 45 3.6.4 Detection Accuracy...... 46 3.6.5 Usability...... 50 3.6.6 Characterization of Data Leaks in Real Apps.... 54 4 AppInspector: Programmatically Diagnosing Performance Issues. 59 4.1 Introduction...... 59 4.2 Performance Issue Characterization...... 62 4.2.1 Lengthy Operation...... 62 4.2.2 Over Asynchrony...... 64 4.2.3 Memory Bloat...... 65 4.3 AppInspector Design...... 69 4.3.1 Bytecode Instrumentation...... 69 4.3.2 Profile Graph...... 78 4.3.3 Diagnosing Performance Issues...... 79 4.3.4 Implementation...... 83 4.4 Evaluation...... 83 4.4.1 Methodology...... 84 4.4.2 Overall Results...... 86 4.4.3 Diagnose Slow App Initialization...... 87 4.4.4 Pinpoint Lengthy API Calls in Slow Event Handlers 88 4.4.5 Reveal Inefficient User-defined Functions...... 89 4.4.6 Find Colliding AsyncTasks...... 90 4.4.7 Comparison with Static Analysis...... 91 4.4.8 Overhead Analysis...... 92 4.5 Related Work...... 93 5 AppSwift: Automatically Enhancing App UI Performance.... 96 5.1 Introduction...... 96 5.2 Motivation...... 99 5.3 AppSwift Design...... 101 5.3.1 Overview...... 102 5.3.2 Bitmap Cache...... 103 5.3.3 Simplest API transformation...... 105 5.3.4 API-assisted API transformation...... 106 5.3.5 Data-flow assisted API transformation...... 107

ix 5.3.6 Transformation correctness validation and complex- ity reduction...... 109 5.3.7 Logging and Inspecting...... 110 5.3.8 Implementation...... 110 5.4 Evaluation...... 111 5.4.1 Methodology...... 112 5.4.2 Experiment Results...... 115 5.4.3 Case Study...... 119 5.5 Discussions...... 122 5.5.1 Generalization...... 122 5.5.2 App Rewriting vs. OS Upgrade...... 123 5.5.3 vs. ART...... 124 5.6 Related Work...... 124 6 Conclusion and Future Work...... 127 Appendix: PATDroid...... 131 References...... 134

x LIST OF TABLES Table page 3–1 Trigger APIs and extended function calls...... 24 3–2 The execution rules. κ is a series of evaluation functions that perform real calculation when values are known. PTS denotes primitive types...... 30 3–3 The SLOCs for different components...... 42 3–4 Evaluation datasets...... 43 3–5 The breakdown of detection accuracy on Android malware genome dataset...... 48 3–6 App auditing use cases and requirements...... 52 3–7 Free apps that spread certain personal information identified by AppAudit. For the “Privacy Policy” column, a “lib” means that the privacy policy does not cover the kind of data spread by advertising libraries...... 55 4–1 A list of studied memory related bug reports...... 66 4–2 Selected apps and evaluation workloads. Category and down- loads are collected from Play Store. All apps have a user review score between 4.2 and 4.7 (out of 5.0)...... 85 4–3 Performance issues detected by AppInspector...... 86 4–4 The analysis capability of AppInspector (dynamic) vs. Per- fChecker (static). ⊆ means the detected problems form a subset of actual problems while ⊇ indicates a superset relation. UDFs denote User-Defined Functions...... 91 5–1 Benchmark image loading on three devices. All three devices load a full-screen-resolution RGB bitmap image...... 99

xi 5–2 Different bitmap origins and their identifiers. “App private” means the image can not be accessed by other apps. “Read- only” means the image data can not be mutated. FILEPATH refers to the relative file path to app root folder. LMT stands for last modified time of the file. RESID stands for a numeric identifier for the resource generated by the compiler...... 103 5–3 Selected apps and the workload for evaluation. All data are collected from Store...... 113

xii LIST OF FIGURES Figure page 2–1 Android architecture...... 6 2–2 The APK file format and the brief Android build process. AndroidManifest.xml* denotes the binary form of this XML file...... 8 2–3 The event-driven execution of an Android app...... 11 2–4 The permissions request dialog when the user installs an app.. 12 3–1 AppAudit use cases: AppAudit protects app developers from using data-leaking 3rd-party libraries; AppAudit helps app markets to detect data-leaking apps uploaded by untrusted app developers; AppAudit helps mobile users to prevent installing problematic apps from untrusted app markets... 15 3–2 AppAudit architecture and workflow...... 20 3–3 An extended call graph. Each vertex stands for a function. Solid lines represent traditional call relationships and dashed lines stand for extended calls. Grey vertices are the marked suspicious functions. BRs stand for BroadcastReceivers that can receive system events...... 23 3–4 AppAudit approximated executor state machine...... 25 3–5 Four basic control flow structures and their compiled bytecode streams...... 33 3–6 The overall true positives on Android malware genome dataset (99.3%)...... 47 3–7 The average analysis time per app for AppAudit and two static analysis tools. Note that FlowDroid only finishes 61% of the samples (due to OutOfMemory exceptions and 10-minute timeout). Its average time only includes successful cases... 51 3–8 The venues of data leaking...... 56 3–9 The types of leaked data...... 57 4–1 Breakdown of memory objects involved in memory leak reports. 67

xiii 4–2 AppInspector workflow...... 68 4–3 AppInspector instrumentation details. Lightweight Profiling: 1. event handlers, life cycle methods and UI callbacks; 2. asynchronous functions; Complementary Tracing: 3. time- consuming API calls; 4. GC pause time; 5. stack sampling; Tracking Asynchrony: 6. asynchronous calls;...... 70 4–4 Generating and visualizing a profile graph...... 77 4–5 A case of colliding asynchronous functions in the time-thread view of a real profile graph. Dashed lines are asynchronous calls...... 81 4–6 The cumulative distribution function of execution time of long-running methods in the UI thread...... 87 5–1 The cumulative distribution function for API occurrence in 115 thousand apps. The top four APIs related to bitmap are selected, used in more than 80% of the apps...... 100 5–2 The architecture of AppSwift. Four shaded components are runtime components of AppSwift, running with the rewritten app...... 102 5–3 AppSwift’s performance enhancements on 30 real-world apps.. 116 5–4 FaceQ main window, showing the avatar picture currently being made and a list of available style pictures. Each style picture is loaded and displayed asynchronously...... 121

xiv Chapter 1 Introduction In the past ten years, mobile devices have witnessed growth and tremen- dous success in the consumer electronics market. Smart devices refer to a broad spectrum of portable electronics, powered by similar software platforms like Google Android and Apple iOS. These include wearable devices like wearable smart watches (<1.8 diagonal inches), hand-held devices such as (2.45 to 5.1 diagonal inches), phablets (5.1 to 6.99 inches) and tablets (>7 inches). Also modern TVs and cars are integrated with smart hardware and software. As of year 2015, the worldwide shipment of smartphones alone has reached more than 1.4 billion units [81]. Mobile applications, also termed as “apps”, are the mobile software that directly interact with users on mobile platforms. These apps offer a wide range of functionaries, such as location-based services, health monitoring, mo- bile gaming, which has greatly innovated daily lives. Mobile users generally obtain apps from app markets. As of 2015, the Google Android app market- place provides over 1.5 million apps, with a cumulative downloads of tens of billions [40]. As mobile computing becomes ubiquitous and users continue to use more and more apps, app quality becomes the core factor that impacts user satisfactory. App quality has many aspects and in this thesis, privacy and performance problems in apps are the primary study targets.

Detecting Privacy Threats. Mobile devices nowadays store a wide range of personal user data and apps could use these data to improve service quality (e.g., targeted search results, location-based recommendation, etc). However, this also attracts attackers that abuse apps to collect sensitive user

1 information for unfair advertising revenue, phishing and other malicious activ- ities. Low-quality apps are also growing quickly, such as adware that contain only ads, greyware that interfere and hijack normal apps, repackaged apps that tamper popular apps to steal advertising revenue, etc. App market oper- ators as well as mobile users need tools to detect privacy threats in apps and prevent harmful apps. Meanwhile, various app privacy study gradually reveals that some apps and advertising libraries that extensively collect and leak sen- sitive user data [71, 62, 65]. App developers also need tools to understand the potential privacy threats caused by including 3rd-party libraries. Recently the research community is developing static analysis [48, 121, 103] and dynamic approaches [126, 61] to track sensitive information flow and detect data leaks. However, because of the event-driven nature and a rapid growing code base, such tools dramatically degrade in detection precision and consume considerable amount of time (from several minutes to hours per app). We realize that controlling the cost (memory consumption and analysis time) of program analysis is important for making these analyses practical for app markets, mobile users and app developers.

Diagnosing Performance Problems. Mobile apps are like other graph- ics (GUI) applications, which are interactive applications that constantly receive user inputs and update its UI. The key performance in- dex for interactive apps is to keep responsive to user inputs. Method tracing (TraceView [30] in the Android SDK) instruments apps to time the method execution to reveal long-running methods, which is the primary tool available for debugging app performance problems. However, event-driven execution of apps could spawn multiple threads working asynchronously to serve one user inputs. Also, the slowness observed in one particular method might be caused by contending resources with another thread or due to a particular

2 execution ordering. Thus method tracing often can not reveal the root causes of performance problems and extensive tracing also incurs considerable run- time overhead (up to 2.5x slowdown [74]) and produces a large bulk of logs that are not easy to understand. As a result, according to a recent study of Github-hosted mobile app projects [110], profiling tools are not effective for diagnosing performance problems and developers generally choose to manually time certain methods to detect performance problems. Recent app debugging research aims to develop more effective perfor- mance measurement methods [122, 101, 108, 112, 78] for mobile apps. How- ever, some approaches require changes in the operating system while others are non-trivial for the Android platform where apps can make reflection calls, use complicated inter-process messaging passing and use a wide range of APIs. We realize that relying on a specialized OS is not realistic for an ever-growing number of OEM Android devices and the complexity of the Android platform must be tackled by a practical diagnosing tool. Meanwhile, researchers are developing mechanisms to improve app perfor- mance [102, 92, 69, 56]. However, these tools currently rely on app developers to use new APIs or refactor existing code base. It remains interesting yet challenging to explore methods that can transform app code and apply per- formance enhancements automatically. 1.1 Contributions

In this thesis, we develop various program analysis techniques and tools to automatically deal with privacy and performance problems in mobile apps. These techniques could be used by app developers to improve app quality, by app market operators to remove low-quality apps and by users to vet app behavior. The contributions can be divided into three parts:

3 • AppAudit: we develop a dynamic binary code analysis to detect data leaks in Android apps. This dynamic analysis can execute a part of the app and track the dissemination of user data. We combine this analysis with a static analysis to detect leaks more precisely and more efficiently (both in terms of time and memory consumption). This part of the work is published at IEEE Symposium on Security and Privacy (S&P’15) [115]. • AppInspector is a tool that diagnoses performance problems in Android apps. It reuses most analysis building blocks from AppAudit and extends that to perform static instrumentation on app binary code. AppInspec- tor collects timing and diagnostic data at runtime and analyze these data to reveal responsiveness problems and their causes. AppInspector only needs to instrument app code and thus is readily deployable to exist- ing test environments. The preliminary results have been published at Workshop on Power-Aware Computing and Systems (HotPower’13) [116] and the full paper is in submission, under review [117]. • AppSwift is a code transformation tool that is built based on AppIn- spector. With AppInspector, we find that app UI constitutes many large bitmap images and many app responsiveness problems are caused by inefficiently loading these images. We develop AppSwift that auto- matically transforms app code to fix common code patterns that cause performance problems. We demonstrate that AppSwift can remove a considerable number of UI responsiveness problems by transforming the app code and retrofitting the uses of various image loading APIs. This part of the work is currently in submission, under review [118].

4 1.2 Thesis Organization

The rest of the thesis begins with a brief introduction of the Android software stack and its app programming model. Then three chapters elaborate the methods and tools we developed to app performance issues and privacy leakage. Finally, a summary of my current research results and future work is provided. The appendix briefly introduces an open-source part of our analysis framework that is used across the three tools we built in the thesis.

5 Chapter 2 Background Android is an open-source mobile platform developed by multiple compa- nies, led by Google, first released in September 2008 [2]. Nowadays, Android is a major mobile platform powering hundreds of millions of devices worldwide. Everyday, there are about one million new Android devices being activated. The Android platform supports a diverse range of devices, from as small as smart watches, TV boxes, to medium size smartphones, phablets, tablets, to large sized connected vehicles. The Play store, which is the Google operated Android application marketplace, now has over one million different apps in 2016. In this chapter, we introduce the Android software stack with a focus on the composition and the execution model of Android applications. 2.1 Android System Hierarchy

Applications (Apps)

Application Framework Views, Managers, Content Providers

Libraries Android WebKit, OpenGL, Runtime SQLite, libc, … Dalvik VM, ART

Linux Drivers, Inter-Process Communication

Figure 2–1: Android architecture.

6 The Android software stack is divided into four layers, as shown in Fig- ure 2–1.

Linux. At the very bottom of the software stack, Android runs a tai- lored Linux operating system, with drivers for various on-board hardware com- ponents, such as WiFi module, GPS module, accelerometer, camera, etc.

Android Runtime and Native Libraries. On top of Linux, Android has the and native libraries. The Android runtime is a fully- functional Java virtual machine, known as the Dalvik VM [20], which can execute programs in Dalvik bytecode format. Newer Android also has a new runtime called ART [11], which further compiles Dalvik bytecode to native device binary for improved execution performance. ART and Dalvik are com- patible. Native libraries include SQL-style , fonts, OpenGL libraries, WebKit, which are fundamental to modern applications.

Application Framework. The layer above that is the Android Appli- cation Framework, which appears as several jar files dynamically linked to all Android applications. The framework provides Java classes to access various hardware services (e.g. acquiring GPS locations, reading phone state, etc). It also provides standard Android UI widgets (namely Views) such as buttons and text boxes. This layer also contains a set of designated application compo- nent classes, which should be inherited by the application to perform program logic.

Android Applications. Finally, the top most layer is the Android applications (or apps for short), developed in Java. Android apps are packed into a single distributable file, called the Android Application Package (APK). 2.2 Android Applications

Android apps are the entities that directly interact with mobile users. Apps can provide various complicated functionalities, such as photo editing,

7 web browsing, gaming, etc. Analyzing and improving the performance and security of apps is the main focus of this thesis. In this section, further details regarding apps will be elaborated to understand the problems within the scope of the thesis. 2.2.1 Code, Manifest, and Resources

App source code APK (zip)

.java javac .class dx .dex dx .jar

AndroidMa AndroidMa aapt nifest.xml nifest.xml*

.xml lib/*.so aapt

.png aapt res/ aapt .wav META-INF/ jarsinger

Figure 2–2: The APK file format and the brief Android build process. AndroidManifest.xml* denotes the binary form of this XML file.

Figure 2–2 briefs the composition of an APK file and the build pro- cess that generates such files. An APK file is a standard zip file, com- prising app code (.dex), a manifest file with metadata describing the app

(AndroidManifest.xml), binary linkable libraries (lib/*.so), images/- sounds/UI design files (res/) and cryptography signatures (META-INF/).

Dalvik bytecode (DEX). The app Java source code files are first com- piled with the standard JDK javac compiler. Then the dx utility of Android build tools converts multiple class files (with Oracle Java bytecode) and li- braries into one single DEX file with Dalvik bytecode. Dalvik bytecode [20] is register-based bytecode, which is more compact in space. The single .dex

8 file contains all classes of the app and 3rd-party libraries. To analyze app bytecode, smali [72] and PATDroid [27] are essential.

AndroidManifest File. The AndroidMananifest file is an XML file containing various app metadata. For example, an app is not a single entry point Java program. Instead, the app developer should inherit designated classes (i.e., app components) in the application framework to implement app logic. The names of the inherited app components are provided in the manifest file such that the Android system knows how to properly launch an app.

Resources and Layout XML. Android also allows app developers to pack images, sounds, animations, and UI design files (namely layout XML files) in an app. These multi-media files (so called Resources) are compiled by the aapt utility of the Android build tools and the app can access them via framework-provided APIs. The below XML snippet shows a simple UI design with one text box and a button. The aapt tool has a range of built-in UI widgets and pre-defined properties for each widget.