PROFILEDROID: MULTI-LAYER PROFILING OF ANDROID APPLICATIONS

XUETAO WEI LORENZO GOMEZ PROFESSOR IULIAN NEAMTIU PROFESSOR MICHALIS FALOUTSOS

UNIVERSITY OF CALIFORNIA, RIVERSIDE

WE DEPEND ON SMARTPHONES MORE AND MORE

source:defenseindustrydaily.com “FDA approves Mobisante’s US Army CSDA initiative smartphone (Connecting Soldiers to Digital Applications) ultrasound app” to replace handheld radio + BLUE FORCE [Feb 2011] source:mobilehealthnews.com tracker + portable GPS + video feed ROVER ANDROID IS A POPULAR SMARTPHONE PLATFORM

Operating system share of smartphone sales (US)

850,000 Android phones activated every day [Google letter to investors, April 2012]

BUT WE DON’T UNDERSTAND APP BEHAVIOR

source:washingtonpost.com THE ANDROID APP MARKET

IS A JUNGLE Will this app drain my battery?

Will this app leak my photos? Which radio is best for me?

Will this app tell my friends that I’m a moron? FIRST STEP TO MAINTAINING THE JUNGLE

Provide a low-cost method to profile the behavior of an app

Given a few short executions: • What did the app do? • How does the app use resources? • What entities does it communicate with? • What was the app supposed to do? • Where there conflicts? Why? BENEFICIAL TO

Application developers • Assess performance and security implications • Make better use of resources

End users • Enhance user control and improve experience • Push developers to make better use of resources

ANDROID APPS

Written in Java, compiled into Dalvik VM bytecode

Packaged as name.apk • Signed with developer’s private key • Essentially a .zip file containing: • .dex bytecode file (similar to .class) • “Manifest” file (XML): permissions Permission model • Permissions last FOREVER! • Shown before install • All or nothing

Permissions alone aren’t enough to describe app behavior

DYNAMIC ANALYSIS WITH PROFILEDROID

Android Device Desktop/laptop

Profiling Monitoring

Android Debugging Bridge • Goal Multi-level profiling based on static and dynamic application analysis MONITORING AND ANALYSIS WITH PROFILEDROID PROFILING FRAMEWORK: MONITORING Trace File

Trace Trace File File

Capture 3 user traces, 5 minutes per app PROFILING FRAMEWORK: MONITORING

Playback Trace File

Logs

Playback original trace and collect logs A QUICK REPLAY DEMO

PROFILING FRAMEWORK: MONITORING

Logs Trace File

Logs

Repeat playback 10 times per user (5 in morning and 5 at night, per app) Total of 30 runs of each app to build profile PROFILING FRAMEWORK: MULTI-LAYER ANALYSIS

Application Static

Application Framework User Logs Libraries, Android Runtime OS Kernel Network Android software stack

What metrics can be used to capture app behavior? SELECTED APPS Category App Social Facebook Games Angry Birds, Angry Birds$$ Music & Audio Pandora, Shazam, Shazam$$ Media & Video Youtube Shopping Amazon Travel Gasbuddy Health & Fitness Instant Heart Rate, Instant Heart Rate$$ Communication Dolphin browser Sports ESPN Reference Dictionary.com, Dictionary.com$$ total 27 apps: 19 free , 8 paid

Wide range of apps, spanning many categories Popular apps with >1,000,000 installs STATIC LAYER Source: manifest & bytecode decompilation • Permissions (shown at install) • Internet • Location (GPS or network) Application Static • Phone • … Application Framework User • Intents (not shown at install) • Resource use Libraries, Android Runtime without permission via deputy apps OS Linux Kernel Network STATIC LAYER ANALYSIS RESULTS

App Internet GPS Camera Mic Bluetooth Telephony Facebook p p i* p Dictionary.com p i i Instant Heart Rate p p i i Shazam p p p Total (out of 27) 27 9 6 4 3 5

p = use via permissions i = use via intents (deputy apps)

*for version originally tested March 2012 USER LAYER

Source: logcat, /dev/input/event • Input devices and events • Touchscreen Application Static • Physical Buttons • Accelerometer Application Framework User • Compass Libraries, Android Runtime • Light proximity sensor OS … Linux Kernel Network USER LAYER ANALYSIS RESULTS LAYER

Source: strace • System call categories • Network sockets Application Static • File system User • VM & IPC Application Framework • Enforces isolation Libraries, Android Runtime • Overhead: scheduling, idling, IPC OS Linux Kernel Network OPERATING SYSTEM LAYER RESULTS

App Intensity Filesystem Network VM & IPC Misc (syscalls/sec) (%) (%) (%) (%) Tiny Flashlight 436 1 1 77 21 Facebook 1,031 4 3 72 21 Amazon 693 1 6 77 16 InstHeartRate 944 8 2 75 15 NETWORK LAYER Source: tcpdump (packets and content) App traffic • Origin (app's website) • CDN and Cloud Application Static • Google User • 3rd party: ads & tracking Application Framework

Libraries, Android Runtime OS Linux Kernel Network NETWORK LAYER RESULTS

App Intensity In/out Origin CDN+ Google 3rd HTTP/ ratio Cloud party HTTPS (bytes/ (%) (%) (%) (%) split sec) (%) Tiny Flashlight 134 2.49 - - 99 - 100/- AdvTaskKiller 26 0.94 - - 100 - 92/8 AdvTaskKiller$$ ------Facebook 4,606 1.45 68 32 - - 23/77 Amazon 7,758 8.17 95 5 - - 99/1 InstHeartRate 575 2.39 - 4 86 10 86/14 InstHeartRate$$ 6 0.31 - 9 90 1 20/80 APPLICATION THUMBNAILS

High usage Medium usage Low usage READING BETWEEN THE LINES

Free apps are not as free as we might think • 50—100% higher system call intensity • Dramatically higher network traffic (usually ads&tracking)

 Bad for your dataplan, your battery life, and your privacy

VM-based isolation comes at a cost • 64—87% of system calls are due to VM and IPC READING BETWEEN THE LINES

Apps talk to many servers spread across many top-level domains • AngryBirds$$: 4 domains, AngryBirds free: 8 domains • Weatherbug: 13 domains, Shazam: 13 domains

Most network traffic is not encrypted Google traffic is predominant • Except for Amazon and Facebook which have 0 (zero) Google traffic

FUTURE WORK

• Expand study to include more apps • User profiles • Study the variance across users • Fully automate process • Profiler as an app to run on the device • Provide summary of usage on close

QUESTIONS?