Tackling Runtime-Based Obfuscation in Android with TIRO
Total Page:16
File Type:pdf, Size:1020Kb
Tackling runtime-based obfuscation in Android with TIRO Michelle Y. Wong and David Lie University of Toronto Abstract analysis as well for efficiency and greater code cover- age [1,2, 12]. As a result, malware authors have increas- Obfuscation is used in malware to hide malicious activ- ingly turned to obfuscation to hide their actions and con- ity from manual or automatic program analysis. On the fuse both static and dynamic analysis tools. The presence Android platform, malware has had a history of using ob- of obfuscation does not indicate malicious intent in and fuscation techniques such as Java reflection, code pack- of itself, as many legitimate applications employ code ing and value encryption. However, more recent mal- obfuscation to protect intellectual property. However, be- ware has turned to employing obfuscation that subverts cause of its prevalence among malware, it is crucial that the integrity of the Android runtime (ART or Dalvik), a malware analyzers have the ability to deobfuscate An- technique we call runtime-based obfuscation. Once sub- droid applications in order to determine if an application verted, the runtime no longer follows the normally ex- is indeed malicious or not. pected rules of code execution and method invocation, There exist a variety of obfuscation techniques on the raising the difficulty of deobfuscating and analyzing mal- Android platform. Many common techniques, such as ware that use these techniques. Java reflection, value encryption, dynamically decrypt- In this work, we propose TIRO, a deobfuscation ing and loading code, and calling native methods have framework for Android using an approach of Target- been identified and discussed in the literature [11,22,26]. Instrument-Run-Observe. TIRO provides a unified These techniques have a common property in that they framework that can deobfuscate malware that use a com- exploit facilities provided by the Java programming lan- bination of traditional obfuscation and newer runtime- guage, which is the main development language for An- based obfuscation techniques. We evaluate and use droid applications, and thus we call these language- TIRO on a dataset of modern Android malware samples based obfuscation techniques. In contrast, malware au- and find that TIRO can automatically detect and reverse thors may eschew Java and execute entirely in native language-based and runtime-based obfuscation. We also code, obfuscating with techniques seen in x86 mal- evaluate TIRO on a corpus of 2000 malware samples ware [3,8, 17, 20, 24]. We call this technique full-native from VirusTotal and find that runtime-based obfuscation code obfuscation. techniques are present in 80% of the samples, demon- strating that runtime-based obfuscation is a significant In this paper, we identify a third option—obfuscation tool employed by Android malware authors today. techniques that subvert ART, the Android RunTime, which we call runtime-based obfuscation techniques. These techniques subtly alter the way method invoca- 1 Introduction tions are resolved and code is executed. Runtime-based obfuscation has advantages over both language-based There are currently an estimated 2.8 million applica- and full-native code obfuscation. While language-based tions on the Google Play store, with thousands being obfuscation techniques have to occur immediately before added and many more existing applications being up- the obfuscated code is called, runtime-based obfuscation dated daily. A large market with many users naturally techniques can occur in one place and alter code exe- draws attackers who create and distribute malicious ap- cution in a seemingly unrelated part of the application. plications (i.e. malware) for fun and profit. While dy- This significantly raises the difficulty of deobfuscating namic analyses [10, 27, 28, 34] can be used to detect code, as code execution no longer follows expected con- and analyze malware, anti-malware tools often use static ventions and analysis can no longer be performed piece- 1 meal on an application, but must examine the entire ap- 2. We present the design and implementation of TIRO, plication as a whole. Compared to full-native code ob- a framework for Android-based deobfuscation that fuscation, runtime-based obfuscation allows a malware can handle both language-based and runtime-based developer to still use the convenient Java-based API li- obfuscation techniques. braries provided by the framework. Malware that use na- 3. We evaluate TIRO on a corpus of 34 modern mal- tive code obfuscation will either have to use language- or ware samples provided by the Android Malware runtime-based obfuscation to hide its Android API use, team at Google. We also run TIRO on 2000 obfus- or risk compatibility loss if it tries to access APIs directly. cated malware samples downloaded from VirusTo- Our study of obfuscated malware suggests that authors tal to measure the prevalence of various runtime- almost universally employ language- and runtime-based based obfuscation techniques in the wild and find methods to hide their use of Android APIs in Java. that 80% use a form of runtime-based obfuscation. To study both language- and runtime-based obfusca- tion in Android malware, we propose TIRO, a tool that We begin by providing background on the Android can handle both types of obfuscation techniques within runtime and classical language-based obfuscation tech- a single deobfuscation framework. TIRO is an acronym niques in Section2. We then introduce and explain for the automated approach taken to defeat obfuscation runtime-based obfuscation techniques in Section3. We — Target-Instrument-Run-Observe. TIRO first analyzes present TIRO, a deobfuscation framework that can han- the application code to target locations where obfusca- dle both language- and runtime-based obfuscation in tion may occur, and applies instrumentation either in the Section4 and provide implementation details in Sec- application or runtime to monitor for obfuscation and tion5. We present an analysis of obfuscated Android collect run-time information. TIRO then runs the ap- malware in Section6 and show how T IRO can deobfus- plication with specially generated inputs that will trig- cate these applications. We analyze our findings and our ger the instrumentation. Finally, TIRO observes the re- limitations in Section7. Related work is discussed in sults of running the instrumented application to deter- Section8. Finally, we conclude in Section9. mine whether obfuscation occurred and if so, produce the deobfuscated code. TIRO performs these steps itera- 2 Background tively until it can no longer detect any new obfuscation. This iterative mechanism enables it to work on a variety Android applications are implemented in Java, compiled of obfuscated applications and techniques. into DEX bytecode, and executed in either the Dalvik TIRO’s hybrid static-dynamic design is rooted in an Virtual Machine or the Android Runtime (ART).1 The integration with IntelliDroid [31], which implements tar- Dalvik VM, used in Android versions prior to 4.4, inter- geted dynamic execution for Android applications. TIRO prets the DEX bytecode and uses just-in-time (JIT) com- uses this targeting to drive its dynamic analysis to lo- pilation for frequently executed code segments. ART, a cations of obfuscation, saving it from having to execute separate runtime introduced in Android 4.4 and set as the unrelated parts of the application. However, IntelliDroid default in Android 5.0, adds ahead-of-time (AOT) com- uses static analysis and is susceptible to language-based pilation (using the dex2oat tool) to a DEX interpreter. and runtime-based obfuscation, which can make its anal- Starting in Android 7.0, ART also includes profile-based ysis incomplete. By using an iterative design that feeds smart compilation that uses a mixture of interpretation, dynamic information back into static analysis for de- JIT, and AOT compilation to boost application perfor- obfuscation, TIRO can incrementally increase the com- mance. pleteness of this targeting, which further improves its de- We briefly discuss traditional language-based obfusca- obfuscation capabilities. In this synergistic combination, tion and full-native code obfuscation techniques: IntelliDroid improves TIRO’s efficiency by targeting its Reflection. Java provides the ability to dynamically in- dynamic analysis toward obfuscation code and TIRO im- stantiate and invoke methods using reflection. Because proves IntelliDroid’s completeness by incorporating de- the target of reflected method invocations is only known obfuscated information back into its targeting. Succes- at run-time, this frustrates static analysis and can make sive iterations allow each to refine the results of the other. the targets of these calls unresolvable (e.g. by using We make three main contributions in this paper: an encrypted string), thus hiding call edges and data ac- cesses. 1. We identify and describe a family of runtime-based obfuscation techniques in ART, including DEX file Value encryption. Key values and strings in an applica- hooking, class modification, ArtMethod hooking, tion can be encrypted so they are not visible to static anal- method entry-point hooking and instruction hook- ysis. When executed, code in the application decrypts ing/overwriting. 1https://source.android.com/devices/tech/dalvik/ 2 the values, allowing the application to use the plain text Java code. As a result, applications that use native code at run-time. Value encryption is often combined with re- obfuscation still need obfuscation for Java code if they flection to hide the names of classes or methods targeted want to be able to make Android API calls reliably. by reflected calls. Dynamic loading. Code located outside the main ap- 3 Runtime-based obfuscation plication package (APK) can be executed through dy- namic code loading. This is often used in packed ap- Before we describe runtime-based obfuscation, we first plications, where the hidden code is stored as an en- describe how code is loaded and executed in the ART crypted binary file within the APK package and de- runtime. Figure1 illustrates three major steps in loading crypted when the application is launched.