Execution Enhanced Static Detection of Android Privacy Leakage Hidden by Dynamic Class Loading

Execution Enhanced Static Detection of Android Privacy Leakage Hidden by Dynamic Class Loading Yufei Yangy, Wenbo Luoy, Yu Peix, Minxue Pany∗, Tian Zhangy∗ yState Key Laboratory for Novel Software Technology, Nanjing University, China xDepartment of Computing, The Hong Kong Polytechnic University, Hong Kong, China [email protected], [email protected], [email protected], {mxp,ztluck}@nju.edu.cn Abstract—Mobile apps often need to collect and/or access DCL is a feature supported by the Android platform, which sensitive user information to fulfill their purposes, but they may enables apps to extend their behaviors at runtime by using also leak such information either intentionally or accidentally, a class loader to load classes in an explicit fashion—in this causing financial and/or emotional damages to users. In the past few years, researchers have developed various techniques to paper, we refer to classes that are implicitly loaded during app detect privacy leakage in mobile apps, however, such detection execution as internal and those explicitly loaded through DCL remains a challenging task when privacy leakage is implemented as external. While DCL brings great flexibility to Android via dynamic class loading (DCL). app development and has been used widely in developing 2 In this work, we propose the DL technique that enhances frameworks and plug-ins, Google recommended to use it with static analysis with dynamic app execution to effectively detect privacy leakage implemented via DCL in Android apps. To caution [16], since classes from sources that are not verified evaluate DL2, we construct a benchmark of 88 subject apps “might be modified to include malicious behavior”. In fact, a with 2578 injected privacy leaks and apply DL2 to the apps. recent study [17] showed that DCL had been used by malwares DL2 was able to detect 1073, or 42%, of the leaks, significantly to evade the security checks in vetting systems like Google outperforming existing state-of-the-art privacy leakage detection Bouncer. tools. Based on whether the source or the sink of sensitive Keywords-Privacy Leakage Detection, Dynamic Class Loading, information is located in external classes, privacy leakage can Taint Analysis, Constraint Solving be implemented via DCL in one of the following four schemes: (i) sensitive information is retrieved in internal code but leaked I. INTRODUCTION in external code; (ii) sensitive information is retrieved in exter- In the past few years, Android has become the most popular nal code but leaked in internal code; (iii) sensitive information mobile operating system, taking over 80% of the market is both retrieved and leaked in external code; (iv) sensitive share [1], and the number of mobile apps targeting the Android information is both retrieved and leaked in internal code. platform has grown rapidly. Meanwhile, according to a recent Existing techniques based on taint analysis do not perform study [2], more and more Android apps need to collect well in discovering leaks implementing schemes (i)–(iii): Since and/or access information like device id, location information, external classes are statically not part of the app under analysis, SMS messages, and contacts, to fulfill their purposes. While tools based on static analysis only have limited power in such apps greatly facilitate our work and daily life, serious detecting those leaks [3]–[8], [18], [19]; Since most external concerns have been raised about the risks of them leaking that classes are only loaded and executed along certain program information. paths, the chance is often low for those paths to be exercised As one of the security mechanisms, the permission-based during dynamic analysis [9]–[14], [20]–[24]; The DyDroid framework employed on the Android platform can be used to technique [25] conducts a privacy tracking analysis by com- ensure that the apps’ access to certain resource/information bining static and dynamic analyses. It, however, handles only is subject to the user’s approval. The framework, however, is privacy leaks with both the source and the sink of sensitive too coarse-grained in that it is only concerned with whether, information within external classes and cannot detect leaks but not how, an app uses the information. To complement implementing schemes (i) and (ii) listed above. In this work, the protection provided by such coarse-grained mechanism, we refer to privacy leaks implementing schemes (i)–(iii) as various approaches to privacy leakage detection have been being hidden by DCL. Leaks implementing scheme (iv) are proposed and many of them [3]–[14] are based on taint not hidden by DCL, since a conservative static taint analyzer analysis [15]. always assuming the external code to propagate taint data can Certain features of the development language and the exe- still detect the leaks. cution environment of mobile apps, however, have added to To effectively detect privacy leakage hidden by DCL in the difficulties in detecting privacy leakage. One important Android apps, we propose the DL2 technique that enhances example of such features is dynamic class loading (DCL). static analysis with dynamic app execution. Given an Android app, DL2 first applies static analysis to find paths in the app *Corresponding author. that lead to invocations to methods from external classes and gather the corresponding path conditions on program variables. 1 public class SmsReceiver extends BroadcastReceiver{ Next, the path conditions are solved by a constraint solver and 2 ... the solutions are used to drive the dynamic execution of the 3 public void onReceive(...){ app, during which DL2 retrieves the external classes loaded 4 ... 5 String clsName = ..., methodName = ...; and records information about the methods from those classes 6 SmsMessage msg = SmsMessage.createFromPdu(pdus[x]); 2 that are invoked. Finally, DL applies static analysis to both the 7 String phoneNbr = msg.getOriginatingAddress(); internal and external code and combines the results to detect 8 if(msg.getMessageBody().contains(KEYWORD)){ privacy leaks hidden by DCL. 9 DexClassLoader loader=new DexClassLoader(...); To evaluate the effectiveness and efficiency of DL2, we 10 try { 11 Class cls = (Class)loader.loadClass(clsName); constructed a benchmark of 88 subject apps based on 25 12 Object instance = cls.newInstance(); Android apps collected from the Google Play store and the 13 cls.getMethod(methodName, Object.class) 1 F-droid repository . In total, 2578 privacy leaks were injected 14 .invoke(instance, phoneNbr); into the subject apps under the guidance of a privacy leak 15 } catch (Exception e) { ... } model for auditing anti-malware tools [26]. DL2 was able 16 } 17 } to detect 1073, or 42%, of the injected privacy leaks in the 18 ... benchmark, which is 163%, and 200%, more than the state- 19 } of-the-art privacy leak detection tool TaintDroid and DyDroid, Listing 1: The code example of DCL respectively. On average, it took DL2 18.8 seconds to detect one privacy leak. 23 public class Sender{ We make the following contributions in this work: 24 void sendMessage(String phoneNbr){ 2 • We develop the DL technique that enhances static anal- 25 SmsManager smsManager = SmsManager.getDefault(); ysis with dynamic app execution to effectively detect 26 smsManager.sendMultipartTextMessage(phoneNbr, ...); privacy leakage hidden by DCL, and implement the 27 ... 28 } technique into an automated tool also named DL2; 29 ... • We construct a benchmark of 88 subject apps with 2578 30 } injected privacy leaks implemented via DCL; 2 Listing 2: The method invoked by DCL • We experimentally evaluate DL on the subject apps from the benchmark, and compare the performance of DL2 with that of TaintDroid and DyDroid on the same subjects. during program execution. Class SmsReceiver is registered as a The remainder of this paper is organized as follows: Section BroadcastReceiver for incoming text messages and its method II presents a simple example demonstrating how dynamic onReceive is invoked upon receiving any message. In case class loading can be used to hide privacy leakage; Section III the body of a message contains KEYWORD (Line 8), the method explains in detail how DL2 works step by step; Section IV constructs a class loader (Line 9), loads a class named clsName describes the experiments we conducted to evaluate the per- (Line 11), creates an instance of the class (Line 12), and then formance of DL2 and reports on the experimental results; calls a method with the name methodName on the new instance Section V reviews recent work related to privacy leak detection using the message sender’s phone number as the argument for Android apps; Section VI concludes the paper. (Lines 13 and 14). II. DCL AND PRIVACY LEAKAGE Ultimately, whether such behavior causes privacy leakage In this section, we use a small example from the VirusShare2 depends on if the sender’s phone number is considered sen- repository to introduce dynamic class loading on the Android sitive, which class with the name clsName gets loaded, and platform and to demonstrate how DCL can be used to leak how the phone number is utilized in clsName.methodName. sensitive user information. In this work, we adopt a flexible design and allow users Dynamic class loading (DCL) is a mechanism that allows to decide which behaviors are considered causing privacy software systems to decide which classes to load during leakage. Particularly, a user may specify a list of source APIs runtime, and it enables applications to be compiled sepa- and a list of sink

Load more