Dissertation Boosting Static Security Analysis of Android Apps Through Code Instrumentation
Total Page:16
File Type:pdf, Size:1020Kb
PhD-FSTC-2016-52 The Faculty of Science, Technology and Communication Dissertation Defense held on the 29th November 2016 in Luxembourg to obtain the degree of Docteur de l’Université du Luxembourg en Informatique by Li Li Born on 13th February 1989 in ChongQing (China) Boosting Static Security Analysis of Android Apps through Code Instrumentation Dissertation Defense Committee Dr. Yves Le Traon, Dissertation Supervisor Professor, University of Luxembourg, Luxembourg Dr. Lionel Briand, Chairman Professor, University of Luxembourg, Luxembourg Dr. Tegawendé Bissyandé, Vice Chairman Research Associate, University of Luxembourg, Luxembourg Dr. Xiangyu Zhang Professor, Purdue University, USA Dr. Michael Backes Professor, Saarland University, Germany Dr. Jacques Klein Senior Research Scientist, University of Luxembourg, Luxembourg Abstract Within a few years, Android has been established as a leading platform in the mobile market with over one billion monthly active Android users. To serve these users, the official market, Google Play, hosts around 2 million apps which have penetrated into a variety of user activities and have played an essential role in their daily life. However, this penetration has also opened doors for malicious apps, presenting big threats that can lead to severe damages. To alleviate the security threats posed by Android apps, the literature has proposed a large body of works which propose static and dynamic approaches for identifying and managing security issues in the mobile ecosystem. Static analysis in particular, which does not require to actually execute code of Android apps, has been used extensively for market-scale analysis. In order to have a better understanding on how static analysis is applied, we conduct a systematic literature review (SLR) of related researches for Android. We studied influential research papers published in the last five years (from 2011 to 2015). Our in-depth examination on those papers reveals, among other findings, that static analysis is largely performed to uncover security and privacy issues. The SLR also highlights that no single work has been proposed to tackle all the challenges for static analysis of Android apps. Existing approaches indeed fail to yield sound results in various analysis cases, given the different specificities of Android programming. Our objective is thus to reduce the analysis complexity of Android apps in a way that existing approaches can also succeed on their failed cases. To this end, we propose to instrument the app code for transforming a given hard problem to an easily-resolvable one (e.g., reducing an inter-app analysis problem to an intra-app analysis problem). As a result, our code instrumentation boosts existing static analyzers in a non-invasive manner (i.e., no need to modify those analyzers). In this dissertation, we apply code instrumentation to solve three well-known challenges of static analy- sis of Android apps, allowing existing static security analyses to 1) be inter-component communication (ICC) aware; 2) be reflection aware; and 3) cut out common libraries. • ICC is a challenge for static analysis. Indeed, the ICC mechanism is driven at the framework level rather than the app level, leaving it invisible to app-targeted static analyzers. As a consequence, static analyzers can only build an incomplete control-flow graph (CFG) which prevents a sound analysis. To support ICC-aware analysis, we devise an approach called IccTA, which instruments app code by adding glue code that directly connects components using traditional Java class access mechanism (e.g., explicit new instantiation of target components). • Reflection is a challenge for static analysis as well, because it also confuses the analysis context. To support reflection-aware analysis, we provide DroidRA, a tool-based approach, which instruments Android apps to explicitly replace reflective calls with their corresponding traditional Java calls. The mapping from reflective calls to traditional Java calls is inferred through a solver, where the resolution of reflective calls is reduced to a composite constant propagation problem. • Libraries are pervasively used in Android apps. On the one hand, their presence increases time/memory consumption of static analysis. On the other hand, they may lead to false positives and false negatives for static approaches (e.g., clone detection and machine learning-based i malware detection). To mitigate this, we propose to instrument Android apps to cut out a set of automatically identified common libraries from the app code, so as to improve static analyzer’s performance in terms of time/memory as well as accuracy. To sum up, in this dissertation, we leverage code instrumentation to boost existing static analyzers, allowing them to yield more sound results and to perform quicker analyses. Thanks to the afore- mentioned approaches, we are now able to automatically identify malicious apps. However, it is still unknown how malicious payloads are introduced into those malicious apps. As a perspective for our future research, we conduct a thorough dissection on piggybacked apps (whose malicious payloads are easily identifiable) in the end of this dissertation, in an attempt to understand how malicious apps are actually built. ii Acknowledgements This dissertation would not have been possible without the support of many people who in one way or another have contributed and extended their precious knowledge and experience in my PhD studies. It is my pleasure to express my gratitude to them. First of all, I would like to express my deepest thanks to my professor Yves Le Traon who has given me this great opportunity to come across continents to pursue my doctoral degree. He has always trusted and supported me with his great kindness throughout my whole PhD journey. Second, I am equally grateful to my daily supervisors, Dr. Jacques Klein and Dr. Tegawendé Bissyandé, for their patience, valuable advice and flawless guidance. They have taught me how to perform research, how to write technical papers and how to conduct fascinating presentations. Their dedicated guidance has made my PhD journey a fruitful and fulfilling experience. I am very happy for the friendship we have built up during the years. Third, I am particularly grateful to Dr. Alexandre Bartel who has introduced me into the world of static analysis of Android apps. Since then, working in this field is just joyful for me. I would like to extend my thanks to all my co-authors including Prof. Eric Bodden, Prof. Patrick McDaniel, Prof. David Lo, Dr. Lorenzo Cavallaro, Dr. Damien Octeau, Steven Arzt, Siegfried Rasthofer, Jabier Martinez, Daoyuan Li, etc. for their helpful discussions and collaborations. I would like to thank all the members of my PhD defense committee, including chairman Prof. Lionel Briand, external reviewers Prof. Xiangyu Zhang and Prof. Michael Backes, my supervisor Prof. Yves Le Traon, and my daily advisers Dr. Jacques Klein, Dr. Tegawendé Bissyandé. It is my great honer to have them in my defense committee and I appreciate very much their efforts to examine my dissertation and evaluate my PhD work. I would like to also express my great thanks to all the friends that I have made in the tiny Grand Duchy of Luxembourg for the memorable moments that we have had. More specifically, I would like to thank all the team members of SerVal at SnT for the great coffee breaks and interesting discussions. Finally, and more personally, I would like to express my deepest appreciation to my dear wife Fengmei Chen, for her everlasting support, encouragement, and love. Also, I can never overemphasize the importance of my parents in shaping who I am and giving me the gift of education. Without their support, this dissertation would not have been possible. Li LI University of Luxembourg November 2016 iii Contents List of figures ix List of tables xiii Contents xiv 1 Introduction 1 1.1 Motivation for Static Security Analysis..........................2 1.2 Challenges of Static Security Analysis..........................3 1.2.1 Android-specific Challenges............................3 1.2.2 Java-inherited Challenges.............................4 1.3 Contributions........................................5 1.4 Roadmap..........................................7 2 Background 9 2.1 Android........................................... 10 2.1.1 App Structure................................... 11 2.1.2 Inter-Component Communication (ICC)..................... 13 2.1.3 Reflection...................................... 14 2.2 Static Analysis....................................... 15 2.2.1 Call Graph (CG) Construction.......................... 15 2.2.2 Control-Flow Graph (CFG) Construction.................... 16 2.2.3 Data-Flow Analysis................................ 17 2.3 Code Instrumentation................................... 18 3 An SLR on Static Analysis of Android Apps 21 3.1 Introduction......................................... 22 3.2 Methodology for the SLR................................. 23 3.2.1 Research Questions................................. 24 3.2.2 Search Strategy................................... 25 3.2.3 Exclusion Criteria................................. 27 3.2.4 Backward Snowballing............................... 28 3.2.5 Primary publications selection.......................... 28 3.3 Data Extraction...................................... 30 3.4 Summary of Findings................................... 31 3.4.1 Purposes of the Analyses............................. 31 3.4.2 Form and Extent of Analysis..........................