Representing String Computations as Graphs for Classifying Malware Justin Del Vecchio Steven Y. Ko Lukasz Ziarek
[email protected] [email protected] [email protected] University at Buffalo University at Buffalo University at Buffalo Buffalo, New York Buffalo, New York Buffalo, New York ABSTRACT 1 INTRODUCTION Android applications rely heavily on strings for sensitive operations Android is one of the most popular platforms for mobile computing like reflection, access to system resources, URL connections, data- and Android’s official app store, Google Play, contains more than2.9 base access, among others. Thus, insight into application behavior million unique applications (apps) as of December 2018. However, can be gained through not only an analysis of what strings an ap- this popularity has also resulted in a dramatic increase in malicious plication creates but also the structure of the computation used to apps targeting Android devices. As of 2019, security experts [1] have create theses strings, and in what manner are these strings used. In identified over 31 million unique, malicious mobile apps. To combat this paper we introduce a static analysis of Android applications to this increase in malware the security community has developed discover strings, how they are created, and their usage. The output many static analysis tools for identifying malicious Android apps. of our static analysis contains all of this information in the form of By identifying malware statically, these tools can offer enhanced a graph which we call a string computation. We leverage the results security for end users who download and use Android apps. Many to classify individual application behavior with respect to malicious different types of analyses have proven to be useful such as data and or benign intent.