Modeling, Analyzing and Optimizing Android System As Complex Networks

Modeling, Analyzing and Optimizing Android System As Complex Networks

Modeling, Analyzing and Optimizing Android System as Complex Networks Die Li, Yuan Dong*, Huan Luo, Tingyu Jiang, Shengyuan Wang, Yu Chen Department of Computer Science and Technology Tsinghua University, Beijing 100084, China *Email: [email protected] Abstract—Due to the rise of mobile devices, there is Applications an increasing interest in the optimizations of the mobile operating system both for performance and code size. However, the complexity of the mobile operating system Application Framework makes these optimizations more challenging. This paper presents extended call graph (ECG) for Libraries Android modeling the access relations of functions and data objects runtime Surface manager Media framework of all the native code of Android except those in the Linux Core libraries kernel. We identified that the ECG of Android is a typical SQLite OpenGL | ES complex network. It exhibits the basic features of scale-free Dalvik VM topology and small-world structure. Based on these results, FreeType WebKit this paper found and eliminated the “isolated” vertices and SGL SSL libc (Bionic) subgraphs (without indegree and outdegree) in the graph to build a better connected and centralized graph. The Linux Kernel largest connected subgraph before optimization contains only 87.0% of all the vertices while the subgraph after Fig. 1. Android architecture. optimization contains 99.9% of all the vertices. Thus, most of the dead code in libraries and framework are eliminated automatically. TABLE I CODE LINE STATISTICS OF ANDROID-X86 3.2.2 (IN KILO, This method is applied to Android-x86 on ASUS Eee GENERATED BY SLOCCOUNT[5]). PC. It achieves about 26.7% code size reduction in total and up to 1.3% speedup in floating-point computation. It also can be adopted to model, analyze and optimize other C/C++ Asm Java Others Total mobile operating systems developed in C/C++ and assembly Applications 182 0 525 1 708 App framework 783 7 638 4 1,432 languages. This work not only provides a foundation of Android runtime 282 53 538 7 880 optimizing Android operating system for both performance Libraries 8,073 127 568 467 9,234 and code size, but also helps us to understand and develop Linux kernel 9,549 241 0 27 9,817 the complex software system more efficiently. Others 346 0.1 257 22 626 Total 19,215 428 2,526 528 22,697 I. INTRODUCTION With the rise of mobile devices, such as smartphones and pads, there is an increasing need to optimize the provides services, such as activity manager and window mobile operating system both for performance and code manager, for applications. The libraries and Android size. However, the complexity of the modern operating runtime are mainly implemented in C/C++. Underneath system like Android[1] makes these optimizations more all of the above is the Linux kernel. challenging. Android is the most popular operating sys- The total physical source lines of code of Android-x86 tem for mobile devices, accounting for about 50% of the 3.2.2 (version 20120215) is 22,697k. It is developed in smartphone market share in the 4th quarter of 2011[2]. Java, C/C++ and other programming languages. Table I Android is composed of a number of complex soft- shows the code line statistics of different components. ware systems. As shown in Figure 1, it adopts a lay- The system contains a lot of dead code that are ered architecture[3]. All applications are written in the completely unreachable, because the source code come Java programming language (sometimes with Java native from different developers, and some from open source interface, i.e. JNI[4], to call native code). The applica- communities whose main platforms are not mobile sys- tion framework includes both Java and C/C++ code. It tems. Android uses its own C library, called “Bionic”, composed partly from the BSD C library combined with in relocatable files1. Then we build the extended Android original code[6]. On the x86 architecture, the call graph of all the native code (C/C++/Asm) of size of a stripped Bionic library is only 26% of that of Android except those in the Linux kernel. a stripped GNU C Library[7]. It achieves the small code • We identified that the ECG of Android is a typical size by not providing unnecessary interfaces. For exam- complex network. It exhibits the basic features of ple, many inter-procedural communication interfaces are scale-free topology and small-world structure. We not implemented because Android has its own “intent” also found that there are many “isolated” vertices service[6]. and subgraphs (without indegree and outdegree) in Therefore, it would be valuable to eliminate the dead the graph. It means that there are plenty of vertices code from all the native code of Android except those in that have no interactions with the rest of the graph. the Linux kernel (There are about 9,507k lines of C/C++ • Based on these results, this paper eliminates the and assembly, as shown in the shadowed cells in Table I). “isolated” vertices and to build a better connected But there are two fundamental challenges that must be graph. The original graph contains only 87.0% of addressed first. all the vertices while the graph after optimization contains 99.9% of all the vertices. Thus, most of Challenge #1: How to model the complex Android the dead functions and inaccessible variables in li- system? With a sound theoretical basis and a variety of braries and framework are eliminated automatically. applications, complex networks offer a perfect nonlinear abstraction when analyzing applications to real-world To our best knowledge, this is the first work to con- problems[8]. Extensive research[9], [10] has confirmed struct and analyze the complete extended call graph for that class collaboration graphs and static call graphs of a functions and data objects of Android system, and then large scale software such as eMule, Openvrml, MySQL, refine the graph to perform system-wide dead code elim- and Linux Kernel display scale-free topology and small- ination. This method is applied to Android-x86 3.2.2. It world properties of complex networks, which have also achieves about 26.7% code size reduction in total and been found in networks built with inter-package de- up to 1.3% speedup in floating-point computation. pendency in Linux distributions[11], [12]. Most of the The rest of this paper is organized as follows: we researches on network of open-source software focus on first present an informal overview of our approach to single software or on the package-level relations. None analyze and optimize Android system (Sec II). Then we of them handles function-level relations in multilingual define, construct and analyze the extended call graph of complex systems as Android. Android system as complex networks (Sec III). Based on these results, we present a system-wide binary rewrite Challenge #2: How to identify and eliminate the optimization to eliminate dead code (Sec IV), and show dead code in Android? Dead code is common in real- the evaluation of the real system (Sec V). Finally we world software[13]. Most modern compilers build a discuss related works (Sec VI), draw conclusions and control flow graph to eliminate dead code in a function. discuss future work (Sec VII). Some compilers perform a similar operation on the compilation-unit level to remove functions and objects II. OVERVIEW that are local to a compilation unit but never used. More Figure 2 outlines the steps of our analysis and op- sophisticated compilers can perform inter-procedural or timizations. In the default build process of a system link-time optimizations[14][15][16] to remove unused with multiple binaries, each binary (executable or shared functions and objects on the program level, which library) is compiled and linked separately, and the com- usually involves a call graph for the whole program. piler and linker only use information from the source However, these existing methods have no ability to solve files for one binary. In a complex system such as the problem we face here. Android, binaries produced in this manner are usually Our Contributions. This paper proposes an optimiza- not optimal. For example, a shared library may contain tion method based on modeling and analyzing the An- externally visible symbols that are never actually used. droid system as complex networks. Based upon the In our method, we construct a whole-system extended previous work in open source software analysis and call graph (ECG) by extracting information from re- link-time optimization, this paper makes the following locatable object files. The ECG is a directed graph. contributions: 1 Relocatable files are commonly called “object files.” Because we directly manipulate ELF files, we follow the ELF specification[17] to • Our work presents extended call graph (ECG) for use “relocatable files” or “relocatable object files”, while “object files” modeling the relations of functions and data objects includes relocatable, executable and shared object files. 2 are actually unreachable and never used. In order to Source files identify them more accurately, we define the significant subgraph of the ECG, which consists of vertices that are actually reachable and must be retained. Vertices not in Compile Build graph the significant subgraph can be eliminated from the final binaries. The significant subgraph can be found with a straightforward graph search algorithm. We rewrite the Relocatable Extended Analyze relocatable files to mark sections not in the significant object files call graph graph subgraph in a way that enables the linker to discard them at link time. Finally, we evaluate the size, correctness and perfor- Optimize mance of the optimized system and compare them with those of the original system.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    12 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us