Extracting Library-Based Object-Oriented Applications

Extracting Library-Based Object-Oriented Applications Peter F. Sweeney Frank Tip IBM Thomas J. Watson Research Center IBM Thomas J. Watson Research Center P.O. Box 704, Yorktown Heights, NY 10598, USA P.O. Box 704, Yorktown Heights, NY 10598, USA pfs@ us.ibm.com [email protected] ABSTRACT s Modem objec~oriented applications typically rely on one or more independently developed class libraries. In an increasingly popular model of software distribution, software is developed in one computing environment and de- With the advent of virtual machine technology~ library ployed in other environments by transfer over the internet. code is amenable to the same analyses as application code, because the same representation is used in each Extraction tools perform a static whole-program analysis to determine unused functionality in applications in order case. When an application is distributed separately from the libraries it depends upon, an extraction tool to reduce the time required to download applications. We have identified a number of scenarios where extraction tools needs to be aware of the boundary between the two. require information beyond what can be inferred through • Different kinds of software distributions (e.g.,complete static analysis: software distributions other than complete applications, web-based applications that execute in applications, the use of reflection, and situations where an the context of a browser, and extensible frameworks) application uses separately developed class libraries. This have different sets of entry points, and require the paper explores these issues, and introduces a modular spec- application extractor to make different assumptions ~cation language for expressing the information required about the deployment environment. In fact, the same for extraction. We implemented this language in the con- unit of software may even play different roles, depend- text of Jax, an industrial-strength application extractor for ing on the deployment scenario. Jaw, and present a small ease study in which different extraction scenarios are applied to a commercially available s The use of dynamic features such as reflection I poses library-based application. additional problems for extraction tools, because a static analysis alone is incapable of detern~inlng the program constructs that are used, and hence the program con- 1. INTRODUCTION structs that can be removed. In an increasingly popular software distribution model, software is developed in one computing environment, and • There are also some interesting interactions between deployed in other environments by transfer over the inter- the above issues. For example, consider a situation net. Because the time required to transfer an application is where an application A is to be distributed together generally proportional to the transferred number of bytes, with an independently developed class library L in it becomes important to make applications as small as pos- which reflection is used. In general, the use of re- sible. Application extractor8 are tools that reduce appli- flection in L may depend on the features in L that cation size by determining unused functionality that can be are used by A. We will discuss how this observation removed from the application without affecting program be- affects extraction. havior. Each of these issues requires information that cannot be ob- Previously~ extractors have been designed primarily with complete applications in mind. Such whole-application ex- tained using static analysis alone, and has to be provided to the extraction tool by the user. This paper explores the tractors require one to specify an application's entry point(s), above issues in detail, and provides a uniform solution in and rely on a static whole-program analysis to determine the form of a small, modular specification language MEL functionality that can be removed without affecting program (Modular Extraction Language) for prodding the informa- behavior. However, the extraction of software distributions other than complete applications raises several issues: tion required to extract various kinds of programs. MEL's features are essentially language-independent, with the ex- ception of some Java-speci~c syntax used to refer to program Permission to make digital or hard copies of all or part of this work for constructs such as classes, methods and fields. In order to personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advan- validate our approach, we implemented MEL in the context tage and that copies bear this notice and the full citation on the first page. of Jax, an industrial-strength application extractor for Jav~ To copy otherwise, to republish, to post on servers or to developed at IBM Research [18]. We discuss how several of redistribute to listS, requires prior specific permission and/or a fee. SIGSOFT 2000 (FSE-8) 11/00 San Diego, CA, USA IFor convenience, we will henceforth use the term 'Yel]ec- © 2000 ACM ISBN 1-58113-205-010010011._$5.00 tion" to refer to all mechanisms for loading and accessing program constructs by specifying their name as a string value, and for examining program structure. 98 the program transformations and optimizations performed for the extraction of A itself. If we want to accommodate by Jax are adapted to take into account MP.L scripts, and scenarios where v obtains a different version s of L, then the present a small case study in which different extraction sce- extractor should derived Aez t from A without making as- narios are applied to a commercially available library-based sumptions about the specific version of L that happens to Java application. be available in a's development environment. The standard The remainder of this paper is organized as follows. In Jav~ libraries are an obvious example of this situation. Section 2, we present the requirements on extraction tools We will now investigate the issues related to the use of in the presence of class library usage. Section 3 introduces reflection. In essence, reflection allows one to access a pro- a specification language for defining the extraction of vari- gram construct by specifying a run-time string value that ous kinds of library-based applications. Section 4 presents represents the construct's name, and to examine the struc- a mechanism for translating specifications to a small set of ture of the classes used in a program. Such features are assertions. Section 5 discusses an implementation of MEL, problematic for extraction tools because, in general, a static and reports on a small case study. Section 6 summarizes analysis cannot determine which program constructs are ac- related work, and Section 7 presents conclusions and direc- cessed using reflection, and should therefore not be removed tions for future work. or transformed. Thus, extractors require additional information from the user that specifies which program constructs 2. REQUIREMENTS are accessed using reflection. In our experience, determining the program constructs that may be accessed using reflec- In this section, we analyze a number of frequently occur- tion is a fairly easy task for a programmer familiar with the ring distribution scenarios, and determine what information code. However, it can be quite difficult to determine how is required by extraction tools beyond what can be obtained reflection is used in third-party libraries, especially if the through static analysis, source code for these libraries is unavailable. In the example of Figure 1, the extraction of ALezt from A and L by 2.1 Distribution scenarios application vendor a requires additional information about Figure I shows several distribution scenarios that may oc- the use of reflection in L. This can be difficult to determine cur in the presence of: a library vendor I responsible for cre- from distribution Lez t alone, because it does not contain the ating and distributing a class library L, an application ven- source code for the library. To complicate matters further, dor a responsible for creating and distributing an L-based the set of program constructs in L accessed using reflection application A, and two users, u and v, of application A. may depend on the features in L that are used by A. In It is reasonable to assume that library vendor l will want general, different L-based applications may cause different to make library L ms small as possible, in order to reduce usage of reflection within L. Our solution to these problems the download times experienced by customers, but also to (discussed in detail below) will he to have library vendor reduce the load of the server from which the library is down- l distribute a script along with Lez t that contains the in- loaded. Hence, I creates an eztracted version Lez t of L, and formation required to extract any L-based application. Our distributes Lez t instead of L. Clearly, Lez t should offer the scripts allow I to specify that a program construct is only ac- same functionality as L, but size-reducing optimizations can cessed using reflection under certain conditions (e.g., when still be applied to parts of L that are not exposed to users. a certain method is reachable). Application vendor a presumably downloads Lez t for use We have only discussed a few example distribution sce- during development of application A. When application A narios. Other likely scenarios include: is ready for distribution, there are two options, depending on whether or not a user already has the prerequisite library • Extracting a library together with multiple applica- L installed. Figure I shows a user u who does not have (the tions that use it. correct version of) L. Assuming that u does not expect to download or create other L-based applications, it is desir- • Extracting a library in the context of another library able for u to download a distribution ALez t that comprises that uses it.

Extracting Library-Based Object-Oriented Applications

Opencobol Manual for Opencobol 2.0

Static Typing Where Possible, Dynamic Typing When Needed: the End of the Cold War Between Programming Languages

Dynamic C++ Classes a Lightweight Mechanism to Update Code in a Running Program

Reprogramming Embedded Systems at Run-Time

Advanced Development of Certified OS Kernels Prof

Specialising Dynamic Techniques for Implementing the Ruby Programming Language

Linkers and Loaders Do?

Using the Go Programming Language in Practice

An Evil Copy: How the Loader Betrays You

Towards a Dynamic Object Model Within Unix Processes

Pynamic Summary

Installation for Z/OS