Inferring Aliasing and Encapsulation Properties for Java

Inferring Aliasing and Encapsulation Properties for Java Kin-Keung Ma Jeffrey S. Foster University of Maryland, College Park University of Maryland, College Park [email protected] [email protected] Abstract 1. Introduction There are many proposals for language techniques to con- Understanding and controlling aliasing is a fundamental part trol aliasing and encapsulation in object oriented programs, of building robust software systems. In recent years, re- typically based on notions of object ownership and pointer searchers have proposed many systems that reason about uniqueness. Most of these systems require extensive manual various aliasing properties in programs, including unique- annotations, and thus there is little experience with these ness [1, 2, 7, 8, 15, 28, 31] and ownership [2, 3, 6, 11, 12, 16, properties in large, existing Java code bases. To remedy 18, 24, 30]. Unique objects are those referred to by only one this situation, we present Uno, a novel static analysis for pointer, and thus are guaranteed unaliased with any other ob- automatically inferring ownership, uniqueness, and other jects in the system. Owned objects are encapsulated inside of aliasing and encapsulation properties in Java. Our analysis their owner, and hence cannot be directly accessed by other requires no annotations, and combines an intraprocedural components. points-to analysis with an interprocedural, demand-driven Many of these systems include static checking of these predicate resolution algorithm. We have applied Uno to a va- properties in Java-like source code, but usually require that riety of Java applications and found that some aliasing prop- the programmer manually add extensive annotations. More- erties, such as temporarily lending a reference to a method, over, while the properties modeled seem quite useful, it is are common, while others, in particular field and argument unclear how often they occur in existing programs. To date, ownership, are relatively uncommon. As a result, we believe experience with using such systems on large software appli- that Uno can be a valuable tool for discovering and under- cations has either been with coarse analysis [16, 30] or via standing aliasing and encapsulation in Java programs. case studies [2]. In this paper, we present a novel tool called Uno1 that Categories and Subject Descriptors D.1.5 [Programming fills this gap. Uno takes as input unannotated Java source Techniques]: Object-oriented Programming; D.2.11 [Soft- code and infers uniqueness of method arguments and results; ware Engineering]: Software Architectures—Information lending (temporary aliasing) of method arguments and re- hiding; D.3.2 [Programming Languages]: Language Class- ceiver objects; and ownership and non-escaping of param- ifications—Object-oriented languages; F.3.2 [Logics and eters and fields. These properties capture key aliasing and Meanings of Programs]: Semantics of Programming Lang- encapsulation behavior, and can give important insight into uages—Program analysis Java code. For example, a programmer might use Uno to check that a factory method always returns a unique object General Terms Languages, Measurement as expected, or that a proxied object is owned by its proxy, which therefore controls all access to it. Keywords Uno, Java, ownership, uniqueness, lending, en- Uno performs inference using a novel two-phase algo- capsulation, aliasing, ownership inference, uniqueness infer- rithm. The first phase is an intraprocedural (within one func- ence tion) may-alias analysis that computes local points-to information. Our alias analysis is mostly standard, but uses an interesting mix of flow-sensitive and flow-insensitive information. The second phase is a demand-driven interprocedural analysis that computes a set of mutually-recursive Permission to make digital or hard copies of all or part of this work for personal or predicates. For example, for each method m, Uno deter- classroom use is granted without fee provided that copies are not made or distributed mines whether the predicate UNIQRET(m) holds, meaning for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute that m always returns a unique object when it is called. If to lists, requires prior specific permission and/or a fee. m returns its ith argument, then UNIQRET(m) holds only OOPSLA’07, October 21–25, 2007, Montreal,´ Quebec,´ Canada. Copyright c 2007 ACM 978-1-59593-786-5/07/0010. $5.00. 1 Uniqueness aNd Ownership, http://www.cs.umd.edu/projects/PL/uno if UNIQPAR(m; i) holds, meaning m is always called with 1 interface Subject f a unique ith argument. Uno incorporates several other inter- 2 void setData(int d); dependent predicates that capture the aspects of aliasing and 3 g encapsulation mentioned above. 4 class ConcreteSubject implements Subject f 5 private int data; We have applied Uno to more than one million lines of 6 void setData(int d) f data = d; g Java code, including SPEC benchmarks, the DaCapo bench- 7 g marks [5], and larger programs found on SourceForge. Our 8 class Factory f goal was to demonstrate the utility of Uno, and to discover 9 public Subject getSubject() f // returns unique 10 Subject r = new ConcreteSubject(); how often the ownership and encapsulation properties it in- 11 Subject s = r ; fers actually occur in Java programs. We found that, on av- 12 return r ; erage across our benchmarks, the monomorphic ownership 13 g 14 g inferred by Uno holds for 16% of the private fields and only 2.7% of the arguments of called constructors. Somewhat sur- (a) Uniqueness of method return prisingly, Uno infers that more than 30% of all methods (constructors not included) that do not return a primitive return a unique value, and approximately 50% of all non- 15 class Proxy implements Subject f 16 private Subject s; // owned by this primitive method parameters are lent (i.e., only temporarily 17 public Proxy(Subject s) f aliased by the method and not captured). Our results show 18 this.s = s; that programmers do control aliasing and encapsulation in 19 g some ways suggested in the literature but less so in oth- 20 public void setData(int d) f 21 s.setData(d∗d); ers, modulo the precision of Uno’s sound but conservative 22 g analysis. To our knowledge, Uno is the first ownership and 23 g uniqueness inference tool that has been demonstrated on a 24 class Main f wide variety of Java applications. 25 public void main(Factory f) f 26 Subject t = f .getSubject(); In summary, the contributions of this paper are: 27 t .setData(1); // uses t directly • 28 Proxy proxy = new Proxy(t); // proxy owns t We describe a flow-sensitive, intraprocedural points-to 29 proxy.setData(2); // t now used through proxy analysis algorithm tuned to compute the information 30 g needed for evaluating Uno’s predicates. (Section 3) 31 g • We present a novel interprocedural algorithm that infers a (b) Ownership of method argument range of aliasing and encapsulation properties. Our analysis is structured as a set of mutually-recursive predi- Figure 1. Source code example cates. The algorithm is demand-driven, so that only the predicates and points-to information necessary to answer a query are actually computed. (Section 4) • We describe our implementation, Uno, and apply it to a are created by getSubject (lines 9–12). Notice that when number of benchmarks. Uno finds that some aliasing and getSubject returns, the only other pointers to its result are r encapsulation properties such as lending of arguments and s, both of which are dead at the method exit. Thus, Uno occur often, and other properties, such as monomorphic concludes the return value of getSubject is always unique. ownership, occur rarely. As a result, we believe that Uno Knowing a return value is unique can be useful because is a valuable tool for discovering and understanding alias- uniqueness typically implies the returned value is “fresh.” ing and encapsulation in Java. (Section 5) This is particularly helpful in this example, when we are calling a factory method rather than a constructor, which is 2. Overview at least guaranteed to allocate a new object. Uno also checks We begin our presentation by illustrating Uno’s core notions uniqueness of constructor return values—a pathological constructor that stores this in a field of another object would vi- of uniqueness and ownership for methods and constructors, olate uniqueness—and we found that all constructors in our and by describing the key predicates Uno computes to per- form inference. experiments return a unique value. In Section 4, we formally define a predicate UNIQRET(m) Uniqueness We say that a pointer is unique if it is the that describes the necessary conditions for method m to re- only reference to the object it points to. Uniqueness is a turn a value that is unique when the method exits. Uno’s very useful pointer property because its strong notion of non- inference algorithm is specified in terms of this and other aliasing permits modular reasoning [1, 2, 7, 8, 15, 28, 31]. predicates, and for a given input program and selection of Figure 1(a) illustrates one kind of uniqueness Uno infers. predicates, Uno reports whether those predicates hold for In this example, instances of ConcreteSubject (lines 4–7) the methods and constructors of interest. Ownership Uno’s notion of ownership is based on the flex- holds, then the ith argument may become owned by the ible alias protection framework of Noble, Potter, Vitek, and receiver object after the call. Clarke [12, 24]. Their system uses a notion called represen- Examining the Proxy class further, Uno observes that tation containment, in which if object o contains or owns on line 18, s is stored in a private field.

Load more