Comp ositional Pointer and Escap e Analysis for Java Programs
John Whaley and Martin Rinard
Lab oratory for Computer Science
Massachusetts Institute of Technology
Cambridge, MA 02139
fjwhaley, [email protected]
Abstract the analysis result b ecoming more precise as more of the
program is analyzed. At every stage in the analysis, the
This pap er presents a combined p ointer and escap e analy-
algorithm can distinguish where it do es and do es not have
sis algorithm for Java programs. The algorithm is based on
complete information.
the abstraction of p oints-to escap e graphs, whichcharacter-
ize how lo cal variables and elds in ob jects refer to other
1.1 Analysis Uses
ob jects. Each p oints-to escap e graph also contains escap e
Java presents a clean and simple memory mo del: concep-
information, which characterizes how ob jects allo cated in
tually, all ob jects are allo cated in a garbage-collected heap.
one region of the program can escap e to b e accessed by an-
While useful to the programmer, this mo del comes with a
other region. The algorithm is designed to analyze arbitrary
cost. In many cases it would b e more ecient to allo cate
regions of complete or incomplete programs, obtaining com-
ob jects on the stack, eliminating the dynamic memory man-
plete information for ob jects that do not escap e the analyzed
agementoverhead for that ob ject. A similar situation holds
regions.
for the Java synchronization mo del. Conceptually, every
Wehave develop ed an implementation that uses the es-
Java ob ject comes with a lo ck. Each synchronized metho d
cap e information to eliminate synchronization for ob jects
ensures that it executes atomically by acquiring and releas-
that are accessed by only one thread and to allo cate ob jects
ing the lo ck in its receiver ob ject. But the lo ckoverhead is
on the stack instead of in the heap. Our exp erimental results
wasted when only one thread accesses the ob ject | the lo cks
are encouraging. We were able to analyze programs tens
are required only when there is a p ossibility that multiple
of thousands of lines long. For our b enchmark programs,
threads may attempt to access the ob ject simultaneously.
our algorithms enable the elimination of b etween 24 and
In this pap er we discuss the use of our analysis results to
67 of the synchronization op erations. They also enable the
eliminate unnecessary synchronization and to enable stack
stack allo cation of b etween 22 and 95 of the ob jects.
allo cation of ob jects. The basic idea is to use the analy-
sis information to determine when ob jects do not escap e
1 Intro duction
threads and metho ds. If an ob ject do es not escap e from its
allo cating thread to another thread, the compiler can trans-
This pap er presents a combined p ointer and escap e anal-
form the program to eliminate synchronization op erations
ysis algorithm for Java programs and programs written in
on that ob ject. If an ob ject do es not escap e a metho d, the
similar ob ject-oriented languages. The algorithm is based
compiler can transform the program to allo cate it in that
on the abstraction of p oints-to escap e graphs, whichchar-
metho d's activation record instead of in the heap. Our ex-
acterize how lo cal variables and elds in ob jects refer to
p erimental results show that the algorithms can eliminate a
other ob jects. Each p oints-to escap e graph also contains es-
signi cantnumb er of heap allo cations and synchronization
cap e information, whichcharacterizes how ob jects allo cated
op erations.
in one region of the program can escap e to b e accessed by
another region.
1.2 Analysis Prop erties
A key concept underlying this abstraction is the goal
of representing interactions between analyzed and unana-
Our analysis has several imp ortant prop erties:
lyzed regions of the program. Points-to escap e graphs make
a clean distinction between ob jects and references created
It is an interpro cedural analysis. It is designed to com-
within an analyzed region and those created in the rest of
bine analysis results from multiple metho ds to obtain
the program. They therefore enable a exible analysis that
precise p oints-to and escap e information.
is capable of analyzing arbitrary parts of the program, with
It is a comp ositional analysis. It is designed to analyze
each metho d once to pro duce a single parameterized
analysis result that can b e sp ecialized for use at all of
1
the call sites that mayinvoke the metho d.
1
Recursive metho ds require an iterative algorithm that may ana-
lyze metho ds multiple times to reach a xed p oint.
b e a ected by unanalyzed regions of the program. Our al- It is a partial program analysis in two senses. First, it
gorithm is designed to analyze arbitrary regions of complete is designed to analyze each metho d indep endently of
or incomplete programs, obtaining complete information for its callers. Second, it is capable of analyzing a metho d
ob jects that do not escap e the analyzed regions. without analyzing all of the metho ds that it invokes,
with the analysis result b ecoming more precise as more
of the invoked metho ds are analyzed.
1.4 Contributions
Because the analysis is comp ositional and can analyze
This pap er makes the following contributions:
pieces of the program indep endently of their callers and
Analysis Algorithms: It presents a new combined
callees, it is esp ecially appropriate for dynamically loaded
p ointer and escap e analysis algorithm. The algorithm
programs. Our current implementation runs in a dynamic
is comp ositional and is designed to deliver useful in-
compiler and analyzes libraries indep endently of applica-
formation without analyzing the entire program.
tions. When the compiler loads an application, it retrieves
the previously computed analysis results for the libraries and
Analysis Approach: It presents an analysis approach
uses these results in the analysis of the application.
that explicitly di erentiates between references and
ob jects from analyzed and unanalyzed regions of the
1.3 Basic Approach
program. This approach allows the algorithm to cap-
ture the interactions between these regions, and to
The analysis is based on an abstraction we call p oints-to
clearly identify where it do es and do es not have com-
escap e graphs. The no des of this abstraction represent ob-
plete information ab out the extracted relationships.
jects; the edges represent references b etween ob jects. The
abstraction also contains information ab out which ob jects
Analysis Uses: It presents two optimizations en-
escap e to other metho ds or threads. For example, an ob-
abled by the extracted p oints-to and escap e informa-
ject escap es if it is returned to an unanalyzed region of the
tion: synchronization elimination and stack allo cation.
program or passed as a parameter to an unanalyzed metho d.
For each metho d, the analysis pro duces a p oints-to es-
Exp erimental Results: It presents exp erimental re-
cap e graph that characterizes the p oints-to relationships cre-
sults from a prototyp e implementation of the algo-
ated within the metho d and the interaction of the metho d
rithms. These results show that the algorithms can
with the p oints-to relationships created by the rest of the
eliminate a signi cant amount of synchronization op-
program. The analysis represents these interactions, in part,
erations and heap allo cations.
by maintaining a distinction b etween two kinds of edges: in-
The remainder of the pap er is organized as follows. Sec-
side edges, which represent references created inside the cur-
tion 2 presents examples that illustrate how the analysis
rently analyzed region, and outside edges, which represent
works. Section 3 describ es how the analysis represents the
references created outside the currently analyzed region.
program and the analysis ob jects that the algorithm uses.
Similarly, there are two kinds of no des: inside nodes,
Section 4 presents the intrapro cedural analysis, and Sec-
which represent ob jects created inside the currently ana-
tion 5 presents the interpro cedural analysis. In Section 6 we
lyzed region and accessed via inside edges, and outside nodes,
present an abstraction relation that characterizes the corre-
which represent ob jects that are either created outside the
sp ondence b etween p oints-to escap e graphs and the ob jects
currently analyzed region or accessed via outside edges. Out-
and references that the program creates as it runs. Sec-
side edges representinteractions in which the analyzed re-
tion 7 presents the synchronization elimination and stack
gion reads a reference created in an unanalyzed region. In-
allo cation transformations. Section 8 presents the exp eri-
side edges from outside no des or no des reachable from out-
mental results from our implementation. Section 9 discusses
side no des representinteractions in which the analyzed re-
related work; we conclude in Section 10.
gion creates a reference that the unanalyzed region may
read.
Edges b etween inside no des represent references created
2 Example
within the currently analyzed region. If the inside no des
do not escap e, they have no interaction with the rest of
In this section we present several examples that illustrate
the program and their edges in the p oints-to escap e graph
how our analysis works.
completely characterize the p oints-to relationships b etween
the ob jects represented by these no des. Because p oints-to
2.1 Return Values
escap e graphs record the interactions of metho ds with their
Figure 1 presents a complex number arithmetic example.
callers, the analysis can recover complete information ab out
The add metho d adds two complex numb ers, storing the
ob jects that escap e the currently analyzed metho d but are
result in a newly allo cated complex numb er ob ject and re-
recaptured in the caller.
turning the new ob ject. The multiply metho d op erates sim-
Our analysis is comp ositional | it analyzes each metho d
ilarly, but multiplies the numb ers instead of adding them.
indep endently of its callers. Unlike all other p ointer and es-
Because each metho d returns the new ob ject, the new ob ject
cap e analysis algorithms that we know of, our algorithm is
escap es from the metho d.
also capable of analyzing a metho d indep endently of meth-
The multiplyAdd metho d multiplies its two arguments,
o ds that it mayinvoke. When the algorithm skips the anal-
then returns the sum of the pro duct and the receiver. In this
ysis of a p otentially invoked metho d, it records that all
case, only the add result escap es | the temp orary result
ob jects passed as parameters to the invoked but not an-
from multiply is inaccessible outside of the multiplyAdd
alyzed metho d escap e from the scop e of the analysis. This
metho d. It is therefore p ossible to generate a sp ecialized
escap e information allows the algorithm to clearly identify
version of the multiply metho d that exp ects the storage for
b oth those ob jects for which it has complete p oints-to infor-
the result ob ject to b e allo cated on the stack of its caller.
mation and those ob jects whose p oints-to relationships may
It would then generate the result into the stack-allo cated
public class Server extends Thread { class complex {
ServerSocket serverSocket; double x,y;
int duplicateConnections; complexdouble a, double b { x = a; y = b; }
complex multiplycomplex a {
ServerServerSocket s { complex product =
serverSocket = s; new complexx*a.x - y*a.y,x*a.y + y*a.x;
duplicateConnections = 0; returnproduct;
} }
public void run { complex addcomplex a {
try { complex sum = new complexx+a.x,y+a.y;
Vector connectorList = new Vector; return sum;
while true { }
Socket clientSocket = complex multiplyAddcomplex a, complex b {
serverSocket.accept; complex product = a.multiplyb;
new ServerHelperclientSocket.sta rt ; complex sum = this.addproduct;
InetAddress addr = returnsum;
clientSocket.getInetAddress ; }
if connectorList.indexOfaddr < 0 }
connectorList.addElementad dr;
else duplicateConnections++;
Figure 1: Complex Numb er Example
}
} catch IOException e { }
}
}
Figure 3: Server Example
the program with no des that corresp ond to the ob ject cre-
ation site. So, for example, product p oints to the no de that
represents ob jects created at the ob ject creation site inside
multiply . Because this no de represents ob jects created in-
side the analyzed region, it is an inside no de.
The no de that product p oints to is accessible only via
the lo cal variable product . We therefore say that this no de
is captured. As so on as the multiplyAdd metho d returns, all
of the ob jects that this no de represents will b e inaccessible.
It is therefore legal to tie the lifetime of the ob ject to the
lifetime of the metho d invo cation, and allo cate the ob ject
on the activation record of the multiplyAdd metho d.
Figure 2: Analysis Result for multiplyAdd
The multiplyAdd metho d returns the ob ject that sum
p oints to. Even though the ob ject escap es this metho d, it
may b e recaptured by a metho d ab ove multiplyAdd in the
ob ject rather than dynamically allo cating a new complex
call chain. The analysis maintains enough information to
ob ject to hold the result.
recognize such situations.
Figure 2 presents the analysis result for the multiplyAdd
metho d. The no des in this graph represent variables and
2.2 Thread-Private Objects
ob jects; the edges represent references. The parameter vari-
ables a and b p oint to no des that represent the parame-
The next example illustrates how the analysis can detect
ter ob jects; the variable this p oints to a no de that rep-
thread-private ob jects, or ob jects that are accessed by only
resents the receiver. Because these ob jects are created out-
one thread. The goal is to eliminate synchronization op-
side the metho d, the corresp onding no des are outside no des.
erations on thread-private ob jects. The co de in Figure 3
Note that the analysis assumes that the parameters are not
implements a simple server. The server waits for incoming
aliased. If the analysis result is used at a metho d invo-
connection requests on the serverSocket . Whenever it gets
cation site where the parameters are aliased, the mapping
a request, it creates a new ServerHelper thread to service
algorithm presented in Section 5 merges the two parameter
the new connection. The server counts the numb er of du-
no des b efore it uses the analysis result. This mechanism
plicate connections | i.e., the numb er of times it connects
ensures that the algorithm can analyze the program under
to a client that it has previously connected to.
the assumption that di erent parameters and ob ject elds
To correctly compute this count, the server maintains a
are not aliased, but use the analysis results in contexts con-
list of the Internet addresses of clients that it has connected
taining aliases.
to. Whenever it receives a new request, the server checks
The product and sum variables p oint to no des that rep-
the list to determine if it has previously connected to the
resent the new complex numb er ob jects allo cated, resp ec-
client. If so, it increments duplicateConnections , whichis
tively, in the multiply and add metho ds. The algorithm
the count of the numb er of duplicate connections that the
represents ob jects allo cated within the analyzed region of
server has established. If not, it inserts the Internet address
of the client into the list so that it can detect any future
duplicate connections from this client.
In this example, the server uses a Vector to hold the
class multisetElement {
list of Internet addresses. This class is thread safe | its
Object element;
metho ds are guaranteed to execute atomically even when
int count;
concurrently invoked by di erent threads. Thread safetyis
multisetElement next;
an imp ortant prop erty. Without it, multithreaded programs
can fail in very subtle ways b ecause of data races, or unantic-
multisetElementObject e, multisetElement n {
ipated interactions b etween threads that concurrently access
count = 1;
the same ob ject.
element = e;
The problem with atomic op erations is the overhead from
next = n;
the synchronization op erations that make the metho ds ex-
}
ecute atomically. In our example, the standard implemen-
synchronized boolean checkObject e {
tations of indexOf and addElement acquire and release the
if element.equalse {
lo ck in their receiver ob ject. This overhead is esp ecially
count++;
regrettable when, as in our example, the Vector is a thread-
returntrue;
private ob ject.
} else return false;
The analysis of the run metho d in our example pro ceeds
}
as follows. The algorithm rst analyzes the invoked metho ds
synchronized multisetElement insertObject e {
to obtain a p oints-to escap e graph for each metho d. Each
multisetElement m = this;
graph summarizes how the metho d a ects the p oints-to re-
while m != null {
lationships b etween ob jects and how the ob jects it creates
if m.checke return this;
and accesses escap e to other threads or to the caller. In
m = m.next;
our example, the new clientSocket ob ject escap es to the
}
ServerHelper thread. The indexOf and addElement meth-
return new multisetElemente, this;
ods maychange the ob jects that the receiver p oints to, but
}
they do not change the escap e status of the receiver.
}
Based on this information, and on its analysis of the run
metho d, the analysis determines that the connectorList
class multiset {
ob ject is captured within the run metho d, and is accessible
multisetElement elements;
only to the run metho d's thread. The synchronization in the
multiset {
addElement and indexOf metho ds is therefore redundant.
elements = null;
The compiler can generate sp ecialized, synchronization-free
}
versions of these metho ds. The generated co de for the run
synchronized void addElementObject e {
metho d invokes these sp ecialized versions, and the compu-
if elements == null
tation executes without synchronization overhead for the
elements = new multisetElemente,null;
Vector ob ject.
else elements = elements.inserte;
It is also p ossible to address the problem of synchro-
}
nization overhead for thread-private ob jects by providing
}
the programmer with two implementations of each class: a
thread-safe implementation with synchronization used for
ob jects shared b etween threads, and a thread-unsafe imple-
Figure 4: Multiset Example
mentation with no synchronization used for thread-private
ob jects. The JDK 1.2 collections API takes this approach.
The problem with this approach is that it complicates the
API and burdens the programmer with the resp onsibility
of determining which ob jects are thread safe and which are
not. The last problem can b ecome esp ecially severe for pro-
grammers maintaining multithreaded programs. If a change
makes a previously thread-private ob ject accessible to mul-
tiple threads, the programmer must search the program to
manually replace the thread-unsafe version with the thread-
safe version.
2.3 Recursive Data Structures
We next present an example that illustrates how our algo-
rithm deals with recursive data structures. Let us assume
that in the previous example, the server would like to count
the numb er of connections from each client, instead of sim-
ply counting the numb er of duplicate connections. In this
case, the server could use the multiset class in Figure 4,
which implements a multiset as a list of elements. Each ele-
ment has a count of the numb er of times it is present in the
Figure 5: Analysis Result for insert
multiset.
If the server used a multiset instead of a Vector , the
A copy statement l = v. analysis of the run metho d in Figure 3 would still recognize
the multiset ob ject as thread-private. In this section, we
A load statement l = l :f .
1 2
fo cus on the treatmentof recursive data structures in the
analysis of the insert metho d for multisetElement s.
A store statement l :f = l .
1 2
Figure 5 presents the analysis result at the end of the
insert metho d. The this variable p oints to the receiver
A global load statement l = cl:f .
no de, which represents the receiver ob ject, while e p oints to
A global store statement cl :f = l.
a parameter no de, which represents the parameter ob ject.
Both no des are outside no des. The other no des represent
A return statement return l, which identi es the re-
ob jects created or accessed during the execution of insert .
turn value l of the metho d.
The next edge from the receiver no de p oints to a no de that
represents all ob jects referenced by the m.next eld at the
An ob ject creation site of the form l = new cl . There
load statement m = m.next , which loads the reference from
are two kinds of ob ject creation sites:
m 's next eld backinto m. The edge is an outside edge, b e-
cause it p oints to a no de that represents an ob ject allo cated
{ If cl inherits from the class Thread, then the ob-
outside the scop e of the analysis of insert , and it p oints
ject creation site is also a thread creation site.
to an outside no de. In particular, it p oints to a load node,
{ If cl do es not inherit from the class Thread, then
b ecause there is one of these no des for each load statement
the ob ject creation site is not a thread creation
in the program.
site.
There is also an outside edge from the load no de back
to itself. As the lo op in the insert metho d walks down
A metho d invo cation site of the form
the list, it traverses the references in the next elds of the
l = l :op l ;:: :;l . Each metho d invo cation site
0 1
k
no des. Because all of these references are loaded at the same
corresp onds to a metho d invo cation site m 2 M .
statement, the outside edges that represent the references
The control ow graph for each metho d op starts with the
p oint to the same outside no de, and there is cycle in the
enter statement enter and ends with an exit statement
op
graph. This mechanism ensures that the analysis terminates
exit .
op
for programs that manipulate recursive data structures.
The analysis represents the control ow relationships b e-
The remaining no de represents the new multisetElement
tween statements as follows: predst is the set of state-
ob ject that insert may allo cate to hold the new multiset
ments that may execute immediately b efore st, and succst
entry. This no de has an inside edge from the next eld
is the set of statements that may execute immediately after
to the receiver no de, and an inside edge from the element
st . There are two program p oints for each statement st,
eld to the parameter no de. These edges represent refer-
the program p oint st immediately b efore st executes, and
ences to these no des created during the execution of insert .
the program p oint st immediately after st executes.
The interpro cedural analysis uses call graph informa-
3 Analysis Algorithm
tion to compute sets of metho ds that may be invoked at
metho d invo cation sites. For each metho d invo cation site
The algorithm analyzes the program at the granularity of
m, callees m is the set of metho ds that m may invoke.
metho ds. The analysis of each metho d incorp orates infor-
Given a metho d op, callers op is the set of metho d invo ca-
mation from the metho d and from some subset of the meth-
tion sites that mayinvoke op . The current implementation
o ds that it invokes. The combination of the currently ana-
obtains this call graph information using a variant of class
lyzed metho d and all of the analyzed metho ds that it invokes
hierarchy analysis [13 ], but the algorithm can use any con-
is called the current analysis scope.
servative approximation to the actual call graph generated
when the program runs.
3.1 Program Representation
The algorithm represents the program using the following
3.2 Object Representation
analysis ob jects. There is a set l 2 L of lo cal variables
The analysis represents the ob jects that the program ma-
and a set p 2 P of formal parameter variables. There is
nipulates using a set n 2 N of no des. There are several
one formal parameter variable for each formal parameter
kinds of no des:
of each metho d in the program. Together, the lo cal and
formal parameter variables make up the set v 2 V = L [ P of
There is the set N of inside no des, which represent
I
variables.
ob jects created within the current analysis scop e and
There is also a set cl 2 CL of classes and a set op 2 OP of
accessed via references created within the current anal-
metho ds. Each metho d has a receiver class cl and a formal
ysis scop e. There is one inside no de for each ob ject
parameter list p ;:::; p . We adopt the convention that
0 k
creation site; that inside no de represents all ob jects
parameter p p oints to the receiver ob ject of the metho d.
0
created within the current analysis scop e at that site.
Finally, there is a set f 2 F of ob ject elds. Ob ject
N is the set of thread no des, or inside no des that
T
elds are accessed using syntax of the form v:f. Static class
corresp ond to thread creation sites. Since each thread
variables are accessed using syntax of the form cl:f .
corresp onds to an ob ject that inherits from the class
The algorithm represents the computation of each metho d
Thread , thread creation sites are also ob ject creation
using a control ow graph. The no des of control ow graphs
sites, and N N .
T I
are statements st 2 ST. The algorithm analyzes only those
statements that a ect the p oints-to and escap e information.
There is the set N of outside no des, which repre-
O
We assume the program has b een prepro cessed so that all
sent ob jects created outside the current analysis scop e
statements relevant to the analysis are in one of the following
or accessed via references created outside the current
forms: analysis scop e.
Both O and I are graphs with edges lab eled with a eld There is the set CL of class no des. Conceptually, each
from F . We de ne the following op erations on no des of the class no de represents a statically allo cated ob ject whose
graphs: elds are the static class variables for the corresp ond-
ing class.
0 0
edgesToO; n = fhn ;ni2O g [ fh hn ; f i;ni2 O g
0 0
The set N of outside no des is futher divided into the
O
edgesFromO; n = fhn; n i2O g [ fh hn; fi;n i2 O g
following kinds of no des:
edges O; n = edgesToO; n [ edgesFromO; n
0 0
O n = fn :hn; n i2O g
There is a set N of parameter no des. There is one
P
0 0
O n; f = fn :hhn; fi;n i2O g
parameter no de for each formal parameter in the pro-
gram. Each parameter no de represents the ob ject that
The following op eration removes a set S of no des from a
its parameter p oints to during the execution of the
0 0 0 0
p oints-to escap e graph. hO ;I ;e ;r i = removeS; hO ; I ; e; r i,
analyzed metho d. When the analysis starts, each of
where
0
the metho d's formal parameter variables p oints to its
O = O [fedges O; n:n 2 S g
0
corresp onding parameter no de. The receiver ob ject is
I = I [fedges I; n:n 2 S g
treated as the rst parameter of each metho d.
en if n 62 S
0
e n =
; otherwise
There is a set N of global no des. There is one global
G
0
r = r S
no de for each static class variable cl:f in the program.
Each global no de represents the ob jects that its static
class variable may p oint to during the execution of
3.4 Reachability and Escap ed No des
the current metho d. When the analysis of the cur-
We identify two di erent kinds of escap e information, with
rent metho d starts, each accessed static class variable
the distinction based on when an ob ject b ecomes accessi-
p oints to its global no de.
ble outside the current metho d. If an ob ject was created
There is a set N of load no des. There is one load
outside the current analysis scop e or was accessed via a ref- L
no de for each load statement in the program. When
erence created outside the current analysis scop e, the ob ject
the load statement executes, it loads a value from a
is already accessible to other parts of the program. In this
eld in an ob ject. If the loaded value is a reference, the
case wesay that the ob ject has escaped or is escaped.
analysis must represent the ob ject that the reference
If an ob ject has not escap ed but will b e returned by the
p oints to. Each load no de represents all of the ob jects
metho d to its caller, wesay that the ob ject wil l escape.
whose references are loaded by the corresp onding load
An ob ject has directly escap ed the current metho d in all
statement.
of the following situations:
There is a set N of return no des. There is one return
A reference to the ob ject was passed as a parameter R
no de for each metho d invo cation site in the program.
into the current metho d.
Return no des are used to represent the return values
A reference to the ob ject was written into a static class
of invo cations of unanalyzed metho ds.
variable.
The analysis represents each array with a single no de.
A reference to the ob ject was passed as a parameter to
This no de has a eld elements , which represents all of the
an invoked metho d, and there is no information ab out
elements of the array. Because the p oints-to information for
what the invoked metho d did with the ob ject.
all of the array elements is merged into this eld, the analysis
do es not make a distinction b etween di erent elements of the
The ob ject is a thread ob ject.
same array.
An ob ject has escap ed if it is reachable via some sequence of
ob ject references from a directly escap ed ob ject. An ob ject 3.3 Points-To Escap e Graphs
is captured if it has not escap ed.
A p oints-to escap e graph is a quadruple of the form hO ; I ; e; r i,
Given our abstraction of p oints-to escap e graphs, we can
where
formalize the concepts of reachability and escap ed no des as
0
follows. Anoden is directly reachable from a no de n in a
O N F N [ N is a set of outside edges.
L G
0 0 0
graph O if n = n , hn; n i2 O or 9f:hhn; fi;n i2 O . Anode
Outside edges represent references created outside the
0 0
n is reachable from a no de n in O if n = n or there exists a
current analysis scop e, either by the caller, by a thread
0
sequence n n such that n = n , n = n , and for all
0 0
k k
running in parallel with the current thread, or byan
0 i i+1 i unanalyzed invoked metho d. 0 anoden is reachable from a set of no des S in O if there 0 exists a no de n 2 S such that n is reachable from n in O . I V [ N F N is a set of inside edges. Inside We de ne reachable O ; S; nif n is reachable from S in O . edges represent references created inside the current Given a p oints-to escap e graph hO ; I ; e; r i,anoden di- analysis scop e. rectly escap es if n 2 N [ N [ N [ CL or en 6= ;. We P R T M e : N ! 2 is an escap e function that records the de ne escap edhO ; I ; e; r i;nif n is reachable in O [ I from set of unanalyzed metho d invo cation sites that a no de a no de that directly escap es, and capturedhO ; I ; e; r i;nif escap es down into. not escap edhO ; I ; e; r i;n. r N is a return set that represents the set of ob- jects that may b e returned by the currently analyzed metho d. then generating additional inside edges from l to all of the 4 Intrapro cedural Analysis no des that v p oints to. The algorithm uses a data ow analysis to generate a p oints- Kill = edges I; l to escap e graph at each p oint in the metho d. The analysis I Gen = fl gI v of each metho d starts with the construction of the p oints- I 0 I = I Kill [ Gen to escap e graph hO ;I ;e ;r i for the rst statementinthe I I 0 0 0 0 metho d. Given a metho d op with formal parameter list p ;:: :;p and accessed static class variables cl :f ;:::; cl :f , 1 1 j j 0 k 4.2 Load Statements the initial p oints-to escap e graph hO ;I ;e ;r i is de ned as 0 0 0 0 A load statement of the form l = l :f makes l p ointtothe 1 2 1 follows. ob ject that l :f p oints to. The analysis mo dels this change 2 In I , each parameter p p oints to its corresp onding 0 by constructing a set S of no des that represent all of the i . The formal parameter p p oints parameter no de n p ob jects to which l :f may p oint, then generating additional 0 2 i to the receiver ob ject of the metho d; the no de n p inside edges from l to every no de in this set. 1 0 represents the receiver ob ject in the analysis of the All no des accessible via inside edges from l :f should 2 metho d. clearly b e in S . But if l p oints to an escap ed no de, other 2 I = fhp ;n i:0 i k g 0 p parts of the program such as the caller or threads executing i i in parallel with the current thread can access the referenced In O , each accessed static class variable cl :f p oints 0 i i ob ject and store values in its elds. In particular, the value to its corresp onding global no de n . cl :f in l :f may have b een written by the caller or a thread i i 2 running in parallel with the current thread | in other words, O = fhhcl ; f i;n i:1 i j g 0 i i cl :f i i l :f may contain a reference created outside of the current 2 analysis scop e. The analysis uses an outside edge to mo del For all n, e n= ;. this reference. The outside edge p oints to the load no de for 0 the load statement, which is the outside no de that represents r = ;. 0 the ob jects that the reference may p oint to. The analysis must therefore consider two cases: the case Note that the algorithm analyzes the metho d under the as- when l do es not p ointto an escap ed no de, and the case 2 sumption that the parameters and accessed static class vari- when l do es p oint to an escap ed no de. The algorithm 2 ables all p oint to di erent ob jects. If the metho d maybe determines which case applies by computing S , the set of E invoked in a calling context in which some of these p ointers escap ed no des to which l p oints. S is the set of no des 2 I p oint to the same ob ject, this ob ject will b e represented by accessible via inside edges from l :f . 2 multiple no des during the analysis of the metho d. In this case, the mapping op eration describ ed b elow in Section 5 S = fn 2 I l :escap ed hO ; I ; e; r i;n g E 2 2 2 will merge the corresp onding outside ob jects when it maps S = [fI n ; f:n 2 I l g I 2 2 2 the nal analysis result for the metho d into the calling con- text at the metho d invo cation site. Because the mapping If S = ; i.e., l do es not p oint to an escap ed no de, E 2 retains all of the edges from the merged ob jects, it conser- S = S and the transfer function simply kills all edges from I vatively mo dels the actual e ect of the metho d. l , then generates inside edges from l to all of the no des 1 1 Once it has nished constructing the initial p oints-to es- in S . cap e graph, the analysis continues by propagating p oints- Kill = edges I; l I 1 to escap e graphs through the statements of the metho d's Gen = fl gS I 1 0 0 0 0 0 control ow graph. The transfer function hO ;I ;e ;r i = I = I Kill [ Gen I I [[ st ]] hO ; I ; e; r i de nes the e ect of each statement st on If S 6= ; i.e., l p oints to at least one escap ed no de, S = the current p oints-to escap e graph. Most of the statements E 2 S [fng, where n is the load no de for the load statement. In rst kill a set of inside edges, then generate additional inside I addition to killing all edges from l , then generating inside and outside edges. In this case, the transfer function has the 1 edges from l to all of the no des in S , the transfer function following general form: 1 also generates outside edges from the escap ed no des to n. 0 I = I Kill [ Gen I I 0 Kill = edges I; l O = O [ Gen I 1 O Gen = fl gS I 1 0 Figure 6 graphically presents the rules that determine the I = I Kill [ Gen I I sets of generated edges for the di erent kinds of statements. Gen = S ffg fng O E 0 Eachrow in this gure contains four items: a statement, a O = O [ Gen O graphical representation of existing edges, a graphical rep- resentation of the existing edges plus the new edges that the 4.3 Store Statements statement generates, and a set of side conditions. The inter- pretation of eachrow is that whenever the p oints-to escap e A store statement of the form l :f = l nds the ob ject to 1 2 graph contains the existing edges and the side conditions are which l p oints, then makes the f eld of this ob ject p oint 1 satis ed, the transfer function for the statement generates to same ob ject as l . The analysis mo dels the e ect of this 2 the new edges. assignment by nding the set of no des that l p oints to, 1 then generating inside edges from all of these no des to the no des that l p oints to. 2 4.1 Copy Statements A copy statement of the form l = v makes l p oint to the Gen = I l ffg I l I 1 2 0 ob ject that v p oints to. The transfer function up dates I to I = I [ Gen I re ect this change by killing the current set of edges from l , Figure 6: Generated Edges for Basic Statements 4.9 Control-Flow Join Points 4.4 Global Load Statements To analyze a statement, the algorithm rst computes the A global load statement of the form l = cl:f makes l p oint join of the p oints-to escap e graphs owing into the statement to the same ob ject that cl :f p oints to. The analysis mo dels from all of its predecessors. It then applies the transfer this change by killing all edges from l and generating inside function to obtain a new p oints-to escap e graph at the p oint edges from l to all of the no des that cl :f p oints to. after the statement. The join op eration is de ned as follows. Kill = edgesI; l I hO ;I ;e ;r ithO ;I ;e ;r i = hO ; I ; e; r i, where O = O [ 1 1 1 1 2 2 2 2 1 Gen = flgO [ I cl; f I O , I = I [ I , 8n 2 N:en=e n [ e n, and r = r [ r . 0 2 1 2 1 2 1 2 I = I Kill [ Gen I I The corresp onding partial order for p oints-to escap e graphs is hO ;I ;e ;r ivhO ;I ;e ;r i if O O , I I , 8n 2 1 1 1 1 2 2 2 2 1 2 1 2 4.5 Global Store Statements N:e n e n, and r r . The b ottom p oints-to escap e 1 2 1 2 graph is h;; ;;e ; ;i, where e n=; for all n. ? ? A global store statementof the form cl:f = l makes the The analysis of each metho d pro duces analysis results static class variable cl :f p oint to the same ob ject as l. The st and st b efore and after each statement st in the analysis mo dels this change by generating inside edges from metho d's control ow graph. The analysis result satis es cl :f to all of the no des that l p oints to. the following equations: Gen = fhcl; fig I l I 0 I = I [ Gen I enter = hO ;I ;e ;r i op 0 0 0 0 0 0 st = tf st :st 2 predstg st = [[ st ]] st 4.6 Object Creation Sites An ob ject creation site of the form l = new cl allo cates a The nal analysis result of metho d op is the analysis result new ob ject and makes l p oint to the ob ject. The analysis at the program p oint after the exit no de, i.e., exit . op represents all ob jects allo cated at a sp eci c creation site As describ ed b elow in Section 5.3, the analysis solves these with the creation site's inside no de n. The transfer function equations using a standard worklist algorithm. mo dels the e ect of the statementby killing all edges from l , then generating an inside edge from l to n. 4.10 Strong Up dates Kill = edges I; l I Our analysis mo dels the execution of statements that up date Gen = fhl ;nig I 0 memory lo cations by adding edges to the no des that repre- I = I Kill [ Gen I I sent the up dated lo cations. There are two p ossible kinds of up dates: weak up dates, which leave the existing edges in 4.7 Return Statements place, and strong up dates, which remove the existing edges. Because strong up dates leave fewer edges in the p oints-to es- A return statement return l sp eci es the return value for cap e graph, they may pro duce more precise analysis results. the metho d. The immediate successor of each return state- In the analysis presented so far, up dates to lo cal variables ment is the exit statement of the metho d. The analysis are strong and all other up dates are weak. mo dels the e ect of the return statementby up dating r to The analysis can legally p erform a strong up date when- include all of the no des that l p oints to. ever the up dated no de is captured and represents exactly 0 r = I l one up dated memory lo cation when the program runs. For a store statementof the form l :f = l , this condition is 1 2 4.8 Exit Statements satis ed whenever l p oints to a single captured no de n, 1 n represents a single ob ject, and f represents a single lo- The transfer function for an exit statement exit pro duces op cation in that ob ject. The last condition is satis ed unless the nal analysis result for the metho d. This result is used, f = elements , the sp ecial eld identi er for array elements at all call sites that may invoke the metho d, to compute discussed in Section 3.2. the e ect of the metho d on the analysis of its caller. The primary activity of the transfer function is to remove infor- 4.10.1 Strong Up dates for Singular No des and Fields mation from the p oints-to escap e graph that should not b e visible to the caller. The algorithm rst computes the set The no de n is singular if it represents one ob ject. This is the S of no des that are not reachable from the static class vari- case, for example, if n is an inside no de that corresp onds to a ables, the parameters, or the return values. It then removes statement executed at most once within the current analysis these no des from the p oints-to escap e graph. scop e. The eld f is singular if it represents a single lo cation within an ob ject. Given these de nitions, we can provide S = fn 2 N:not reachable O [ I; CL [ P [ r;ng 0 0 0 0 0 the following de nition of I for the transfer function of a hO ;I ;e ;r i = removeS; hO ; I ; e; r i store statement l :f = l . With this de nition, the transfer 1 2 The analysis uses the reachability information at metho d function for store statements p erforms strong up dates. exit p oints to b ound ob ject lifetimes. If an inside no de is not reachable from the static class variables, the parameters, edgesFromn; f if I l =fng; n; f singular ; 1 the return values, or a thread ob ject, then all of the ob jects and capturedhI ; O ; e; r i;n Kill = I it represents i.e., all of the ob jects allo cated at the corre- ; otherwise sp onding ob ject allo cation site within the current analysis Gen = I l ff g I l I 1 2 0 scop e are inaccessible b oth to the metho d's caller and to I = I Kill [ Gen I I other threads. When the metho d returns, these ob jects are inaccessible to the rest of the computation. It is therefore legal to allo cate these ob jects in the activation record of the metho d and to access the ob jects without synchronization. The old graph is the p oints-to escap e graph hO ; I ; e; r i 4.10.2 Summary No des at the p oint b efore the metho d invo cation site. It is p ossible to extend the ob ject representation so that each inside no de would represent only the last ob ject allo cated at The incoming graph is the nal analysis result the corresp onding allo cation site. All other no des allo cated hO ;I ;e ;r i = exit for the invoked metho d. R R R R op at this site would b e represented by a summary no de. With The mapping algorithm pro duces the new graph this approach, the analysis would p erform strong up dates for hO ;I ;e ;r i = mapm; hO ; I ; e; r i; op. M M M M all inside no des, and weak up dates only for summary no des. Similar approaches have b een prop osed for intrapro cedural shap e analysis algorithms [24 , 12 ]. 5.2.1 Overview Conceptually, the algorithm rst builds an initial mapping 4.11 An Optimization for Static Class Variables N : N ! 2 from the no des of the incoming graph to the no des of the old graph. For each outside no de n in the in- In the absence of information ab out how parallel threads ac- coming graph, n is the set of no des from the old graph cess static class variables, the extracted analysis information that n represents during the analysis of the invoked metho d imp oses no limit on the lifetimes or p oints-to relationships op . This initial value of maps global no des, parameter of ob jects reachable from these variables. The implemented no des, and load no des to their corresp onding no des in the compiler therefore adopts a more compact representation new graph. It builds the mapping for load no des by tracing for the p oints-to relationships involving no des that repre- correlated paths in the old graph and the incoming graph. sent these ob jects. Instead of recording the sp eci c set of Each path in the old graph consists of a sequence of inside static class variables that may p oint to a no de, the analysis edges; the corresp onding path in the incoming graph con- simply records that at least one do es. This representation sists of the sequence of outside edges that represent the cor- reduces the size of the p oints-to escap e graph without re- resp onding inside edges during the analysis of the metho d. ducing the amount of useful information. The algorithm then builds the new p oints-to escap e graph hO ;I ;e ;r i = mapm; hO ; I ; e; r i; op. The algorithm M M M M 5 Interpro cedural Analysis starts by initializing the new graph to the old graph hO ; I ; e; r i. It then uses the mapping to map no des and edges from the At each metho d invo cation site, the analysis has the option incoming graph into the new graph. The mapp ed no des rep- of either skipping the site or analyzing the site. If it analyzes resent ob jects allo cated or accessed by the invoked metho d. the site, it collects the nal analysis results from all of the The mapp ed edges represent references that the invoked p otentially invoked metho ds, maps each analysis result into metho d created or read. During the mapping pro cess, edges the p oints-to escap e graph from the program p oint b efore are mapp ed from pairs of no des in the incoming graph to the metho d invo cation site, then merges the mapp ed results corresp onding pairs of no des in the new graph. In this case to derive the p oints-to escap e graph at the p oint after the wesay that the no des in the new graph acquire the edges metho d invo cation site. If the analysis skips a metho d invo- from the no des in the incoming graph. cation site, it marks all of the parameters as escaping down As part of the mapping pro cess, the algorithm extends into the site. so that if the algorithm maps a no de n from the incoming graph into the new graph, n 2 n. In this case we say 5.1 Skipp ed Metho d Invo cation Sites that n is present in the new graph. We next describ e the roles that the di erent kinds of The transfer function for a skipp ed metho d invo cation site is no des in the incoming graph play in the analysis, and discuss de ned as follows. Given a skipp ed metho d invo cation site how these roles a ect each no de's presence in the new graph, m of the form l = l :op l ;:::;l with return no de n 0 1 R k its e ect on the mapping , and the edges that are mapp ed and a current p oints-to escap e graph hO ; I ; e; r i, the p oints- 0 0 into the new graph. to escap e graph hO; I ;e;r i = [[ m]] hO ; I ; e; r i after the site is de ned as follows: n 2 N : If n is a parameter no de, it represents the P 0 I = I edges I; l [fhl;n ig R no des that the corresp onding actual parameter p oints en [fmg if 90 i k:n 2 I l i 0 to. These no des should acquire n's edges, but n itself e n = en otherwise should not b e present in the new graph. n is the set of no des in the current graph that the The return no de n is an outside no de used to represent the R corresp onding actual parameter p oints to. return value of the invoked metho d. n 2 N : If n is a global no de, it represents the ob jects G 5.2 Analyzed Metho d Invo cation Sites that the corresp onding static class variable p oints to. These no des should acquire n's edges, and n should b e Given an analyzed metho d invo cation site m of the form present in the new graph. Note that n is also escap ed l = l :op l ; :::;l and a current p oints-to escap e graph 0 1 k 0 0 0 0 in the the new graph. hO ; I ; e; r i, the new p oints-to escap e graph hO ;I ;e ;r i = [[ m ]] hO ; I ; e; r i after the site is de ned as follows: n is the set of no des in the current graph that the 0 0 0 0 corresp onding static class variable p oints to. Note that hO ;I ;e ;r i = tfmapm; hO ; I ; e; r i; op:op 2 callees mg n 2 n. We next present the mapping algorithm for metho d in- n 2 N :Ifn is a load no de, all of the no des that it rep- L vo cation sites. We assume a metho d invo cation site m of the resents should acquire its edges. If n may representan form l = l :op l ;: ::;l and an invoked metho d op with 0 1 k ob ject created outside the analysis scop e of the caller formal parameter list p ;:: :;p and accessed static class 0 k or an ob ject accessed via a reference created outside variables cl :f ;: ::;cl :f . There are three p oints-to es- 1 1 j j cap e graphs involved in the algorithm: mappings and edges. Figure 9 graphically presents the gen- that scop e, it and its edges should also b e mapp ed into erated mappings and edges for some of the crucial rules. the new graph. Each row in this gure contains four items: an inference The basic idea is that n should b e present in the new rule, a graphical representation of existing edges and map- graph only if an escap ed no de would p oint to it | pings, a graphical representation of the existing edges and there should b e no outside edges from captured no des. mappings plus the new edges and mappings that the rule The analysis determines if n should b e presentby ex- generates, and a set of side conditions. The no des on the amining all of the no des in the incoming graph that b ottom row of each graphical representation are from the p ointton. There are two cases: incoming graph, while the no des on the top row are from 0 the new graph. The interpretation of eachrow is that if the { If at least one of these no des say n is present existing edges and mappings are present in the incoming and escap ed in the new graph, n should also b e graph and new graph, and the side conditions are true, then present, and there should b e an outside edge in 0 the inference rule generates the new edges and mappings. the new graph from n to n. { If at least one of these no des is mapp ed to an es- 0 0 i k; n 2 I l i cap ed no de say n from the old graph, n should 1 n 2 n p b e present, and there should b e an outside edge i 0 in the new graph from n to n. 1 i j 2 cl 2 cl i i In b oth cases, n is escap ed in the new graph. n includes the set of no des in the old graph that n rep- hhn ; fi;n i2 O ; hhn ; fi;n i2I; 1 2 R 3 4 resented during the analysis of the metho d. And n 2 n 2 n ;n 62 N 3 3 1 1 I nifn is present in the new graph. n 2 n 4 2 n 2 N : If n is an inside no de, it and its inside edges I should b e present in the new graph if the ob jects that Figure 7: Initial Rules for it represents are reachable in the caller. The analysis determines if n should b e presentby examining all of O O I edges l I M M 4 the no des in the incoming graph that p oint to it. As en e n r r M M for load no des, there are two cases: hhn ; fi;n i2 I 0 1 2 R { If at least one of these no des say n is present 5 n ffg n I 1 2 M in the new graph, n should also b e present, and 0 there should b e an edge in the new graph from n hhn ; fi;n i2 I ;n 2 n ;n 2 N [ N 1 2 R 1 2 I R 6 to n. n 2 n 2 2 { If at least one of these no des represents a no de 0 hhn ; fi;n i2O ;n 2 n ; 1 2 R 1 say n in the old graph, n should be present, escap ed hO ;I ;e ;r i;n 7 M M M M and there should be an edge in the new graph 0 hhn; fi;n i2 O ;n 2 n 2 M 2 2 from n to n. n 2 r \ N [ N R I R A primary di erence between load no des and inside 8 n 2 n no des is that load no des are present in the new graph only if they are escap ed in that graph. Inside no des 0 e n 6= ;;n 2 n R 9 are present if they are reachable, and can be either 0 m 2 e n M escap ed or captured in the new graph. [fn:n 2 r g I l 10 R M If an inside no de n is present in the new graph, n= fng. Otherwise n=;. Figure 8: Rules for and hO ;I ;e ;r i M M M M n 2 N : If n is a return no de these no des represent R return values from invoked but unanalyzed metho ds, the conditions are the same as for inside no des. 5.2.3 Initial Rules 5.2.2 Constraints for the Mapping Algorithm The initial set of contraints, Rules 1 through 3, set up the initial mapping so that all outside no des in the incoming Figures 7 and 8 presenta formal sp eci cation of the con- graph are mapp ed to the no des in the old graph that they straints that the mapping and the new p oints-to escap e represent during the analysis of the invoked metho d op. graph hO ;I ;e ;r i must satisfy. The constraints are M M M M sp eci ed as a set of inference rules that the nal mapping Rule 1 maps each parameter no de n to the no des p 2 i and p oints-to escap e graph must satisfy. I l in the old graph that it represents during the i Conceptually, many of these inference rules start with an 3 analysis of the metho d. existing mapping and set of edges, then generate additional 2 Rule 2, along with Rule 3, ensures that each accessed Each inference rule is written in the standard form global no de is mapp ed to the no des in the old graph c ;:::;c 1 k 0 0 3 ;:::;c c Recall that the metho d invo cation site has actual parameters j 1 l ;:: :;l , the metho d has formal parameters p ;:::;p , and n 0 p k 0 k i which imp oses the constraint that if all of c ;:::;c are true, then 1 k is the parameter no de for p . 0 0 i all of c ; :::;c must b e true. 1 j Figure 9: Generated Edges and Mappings for Inference Rules 4 that it represents during the analysis of the metho d. inside edges into the new graph take place inside mapNo de, it is resp onsible for ensuring that the new graph contains Rule 3 maps outside no des to the no des that they rep- the right inside edges. resent during the analysis of the metho d, matching The control structure of the algorithm is based on three outside edges in the incoming graph with corresp ond- worklists. Eachworklist corresp onds to one of the rules that ing inside edges in the old graph. The constraint starts map no des from the incoming graph to the new graph. Each with a no de n from the incoming graph that already 1 worklist entry contains a no de n from the new graph and maps to a no de n in the old graph. It then matches 3 an edge hhn ; fi;n i from the incoming graph. All worklist 1 2 edges from these two no des to nd a load no de n from 2 entries have the prop erty that n 2 n . 1 the incoming graph that representsanoden in the 4 When the algorithm pro cesses the worklist entry,itchecks new graph during the analysis of the metho d. The to see if it should up date the mapping for n . The details 2 constraint maps n to n . 2 4 of the check dep end on the sp eci c worklist. Rules 1 through 3 map outside no des only. Rule 6, how- The worklist W corresp onds to Rule 3, which matches E ever, may map an inside no de into the new graph. It is outside edges from the incoming graph against inside imp ortant that Rule 3 do es not match outside edges from edges in the old graph. All edges in this worklist are these inside no des against inside edges in the old graph. The outside edges. To pro cess an entry from this worklist, reason for this restriction is that inside no des represent ob- the algorithm matches the edge hhn ; fi;n i from the 1 2 jects that were allo cated inside the current analysis scop e. worklist against all corresp onding inside edges These ob jects did not exist when the analyzed metho d was hhn; fi;n i in the old graph. For each corresp onding 4 invoked. Any outside edges from inside no des therefore rep- edge it maps n to n . 2 4 resent references created either by threads running in paral- The worklist W corresp onds to Rule 6, which maps lel with the current analysis scop e, or by unanalyzed meth- I inside no des into the new graph. All edges in this ods invoked by the current analysis scop e. In either case worklist are inside edges. To pro cess an entry from the reference did not exist when the metho d was invoked. this worklist, the algorithm extracts the no de n that Because all inside edges in the old graph existed when the 2 the worklist edge p oints to, and maps n into the new analyzed metho d was invoked, the outside edge in the in- 2 graph. coming graph did not represent these edges. The worklist W corresp onds to Rule 7, which maps O 5.2.4 Remaining Rules outside no des into the new graph. All edges in this worklist are outside edges. To pro cess an entry from Rule 4 initializes the new graph to include the old graph. this worklist, the algorithm inserts a corresp onding Rule 5 maps inside edges from the incoming graph into the outside edge hhn; f i;n i into the new graph and maps 2 new graph. There is an inside edge b etween two no des in n into the new graph. 2 the new graph if there is an inside edge b etween their two corresp onding no des in the incoming graph. There is a slight complication in the algorithm. As the Rules 6 through 8 map no des from the incoming graph algorithm executes, it p erio dically maps inside edges from into the new graph. Rule 6 maps an inside or return no de the incoming graph into the new graph. Whenever a no de into the new graph if it is reachable from a no de that is n is mapp ed to n, the algorithm maps each inside edge 1 already mapp ed into the new graph. Rule 7 maps a load hhn ; fi;n i from the incoming graph into the new graph. 1 2 no de into the new graph if it is reachable from an escap ed This mapping pro cess inserts a corresp onding inside edge no de that is already mapp ed into the new graph. Finally, from n to each no de that n maps to; i.e., to eachnodein 2 Rule 8 maps an inside or return no de into the new graph if n . The algorithm must ensure that when it completes, 2 it is returned by the metho d. there is one such edge for each no de in the nal set of no des Rule 9 ensures that all no des passed into unanalyzed n . But when the algorithm rst maps hhn ; fi;n i into 2 1 2 metho ds are marked appropriately in the escap e function. the new graph, n may not b e complete. In this case, 2 Rule 10 makes l the lo cal variable that is assigned to the the algorithm will eventually map n to more no des, in- 2 value that the metho d returns p oint to the no des that rep- creasing the set of no des in n . There should b e edges 2 resent the return value of the metho d. from n to all of the no des in the nal n , not just to 2 those no de that were presentin n when the algorithm 2 5.2.5 Constraint Solution Algorithm mapp ed hhn ; fi;n i into the new graph. 1 2 The algorithm ensures that all of these edges are present Figures 10 and 11 present an algorithm for solving the con- in the nal graph by building a set n of delayed actions. 2 straint system in Figures 7 and 8. The algorithm directly Each delayed action consists of a no de in the new graph and re ects the structure of the inference rules. At each step it a eld in that no de. Whenever the no de n is mapp ed to a 2 detects an inference rule antecedent that b ecomes true, then 0 0 new no de n i.e., the algorithm sets n =n [fn g, 2 2 takes action to ensure that the consequent is also true. the algorithm establishes a new inside edge for each delayed The mapNo den ;n pro cedure is invoked whenever the 1 action. The new edge go es from the no de in the action to algorithm maps a no de n from the incoming graph to a 1 0 the newly mapp ed no de n . These edges ensure that the no de n in the new graph. It rst up dates the escap e func- nal set of inside edges satis es the constraints. tion e n to re ect any new escap e information. It then M maps new inside edges b oth to and from n to re ect the 5.3 Global Fixed-Point Analysis Algorithm corresp onding inside edges from the incoming graph. Rule 5 is the only constraint that maps inside edges from the in- Figure 12 presents the global xed-p oint algorithm that the coming graph into the new graph. Because all insertions of compiler uses to solve the combined intrapro cedural and in- 4 terpro cedural data ow equations. It maintains a worklist of Recall that the metho d accesses static class variables cl :f ; :::;cl :f , and that n is the global no de for cl :f . p ending statements. At each step, it removes a statement 0 0 j j i i cl :f i i Initialize analysis results for all st 2 ST do mapNo den ;n st = st=h;; ;;e ; ;i 1 ? if hn ;ni62D then for all op 2 OP do 1 n =n [fng enter =hO ;I ;e ;r i 1 1 op 0 0 0 0 D = D [fhn ;nig Initialize the worklist 1 Update e n to satisfy Rule 9 W = fenter :op 2 OPg M op if e n 6= ; then e n= e n [fmg while W 6= ;do R 1 M M Update I to satisfy Rule 5 Remove a statement from worklist M I = I [ n fng W = W fst g M M 1 for all hhn ; fi;n i2 I do Process the statement 1 2 R 0 0 I = I [fhn; fig n st =tf st :st 2 predstg M M 2 n = n [fhn; f ig st =[[ st ]] st 2 2 Update worklists for Rules 3,6 and 7 if stchanged then W = W [ fngedgesFromO ;n Put potential ly a ected statements E E R 1 if n = n then onto worklist 1 W = W [ fngedgesFromI ;n W = W [ succst I I R 1 W = W [ fngedgesFromO ;n if st exit then O O R 1 op W = W [ callers op Figure 10: Pro cedure for Mapping One No de to Another Figure 12: Fixed-Point Analysis Algorithm Initialize worklists W = W = W = ; E I O Initialize new graph to satisfy Rule 4 and up dates the analysis results b efore and after the state- hO ;I ;e ;r i = hO; I edgesl ;e;ri M M M M ment. If the analysis result after the statement changed, Map parameter nodes to satisfy Rule 1 it inserts all of its successors or for exit no des, all of the for 0 i k do callers of its metho d into the worklist. for all n 2 I l do mapNo den ;n i p The order in which the algorithm analyzes metho ds can i Map classes to satisfy Rule 2 have a signi cant impact on the analysis. For non-recursive for 1 i j do mapNo decl ; cl metho ds, a b ottom-up analysis of the program yields the full i i Map return values to satisfy Rule 8 result with one analysis p er metho d. For recursive metho ds, for all n 2 r \ N [ N do mapNo den; n the analysis results must b e iteratively recomputed within R I R done = false each strongly connected comp onent of the call graph us- while not done do ing the current b est result until the analysis reaches a xed done = true p oint. if cho ose hn ; hhn ; fi;n ii 2 W such that It is p ossible to extend the algorithm so that it initially 3 1 2 E n 62 N then skips the analysis of metho d invo cation sites. If the analysis 1 I W = W fhn ; hhn ; fi;n iig result is not precise enough, it can incrementally increase E E 3 1 2 Map nodes to satisfy Rule 3 the precision by analyzing metho d invo cation sites that it for all hhn ; f i;n i2I do mapNo den ;n originally skipp ed. The algorithm will then propagate the 3 4 2 4 done = false new, more precise result to up date the analysis results at else if cho ose hn; hhn ; fi;n ii 2 W such that a ected program p oints. 1 2 I n 2 N [ N then 2 I R W = W fhn; hhn ; fi;n iig I I 1 2 6 Abstraction Relation Map inside node to satisfy Rule 6 mapNo den ;n 2 2 In this section, wecharacterize the corresp ondence b etween done = false p oints-to escap e graphs and the ob jects and references cre- else if cho ose hn; hhn ; fi;n ii 2 W such that 1 2 0 ated during the execution of the program. Akey prop erty escap ed hO ;I ;e ;r i;n then M M M M of this corresp ondence is that a single concrete ob ject in W = W fhn; hhn ; fi;n iig O O 1 2 the execution of the program may b e represented bymulti- Map outside edge and outside node ple no des in the p oints-to escap e graph. We therefore state to satisfy Rule 7 the prop erties that characterize the corresp ondence using an O = O [ fhhn; f i;n ig M M 2 abstraction relation, which relates each ob ject to all of the mapNo den ;n 2 2 no des that represent it. done = false As the program executes, it creates a set of concrete Update I l to satisfy Rule 10 M ob jects o 2 C and a set of references r 2 R V C [ I l =[fn:n 2 r g M R CL F C [ C F Cbetween ob jects. At each p oint in the execution of the program, it is p ossible to de ne the following sets of references and ob jects: Figure 11: Constraint Solution Algorithm for and R is the set of references created by the current ex- C hO ;I ;e ;r i M M M M ecution of the current metho d and all of the analyzed metho ds that it invokes. These prop erties ensure that captured ob jects are reach- R is the set of references read by the current exe- R able only via paths that start with the lo cal variables. cution of the current metho d and all of the analyzed If an ob ject is captured at a metho d exit p oint, it will metho ds that it invokes. therefore b ecome inaccessible as so on as the metho d C is the set of ob jects reachable from the lo cal vari- R returns. ables, static class variables, and parameters by follow- The p oints-to information in the p oints-to escap e graph ing references in R [ R . C R completely characterizes the references between ob- R = R \ C F C is the set of inside refer- I C R R jects represented by captured no des: ences. These are the references represented by the set { capturedhO ; I ; e; r i;n ; capturedhO ; I ; e; r i;n ; of inside edges in the analysis. 1 2 n 2 o ;n 2 o and hhn ; fi;n i62 I implies 1 1 2 2 1 2 R =R \ C F C R is the set of outside O R R R I hho ; fi;o i62 R 1 2 references. These are the references represented by the set of outside edges in the analysis. 7 Optimizations It is always p ossible to construct an abstraction relation For optimization purp oses, the two most imp ortant prop er- C N between the ob jects and the no des in the p oints- ties are that the absence of edges b etween captured no des to escap e graph hO ; I ; e; r i at the current program p oint. guarantees the absence of references b etween the represented This relation relates each ob ject to all of the no des in the ob jects, and that captured ob jects are inaccessible when the p oints-to escap e graph that represent the ob ject during the metho d returns. In this section we discuss two transforma- analysis of the metho d. The abstraction relation has all of tions, stack allo cation of ob jects and synchronization elimi- the prop erties describ ed b elow. nation, that the escap e information enables. Reachable ob jects are represented by their allo cation sites. If o was created at an ob ject creation site within 7.1 Synchronization Elimination the current execution of the current metho d or ana- If an ob ject do es not escap e a thread, it is legal to remove lyzed metho ds that it invokes, and o is reachable i.e. the lo ck acquire and release op erations from synchronized o 2 C , n 2 o, where n is the ob ject creation site's R metho ds that execute on that ob ject. The b ene t of this inside no de. transformation is the elimination of synchronization over- 5 Each ob ject is represented by at most one inside no de: head To apply the transformation, the compiler generates a { n ;n 2 o and n ;n 2 N implies n = n 1 2 1 2 I 1 2 sp ecialized, synchronization-free version of each synchronized metho d that may execute on a captured ob ject. At each call All outside references have a corresp onding outside site that invokes the metho d only on captured ob jects, the edge in the p oints-to escap e graph: compiler generates co de that invokes the synchronization- { hv ;oi2 R implies O v \ o 6= ; free version. O An issue that the compiler must deal with is the dis- { hhcl ; fi;oi2 R implies O cl ; f \ o 6= ; O tance in the call graph b etween the metho d that captures { hho ; fi;o i2 R implies 1 2 O the ob ject and the synchronized metho d. In addition to o ffg o \ O 6= ; 1 2 removing synchronization op erations from the synchronized metho d, the compiler must also generate sp ecialized ver- All inside references have a corresp onding inside edge sions of all metho ds in the call chain from the capturing in the p oints-to escap e graph: metho d to the metho d containing the call site that invokes the synchronization-free version. Our sp ecialization p olicy { hv ;oi2 R implies I v \ o 6= ; I imp oses no limit on the numb er of sp ecialized metho ds | { hhcl ; fi;oi2 R implies I cl; f \ o 6= ; I it applies the transformation whenever p ossible. { hho ; fi;o i2 R implies 1 2 I A nal problem is nding the paths in the call graph that o ffg o \ I 6= ; 1 2 go from captured ob jects to synchronized metho ds. The compiler solves this problem by augmenting the analysis to If an ob ject is represented bya captured no de, it is record, for each no de in the p oints-to escap e graph, paths in represented by only that no de: the call graph that lead to synchronized op erations on the ob jects represented by that no de. For captured ob jects, the { n 2 o and capturedhO ; I ; e; r i;n implies compiler can use this information to trace a path from the o= fng captured ob ject down to the call sites that invoke synchro- Given this prop erty,we de ne that an ob ject is cap- nized metho ds on that ob ject. tured if it is represented by a captured no de. All ref- erences to captured ob jects are either from lo cal vari- 7.2 Stack Allo cation of Objects ables or from other captured ob jects: If an ob ject do es not escap e a metho d, it can b e allo cated { n 2 o, capturedhO ; I ; e; r i;n, and hv ;oi2R on the stack instead of in the heap. One b ene t of this implies v 2 L transformation is the elimination of garbage collection over- head. Instead of b eing pro cessed by the collector, the ob ject { n 2 o , capturedhO ; I ; e; r i;n , and 2 2 2 hho ; fi;o i 2 R implies 9n 2 N:o = fn g 5 1 2 1 1 1 To preserve the semantics of the Java memory mo del on machines and capturedhO ; I ; e; r i;n 1 that implementweak memory consistency mo dels, the compiler may need to insert memory barriers at the original lo ck acquire and release sites when it removes the acquire and release constructs. will b e implicitly collected when the metho d returns and the This approach provides excellent supp ort for dynamically stack rolls back. Another b ene t is b etter memory lo cality loaded programs. It allows the compiler to analyze a large, b ecause the stack frame will tend to b e resident in the cache. commonly used package such as the Java Class Libraries Finally, if the compiler can precisely compute the lo cation once, then reuse the analyze results every time a program is of the ob ject on the stack, it can generate co de that accesses loaded that uses the package. It also supp orts the delivery the ob ject elds directly in the stack frame. of preanalyzed packages. Instead of requiring the analysis The largest p otential drawback of stack allo cation is that to b e p erformed when the package is rst loaded into a cus- it may increase memory consumption by extending the life- tomer's virtual machine, a vendor could p erform the analysis time of ob jects allo cated on the stack. Because this prob- as part of the release pro cess, then ship the analysis results lem may b e esp ecially acute for allo cation sites that create along with the co de. a statically unb ounded numb er of ob jects on the stack, our When the Jalap eno ~ compiler generates co de for a dynam- implementation allo cates ob jects on the stack only if the al- ically loaded class, it uses the information in the analysis le lo cation site will b e executed at most once p er invo cation to apply the stack allo cation and synchronization elimina- 6 of the al locating method, or the metho d that contains the tion optimizations describ ed ab ove. One unusual feature ob ject allo cation site. of the system is that the analysis is applied to the compiler In the simplest case, a captured ob ject do es not escap e and virtual machine during the b o otstrap pro cess. The en- its allo cating metho d. In this case, the compiler can sim- tire system, including the dynamic compiler and the virtual ply replace the normal ob ject allo cation instruction with a machine as well as the applications, therefore executes with the optimizations applied. sp ecial instruction that allo cates the ob ject on the heap. If an ob ject escap es its allo cating metho d, but is recap- tured within a caller direct or indirect, it may b e p ossi- 8.2 Benchmark Set ble to allo cate the ob ject on the stack of the metho d that 7 Our b enchmark set includes four programs. Javac is the captures the ob ject. In addition to requiring that stack- 8 standard Java compiler. Javacup isayacc-like parser gen- allo cated ob jects come from allo cation sites that are exe- 9 erator for Java. Javalex is a lexical analyzer generator for cuted at most once p er invo cation of the allo cating metho d, Java. Pb ob is a transaction-pro cessing b enchmark designed the compiler also requires that the allo cating metho d b e in- to quantify the p erformance of simple transactional server voked at most once for each call site in the captured metho d workloads written in Java. that leads to the allo cating metho d. Wechose these applications in part b ecause they are all If the allo cation site satis es this restriction, the compiler complete programs in regular use. We therefore exp ect them generates a sp ecialized version of the allo cating metho d. In- to exhibit realistic ob ject creation and access patterns. They stead of allo cating the ob ject itself, the sp ecialized version were also chosen to test the scalability of our analysis. With takes, as a parameter, a p ointer to preallo cated space in the the asso ciated and analyzed class libraries, all three pro- capturing metho d's activation record. The allo cation site grams are on the order of tens of thousands of lines of co de, is transformed to initialize the ob ject at the given lo cation, and many existing p ointer analysis algorithms are impracti- rather than to allo cate it in the heap. As for the synchro- cal for programs of this size. Figure 13 presents the number nization elimination transformation describ ed ab ove, the com- of lines of co de and the total sizes of the class les for these piler must generate sp ecialized versions of all metho ds on applications. JVM is the Jalap eno ~ virtual machine. the call chain from the capturing metho d to the allo cating metho d. For this transformation to interop erate correctly with Lines of Class File the rest of the system, the garbage collector must recog- Benchmark Co de Size bytes nize stack-allo cated ob jects. The recognition mechanism is javac - 87,801 simple | it examines the address of the ob ject to deter- javacup 10,574 157,057 mine if it is allo cated on a stack or in the heap. During 8,159 95,229 javalex a collection, the collector still scans stack-allo cated ob jects pb ob 18,370 253,752 normally. But it do es not attempt to move or collect stack- JVM 70,536 456,494 allo cated ob jects. 8 Exp erimental Results Figure 13: Benchmark Characteristics Wehave implemented a combined p ointer and escap e analy- We staged the analysis of the applications as follows. sis based on the algorithm describ ed in Sections 4 and 5. We We rst analyzed the virtual machine and the standard li- implemented the analysis in the compiler for the Jalap eno ~ braries, writing the analysis results out to the appropriate JVM [8 ], a Java virtual machine written in Java with a few les. We then analyzed the application co de, reusing the unsafe extensions for p erforming low-level system op erations analysis results from the standard libraries. such as explicit memory management and p ointer manipu- lation. 8.1 Compiler Structure The analysis is implemented as a separate phase of the Jalap eno ~ dynamic compiler, which op erates on the Jalap eno ~ intermediate representation. To analyze a class, the algo- rithm loads the class, converts its metho ds into the interme- diate representation, then analyzes the metho ds. The nal analysis results for the metho ds are written out to a le. Number of Number of Number of Number of Synchronization Synchronization Heap-Allo cated Heap-Allo cated Op erations Op erations Ob jects Ob jects Before After Before After Benchmark Optimization Optimization Benchmark Optimization Optimization javac 4,583,013 2,925,653 javac 2,845,485 2,033,525 1,004,409 330,952 javacup 1,913,594 1,495,141 javacup javalex 2,038,945 1,058,456 javalex 4,389,452 214,803 pb ob 205,198,941 155,764,374 pb ob 56,708,490 39,751,260 Figure 14: Number of Synchronization Op erations Before Figure 16: Numb er of Heap-Allo cated Ob jects Before and and After Optimization After Optimization 100% 100% 80% 80% 60% 60% 40% 40% 20% 20% 0% 0% javac javacup javalex pbob javac javacup javalex pbob Figure 15: Percentage of Eliminated Synchronization Op er- Figure 17: Percentage of Ob jects Allo cated on the Stack ations Size of Size of Heap-Allo cated Heap-Allo cated 8.3 Synchronization Elimination Ob jects Ob jects Figure 14 presents the dynamic numb er of synchronization Before After op erations for all of the applications b oth b efore and af- Benchmark Optimization Optimization ter the synchronization elimination optimization. Figure 15 javac 81,573,448 60,843,884 plots the p ercentage of eliminated synchronization op era- javacup 54,089,120 43,038,640 tions. The analysis achieves reasonable results, eliminating 107,347,648 8,887,836 javalex between 24 to 64 of the synchronization op erations. pb ob 25,950,641,024 18,046,027,560 8.4 Stack Allo cation Figure 18: Total Size of Heap-Allo cated Ob jects Before and Figure 16 presents the dynamic number of heap allo cated After Optimization bytes ob jects for all of the applications b oth b efore and after the stack allo cation optimization. Figure 17 plots the p ercent- age of ob jects allo cated on the stack. Once again, the anal- hieves reasonable results, allo cating b etween 22 and ysis ac 100% 95 of the ob jects on the stack. 18 presents the total size of the heap allo cated Figure 80% ob jects for all of the applications b oth b efore and after the k allo cation optimization. Figure 17 plots the p ercent- stac 60% age of memory allo cated on the stack instead of in the heap. e amount of ob ject memory allo cated on the stack The relativ 40% is closely correlated with the numb er of ob jects allo cated on k. the stac 20% 6 The currentversion of the Jalap eno ~ JVM do es not supp ort stack allo cation. All ob jects are therefore allo cated on the heap. The com- 0% piler do es, however, generate all of the sp ecialized metho ds necessary k allo cation. The compiler fully implements the syn- to supp ort stac javac javacup javalex pbob chronization elimination optimization. 7 See the package sun.to ols.javac in the standard Java distribution. 8 Available at www.cs.princeton.edu/ app el/mo dern/java/CUP/ 9 Figure 19: Percentage of Ob ject Memory Allo cated on the Available at www.cs.princeton.edu/ app el/mo dern/java/JLex/ Stack 9 Related Work 9.1.2 Pointer Analysis for Multithreaded Programs Rugina and Rinard recently develop ed a ow-sensitive p ointer In this section, we discuss several areas of related work: analysis algorithm for multithreaded programs [23 ]. The key p ointer analysis, escap e analysis, and synchronization op- complication in this algorithm is characterizing the interac- timizations. tion b etween multiple threads and how this interaction af- fects the p oints-to relationship. The analysis presented in 9.1 Pointer Analysis this pap er simply sidesteps this issue. It generates precise re- sults only for ob jects that do not escap e. If an ob ject escap es Pointer analysis for sequential programs is a relatively ma- to another thread, our algorithm is not designed to main- ture eld. Flow-insensitive analyses, as the name suggests, tain precise information ab out its p oints-to relationships. It do not take statement ordering into account, and typically is imp ortant to realize that this is a key design decision that use some form of constraint-based analysis to pro duce a allows us to obtain go o d escap e analysis results with what single p oints-to graph that is valid across the entire pro- is essentially a lo cal analysis. gram [2, 27 , 26 ]. Flow-insensitive analyses extend trivially Corb ett describ es a shap e analysis algorithm for multi- from sequential programs to multithreaded programs. Be- threaded Java programs [12 ]. The algorithm can be used cause they are insensitive to the statement order, they triv- to nd ob jects that are accessible to a single thread or ob- ially mo del all of the interleavings of the parallel executions. jects that are accessed by at most one thread at any given time. The goal is to use this information to reduce the size 9.1.1 Flow-Sensitive Analyses of the state space explored by a mo del checker. The algo- Flow-sensitive analyses take the statement ordering into ac- rithm is intrapro cedural, and do es not address the problem count, typically using a data ow analysis to pro duce a p oints- of analyzing programs with multiple pro cedures. to graph or set of alias pairs for each program p oint [25 , 22 , 28 , 18 , 10 , 20 ]. One approach analyzes the program in a top- 9.2 Escap e Analysis down fashion starting from the main pro cedure, reanalyzing There has b een a fair amount of work on escap e analysis each p otentially invoked pro cedure in each new calling con- in the context of functional languages [4 , 3, 29 , 14, 15 , 5, text [28 , 18 ]. This approach exp oses the compiler to the 19 ]. The implementations of functional languages create p ossibility of sp ending signi cant amounts of time reanalyz- many ob jects for example, cons cells and closures implic- ing pro cedures. It also commits the compiler to an analysis itly. These ob jects are usually allo cated in the heap and of the entire program and is impractical for situations in reclaimed later by the garbage collector. It is often p ossible which the entire program is not available. to use a lifetime or escap e analysis to deduce b ounds on the Another approach analyzes the program in a b ottom-up lifetimes of these dynamically created ob jects, and to p er- fashion, extracting a single analysis result for each pro ce- form optimizations to improve their memory management. dure. The analysis reuses the result at each call site that Deutsch [14 ] describ es a lifetime and sharing analysis for mayinvoke the pro cedure. Sathyanathan and Lam present higher-order functional languages. His analysis rst trans- an algorithm for programs with non-recursive p ointer data lates a higher-order functional program into a sequence of typ es [25 ]. This algorithm supp orts strong up dates and ex- op erations in a low-level op erational mo del, then p erforms tracts enough information to obtain fully context-sensitive an analysis on the translated program to determine the life- results in a comp ositional way. Chatterjee, Ryder, and times of dynamically created ob jects. The analysis is a Landi describ e an approach that extracts multiple analysis whole-program analysis. Park and Goldb erg [3 ] also describ e results; each result is indexed under the calling context for an escap e analysis for higher-order functional languages. whichitisvalid [9 ]. Our approach extracts a single analysis Their analysis is less precise than Deutsch's. It is, how- result for calling contexts with no aliases, and merges no des ever, conceptually simpler and more ecient. Their main for calling contexts with aliases. The disadvantage of this contribution was to extend escap e analysis to include lists. approach is that it may pro duce less precise results than Deutsch [15 ] later presented an analysis that extracts the approaches that maintain information for multiple calling same information but runs in almost linear time. Blanchet [5] contexts. The advantage is that it leads to a simpler algo- extended this algorithm to work in the presence of imp era- rithm and smaller analysis results. tive features and p olymorphism. He also provides a correct- Many p ointer and shap e analysis algorithms are based ness pro of and some exp erimental results. on the abstraction of p oints-to graphs [24 , 18 , 28]. The Baker [4 ] describ es an novel approach to higher-order no des in these graphs typically represent memory lo cations, escap e analysis of functional languages based on the typ e ob jects, or variables, and the edges represent references b e- inference uni cation technique. The analysis provides es- tween the represented entities. A standard technique is to cap e information for lists only. Hannan also describ es a use invisible variables [20 , 18 , 28] to parameterize the anal- typ e-based analysis in [19 ]. He uses annotated typ es to de- ysis result so that it can b e used at di erent call sites. Like scrib e the escap e information. He only gives inference rules outside ob jects, invisible variables represent entities from and no algorithm to compute annotated typ es. outside the analysis context. But the relationships b etween invisible variables are typically determined by the aliasing relationships in the calling context or by the typ e system. 9.3 Synchronization Optimizations In our approach, the relationships b etween outside ob jects Diniz and Rinard [16 , 17 ] describ e several algorithms for p er- are determined by the structure of the analyzed metho d | forming synchronization optimizations in parallel programs. each load statement corresp onds to a single outside ob ject. The basic idea is to drivedown the lo cking overhead by coa- This approach leads to a simple and ecient treatmentof lescing multiple critical sections that acquire and release the recursive data structures. Another di erence between the same lo ckmultiple times into a single critical section that approach presented in this pap er and other previously pre- acquires and releases the lo ck only once. When p ossible, the sented approaches is the explicit distinction b etween inside algorithm also coarsens the lo ck granularityby using lo cks and outside edges. algorithm can distinguish where it do es and do es not have in enclosing ob jects to synchronize op erations on nested ob- complete information. As develop ers continue to moveto jects. Plevyak and Chien describ e similar algorithms for a mo del of dynamically loaded, comp onent-based software, reducing synchronization overhead in sequential executions we b elieve this general approach will b ecome increasingly of concurrent ob ject-oriented programs [21 ]. relevant and comp elling. Several research groups have recently develop ed synchro- nization optimization techniques for Java programs. Aldrich, Chamb ers, Sirer, and Eggers describ e several techniques for Acknowledgements reducing synchronization overhead, including synchroniza- tion elimination for thread-private ob jects and several opti- The authors would liketoacknowledge discussions with Radu mizations that eliminate synchronization from nested moni- Rugina and Darko Marinov regarding p ointer and escap e tor calls [1 ]. Blanchet describ es a pure escap e analysis based analysis. The second author would liketoacknowledge dis- on an abstraction of a typ e-based analysis [6 ]. The imple- cussions with Mo oly Sagiv regarding p ointer, escap e, and mentation uses the results to eliminate synchronization for shap e analysis. The rst author would liketoacknowledge thread-private ob jects and to allo cate captured ob jects on Jong-Deok Choi for intro ducing him to the concept of using the stack. Bogda and Ho elzle describ e a ow-insensitive es- escap e information for synchronization elimination, while he cap e analysis based on global set inclusion constraints [7 ]. was an IBM employee in the summer of 1998. The authors The implementation uses the results to eliminate synchro- would like to thank the develop ers of the IBM Jalap eno ~ vir- nization for thread-private ob jects. A limitation is that the tual machine and the MIT Flex pro ject for providing the analysis is not designed to nd captured ob jects that are compiler infrastructure in which this research was imple- reachable via paths with more than two references. mented, and the anonymous reviewers for their comments. Choi, Gupta, Serrano, Sreedhar, and Midki presenta comp ositional data ow analysis for computing reachability References information [11 ]. The analysis results are used for synchro- nization elimination and stack allo cation of ob jects. Like [1] J. Aldrich, C. Chamb ers, E. Sirer, and S. Eggers. Static the analysis presented in this pap er, it uses an extension analyses for eliminating unnecessary synchronization of p oints-to graphs with abstract no des that may represent from java programs. In Proceedings of the 6th Inter- multiple ob jects. It do es not distinguish b etween inside and national Static Analysis Symposium, Septemb er 1999. outside edges, but do es contain an optimization, deferred edges, that is designed to improve the eciency of the anal- [2] Lars Ole Andersen. Program Analysis and Specializa- ysis. The approach classi es ob jects as globally escaping, tion for the C Programming Language. PhD thesis, escaping via an argument, and not escaping. Because the DIKU, University of Cop enhagen, May 1994. primary goal was to compute escap e information, the anal- [3] B. Goldb erg and Y. Park. Escap e analysis on lists. In ysis collapses globally escaping subgraphs into a single no de Proceedings of the SIGPLAN '92 Conference on Pro- instead of maintaining the extracted p oints-to information. gram Language Design and Implementation, pages 116{ Our analysis retains this information, in part b ecause we 127, July 1992. anticipate further thread interaction analyses for example, extensions of existing p ointer analysis algorithms for multi- [4] H. Baker. Unifying and conquer garbage, up dating, threaded programs [23 ] that will use this information. aliasing ... in functional languages. In Proceedings of the ACM Conference on Lisp and Functional Program- 10 Conclusion ming, pages 218{226, 1990. This pap er presents a new combined p ointer and escap e [5] B. Blanchet. Escap e analysis: Correctness pro of, imple- analysis algorithm for ob ject-oriented programs. The al- mentation and exp erimental results. In Proceedings of gorithm is designed to analyze arbitrary parts of complete the 25th Annual ACM Symposium on the Principles of or incomplete programs, obtaining complete information for Programming Languages,Paris, France, January 1998. ob jects that do not escap e the analyzed parts. ACM, ACM, New York. Wehave implemented the algorithm in the IBM Jalap eno ~ [6] B. Blanchet. Escap e analysis for ob ject oriented lan- virtual machine, and applied the analysis information to guages. application to java. In Proceedings of the 14th two optimization problems: synchronization elimination and Annual Conference on Object-Oriented Programming stack allo cation of ob jects. For our b enchmark programs, Systems, Languages and Applications, Denver, CO, our algorithms enable the stack allo cation of b etween 22 Novemb er 1999. and 95 of the ob jects. They also enable the elimination of between 24 and 67 of the synchronization op erations. [7] J. Bo dga and U. Ho elzle. Removing unnecessary syn- In the long run, we b elieve the most imp ortant concept chronization in java. In Proceedings of the 14th Annual in this researchmay turn out to b e designing analysis algo- Conference on Object-Oriented Programming Systems, rithms from the p ersp ective of extracting and representing Languages and Applications, Denver, CO, November interactions b etween analyzed and unanalyzed regions of the 1999. program. This concept leads to representations, such as the p oints-to escap e graphs presented in this pap er, that makea [8] M. Burke, J. Choi, S. Fink, D. Grove, M. Hind, clean distinction b etween information created lo cally within V. Sarkar, M. Serrano, V. Sreedhar, H. Srinivasan, and an analyzed region and information created in the rest of J. Whaley. The jalap eno ~ dynamic optimizing compiler the program outside the analyzed region. for java. In Proceedings of the ACM SIGPLAN 1999 These representations supp ort exible analyses that are Java Grande Conference, June 1999. capable of analyzing arbitrary parts of the program, with the analysis result b ecoming more precise as more of the program is analyzed. At every stage of the analysis, the [21] J. Plevyak, X. Zhang, and A. Chien. Obtaining sequen- [9] R. Chatterjee, B. Ryder, and W. Landi. Relevant tial eciency for concurrent ob ject-oriented languages. context inference. In Proceedings of the 26th Annual In Proceedings of the 22nd Annual ACM Symposium on ACM Symposium on the Principles of Programming the Principles of Programming Languages, San Fran- Languages, San Antonio, TX, January 1999. cisco, CA, January 1995. ACM, New York. [10] J. Choi, M. Burke, and P. Carini. Ecient ow-sensitive interpro cedural computation of p ointer- [22] E. Ruf. Context-insensitive alias analysis reconsidered. induced aliases and side e ects. In ConferenceRecordof In Proceedings of the SIGPLAN '95 ConferenceonPro- the Twentieth Annual Symposium on Principles of Pro- gram Language Design and Implementation, La Jolla, gramming Languages, Charleston, SC, January 1993. CA, June 1995. ACM. [23] R. Rugina and M. Rinard. Pointer analysis for mul- [11] J. Choi, M. Gupta, M. Serrano, V. Sreedhar, and tithreaded programs. In Proceedings of the SIGPLAN S. Midki . Escap e analysis for java. In Proceedings '99 Conference on Program Language Design and Im- of the 14th Annual Conference on Object-OrientedPro- plementation,Atlanta, GA, May 1999. gramming Systems, Languages and Applications, Den- [24] M. Sagiv, T. Reps, and R. Wilhelm. Solving shap e- ver, CO, Novemb er 1999. analysis problems in languages with destructive up dat- [12] J. Corb ett. Using shap e analysis to reduce nite-state ing. ACM Transactions on Programming Languages mo dels of concurrentjava programs. In Proceedings of and Systems, 201:1{50, January 1998. the International Symposium on Software Testing and Analysis, March 1998. [25] P. Sathyanathan and M. Lam. Context-sensitive in- terpro cedural p ointer analysis in the presence of dy- [13] J. Dean, D. Grove, and C. Chamb ers. Optimization namic aliasing. In Proceedings of the Ninth Workshop of ob ject-oriented programs using static class hierarchy on Languages and Compilers for Paral lel Computing, analysis. In Proceedings of the 9th European Confer- San Jose, CA, August 1996. Springer-Verlag. ence on Object-Oriented Programming, Aarhus, Den- mark, August 1995. [26] M. Shapiro and S. Horwitz. Fast and accurate ow- insensitive p oints-to analysis. In Proceedings of the 24th [14] A. Deutsch. On determining lifetime and aliasing of Annual ACM Symposium on the Principles of Program- dynamically allo cated data in higher-order functional ming Languages,Paris, France, January 1997. sp eci cations. In Proceedings of the 17th Annual ACM Symposium on the Principles of Programming Lan- [27] Bjarne Steensgaard. Points-to analysis in almost linear guages, pages 157{168, San Francisco, CA, January time. In Proceedings of the 23rdAnnual ACM Sympo- 1990. ACM, ACM, New York. sium on the Principles of Programming Languages, St. Petersburg Beach, FL, January 1996. [15] A. Deutsch. On the complexity of escap e analysis. In Proceedings of the 24th Annual ACM Symposium on the [28] R. Wilson and M. Lam. Ecient context-sensitive Principles of Programming Languages, Paris, France, p ointer analysis for C programs. In Proceedings of the January 1997. ACM, ACM, New York. SIGPLAN '95 Conference on Program Language De- sign and Implementation, La Jolla, CA, June 1995. [16] P. Diniz and M. Rinard. Lo ck coarsening: Eliminat- ACM, New York. ing lo ckoverhead in automatically parallelized ob ject- based programs. In Proceedings of the Ninth Workshop [29] Y. Tang and P. Jouvelot. Control- ow e ects for escap e on Languages and Compilers for Paral lel Computing, analysis. In Workshop on Static Analysis, pages 313{ pages 285{299, San Jose, CA, August 1996. Springer- 321, Septemb er 1992. Verlag. [17] P. Diniz and M. Rinard. Synchronization transforma- tions for parallel computing. In Proceedings of the 24th Annual ACM Symposium on the Principles of Program- ming Languages, pages 187{200, Paris, France, January 1997. ACM, New York. [18] M. Emami, R. Ghiya, and L. Hendren. Context- sensitiveinterpro cedural p oints-to analysis in the pres- ence of function p ointers. In Proceedings of the SIG- PLAN '94 Conference on Program Language Design and Implementation, pages 242{256, Orlando, FL, June 1994. ACM, New York. [19] J. Hannan. Atyp e-based analysis for blo ck allo cation in functional languages. In Proceedings of the Second International Static Analysis Symposium.ACM, ACM, New York, Septemb er 1995. [20] W. Landi and B. Ryder. A safe approximation algo- rithm for interpro cedural p ointer aliasing. In Proceed- ings of the SIGPLAN '92 ConferenceonProgram Lan- guage Design and Implementation, San Francisco, CA, June 1992.