Comp ositional Pointer and Escap e Analysis for Programs

John Whaley and Martin Rinard

Lab oratory for Computer Science

Massachusetts Institute of Technology

Cambridge, MA 02139

fjwhaley, [email protected]

Abstract the analysis result b ecoming more precise as more of the

program is analyzed. At every stage in the analysis, the

This pap er presents a combined p ointer and escap e analy-

algorithm can distinguish where it do es and do es not have

sis algorithm for Java programs. The algorithm is based on

complete information.

the abstraction of p oints-to escap e graphs, whichcharacter-

ize how lo cal variables and elds in ob jects refer to other

1.1 Analysis Uses

ob jects. Each p oints-to escap e graph also contains escap e

Java presents a clean and simple memory mo del: concep-

information, which characterizes how ob jects allo cated in

tually, all ob jects are allo cated in a garbage-collected heap.

one region of the program can escap e to b e accessed by an-

While useful to the programmer, this mo del comes with a

other region. The algorithm is designed to analyze arbitrary

cost. In many cases it would b e more ecient to allo cate

regions of complete or incomplete programs, obtaining com-

ob jects on the stack, eliminating the dynamic memory man-

plete information for ob jects that do not escap e the analyzed

agementoverhead for that ob ject. A similar situation holds

regions.

for the Java synchronization mo del. Conceptually, every

Wehave develop ed an implementation that uses the es-

Java ob ject comes with a lo ck. Each synchronized metho d

cap e information to eliminate synchronization for ob jects

ensures that it executes atomically by acquiring and releas-

that are accessed by only one and to allo cate ob jects

ing the lo ck in its receiver ob ject. But the lo ckoverhead is

on the stack instead of in the heap. Our exp erimental results

wasted when only one thread accesses the ob ject | the lo cks

are encouraging. We were able to analyze programs tens

are required only when there is a p ossibility that multiple

of thousands of lines long. For our b enchmark programs,

threads may attempt to access the ob ject simultaneously.

our algorithms enable the elimination of b etween 24 and

In this pap er we discuss the use of our analysis results to

67 of the synchronization op erations. They also enable the

eliminate unnecessary synchronization and to enable stack

stack allo cation of b etween 22 and 95 of the ob jects.

allo cation of ob jects. The basic idea is to use the analy-

sis information to determine when ob jects do not escap e

1 Intro duction

threads and metho ds. If an ob ject do es not escap e from its

allo cating thread to another thread, the can trans-

This pap er presents a combined p ointer and escap e anal-

form the program to eliminate synchronization op erations

ysis algorithm for Java programs and programs written in

on that ob ject. If an ob ject do es not escap e a metho d, the

similar ob ject-oriented languages. The algorithm is based

compiler can transform the program to allo cate it in that

on the abstraction of p oints-to escap e graphs, whichchar-

metho d's activation record instead of in the heap. Our ex-

acterize how lo cal variables and elds in ob jects refer to

p erimental results show that the algorithms can eliminate a

other ob jects. Each p oints-to escap e graph also contains es-

signi cantnumb er of heap allo cations and synchronization

cap e information, whichcharacterizes how ob jects allo cated

op erations.

in one region of the program can escap e to b e accessed by

another region.

1.2 Analysis Prop erties

A key concept underlying this abstraction is the goal

of representing interactions between analyzed and unana-

Our analysis has several imp ortant prop erties:

lyzed regions of the program. Points-to escap e graphs make

a clean distinction between ob jects and references created

 It is an interpro cedural analysis. It is designed to com-

within an analyzed region and those created in the rest of

bine analysis results from multiple metho ds to obtain

the program. They therefore enable a exible analysis that

precise p oints-to and escap e information.

is capable of analyzing arbitrary parts of the program, with

 It is a comp ositional analysis. It is designed to analyze

each metho d once to pro duce a single parameterized

analysis result that can b e sp ecialized for use at all of

1

the call sites that mayinvoke the metho d.

1

Recursive metho ds require an iterative algorithm that may ana-

lyze metho ds multiple times to reach a xed p oint.

b e a ected by unanalyzed regions of the program. Our al-  It is a partial program analysis in two senses. First, it

gorithm is designed to analyze arbitrary regions of complete is designed to analyze each metho d indep endently of

or incomplete programs, obtaining complete information for its callers. Second, it is capable of analyzing a metho d

ob jects that do not escap e the analyzed regions. without analyzing all of the metho ds that it invokes,

with the analysis result b ecoming more precise as more

of the invoked metho ds are analyzed.

1.4 Contributions

Because the analysis is comp ositional and can analyze

This pap er makes the following contributions:

pieces of the program indep endently of their callers and

 Analysis Algorithms: It presents a new combined

callees, it is esp ecially appropriate for dynamically loaded

p ointer and escap e analysis algorithm. The algorithm

programs. Our current implementation runs in a dynamic

is comp ositional and is designed to deliver useful in-

compiler and analyzes libraries indep endently of applica-

formation without analyzing the entire program.

tions. When the compiler loads an application, it retrieves

the previously computed analysis results for the libraries and

 Analysis Approach: It presents an analysis approach

uses these results in the analysis of the application.

that explicitly di erentiates between references and

ob jects from analyzed and unanalyzed regions of the

1.3 Basic Approach

program. This approach allows the algorithm to cap-

ture the interactions between these regions, and to

The analysis is based on an abstraction we call p oints-to

clearly identify where it do es and do es not have com-

escap e graphs. The no des of this abstraction represent ob-

plete information ab out the extracted relationships.

jects; the edges represent references b etween ob jects. The

abstraction also contains information ab out which ob jects

 Analysis Uses: It presents two optimizations en-

escap e to other metho ds or threads. For example, an ob-

abled by the extracted p oints-to and escap e informa-

ject escap es if it is returned to an unanalyzed region of the

tion: synchronization elimination and stack allo cation.

program or passed as a parameter to an unanalyzed metho d.

For each metho d, the analysis pro duces a p oints-to es-

 Exp erimental Results: It presents exp erimental re-

cap e graph that characterizes the p oints-to relationships cre-

sults from a prototyp e implementation of the algo-

ated within the metho d and the interaction of the metho d

rithms. These results show that the algorithms can

with the p oints-to relationships created by the rest of the

eliminate a signi cant amount of synchronization op-

program. The analysis represents these interactions, in part,

erations and heap allo cations.

by maintaining a distinction b etween two kinds of edges: in-

The remainder of the pap er is organized as follows. Sec-

side edges, which represent references created inside the cur-

tion 2 presents examples that illustrate how the analysis

rently analyzed region, and outside edges, which represent

works. Section 3 describ es how the analysis represents the

references created outside the currently analyzed region.

program and the analysis ob jects that the algorithm uses.

Similarly, there are two kinds of no des: inside nodes,

Section 4 presents the intrapro cedural analysis, and Sec-

which represent ob jects created inside the currently ana-

tion 5 presents the interpro cedural analysis. In Section 6 we

lyzed region and accessed via inside edges, and outside nodes,

present an abstraction relation that characterizes the corre-

which represent ob jects that are either created outside the

sp ondence b etween p oints-to escap e graphs and the ob jects

currently analyzed region or accessed via outside edges. Out-

and references that the program creates as it runs. Sec-

side edges representinteractions in which the analyzed re-

tion 7 presents the synchronization elimination and stack

gion reads a reference created in an unanalyzed region. In-

allo cation transformations. Section 8 presents the exp eri-

side edges from outside no des or no des reachable from out-

mental results from our implementation. Section 9 discusses

side no des representinteractions in which the analyzed re-

related work; we conclude in Section 10.

gion creates a reference that the unanalyzed region may

read.

Edges b etween inside no des represent references created

2 Example

within the currently analyzed region. If the inside no des

do not escap e, they have no interaction with the rest of

In this section we present several examples that illustrate

the program and their edges in the p oints-to escap e graph

how our analysis works.

completely characterize the p oints-to relationships b etween

the ob jects represented by these no des. Because p oints-to

2.1 Return Values

escap e graphs record the interactions of metho ds with their

Figure 1 presents a complex number arithmetic example.

callers, the analysis can recover complete information ab out

The add metho d adds two complex numb ers, storing the

ob jects that escap e the currently analyzed metho d but are

result in a newly allo cated complex numb er ob ject and re-

recaptured in the caller.

turning the new ob ject. The multiply metho d op erates sim-

Our analysis is comp ositional | it analyzes each metho d

ilarly, but multiplies the numb ers instead of adding them.

indep endently of its callers. Unlike all other p ointer and es-

Because each metho d returns the new ob ject, the new ob ject

cap e analysis algorithms that we know of, our algorithm is

escap es from the metho d.

also capable of analyzing a metho d indep endently of meth-

The multiplyAdd metho d multiplies its two arguments,

o ds that it mayinvoke. When the algorithm skips the anal-

then returns the sum of the pro duct and the receiver. In this

ysis of a p otentially invoked metho d, it records that all

case, only the add result escap es | the temp orary result

ob jects passed as parameters to the invoked but not an-

from multiply is inaccessible outside of the multiplyAdd

alyzed metho d escap e from the scop e of the analysis. This

metho d. It is therefore p ossible to generate a sp ecialized

escap e information allows the algorithm to clearly identify

version of the multiply metho d that exp ects the storage for

b oth those ob jects for which it has complete p oints-to infor-

the result ob ject to b e allo cated on the stack of its caller.

mation and those ob jects whose p oints-to relationships may

It would then generate the result into the stack-allo cated

public class Server extends Thread { class complex {

ServerSocket serverSocket; double x,y;

int duplicateConnections; complexdouble a, double b { x = a; y = b; }

complex multiplycomplex a {

ServerServerSocket s { complex product =

serverSocket = s; new complexx*a.x - y*a.y,x*a.y + y*a.x;

duplicateConnections = 0; returnproduct;

} }

public void run  { complex addcomplex a {

try { complex sum = new complexx+a.x,y+a.y;

Vector connectorList = new Vector; return sum;

while true { }

Socket clientSocket = complex multiplyAddcomplex a, complex b {

serverSocket.accept; complex product = a.multiplyb;

new ServerHelperclientSocket.sta rt ; complex sum = this.addproduct;

InetAddress addr = returnsum;

clientSocket.getInetAddress ; }

if connectorList.indexOfaddr < 0 }

connectorList.addElementad dr;

else duplicateConnections++;

Figure 1: Complex Numb er Example

}

} catch IOException e { }

}

}

Figure 3: Server Example

the program with no des that corresp ond to the ob ject cre-

ation site. So, for example, product p oints to the no de that

represents ob jects created at the ob ject creation site inside

multiply . Because this no de represents ob jects created in-

side the analyzed region, it is an inside no de.

The no de that product p oints to is accessible only via

the lo cal variable product . We therefore say that this no de

is captured. As so on as the multiplyAdd metho d returns, all

of the ob jects that this no de represents will b e inaccessible.

It is therefore legal to tie the lifetime of the ob ject to the

lifetime of the metho d invo cation, and allo cate the ob ject

on the activation record of the multiplyAdd metho d.

Figure 2: Analysis Result for multiplyAdd

The multiplyAdd metho d returns the ob ject that sum

p oints to. Even though the ob ject escap es this metho d, it

may b e recaptured by a metho d ab ove multiplyAdd in the

ob ject rather than dynamically allo cating a new complex

call chain. The analysis maintains enough information to

ob ject to hold the result.

recognize such situations.

Figure 2 presents the analysis result for the multiplyAdd

metho d. The no des in this graph represent variables and

2.2 Thread-Private Objects

ob jects; the edges represent references. The parameter vari-

ables a and b p oint to no des that represent the parame-

The next example illustrates how the analysis can detect

ter ob jects; the variable this p oints to a no de that rep-

thread-private ob jects, or ob jects that are accessed by only

resents the receiver. Because these ob jects are created out-

one thread. The goal is to eliminate synchronization op-

side the metho d, the corresp onding no des are outside no des.

erations on thread-private ob jects. The co de in Figure 3

Note that the analysis assumes that the parameters are not

implements a simple server. The server waits for incoming

aliased. If the analysis result is used at a metho d invo-

connection requests on the serverSocket . Whenever it gets

cation site where the parameters are aliased, the mapping

a request, it creates a new ServerHelper thread to service

algorithm presented in Section 5 merges the two parameter

the new connection. The server counts the numb er of du-

no des b efore it uses the analysis result. This mechanism

plicate connections | i.e., the numb er of times it connects

ensures that the algorithm can analyze the program under

to a client that it has previously connected to.

the assumption that di erent parameters and ob ject elds

To correctly compute this count, the server maintains a

are not aliased, but use the analysis results in contexts con-

list of the Internet addresses of clients that it has connected

taining aliases.

to. Whenever it receives a new request, the server checks

The product and sum variables p oint to no des that rep-

the list to determine if it has previously connected to the

resent the new complex numb er ob jects allo cated, resp ec-

client. If so, it increments duplicateConnections , whichis

tively, in the multiply and add metho ds. The algorithm

the count of the numb er of duplicate connections that the

represents ob jects allo cated within the analyzed region of

server has established. If not, it inserts the Internet address

of the client into the list so that it can detect any future

duplicate connections from this client.

In this example, the server uses a Vector to hold the

class multisetElement {

list of Internet addresses. This class is thread safe | its

Object element;

metho ds are guaranteed to execute atomically even when

int count;

concurrently invoked by di erent threads. Thread safetyis

multisetElement next;

an imp ortant prop erty. Without it, multithreaded programs

can fail in very subtle ways b ecause of data races, or unantic-

multisetElementObject e, multisetElement n {

ipated interactions b etween threads that concurrently access

count = 1;

the same ob ject.

element = e;

The problem with atomic op erations is the overhead from

next = n;

the synchronization op erations that make the metho ds ex-

}

ecute atomically. In our example, the standard implemen-

synchronized boolean checkObject e {

tations of indexOf and addElement acquire and release the

if element.equalse {

lo ck in their receiver ob ject. This overhead is esp ecially

count++;

regrettable when, as in our example, the Vector is a thread-

returntrue;

private ob ject.

} else return false;

The analysis of the run metho d in our example pro ceeds

}

as follows. The algorithm rst analyzes the invoked metho ds

synchronized multisetElement insertObject e {

to obtain a p oints-to escap e graph for each metho d. Each

multisetElement m = this;

graph summarizes how the metho d a ects the p oints-to re-

while m != null {

lationships b etween ob jects and how the ob jects it creates

if m.checke return this;

and accesses escap e to other threads or to the caller. In

m = m.next;

our example, the new clientSocket ob ject escap es to the

}

ServerHelper thread. The indexOf and addElement meth-

return new multisetElemente, this;

ods maychange the ob jects that the receiver p oints to, but

}

they do not change the escap e status of the receiver.

}

Based on this information, and on its analysis of the run

metho d, the analysis determines that the connectorList

class multiset {

ob ject is captured within the run metho d, and is accessible

multisetElement elements;

only to the run metho d's thread. The synchronization in the

multiset {

addElement and indexOf metho ds is therefore redundant.

elements = null;

The compiler can generate sp ecialized, synchronization-free

}

versions of these metho ds. The generated co de for the run

synchronized void addElementObject e {

metho d invokes these sp ecialized versions, and the compu-

if elements == null

tation executes without synchronization overhead for the

elements = new multisetElemente,null;

Vector ob ject.

else elements = elements.inserte;

It is also p ossible to address the problem of synchro-

}

nization overhead for thread-private ob jects by providing

}

the programmer with two implementations of each class: a

thread-safe implementation with synchronization used for

ob jects shared b etween threads, and a thread-unsafe imple-

Figure 4: Multiset Example

mentation with no synchronization used for thread-private

ob jects. The JDK 1.2 collections API takes this approach.

The problem with this approach is that it complicates the

API and burdens the programmer with the resp onsibility

of determining which ob jects are thread safe and which are

not. The last problem can b ecome esp ecially severe for pro-

grammers maintaining multithreaded programs. If a change

makes a previously thread-private ob ject accessible to mul-

tiple threads, the programmer must search the program to

manually replace the thread-unsafe version with the thread-

safe version.

2.3 Recursive Data Structures

We next present an example that illustrates how our algo-

rithm deals with recursive data structures. Let us assume

that in the previous example, the server would like to count

the numb er of connections from each client, instead of sim-

ply counting the numb er of duplicate connections. In this

case, the server could use the multiset class in Figure 4,

which implements a multiset as a list of elements. Each ele-

ment has a count of the numb er of times it is present in the

Figure 5: Analysis Result for insert

multiset.

If the server used a multiset instead of a Vector , the

 A copy statement l = v. analysis of the run metho d in Figure 3 would still recognize

the multiset ob ject as thread-private. In this section, we

 A load statement l = l :f .

1 2

fo cus on the treatmentof recursive data structures in the

analysis of the insert metho d for multisetElement s.

 A store statement l :f = l .

1 2

Figure 5 presents the analysis result at the end of the

insert metho d. The this variable p oints to the receiver

 A global load statement l = cl:f .

no de, which represents the receiver ob ject, while e p oints to

 A global store statement cl :f = l.

a parameter no de, which represents the parameter ob ject.

Both no des are outside no des. The other no des represent

 A return statement return l, which identi es the re-

ob jects created or accessed during the execution of insert .

turn value l of the metho d.

The next edge from the receiver no de p oints to a no de that

represents all ob jects referenced by the m.next eld at the

 An ob ject creation site of the form l = new cl . There

load statement m = m.next , which loads the reference from

are two kinds of ob ject creation sites:

m 's next eld backinto m. The edge is an outside edge, b e-

cause it p oints to a no de that represents an ob ject allo cated

{ If cl inherits from the class Thread, then the ob-

outside the scop e of the analysis of insert , and it p oints

ject creation site is also a thread creation site.

to an outside no de. In particular, it p oints to a load node,

{ If cl do es not inherit from the class Thread, then

b ecause there is one of these no des for each load statement

the ob ject creation site is not a thread creation

in the program.

site.

There is also an outside edge from the load no de back

to itself. As the lo op in the insert metho d walks down

 A metho d invo cation site of the form

the list, it traverses the references in the next elds of the

l = l :op l ;:: :;l . Each metho d invo cation site

0 1

k

no des. Because all of these references are loaded at the same

corresp onds to a metho d invo cation site m 2 M .

statement, the outside edges that represent the references

The control ow graph for each metho d op starts with the

p oint to the same outside no de, and there is cycle in the

enter statement enter and ends with an exit statement

op

graph. This mechanism ensures that the analysis terminates

exit .

op

for programs that manipulate recursive data structures.

The analysis represents the control ow relationships b e-

The remaining no de represents the new multisetElement

tween statements as follows: predst  is the set of state-

ob ject that insert may allo cate to hold the new multiset

ments that may execute immediately b efore st, and succst 

entry. This no de has an inside edge from the next eld

is the set of statements that may execute immediately after

to the receiver no de, and an inside edge from the element

st . There are two program p oints for each statement st,

eld to the parameter no de. These edges represent refer-

the program p oint st immediately b efore st executes, and

ences to these no des created during the execution of insert .

the program p oint st  immediately after st executes.

The interpro cedural analysis uses call graph informa-

3 Analysis Algorithm

tion to compute sets of metho ds that may be invoked at

metho d invo cation sites. For each metho d invo cation site

The algorithm analyzes the program at the granularity of

m, callees m is the set of metho ds that m may invoke.

metho ds. The analysis of each metho d incorp orates infor-

Given a metho d op, callers op is the set of metho d invo ca-

mation from the metho d and from some subset of the meth-

tion sites that mayinvoke op . The current implementation

o ds that it invokes. The combination of the currently ana-

obtains this call graph information using a variant of class

lyzed metho d and all of the analyzed metho ds that it invokes

hierarchy analysis [13 ], but the algorithm can use any con-

is called the current analysis scope.

servative approximation to the actual call graph generated

when the program runs.

3.1 Program Representation

The algorithm represents the program using the following

3.2 Object Representation

analysis ob jects. There is a set l 2 L of lo cal variables

The analysis represents the ob jects that the program ma-

and a set p 2 P of formal parameter variables. There is

nipulates using a set n 2 N of no des. There are several

one formal parameter variable for each formal parameter

kinds of no des:

of each metho d in the program. Together, the lo cal and

formal parameter variables make up the set v 2 V = L [ P of

 There is the set N of inside no des, which represent

I

variables.

ob jects created within the current analysis scop e and

There is also a set cl 2 CL of classes and a set op 2 OP of

accessed via references created within the current anal-

metho ds. Each metho d has a receiver class cl and a formal

ysis scop e. There is one inside no de for each ob ject

parameter list p ;:::; p . We adopt the convention that

0 k

creation site; that inside no de represents all ob jects

parameter p p oints to the receiver ob ject of the metho d.

0

created within the current analysis scop e at that site.

Finally, there is a set f 2 F of ob ject elds. Ob ject

N is the set of thread no des, or inside no des that

T

elds are accessed using syntax of the form v:f. Static class

corresp ond to thread creation sites. Since each thread

variables are accessed using syntax of the form cl:f .

corresp onds to an ob ject that inherits from the class

The algorithm represents the computation of each metho d

Thread , thread creation sites are also ob ject creation

using a control ow graph. The no des of control ow graphs

sites, and N N .

T I

are statements st 2 ST. The algorithm analyzes only those

statements that a ect the p oints-to and escap e information.

 There is the set N of outside no des, which repre-

O

We assume the program has b een prepro cessed so that all

sent ob jects created outside the current analysis scop e

statements relevant to the analysis are in one of the following

or accessed via references created outside the current

forms: analysis scop e.

Both O and I are graphs with edges lab eled with a eld  There is the set CL of class no des. Conceptually, each

from F . We de ne the following op erations on no des of the class no de represents a statically allo cated ob ject whose

graphs: elds are the static class variables for the corresp ond-

ing class.

0 0

edgesToO; n = fhn ;ni2O g [ fh hn ; f i;ni2 O g

0 0

The set N of outside no des is futher divided into the

O

edgesFromO; n = fhn; n i2O g [ fh hn; fi;n i2 O g

following kinds of no des:

edges O; n = edgesToO; n [ edgesFromO; n

0 0

O n = fn :hn; n i2O g

 There is a set N of parameter no des. There is one

P

0 0

O n; f  = fn :hhn; fi;n i2O g

parameter no de for each formal parameter in the pro-

gram. Each parameter no de represents the ob ject that

The following op eration removes a set S of no des from a

its parameter p oints to during the execution of the

0 0 0 0

p oints-to escap e graph. hO ;I ;e ;r i = removeS; hO ; I ; e; r i,

analyzed metho d. When the analysis starts, each of

where

0

the metho d's formal parameter variables p oints to its

O = O [fedges O; n:n 2 S g

0

corresp onding parameter no de. The receiver ob ject is

I = I [fedges I; n:n 2 S g



treated as the rst parameter of each metho d.

en if n 62 S

0

e n =

; otherwise

 There is a set N of global no des. There is one global

G

0

r = r S

no de for each static class variable cl:f in the program.

Each global no de represents the ob jects that its static

class variable may p oint to during the execution of

3.4 Reachability and Escap ed No des

the current metho d. When the analysis of the cur-

We identify two di erent kinds of escap e information, with

rent metho d starts, each accessed static class variable

the distinction based on when an ob ject b ecomes accessi-

p oints to its global no de.

ble outside the current metho d. If an ob ject was created

 There is a set N of load no des. There is one load

outside the current analysis scop e or was accessed via a ref- L

no de for each load statement in the program. When

erence created outside the current analysis scop e, the ob ject

the load statement executes, it loads a value from a

is already accessible to other parts of the program. In this

eld in an ob ject. If the loaded value is a reference, the

case wesay that the ob ject has escaped or is escaped.

analysis must represent the ob ject that the reference

If an ob ject has not escap ed but will b e returned by the

p oints to. Each load no de represents all of the ob jects

metho d to its caller, wesay that the ob ject wil l escape.

whose references are loaded by the corresp onding load

An ob ject has directly escap ed the current metho d in all

statement.

of the following situations:

 There is a set N of return no des. There is one return

 A reference to the ob ject was passed as a parameter R

no de for each metho d invo cation site in the program.

into the current metho d.

Return no des are used to represent the return values

 A reference to the ob ject was written into a static class

of invo cations of unanalyzed metho ds.

variable.

The analysis represents each array with a single no de.

 A reference to the ob ject was passed as a parameter to

This no de has a eld elements , which represents all of the

an invoked metho d, and there is no information ab out

elements of the array. Because the p oints-to information for

what the invoked metho d did with the ob ject.

all of the array elements is merged into this eld, the analysis

do es not make a distinction b etween di erent elements of the

 The ob ject is a thread ob ject.

same array.

An ob ject has escap ed if it is reachable via some sequence of

ob ject references from a directly escap ed ob ject. An ob ject 3.3 Points-To Escap e Graphs

is captured if it has not escap ed.

A p oints-to escap e graph is a quadruple of the form hO ; I ; e; r i,

Given our abstraction of p oints-to escap e graphs, we can

where

formalize the concepts of reachability and escap ed no des as

0

follows. Anoden is directly reachable from a no de n in a

 O N  F  N [ N  is a set of outside edges.

L G

0 0 0

graph O if n = n , hn; n i2 O or 9f:hhn; fi;n i2 O . Anode

Outside edges represent references created outside the

0 0

n is reachable from a no de n in O if n = n or there exists a

current analysis scop e, either by the caller, by a thread

0

sequence n    n such that n = n , n = n , and for all

0 0

k k

running in parallel with the current thread, or byan

0  i

i+1 i

unanalyzed invoked metho d.

0

anoden is reachable from a set of no des S in O if there

0

exists a no de n 2 S such that n is reachable from n in O .

 I V [ N  F  N is a set of inside edges. Inside

We de ne reachable O ; S; nif n is reachable from S in O .

edges represent references created inside the current

Given a p oints-to escap e graph hO ; I ; e; r i,anoden di-

analysis scop e.

rectly escap es if n 2 N [ N [ N [ CL or en 6= ;. We

P R T

M

 e : N ! 2 is an escap e function that records the

de ne escap edhO ; I ; e; r i;nif n is reachable in O [ I from

set of unanalyzed metho d invo cation sites that a no de

a no de that directly escap es, and capturedhO ; I ; e; r i;nif

escap es down into.

not escap edhO ; I ; e; r i;n.

 r N is a return set that represents the set of ob-

jects that may b e returned by the currently analyzed metho d.

then generating additional inside edges from l to all of the 4 Intrapro cedural Analysis

no des that v p oints to.

The algorithm uses a data ow analysis to generate a p oints-

Kill = edges I; l

to escap e graph at each p oint in the metho d. The analysis

I

Gen = fl gI v

of each metho d starts with the construction of the p oints-

I

0

I = I Kill  [ Gen to escap e graph hO ;I ;e ;r i for the rst statementinthe

I I 0 0 0 0

metho d. Given a metho d op with formal parameter list

p ;:: :;p and accessed static class variables cl :f ;:::; cl :f ,

1 1 j j

0 k

4.2 Load Statements

the initial p oints-to escap e graph hO ;I ;e ;r i is de ned as

0 0 0 0

A load statement of the form l = l :f makes l p ointtothe

1 2 1

follows.

ob ject that l :f p oints to. The analysis mo dels this change

2

 In I , each parameter p p oints to its corresp onding

0

by constructing a set S of no des that represent all of the

i

. The formal parameter p p oints parameter no de n

p

ob jects to which l :f may p oint, then generating additional

0

2

i

to the receiver ob ject of the metho d; the no de n

p

inside edges from l to every no de in this set.

1

0

represents the receiver ob ject in the analysis of the

All no des accessible via inside edges from l :f should

2

metho d.

clearly b e in S . But if l p oints to an escap ed no de, other

2

I = fhp ;n i:0  i  k g

0 p

parts of the program such as the caller or threads executing

i

i

in parallel with the current thread can access the referenced

 In O , each accessed static class variable cl :f p oints

0 i i

ob ject and store values in its elds. In particular, the value

to its corresp onding global no de n .

cl :f

in l :f may have b een written by the caller or a thread i

i

2

running in parallel with the current thread | in other words,

O = fhhcl ; f i;n i:1  i  j g

0 i i

cl :f

i i

l :f may contain a reference created outside of the current

2

analysis scop e. The analysis uses an outside edge to mo del

 For all n, e n= ;.

this reference. The outside edge p oints to the load no de for

0

the load statement, which is the outside no de that represents

 r = ;.

0

the ob jects that the reference may p oint to.

The analysis must therefore consider two cases: the case

Note that the algorithm analyzes the metho d under the as-

when l do es not p ointto an escap ed no de, and the case

2

sumption that the parameters and accessed static class vari-

when l do es p oint to an escap ed no de. The algorithm

2

ables all p oint to di erent ob jects. If the metho d maybe

determines which case applies by computing S , the set of

E

invoked in a calling context in which some of these p ointers

escap ed no des to which l p oints. S is the set of no des

2 I

p oint to the same ob ject, this ob ject will b e represented by

accessible via inside edges from l :f .

2

multiple no des during the analysis of the metho d. In this

case, the mapping op eration describ ed b elow in Section 5

S = fn 2 I l :escap ed hO ; I ; e; r i;n g

E 2 2 2

will merge the corresp onding outside ob jects when it maps

S = [fI n ; f:n 2 I l g

I 2 2 2

the nal analysis result for the metho d into the calling con-

text at the metho d invo cation site. Because the mapping

If S = ; i.e., l do es not p oint to an escap ed no de,

E 2

retains all of the edges from the merged ob jects, it conser-

S = S and the transfer function simply kills all edges from

I

vatively mo dels the actual e ect of the metho d.

l , then generates inside edges from l to all of the no des

1 1

Once it has nished constructing the initial p oints-to es-

in S .

cap e graph, the analysis continues by propagating p oints-

Kill = edges I; l 

I 1

to escap e graphs through the statements of the metho d's

Gen = fl gS

I 1

0 0 0 0

0

control ow graph. The transfer function hO ;I ;e ;r i =

I = I Kill  [ Gen

I I

[[ st ]] hO ; I ; e; r i de nes the e ect of each statement st on

If S 6= ; i.e., l p oints to at least one escap ed no de, S =

the current p oints-to escap e graph. Most of the statements

E 2

S [fng, where n is the load no de for the load statement. In

rst kill a set of inside edges, then generate additional inside

I

addition to killing all edges from l , then generating inside

and outside edges. In this case, the transfer function has the

1

edges from l to all of the no des in S , the transfer function

following general form:

1

also generates outside edges from the escap ed no des to n.

0

I = I Kill  [ Gen

I I

0

Kill = edges I; l  O = O [ Gen

I 1 O

Gen = fl gS

I 1

0

Figure 6 graphically presents the rules that determine the

I = I Kill  [ Gen

I I

sets of generated edges for the di erent kinds of statements.

Gen = S ffg fng

O E

0

Eachrow in this gure contains four items: a statement, a

O = O [ Gen

O

graphical representation of existing edges, a graphical rep-

resentation of the existing edges plus the new edges that the

4.3 Store Statements

statement generates, and a set of side conditions. The inter-

pretation of eachrow is that whenever the p oints-to escap e

A store statement of the form l :f = l nds the ob ject to

1 2

graph contains the existing edges and the side conditions are

which l p oints, then makes the f eld of this ob ject p oint

1

satis ed, the transfer function for the statement generates

to same ob ject as l . The analysis mo dels the e ect of this

2

the new edges.

assignment by nding the set of no des that l p oints to,

1

then generating inside edges from all of these no des to the

no des that l p oints to.

2

4.1 Copy Statements

A copy statement of the form l = v makes l p oint to the

Gen = I l  ffg  I l 

I 1 2

0

ob ject that v p oints to. The transfer function up dates I to

I = I [ Gen

I

re ect this change by killing the current set of edges from l ,

Figure 6: Generated Edges for Basic Statements

4.9 Control-Flow Join Points 4.4 Global Load Statements

To analyze a statement, the algorithm rst computes the A global load statement of the form l = cl:f makes l p oint

join of the p oints-to escap e graphs owing into the statement to the same ob ject that cl :f p oints to. The analysis mo dels

from all of its predecessors. It then applies the transfer this change by killing all edges from l and generating inside

function to obtain a new p oints-to escap e graph at the p oint edges from l to all of the no des that cl :f p oints to.

after the statement. The join op eration is de ned as follows.

Kill = edgesI; l

I

hO ;I ;e ;r ithO ;I ;e ;r i = hO ; I ; e; r i, where O = O [

1 1 1 1 2 2 2 2 1

Gen = flgO [ I cl; f

I

O , I = I [ I , 8n 2 N:en=e n [ e n, and r = r [ r .

0

2 1 2 1 2 1 2

I = I Kill  [ Gen

I I

The corresp onding partial order for p oints-to escap e graphs

is hO ;I ;e ;r ivhO ;I ;e ;r i if O O , I I , 8n 2

1 1 1 1 2 2 2 2 1 2 1 2

4.5 Global Store Statements

N:e n e n, and r r . The b ottom p oints-to escap e

1 2 1 2

graph is h;; ;;e ; ;i, where e n=; for all n.

? ?

A global store statementof the form cl:f = l makes the

The analysis of each metho d pro duces analysis results

static class variable cl :f p oint to the same ob ject as l. The

st  and st b efore and after each statement st in the

analysis mo dels this change by generating inside edges from

metho d's control ow graph. The analysis result satis es

cl :f to all of the no des that l p oints to.

the following equations:

Gen = fhcl; fig  I l

I

0

I = I [ Gen

I

enter  = hO ;I ;e ;r i

op 0 0 0 0

0 0

st  = tf st :st 2 predstg

st  = [[ st ]] st 

4.6 Object Creation Sites

An ob ject creation site of the form l = new cl allo cates a

The nal analysis result of metho d op is the analysis result

new ob ject and makes l p oint to the ob ject. The analysis

at the program p oint after the exit no de, i.e., exit .

op

represents all ob jects allo cated at a sp eci c creation site

As describ ed b elow in Section 5.3, the analysis solves these

with the creation site's inside no de n. The transfer function

equations using a standard worklist algorithm.

mo dels the e ect of the statementby killing all edges from

l , then generating an inside edge from l to n.

4.10 Strong Up dates

Kill = edges I; l

I

Our analysis mo dels the execution of statements that up date

Gen = fhl ;nig

I

0

memory lo cations by adding edges to the no des that repre-

I = I Kill  [ Gen

I I

sent the up dated lo cations. There are two p ossible kinds of

up dates: weak up dates, which leave the existing edges in

4.7 Return Statements

place, and strong up dates, which remove the existing edges.

Because strong up dates leave fewer edges in the p oints-to es-

A return statement return l sp eci es the return value for

cap e graph, they may pro duce more precise analysis results.

the metho d. The immediate successor of each return state-

In the analysis presented so far, up dates to lo cal variables

ment is the exit statement of the metho d. The analysis

are strong and all other up dates are weak.

mo dels the e ect of the return statementby up dating r to

The analysis can legally p erform a strong up date when-

include all of the no des that l p oints to.

ever the up dated no de is captured and represents exactly

0

r = I l

one up dated memory lo cation when the program runs. For

a store statementof the form l :f = l , this condition is

1 2

4.8 Exit Statements

satis ed whenever l p oints to a single captured no de n,

1

n represents a single ob ject, and f represents a single lo-

The transfer function for an exit statement exit pro duces

op

cation in that ob ject. The last condition is satis ed unless

the nal analysis result for the metho d. This result is used,

f = elements , the sp ecial eld identi er for array elements

at all call sites that may invoke the metho d, to compute

discussed in Section 3.2.

the e ect of the metho d on the analysis of its caller. The

primary activity of the transfer function is to remove infor-

4.10.1 Strong Up dates for Singular No des and Fields

mation from the p oints-to escap e graph that should not b e

visible to the caller. The algorithm rst computes the set

The no de n is singular if it represents one ob ject. This is the

S of no des that are not reachable from the static class vari-

case, for example, if n is an inside no de that corresp onds to a

ables, the parameters, or the return values. It then removes

statement executed at most once within the current analysis

these no des from the p oints-to escap e graph.

scop e. The eld f is singular if it represents a single lo cation

within an ob ject. Given these de nitions, we can provide S = fn 2 N:not reachable O [ I; CL [ P [ r;ng

0

0 0 0 0

the following de nition of I for the transfer function of a

hO ;I ;e ;r i = removeS; hO ; I ; e; r i

store statement l :f = l . With this de nition, the transfer

1 2

The analysis uses the reachability information at metho d

function for store statements p erforms strong up dates.

exit p oints to b ound ob ject lifetimes. If an inside no de is



not reachable from the static class variables, the parameters,

edgesFromn; f if I l =fng; n; f singular ;

1

the return values, or a thread ob ject, then all of the ob jects

and capturedhI ; O ; e; r i;n

Kill =

I

it represents i.e., all of the ob jects allo cated at the corre-

; otherwise

sp onding ob ject allo cation site within the current analysis

Gen = I l  ff g  I l 

I 1 2

0

scop e are inaccessible b oth to the metho d's caller and to

I = I Kill  [ Gen

I I

other threads. When the metho d returns, these ob jects are

inaccessible to the rest of the computation. It is therefore

legal to allo cate these ob jects in the activation record of the

metho d and to access the ob jects without synchronization.

 The old graph is the p oints-to escap e graph hO ; I ; e; r i 4.10.2 Summary No des

at the p oint b efore the metho d invo cation site.

It is p ossible to extend the ob ject representation so that each

inside no de would represent only the last ob ject allo cated at

 The incoming graph is the nal analysis result

the corresp onding allo cation site. All other no des allo cated

hO ;I ;e ;r i = exit  for the invoked metho d.

R R R R op

at this site would b e represented by a summary no de. With

 The mapping algorithm pro duces the new graph

this approach, the analysis would p erform strong up dates for

hO ;I ;e ;r i = mapm; hO ; I ; e; r i; op.

M M M M

all inside no des, and weak up dates only for summary no des.

Similar approaches have b een prop osed for intrapro cedural

shap e analysis algorithms [24 , 12 ].

5.2.1 Overview

Conceptually, the algorithm rst builds an initial mapping

4.11 An Optimization for Static Class Variables

N

 : N ! 2 from the no des of the incoming graph to the

no des of the old graph. For each outside no de n in the in-

In the absence of information ab out how parallel threads ac-

coming graph, n is the set of no des from the old graph

cess static class variables, the extracted analysis information

that n represents during the analysis of the invoked metho d

imp oses no limit on the lifetimes or p oints-to relationships

op . This initial value of  maps global no des, parameter

of ob jects reachable from these variables. The implemented

no des, and load no des to their corresp onding no des in the

compiler therefore adopts a more compact representation

new graph. It builds the mapping for load no des by tracing

for the p oints-to relationships involving no des that repre-

correlated paths in the old graph and the incoming graph.

sent these ob jects. Instead of recording the sp eci c set of

Each path in the old graph consists of a sequence of inside

static class variables that may p oint to a no de, the analysis

edges; the corresp onding path in the incoming graph con-

simply records that at least one do es. This representation

sists of the sequence of outside edges that represent the cor-

reduces the size of the p oints-to escap e graph without re-

resp onding inside edges during the analysis of the metho d.

ducing the amount of useful information.

The algorithm then builds the new p oints-to escap e graph

hO ;I ;e ;r i = mapm; hO ; I ; e; r i; op. The algorithm

M M M M

5 Interpro cedural Analysis

starts by initializing the new graph to the old graph hO ; I ; e; r i.

It then uses the mapping  to map no des and edges from the

At each metho d invo cation site, the analysis has the option

incoming graph into the new graph. The mapp ed no des rep-

of either skipping the site or analyzing the site. If it analyzes

resent ob jects allo cated or accessed by the invoked metho d.

the site, it collects the nal analysis results from all of the

The mapp ed edges represent references that the invoked

p otentially invoked metho ds, maps each analysis result into

metho d created or read. During the mapping pro cess, edges

the p oints-to escap e graph from the program p oint b efore

are mapp ed from pairs of no des in the incoming graph to

the metho d invo cation site, then merges the mapp ed results

corresp onding pairs of no des in the new graph. In this case

to derive the p oints-to escap e graph at the p oint after the

wesay that the no des in the new graph acquire the edges

metho d invo cation site. If the analysis skips a metho d invo-

from the no des in the incoming graph.

cation site, it marks all of the parameters as escaping down

As part of the mapping pro cess, the algorithm extends

into the site.

 so that if the algorithm maps a no de n from the incoming

graph into the new graph, n 2 n. In this case we say

5.1 Skipp ed Metho d Invo cation Sites

that n is present in the new graph.

We next describ e the roles that the di erent kinds of

The transfer function for a skipp ed metho d invo cation site is

no des in the incoming graph play in the analysis, and discuss

de ned as follows. Given a skipp ed metho d invo cation site

how these roles a ect each no de's presence in the new graph,

m of the form l = l :op l ;:::;l  with return no de n

0 1 R

k

its e ect on the mapping , and the edges that are mapp ed

and a current p oints-to escap e graph hO ; I ; e; r i, the p oints-

0 0

into the new graph.

to escap e graph hO; I ;e;r i = [[ m]] hO ; I ; e; r i after the site

is de ned as follows:

 n 2 N : If n is a parameter no de, it represents the

P

0

I = I edges I; l [fhl;n ig

R



no des that the corresp onding actual parameter p oints

en [fmg if 90  i  k:n 2 I l 

i

0

to. These no des should acquire n's edges, but n itself

e n =

en otherwise

should not b e present in the new graph.

n is the set of no des in the current graph that the

The return no de n is an outside no de used to represent the

R

corresp onding actual parameter p oints to.

return value of the invoked metho d.

 n 2 N : If n is a global no de, it represents the ob jects

G

5.2 Analyzed Metho d Invo cation Sites

that the corresp onding static class variable p oints to.

These no des should acquire n's edges, and n should b e

Given an analyzed metho d invo cation site m of the form

present in the new graph. Note that n is also escap ed

l = l :op l ; :::;l  and a current p oints-to escap e graph

0 1

k

0 0 0 0

in the the new graph.

hO ; I ; e; r i, the new p oints-to escap e graph hO ;I ;e ;r i =

[[ m ]] hO ; I ; e; r i after the site is de ned as follows:

n is the set of no des in the current graph that the

0 0 0 0

corresp onding static class variable p oints to. Note that

hO ;I ;e ;r i = tfmapm; hO ; I ; e; r i; op:op 2 callees mg

n 2 n.

We next present the mapping algorithm for metho d in-

 n 2 N :Ifn is a load no de, all of the no des that it rep-

L

vo cation sites. We assume a metho d invo cation site m of the

resents should acquire its edges. If n may representan

form l = l :op l ;: ::;l  and an invoked metho d op with

0 1

k

ob ject created outside the analysis scop e of the caller

formal parameter list p ;:: :;p and accessed static class

0 k

or an ob ject accessed via a reference created outside

variables cl :f ;: ::;cl :f . There are three p oints-to es-

1 1 j j

cap e graphs involved in the algorithm:

mappings and edges. Figure 9 graphically presents the gen- that scop e, it and its edges should also b e mapp ed into

erated mappings and edges for some of the crucial rules. the new graph.

Each row in this gure contains four items: an inference

The basic idea is that n should b e present in the new

rule, a graphical representation of existing edges and map-

graph only if an escap ed no de would p oint to it |

pings, a graphical representation of the existing edges and

there should b e no outside edges from captured no des.

mappings plus the new edges and mappings that the rule

The analysis determines if n should b e presentby ex-

generates, and a set of side conditions. The no des on the

amining all of the no des in the incoming graph that

b ottom row of each graphical representation are from the

p ointton. There are two cases:

incoming graph, while the no des on the top row are from

0

the new graph. The interpretation of eachrow is that if the

{ If at least one of these no des say n  is present

existing edges and mappings are present in the incoming

and escap ed in the new graph, n should also b e

graph and new graph, and the side conditions are true, then

present, and there should b e an outside edge in

0

the inference rule generates the new edges and mappings.

the new graph from n to n.

{ If at least one of these no des is mapp ed to an es-

0

0  i  k; n 2 I l 

i

cap ed no de say n  from the old graph, n should

1

n 2 n 

p

b e present, and there should b e an outside edge

i

0

in the new graph from n to n.

1  i  j

2

cl 2 cl 

i i In b oth cases, n is escap ed in the new graph. n

includes the set of no des in the old graph that n rep-

hhn ; fi;n i2 O ; hhn ; fi;n i2I;

1 2 R 3 4

resented during the analysis of the metho d. And n 2

n 2 n ;n 62 N

3

3 1 1 I

nifn is present in the new graph.

n 2 n 

4 2

 n 2 N : If n is an inside no de, it and its inside edges

I

should b e present in the new graph if the ob jects that

Figure 7: Initial Rules for 

it represents are reachable in the caller. The analysis

determines if n should b e presentby examining all of

O O I edges l I

M M

4

the no des in the incoming graph that p oint to it. As

en e n r r

M M

for load no des, there are two cases:

hhn ; fi;n i2 I

0

1 2 R

{ If at least one of these no des say n  is present

5

n  ffg  n  I

1 2 M

in the new graph, n should also b e present, and

0

there should b e an edge in the new graph from n

hhn ; fi;n i2 I ;n 2 n ;n 2 N [ N

1 2 R 1 2 I R

6

to n.

n 2 n 

2 2

{ If at least one of these no des represents a no de

0

hhn ; fi;n i2O ;n 2 n ;

1 2 R 1

say n  in the old graph, n should be present,

escap ed hO ;I ;e ;r i;n

7

M M M M

and there should be an edge in the new graph

0

hhn; fi;n i2 O ;n 2 n 

2 M 2 2

from n to n.

n 2 r \ N [ N 

R I R

A primary di erence between load no des and inside

8

n 2 n

no des is that load no des are present in the new graph

only if they are escap ed in that graph. Inside no des

0

e n 6= ;;n 2 n

R

9

are present if they are reachable, and can be either

0

m 2 e n 

M

escap ed or captured in the new graph.

[fn:n 2 r g I l 10

R M

If an inside no de n is present in the new graph, n=

fng. Otherwise n=;.

Figure 8: Rules for  and hO ;I ;e ;r i

M M M M

 n 2 N : If n is a return no de these no des represent

R

return values from invoked but unanalyzed metho ds,

the conditions are the same as for inside no des.

5.2.3 Initial Rules

5.2.2 Constraints for the Mapping Algorithm

The initial set of contraints, Rules 1 through 3, set up the

initial mapping  so that all outside no des in the incoming

Figures 7 and 8 presenta formal sp eci cation of the con-

graph are mapp ed to the no des in the old graph that they

straints that the mapping  and the new p oints-to escap e

represent during the analysis of the invoked metho d op.

graph hO ;I ;e ;r i must satisfy. The constraints are

M M M M

sp eci ed as a set of inference rules that the nal mapping

 Rule 1 maps each parameter no de n to the no des

p

2

i

and p oints-to escap e graph must satisfy.

I l  in the old graph that it represents during the

i

Conceptually, many of these inference rules start with an

3

analysis of the metho d.

existing mapping and set of edges, then generate additional

2

 Rule 2, along with Rule 3, ensures that each accessed

Each inference rule is written in the standard form

global no de is mapp ed to the no des in the old graph

c ;:::;c

1

k

0 0

3

;:::;c c

Recall that the metho d invo cation site has actual parameters

j 1

l ;:: :;l , the metho d has formal parameters p ;:::;p , and n

0 p

k

0 k

i

which imp oses the constraint that if all of c ;:::;c are true, then

1

k

is the parameter no de for p .

0 0

i

all of c ; :::;c must b e true.

1 j

Figure 9: Generated Edges and Mappings for Inference Rules

4

that it represents during the analysis of the metho d. inside edges into the new graph take place inside mapNo de,

it is resp onsible for ensuring that the new graph contains

 Rule 3 maps outside no des to the no des that they rep-

the right inside edges.

resent during the analysis of the metho d, matching

The control structure of the algorithm is based on three

outside edges in the incoming graph with corresp ond-

worklists. Eachworklist corresp onds to one of the rules that

ing inside edges in the old graph. The constraint starts

map no des from the incoming graph to the new graph. Each

with a no de n from the incoming graph that already

1

worklist entry contains a no de n from the new graph and

maps to a no de n in the old graph. It then matches

3

an edge hhn ; fi;n i from the incoming graph. All worklist

1 2

edges from these two no des to nd a load no de n from

2

entries have the prop erty that n 2 n .

1

the incoming graph that representsanoden in the

4

When the algorithm pro cesses the worklist entry,itchecks

new graph during the analysis of the metho d. The

to see if it should up date the mapping for n . The details

2

constraint maps n to n .

2 4

of the check dep end on the sp eci c worklist.

Rules 1 through 3 map outside no des only. Rule 6, how-

 The worklist W corresp onds to Rule 3, which matches

E

ever, may map an inside no de into the new graph. It is

outside edges from the incoming graph against inside

imp ortant that Rule 3 do es not match outside edges from

edges in the old graph. All edges in this worklist are

these inside no des against inside edges in the old graph. The

outside edges. To pro cess an entry from this worklist,

reason for this restriction is that inside no des represent ob-

the algorithm matches the edge hhn ; fi;n i from the

1 2

jects that were allo cated inside the current analysis scop e.

worklist against all corresp onding inside edges

These ob jects did not exist when the analyzed metho d was

hhn; fi;n i in the old graph. For each corresp onding

4

invoked. Any outside edges from inside no des therefore rep-

edge it maps n to n .

2 4

resent references created either by threads running in paral-

 The worklist W corresp onds to Rule 6, which maps

lel with the current analysis scop e, or by unanalyzed meth-

I

inside no des into the new graph. All edges in this

ods invoked by the current analysis scop e. In either case

worklist are inside edges. To pro cess an entry from

the reference did not exist when the metho d was invoked.

this worklist, the algorithm extracts the no de n that

Because all inside edges in the old graph existed when the

2

the worklist edge p oints to, and maps n into the new

analyzed metho d was invoked, the outside edge in the in-

2

graph.

coming graph did not represent these edges.

 The worklist W corresp onds to Rule 7, which maps

O

5.2.4 Remaining Rules

outside no des into the new graph. All edges in this

worklist are outside edges. To pro cess an entry from

Rule 4 initializes the new graph to include the old graph.

this worklist, the algorithm inserts a corresp onding

Rule 5 maps inside edges from the incoming graph into the

outside edge hhn; f i;n i into the new graph and maps

2

new graph. There is an inside edge b etween two no des in

n into the new graph.

2

the new graph if there is an inside edge b etween their two

corresp onding no des in the incoming graph.

There is a slight complication in the algorithm. As the

Rules 6 through 8 map no des from the incoming graph

algorithm executes, it p erio dically maps inside edges from

into the new graph. Rule 6 maps an inside or return no de

the incoming graph into the new graph. Whenever a no de

into the new graph if it is reachable from a no de that is

n is mapp ed to n, the algorithm maps each inside edge

1

already mapp ed into the new graph. Rule 7 maps a load

hhn ; fi;n i from the incoming graph into the new graph.

1 2

no de into the new graph if it is reachable from an escap ed

This mapping pro cess inserts a corresp onding inside edge

no de that is already mapp ed into the new graph. Finally,

from n to each no de that n maps to; i.e., to eachnodein

2

Rule 8 maps an inside or return no de into the new graph if

n . The algorithm must ensure that when it completes,

2

it is returned by the metho d.

there is one such edge for each no de in the nal set of no des

Rule 9 ensures that all no des passed into unanalyzed

n . But when the algorithm rst maps hhn ; fi;n i into

2 1 2

metho ds are marked appropriately in the escap e function.

the new graph, n may not b e complete. In this case,

2

Rule 10 makes l the lo cal variable that is assigned to the

the algorithm will eventually map n to more no des, in-

2

value that the metho d returns p oint to the no des that rep-

creasing the set of no des in n . There should b e edges

2

resent the return value of the metho d.

from n to all of the no des in the nal n , not just to

2

those no de that were presentin n  when the algorithm

2

5.2.5 Constraint Solution Algorithm

mapp ed hhn ; fi;n i into the new graph.

1 2

The algorithm ensures that all of these edges are present

Figures 10 and 11 present an algorithm for solving the con-

in the nal graph by building a set  n  of delayed actions.

2

straint system in Figures 7 and 8. The algorithm directly

Each delayed action consists of a no de in the new graph and

re ects the structure of the inference rules. At each step it

a eld in that no de. Whenever the no de n is mapp ed to a

2

detects an inference rule antecedent that b ecomes true, then

0 0

new no de n i.e., the algorithm sets n =n  [fn g,

2 2

takes action to ensure that the consequent is also true.

the algorithm establishes a new inside edge for each delayed

The mapNo den ;n pro cedure is invoked whenever the

1

action. The new edge go es from the no de in the action to

algorithm maps a no de n from the incoming graph to a

1

0

the newly mapp ed no de n . These edges ensure that the

no de n in the new graph. It rst up dates the escap e func-

nal set of inside edges satis es the constraints.

tion e n to re ect any new escap e information. It then

M

maps new inside edges b oth to and from n to re ect the

5.3 Global Fixed-Point Analysis Algorithm

corresp onding inside edges from the incoming graph. Rule 5

is the only constraint that maps inside edges from the in-

Figure 12 presents the global xed-p oint algorithm that the

coming graph into the new graph. Because all insertions of

compiler uses to solve the combined intrapro cedural and in-

4

terpro cedural data ow equations. It maintains a worklist of

Recall that the metho d accesses static class variables

cl :f ; :::;cl :f , and that n is the global no de for cl :f .

p ending statements. At each step, it removes a statement

0 0 j j i i

cl :f

i i

Initialize analysis results

for all st 2 ST do

mapNo den ;n

st = st=h;; ;;e ; ;i

1

?

if hn ;ni62D then

for all op 2 OP do

1

n =n  [fng

enter =hO ;I ;e ;r i

1 1

op 0 0 0 0

D = D [fhn ;nig

Initialize the worklist

1

Update e n to satisfy Rule 9

W = fenter :op 2 OPg

M

op

if e n  6= ; then e n= e n [fmg

while W 6= ;do

R 1 M M

Update I to satisfy Rule 5

Remove a statement from worklist

M

I = I [  n  fng

W = W fst g

M M 1

for all hhn ; fi;n i2 I do

Process the statement

1 2 R

0 0

I = I [fhn; fig  n 

st =tf st :st 2 predstg

M M 2

 n = n  [fhn; f ig

st =[[ st ]] st 

2 2

Update worklists for Rules 3,6 and 7

if stchanged then

W = W [ fngedgesFromO ;n 

Put potential ly a ected statements

E E R 1

if n = n then

onto worklist

1

W = W [ fngedgesFromI ;n 

W = W [ succst

I I R 1

W = W [ fngedgesFromO ;n 

if st exit then

O O R 1

op

W = W [ callers op 

Figure 10: Pro cedure for Mapping One No de to Another

Figure 12: Fixed-Point Analysis Algorithm

Initialize worklists

W = W = W = ;

E I O

Initialize new graph to satisfy Rule 4

and up dates the analysis results b efore and after the state-

hO ;I ;e ;r i = hO; I edgesl ;e;ri

M M M M

ment. If the analysis result after the statement changed,

Map parameter nodes to satisfy Rule 1

it inserts all of its successors or for exit no des, all of the

for 0  i  k do

callers of its metho d into the worklist.

for all n 2 I l  do mapNo den ;n

i p

The order in which the algorithm analyzes metho ds can

i

Map classes to satisfy Rule 2

have a signi cant impact on the analysis. For non-recursive

for 1  i  j do mapNo decl ; cl 

metho ds, a b ottom-up analysis of the program yields the full

i i

Map return values to satisfy Rule 8

result with one analysis p er metho d. For recursive metho ds,

for all n 2 r \ N [ N  do mapNo den; n

the analysis results must b e iteratively recomputed within

R I R

done = false

each strongly connected comp onent of the call graph us-

while not done do

ing the current b est result until the analysis reaches a xed

done = true

p oint.

if cho ose hn ; hhn ; fi;n ii 2 W such that

It is p ossible to extend the algorithm so that it initially

3 1 2 E

n 62 N then

skips the analysis of metho d invo cation sites. If the analysis

1 I

W = W fhn ; hhn ; fi;n iig

result is not precise enough, it can incrementally increase

E E 3 1 2

Map nodes to satisfy Rule 3

the precision by analyzing metho d invo cation sites that it

for all hhn ; f i;n i2I do mapNo den ;n 

originally skipp ed. The algorithm will then propagate the

3 4 2 4

done = false

new, more precise result to up date the analysis results at

else if cho ose hn; hhn ; fi;n ii 2 W such that

a ected program p oints.

1 2 I

n 2 N [ N then

2 I R

W = W fhn; hhn ; fi;n iig

I I 1 2

6 Abstraction Relation

Map inside node to satisfy Rule 6

mapNo den ;n 

2 2

In this section, wecharacterize the corresp ondence b etween

done = false

p oints-to escap e graphs and the ob jects and references cre-

else if cho ose hn; hhn ; fi;n ii 2 W such that

1 2 0

ated during the execution of the program. Akey prop erty

escap ed hO ;I ;e ;r i;n then

M M M M

of this corresp ondence is that a single concrete ob ject in

W = W fhn; hhn ; fi;n iig

O O 1 2

the execution of the program may b e represented bymulti-

Map outside edge and outside node

ple no des in the p oints-to escap e graph. We therefore state

to satisfy Rule 7

the prop erties that characterize the corresp ondence using an

O = O [ fhhn; f i;n ig

M M 2

abstraction relation, which relates each ob ject to all of the

mapNo den ;n 

2 2

no des that represent it.

done = false

As the program executes, it creates a set of concrete

Update I l to satisfy Rule 10

M

ob jects o 2 C and a set of references r 2 R V  C [

I l =[fn:n 2 r g

M R

CL  F  C [ C  F  Cbetween ob jects. At each p oint

in the execution of the program, it is p ossible to de ne the

following sets of references and ob jects:

Figure 11: Constraint Solution Algorithm for  and

 R is the set of references created by the current ex-

C

hO ;I ;e ;r i

M M M M

ecution of the current metho d and all of the analyzed

metho ds that it invokes.

These prop erties ensure that captured ob jects are reach-  R is the set of references read by the current exe-

R

able only via paths that start with the lo cal variables. cution of the current metho d and all of the analyzed

If an ob ject is captured at a metho d exit p oint, it will metho ds that it invokes.

therefore b ecome inaccessible as so on as the metho d

 C is the set of ob jects reachable from the lo cal vari-

R

returns.

ables, static class variables, and parameters by follow-

 The p oints-to information in the p oints-to escap e graph ing references in R [ R .

C R

completely characterizes the references between ob-

 R = R \ C  F  C  is the set of inside refer-

I C R R

jects represented by captured no des:

ences. These are the references represented by the set

{ capturedhO ; I ; e; r i;n ; capturedhO ; I ; e; r i;n ; of inside edges in the analysis.

1 2

n 2 o ;n 2 o  and hhn ; fi;n i62 I implies

1 1 2 2 1 2

 R =R \ C  F  C  R is the set of outside

O R R R I

hho ; fi;o i62 R

1 2

references. These are the references represented by the

set of outside edges in the analysis.

7 Optimizations

It is always p ossible to construct an abstraction relation

For optimization purp oses, the two most imp ortant prop er-

 C  N between the ob jects and the no des in the p oints-

ties are that the absence of edges b etween captured no des

to escap e graph hO ; I ; e; r i at the current program p oint.

guarantees the absence of references b etween the represented

This relation relates each ob ject to all of the no des in the

ob jects, and that captured ob jects are inaccessible when the

p oints-to escap e graph that represent the ob ject during the

metho d returns. In this section we discuss two transforma-

analysis of the metho d. The abstraction relation has all of

tions, stack allo cation of ob jects and synchronization elimi-

the prop erties describ ed b elow.

nation, that the escap e information enables.

 Reachable ob jects are represented by their allo cation

sites. If o was created at an ob ject creation site within

7.1 Synchronization Elimination

the current execution of the current metho d or ana-

If an ob ject do es not escap e a thread, it is legal to remove

lyzed metho ds that it invokes, and o is reachable i.e.

the lo ck acquire and release op erations from synchronized

o 2 C , n 2 o, where n is the ob ject creation site's

R

metho ds that execute on that ob ject. The b ene t of this

inside no de.

transformation is the elimination of synchronization over-

5

 Each ob ject is represented by at most one inside no de:

head

To apply the transformation, the compiler generates a

{ n ;n 2 o and n ;n 2 N implies n = n

1 2 1 2 I 1 2

sp ecialized, synchronization-free version of each synchronized

metho d that may execute on a captured ob ject. At each call

 All outside references have a corresp onding outside

site that invokes the metho d only on captured ob jects, the

edge in the p oints-to escap e graph:

compiler generates co de that invokes the synchronization-

{ hv ;oi2 R implies O v  \ o 6= ;

free version.

O

An issue that the compiler must deal with is the dis-

{ hhcl ; fi;oi2 R implies O cl ; f \ o 6= ;

O

tance in the call graph b etween the metho d that captures

{ hho ; fi;o i2 R implies

1 2 O

the ob ject and the synchronized metho d. In addition to

o  ffg  o  \ O 6= ;

1 2

removing synchronization op erations from the synchronized

metho d, the compiler must also generate sp ecialized ver-

 All inside references have a corresp onding inside edge

sions of all metho ds in the call chain from the capturing

in the p oints-to escap e graph:

metho d to the metho d containing the call site that invokes

the synchronization-free version. Our sp ecialization p olicy

{ hv ;oi2 R implies I v \ o 6= ;

I

imp oses no limit on the numb er of sp ecialized metho ds |

{ hhcl ; fi;oi2 R implies I cl; f \ o 6= ;

I

it applies the transformation whenever p ossible.

{ hho ; fi;o i2 R implies

1 2 I

A nal problem is nding the paths in the call graph that

o  ffg  o  \ I 6= ;

1 2

go from captured ob jects to synchronized metho ds. The

compiler solves this problem by augmenting the analysis to

 If an ob ject is represented bya captured no de, it is

record, for each no de in the p oints-to escap e graph, paths in

represented by only that no de:

the call graph that lead to synchronized op erations on the

ob jects represented by that no de. For captured ob jects, the

{ n 2 o and capturedhO ; I ; e; r i;n implies

compiler can use this information to trace a path from the

o= fng

captured ob ject down to the call sites that invoke synchro-

Given this prop erty,we de ne that an ob ject is cap- nized metho ds on that ob ject.

tured if it is represented by a captured no de. All ref-

erences to captured ob jects are either from lo cal vari-

7.2 Stack Allo cation of Objects

ables or from other captured ob jects:

If an ob ject do es not escap e a metho d, it can b e allo cated

{ n 2 o, capturedhO ; I ; e; r i;n, and hv ;oi2R

on the stack instead of in the heap. One b ene t of this

implies v 2 L

transformation is the elimination of garbage collection over-

head. Instead of b eing pro cessed by the collector, the ob ject

{ n 2 o , capturedhO ; I ; e; r i;n , and

2 2 2

hho ; fi;o i 2 R implies 9n 2 N:o  = fn g

5

1 2 1 1 1

To preserve the semantics of the Java memory mo del on machines

and capturedhO ; I ; e; r i;n 

1

that implementweak memory consistency mo dels, the compiler may

need to insert memory barriers at the original lo ck acquire and release

sites when it removes the acquire and release constructs.

will b e implicitly collected when the metho d returns and the This approach provides excellent supp ort for dynamically

stack rolls back. Another b ene t is b etter memory lo cality loaded programs. It allows the compiler to analyze a large,

b ecause the stack frame will tend to b e resident in the cache. commonly used package such as the Java Class Libraries

Finally, if the compiler can precisely compute the lo cation once, then reuse the analyze results every time a program is

of the ob ject on the stack, it can generate co de that accesses loaded that uses the package. It also supp orts the delivery

the ob ject elds directly in the stack frame. of preanalyzed packages. Instead of requiring the analysis

The largest p otential drawback of stack allo cation is that to b e p erformed when the package is rst loaded into a cus-

it may increase memory consumption by extending the life- tomer's virtual machine, a vendor could p erform the analysis

time of ob jects allo cated on the stack. Because this prob- as part of the release pro cess, then ship the analysis results

lem may b e esp ecially acute for allo cation sites that create along with the co de.

a statically unb ounded numb er of ob jects on the stack, our When the Jalap eno ~ compiler generates co de for a dynam-

implementation allo cates ob jects on the stack only if the al- ically loaded class, it uses the information in the analysis le

lo cation site will b e executed at most once p er invo cation to apply the stack allo cation and synchronization elimina-

6

of the al locating method, or the metho d that contains the tion optimizations describ ed ab ove. One unusual feature

ob ject allo cation site. of the system is that the analysis is applied to the compiler

In the simplest case, a captured ob ject do es not escap e and virtual machine during the b o otstrap pro cess. The en-

its allo cating metho d. In this case, the compiler can sim- tire system, including the dynamic compiler and the virtual

ply replace the normal ob ject allo cation instruction with a machine as well as the applications, therefore executes with

the optimizations applied. sp ecial instruction that allo cates the ob ject on the heap.

If an ob ject escap es its allo cating metho d, but is recap-

tured within a caller direct or indirect, it may b e p ossi-

8.2 Benchmark Set

ble to allo cate the ob ject on the stack of the metho d that

7

Our b enchmark set includes four programs. Javac is the

captures the ob ject. In addition to requiring that stack-

8

standard Java compiler. Javacup isayacc-like parser gen-

allo cated ob jects come from allo cation sites that are exe-

9

erator for Java. Javalex is a lexical analyzer generator for

cuted at most once p er invo cation of the allo cating metho d,

Java. Pb ob is a transaction-pro cessing b enchmark designed

the compiler also requires that the allo cating metho d b e in-

to quantify the p erformance of simple transactional server

voked at most once for each call site in the captured metho d

workloads written in Java.

that leads to the allo cating metho d.

Wechose these applications in part b ecause they are all

If the allo cation site satis es this restriction, the compiler

complete programs in regular use. We therefore exp ect them

generates a sp ecialized version of the allo cating metho d. In-

to exhibit realistic ob ject creation and access patterns. They

stead of allo cating the ob ject itself, the sp ecialized version

were also chosen to test the scalability of our analysis. With

takes, as a parameter, a p ointer to preallo cated space in the

the asso ciated and analyzed class libraries, all three pro-

capturing metho d's activation record. The allo cation site

grams are on the order of tens of thousands of lines of co de,

is transformed to initialize the ob ject at the given lo cation,

and many existing p ointer analysis algorithms are impracti-

rather than to allo cate it in the heap. As for the synchro-

cal for programs of this size. Figure 13 presents the number

nization elimination transformation describ ed ab ove, the com-

of lines of co de and the total sizes of the class les for these

piler must generate sp ecialized versions of all metho ds on

applications. JVM is the Jalap eno ~ virtual machine.

the call chain from the capturing metho d to the allo cating

metho d.

For this transformation to interop erate correctly with

Lines of Class File

the rest of the system, the garbage collector must recog-

Benchmark Co de Size bytes

nize stack-allo cated ob jects. The recognition mechanism is

javac - 87,801

simple | it examines the address of the ob ject to deter-

javacup 10,574 157,057

mine if it is allo cated on a stack or in the heap. During

8,159 95,229 javalex

a collection, the collector still scans stack-allo cated ob jects

pb ob 18,370 253,752

normally. But it do es not attempt to move or collect stack-

JVM 70,536 456,494

allo cated ob jects.

8 Exp erimental Results

Figure 13: Benchmark Characteristics

Wehave implemented a combined p ointer and escap e analy-

We staged the analysis of the applications as follows.

sis based on the algorithm describ ed in Sections 4 and 5. We

We rst analyzed the virtual machine and the standard li-

implemented the analysis in the compiler for the Jalap eno ~

braries, writing the analysis results out to the appropriate

JVM [8 ], a Java virtual machine written in Java with a few

les. We then analyzed the application co de, reusing the

unsafe extensions for p erforming low-level system op erations

analysis results from the standard libraries.

such as explicit memory management and p ointer manipu-

lation.

8.1 Compiler Structure

The analysis is implemented as a separate phase of the

Jalap eno ~ dynamic compiler, which op erates on the Jalap eno ~

intermediate representation. To analyze a class, the algo-

rithm loads the class, converts its metho ds into the interme-

diate representation, then analyzes the metho ds. The nal

analysis results for the metho ds are written out to a le.

Number of Number of

Number of Number of

Synchronization Synchronization

Heap-Allo cated Heap-Allo cated

Op erations Op erations

Ob jects Ob jects

Before After

Before After

Benchmark Optimization Optimization

Benchmark Optimization Optimization

javac 4,583,013 2,925,653

javac 2,845,485 2,033,525

1,004,409 330,952 javacup

1,913,594 1,495,141 javacup

javalex 2,038,945 1,058,456

javalex 4,389,452 214,803

pb ob 205,198,941 155,764,374

pb ob 56,708,490 39,751,260

Figure 14: Number of Synchronization Op erations Before

Figure 16: Numb er of Heap-Allo cated Ob jects Before and

and After Optimization After Optimization

100% 100%

80% 80%

60% 60%

40% 40%

20% 20%

0% 0%

javac javacup javalex pbob javac javacup javalex pbob

Figure 15: Percentage of Eliminated Synchronization Op er-

Figure 17: Percentage of Ob jects Allo cated on the Stack

ations

Size of Size of

Heap-Allo cated Heap-Allo cated

8.3 Synchronization Elimination

Ob jects Ob jects

Figure 14 presents the dynamic numb er of synchronization

Before After

op erations for all of the applications b oth b efore and af-

Benchmark Optimization Optimization

ter the synchronization elimination optimization. Figure 15

javac 81,573,448 60,843,884

plots the p ercentage of eliminated synchronization op era-

javacup 54,089,120 43,038,640

tions. The analysis achieves reasonable results, eliminating

107,347,648 8,887,836 javalex

between 24 to 64 of the synchronization op erations.

pb ob 25,950,641,024 18,046,027,560

8.4 Stack Allo cation

Figure 18: Total Size of Heap-Allo cated Ob jects Before and

Figure 16 presents the dynamic number of heap allo cated

After Optimization bytes

ob jects for all of the applications b oth b efore and after the

stack allo cation optimization. Figure 17 plots the p ercent-

age of ob jects allo cated on the stack. Once again, the anal-

hieves reasonable results, allo cating b etween 22 and

ysis ac 100%

95 of the ob jects on the stack.

18 presents the total size of the heap allo cated

Figure 80%

ob jects for all of the applications b oth b efore and after the

k allo cation optimization. Figure 17 plots the p ercent-

stac 60%

age of memory allo cated on the stack instead of in the heap.

e amount of ob ject memory allo cated on the stack

The relativ 40%

is closely correlated with the numb er of ob jects allo cated on k.

the stac 20%

6

The currentversion of the Jalap eno ~ JVM do es not supp ort stack

allo cation. All ob jects are therefore allo cated on the heap. The com- 0%

piler do es, however, generate all of the sp ecialized metho ds necessary

k allo cation. The compiler fully implements the syn-

to supp ort stac javac javacup javalex pbob

chronization elimination optimization.

7

See the package sun.to ols.javac in the standard Java distribution.

8

Available at www.cs.princeton.edu/ app el/mo dern/java/CUP/

9

Figure 19: Percentage of Ob ject Memory Allo cated on the

Available at www.cs.princeton.edu/ app el/mo dern/java/JLex/

Stack

9 Related Work 9.1.2 for Multithreaded Programs

Rugina and Rinard recently develop ed a ow-sensitive p ointer

In this section, we discuss several areas of related work:

analysis algorithm for multithreaded programs [23 ]. The key

p ointer analysis, escap e analysis, and synchronization op-

complication in this algorithm is characterizing the interac-

timizations.

tion b etween multiple threads and how this interaction af-

fects the p oints-to relationship. The analysis presented in

9.1 Pointer Analysis

this pap er simply sidesteps this issue. It generates precise re-

sults only for ob jects that do not escap e. If an ob ject escap es

Pointer analysis for sequential programs is a relatively ma-

to another thread, our algorithm is not designed to main-

ture eld. Flow-insensitive analyses, as the name suggests,

tain precise information ab out its p oints-to relationships. It

do not take statement ordering into account, and typically

is imp ortant to realize that this is a key design decision that

use some form of constraint-based analysis to pro duce a

allows us to obtain go o d escap e analysis results with what

single p oints-to graph that is valid across the entire pro-

is essentially a lo cal analysis.

gram [2, 27 , 26 ]. Flow-insensitive analyses extend trivially

Corb ett describ es a shap e analysis algorithm for multi-

from sequential programs to multithreaded programs. Be-

threaded Java programs [12 ]. The algorithm can be used

cause they are insensitive to the statement order, they triv-

to nd ob jects that are accessible to a single thread or ob-

ially mo del all of the interleavings of the parallel executions.

jects that are accessed by at most one thread at any given

time. The goal is to use this information to reduce the size

9.1.1 Flow-Sensitive Analyses

of the state space explored by a mo del checker. The algo-

Flow-sensitive analyses take the statement ordering into ac-

rithm is intrapro cedural, and do es not address the problem

count, typically using a data ow analysis to pro duce a p oints-

of analyzing programs with multiple pro cedures.

to graph or set of alias pairs for each program p oint [25 , 22 ,

28 , 18 , 10 , 20 ]. One approach analyzes the program in a top-

9.2 Escap e Analysis

down fashion starting from the main pro cedure, reanalyzing

There has b een a fair amount of work on escap e analysis

each p otentially invoked pro cedure in each new calling con-

in the context of functional languages [4 , 3, 29 , 14, 15 , 5,

text [28 , 18 ]. This approach exp oses the compiler to the

19 ]. The implementations of functional languages create

p ossibility of sp ending signi cant amounts of time reanalyz-

many ob jects for example, cons cells and closures implic-

ing pro cedures. It also commits the compiler to an analysis

itly. These ob jects are usually allo cated in the heap and

of the entire program and is impractical for situations in

reclaimed later by the garbage collector. It is often p ossible

which the entire program is not available.

to use a lifetime or escap e analysis to deduce b ounds on the

Another approach analyzes the program in a b ottom-up

lifetimes of these dynamically created ob jects, and to p er-

fashion, extracting a single analysis result for each pro ce-

form optimizations to improve their memory management.

dure. The analysis reuses the result at each call site that

Deutsch [14 ] describ es a lifetime and sharing analysis for

mayinvoke the pro cedure. Sathyanathan and Lam present

higher-order functional languages. His analysis rst trans-

an algorithm for programs with non-recursive p ointer data

lates a higher-order functional program into a sequence of

typ es [25 ]. This algorithm supp orts strong up dates and ex-

op erations in a low-level op erational mo del, then p erforms

tracts enough information to obtain fully context-sensitive

an analysis on the translated program to determine the life-

results in a comp ositional way. Chatterjee, Ryder, and

times of dynamically created ob jects. The analysis is a

Landi describ e an approach that extracts multiple analysis

whole-program analysis. Park and Goldb erg [3 ] also describ e

results; each result is indexed under the calling context for

an escap e analysis for higher-order functional languages.

whichitisvalid [9 ]. Our approach extracts a single analysis

Their analysis is less precise than Deutsch's. It is, how-

result for calling contexts with no aliases, and merges no des

ever, conceptually simpler and more ecient. Their main

for calling contexts with aliases. The disadvantage of this

contribution was to extend escap e analysis to include lists.

approach is that it may pro duce less precise results than

Deutsch [15 ] later presented an analysis that extracts the

approaches that maintain information for multiple calling

same information but runs in almost linear time. Blanchet [5]

contexts. The advantage is that it leads to a simpler algo-

extended this algorithm to work in the presence of imp era-

rithm and smaller analysis results.

tive features and p olymorphism. He also provides a correct-

Many p ointer and shap e analysis algorithms are based

ness pro of and some exp erimental results.

on the abstraction of p oints-to graphs [24 , 18 , 28]. The

Baker [4 ] describ es an novel approach to higher-order

no des in these graphs typically represent memory lo cations,

escap e analysis of functional languages based on the typ e

ob jects, or variables, and the edges represent references b e-

inference uni cation technique. The analysis provides es-

tween the represented entities. A standard technique is to

cap e information for lists only. Hannan also describ es a

use invisible variables [20 , 18 , 28] to parameterize the anal-

typ e-based analysis in [19 ]. He uses annotated typ es to de-

ysis result so that it can b e used at di erent call sites. Like

scrib e the escap e information. He only gives inference rules

outside ob jects, invisible variables represent entities from

and no algorithm to compute annotated typ es.

outside the analysis context. But the relationships b etween

invisible variables are typically determined by the aliasing

relationships in the calling context or by the typ e system.

9.3 Synchronization Optimizations

In our approach, the relationships b etween outside ob jects

Diniz and Rinard [16 , 17 ] describ e several algorithms for p er-

are determined by the structure of the analyzed metho d |

forming synchronization optimizations in parallel programs.

each load statement corresp onds to a single outside ob ject.

The basic idea is to drivedown the lo cking overhead by coa-

This approach leads to a simple and ecient treatmentof

lescing multiple critical sections that acquire and release the

recursive data structures. Another di erence between the

same lo ckmultiple times into a single critical section that

approach presented in this pap er and other previously pre-

acquires and releases the lo ck only once. When p ossible, the

sented approaches is the explicit distinction b etween inside

algorithm also coarsens the lo ck granularityby using lo cks and outside edges.

algorithm can distinguish where it do es and do es not have in enclosing ob jects to synchronize op erations on nested ob-

complete information. As develop ers continue to moveto jects. Plevyak and Chien describ e similar algorithms for

a mo del of dynamically loaded, comp onent-based software, reducing synchronization overhead in sequential executions

we b elieve this general approach will b ecome increasingly of concurrent ob ject-oriented programs [21 ].

relevant and comp elling. Several research groups have recently develop ed synchro-

nization optimization techniques for Java programs. Aldrich,

Chamb ers, Sirer, and Eggers describ e several techniques for

Acknowledgements

reducing synchronization overhead, including synchroniza-

tion elimination for thread-private ob jects and several opti-

The authors would liketoacknowledge discussions with Radu

mizations that eliminate synchronization from nested moni-

Rugina and Darko Marinov regarding p ointer and escap e

tor calls [1 ]. Blanchet describ es a pure escap e analysis based

analysis. The second author would liketoacknowledge dis-

on an abstraction of a typ e-based analysis [6 ]. The imple-

cussions with Mo oly Sagiv regarding p ointer, escap e, and

mentation uses the results to eliminate synchronization for

shap e analysis. The rst author would liketoacknowledge

thread-private ob jects and to allo cate captured ob jects on

Jong-Deok Choi for intro ducing him to the concept of using

the stack. Bogda and Ho elzle describ e a ow-insensitive es-

escap e information for synchronization elimination, while he

cap e analysis based on global set inclusion constraints [7 ].

was an IBM employee in the summer of 1998. The authors

The implementation uses the results to eliminate synchro-

would like to thank the develop ers of the IBM Jalap eno ~ vir-

nization for thread-private ob jects. A limitation is that the

tual machine and the MIT Flex pro ject for providing the

analysis is not designed to nd captured ob jects that are

compiler infrastructure in which this research was imple-

reachable via paths with more than two references.

mented, and the anonymous reviewers for their comments.

Choi, Gupta, Serrano, Sreedhar, and Midki presenta

comp ositional data ow analysis for computing reachability

References

information [11 ]. The analysis results are used for synchro-

nization elimination and stack allo cation of ob jects. Like

[1] J. Aldrich, C. Chamb ers, E. Sirer, and S. Eggers. Static

the analysis presented in this pap er, it uses an extension

analyses for eliminating unnecessary synchronization

of p oints-to graphs with abstract no des that may represent

from java programs. In Proceedings of the 6th Inter-

multiple ob jects. It do es not distinguish b etween inside and

national Static Analysis Symposium, Septemb er 1999.

outside edges, but do es contain an optimization, deferred

edges, that is designed to improve the eciency of the anal-

[2] Lars Ole Andersen. Program Analysis and Specializa-

ysis. The approach classi es ob jects as globally escaping,

tion for the C Programming Language. PhD thesis,

escaping via an argument, and not escaping. Because the

DIKU, University of Cop enhagen, May 1994.

primary goal was to compute escap e information, the anal-

[3] B. Goldb erg and Y. Park. Escap e analysis on lists. In

ysis collapses globally escaping subgraphs into a single no de

Proceedings of the SIGPLAN '92 Conference on Pro-

instead of maintaining the extracted p oints-to information.

gram Language Design and Implementation, pages 116{

Our analysis retains this information, in part b ecause we

127, July 1992.

anticipate further thread interaction analyses for example,

extensions of existing p ointer analysis algorithms for multi-

[4] H. Baker. Unifying and conquer garbage, up dating,

threaded programs [23 ] that will use this information.

aliasing ... in functional languages. In Proceedings of

the ACM Conference on Lisp and Functional Program-

10 Conclusion

ming, pages 218{226, 1990.

This pap er presents a new combined p ointer and escap e

[5] B. Blanchet. Escap e analysis: Correctness pro of, imple-

analysis algorithm for ob ject-oriented programs. The al-

mentation and exp erimental results. In Proceedings of

gorithm is designed to analyze arbitrary parts of complete

the 25th Annual ACM Symposium on the Principles of

or incomplete programs, obtaining complete information for

Programming Languages,Paris, France, January 1998.

ob jects that do not escap e the analyzed parts.

ACM, ACM, New York.

Wehave implemented the algorithm in the IBM Jalap eno ~

[6] B. Blanchet. Escap e analysis for ob ject oriented lan-

virtual machine, and applied the analysis information to

guages. application to java. In Proceedings of the 14th

two optimization problems: synchronization elimination and

Annual Conference on Object-Oriented Programming

stack allo cation of ob jects. For our b enchmark programs,

Systems, Languages and Applications, Denver, CO,

our algorithms enable the stack allo cation of b etween 22

Novemb er 1999.

and 95 of the ob jects. They also enable the elimination of

between 24 and 67 of the synchronization op erations.

[7] J. Bo dga and U. Ho elzle. Removing unnecessary syn-

In the long run, we b elieve the most imp ortant concept

chronization in java. In Proceedings of the 14th Annual

in this researchmay turn out to b e designing analysis algo-

Conference on Object-Oriented Programming Systems,

rithms from the p ersp ective of extracting and representing

Languages and Applications, Denver, CO, November

interactions b etween analyzed and unanalyzed regions of the

1999.

program. This concept leads to representations, such as the

p oints-to escap e graphs presented in this pap er, that makea

[8] M. Burke, J. Choi, S. Fink, D. Grove, M. Hind,

clean distinction b etween information created lo cally within

V. Sarkar, M. Serrano, V. Sreedhar, H. Srinivasan, and

an analyzed region and information created in the rest of

J. Whaley. The jalap eno ~ dynamic

the program outside the analyzed region.

for java. In Proceedings of the ACM SIGPLAN 1999

These representations supp ort exible analyses that are

Java Grande Conference, June 1999.

capable of analyzing arbitrary parts of the program, with

the analysis result b ecoming more precise as more of the

program is analyzed. At every stage of the analysis, the

[21] J. Plevyak, X. Zhang, and A. Chien. Obtaining sequen- [9] R. Chatterjee, B. Ryder, and W. Landi. Relevant

tial eciency for concurrent ob ject-oriented languages. context inference. In Proceedings of the 26th Annual

In Proceedings of the 22nd Annual ACM Symposium on ACM Symposium on the Principles of Programming

the Principles of Programming Languages, San Fran- Languages, San Antonio, TX, January 1999.

cisco, CA, January 1995. ACM, New York.

[10] J. Choi, M. Burke, and P. Carini. Ecient

ow-sensitive interpro cedural computation of p ointer-

[22] E. Ruf. Context-insensitive reconsidered.

induced aliases and side e ects. In ConferenceRecordof

In Proceedings of the SIGPLAN '95 ConferenceonPro-

the Twentieth Annual Symposium on Principles of Pro-

gram Language Design and Implementation, La Jolla,

gramming Languages, Charleston, SC, January 1993.

CA, June 1995.

ACM.

[23] R. Rugina and M. Rinard. Pointer analysis for mul-

[11] J. Choi, M. Gupta, M. Serrano, V. Sreedhar, and

tithreaded programs. In Proceedings of the SIGPLAN

S. Midki . Escap e analysis for java. In Proceedings

'99 Conference on Program Language Design and Im-

of the 14th Annual Conference on Object-OrientedPro-

plementation,Atlanta, GA, May 1999.

gramming Systems, Languages and Applications, Den-

[24] M. Sagiv, T. Reps, and R. Wilhelm. Solving shap e-

ver, CO, Novemb er 1999.

analysis problems in languages with destructive up dat-

[12] J. Corb ett. Using shap e analysis to reduce nite-state

ing. ACM Transactions on Programming Languages

mo dels of concurrentjava programs. In Proceedings of

and Systems, 201:1{50, January 1998.

the International Symposium on Software Testing and

Analysis, March 1998.

[25] P. Sathyanathan and M. Lam. Context-sensitive in-

terpro cedural p ointer analysis in the presence of dy-

[13] J. Dean, D. Grove, and C. Chamb ers. Optimization

namic aliasing. In Proceedings of the Ninth Workshop

of ob ject-oriented programs using static class hierarchy

on Languages and for Paral lel Computing,

analysis. In Proceedings of the 9th European Confer-

San Jose, CA, August 1996. Springer-Verlag.

ence on Object-Oriented Programming, Aarhus, Den-

mark, August 1995.

[26] M. Shapiro and S. Horwitz. Fast and accurate ow-

insensitive p oints-to analysis. In Proceedings of the 24th

[14] A. Deutsch. On determining lifetime and aliasing of

Annual ACM Symposium on the Principles of Program-

dynamically allo cated data in higher-order functional

ming Languages,Paris, France, January 1997.

sp eci cations. In Proceedings of the 17th Annual ACM

Symposium on the Principles of Programming Lan-

[27] Bjarne Steensgaard. Points-to analysis in almost linear

guages, pages 157{168, San Francisco, CA, January

time. In Proceedings of the 23rdAnnual ACM Sympo-

1990. ACM, ACM, New York.

sium on the Principles of Programming Languages, St.

Petersburg Beach, FL, January 1996.

[15] A. Deutsch. On the complexity of escap e analysis. In

Proceedings of the 24th Annual ACM Symposium on the

[28] R. Wilson and M. Lam. Ecient context-sensitive

Principles of Programming Languages, Paris, France,

p ointer analysis for C programs. In Proceedings of the

January 1997. ACM, ACM, New York.

SIGPLAN '95 Conference on Program Language De-

sign and Implementation, La Jolla, CA, June 1995.

[16] P. Diniz and M. Rinard. Lo ck coarsening: Eliminat-

ACM, New York.

ing lo ckoverhead in automatically parallelized ob ject-

based programs. In Proceedings of the Ninth Workshop

[29] Y. Tang and P. Jouvelot. Control- ow e ects for escap e

on Languages and Compilers for Paral lel Computing,

analysis. In Workshop on Static Analysis, pages 313{

pages 285{299, San Jose, CA, August 1996. Springer-

321, Septemb er 1992.

Verlag.

[17] P. Diniz and M. Rinard. Synchronization transforma-

tions for parallel computing. In Proceedings of the 24th

Annual ACM Symposium on the Principles of Program-

ming Languages, pages 187{200, Paris, France, January

1997. ACM, New York.

[18] M. Emami, R. Ghiya, and L. Hendren. Context-

sensitiveinterpro cedural p oints-to analysis in the pres-

ence of function p ointers. In Proceedings of the SIG-

PLAN '94 Conference on Program Language Design

and Implementation, pages 242{256, Orlando, FL, June

1994. ACM, New York.

[19] J. Hannan. Atyp e-based analysis for blo ck allo cation

in functional languages. In Proceedings of the Second

International Static Analysis Symposium.ACM, ACM,

New York, Septemb er 1995.

[20] W. Landi and B. Ryder. A safe approximation algo-

rithm for interpro cedural p ointer aliasing. In Proceed-

ings of the SIGPLAN '92 ConferenceonProgram Lan-

guage Design and Implementation, San Francisco, CA, June 1992.