Comp ositional Pointer and Escap e Analysis for Multithreaded

Programs

Martin Rinard John Whaley

Lab oratory for Computer Science IBM Tokyo Research Lab oratory

Massachusetts Institute of Technology Network Computing Platform

Cambridge, MA 02139 1623-14 Shimotsuruma

Yamato-shi, Kanagawa-ken 242-8502 Japan

[email protected] [email protected]

For example, an ob ject escap es if it is reachable from an un- Abstract

analyzed running in parallel with the current thread

This pap er presents a new combined p ointer and escap e

or returned to an unanalyzed region of the program.

analysis algorithm for Java programs with unstructured mul-

Combining p oints-to and escap e information in the same

tithreading. The algorithm is based on the abstraction of

analysis enables the algorithm to represent all p otential in-

paral lel interaction graphs, whichcharacterize the p oints-to

teractions b etween the analyzed and unanalyzed regions of

and escap e relationships b etween ob jects and the ordering

the program. The algorithm represents these interactions,

relationships between actions p erformed by multiple par-

in part, by distinguishing b etween two kinds of edges: in-

allel threads. To our knowledge, this algorithm is the rst

side edges, which represent references created within the cur-

interpro cedural, ow-sensitive p ointer analysis algorithm ca-

rently analyzed region, and outside edges, which represent

pable of extracting the p oints-to relationships generated by

references created outside this region. Each outside edge

the interactions between unstructured parallel threads. It

represents a p otential interaction in which the analyzed re-

is also the rst algorithm capable of analyzing interactions

gion reads a reference created in an unanalyzed region. Each

between threads to extract precise escap e information even

inside edge from escap ed no de represents a p otential inter-

for ob jects accessible to multiple threads.

action in which the analyzed region creates a reference that

Wehave implemented our analysis in the IBM Jalap eno ~

an unanalyzed region may read.

dynamic for Java and used the analysis results

Representing p otential interactions with inside and out-

to eliminate redundant synchronization. For our b ench-

side edges leads to an analysis that is comp ositional in two

mark programs, the thread interaction analysis signi cantly

senses:

improves the e ectiveness of the synchronization elimina-

 Metho d Comp ositionality: The algorithm analyzes

tion algorithm as compared with previously published tech-

each metho d once to derive a single parameterized

niques, which do not analyze these interactions.

analysis result that records all p otential interactions of

1

the metho d with its callers. At each call site, the al-

1 Intro duction

gorithm matches outside edges from the callee against

inside edges from the caller to compute the e ect of

This pap er presents a new, combined p ointer and escap e

the callee on the p oints-to and escap e information of

analysis algorithm for multithreaded programs. To our knowl-

the caller.

edge, this algorithm is the rst interpro cedural, ow-sensitive

p ointer analysis algorithm for programs with the unstruc-

 Thread Comp ositionality: The algorithm analyzes

tured form of multithreading present in Java and similar

each thread once to derive an analysis result that records

languages. It is also the rst algorithm capable of analyz-

all of the p otential interactions of the thread with other

ing interactions between threads to extract precise escap e

parallel threads. The analysis can then combine analy-

information even for ob jects accessible to multiple threads.

sis results from multiple parallel threads by matching

each outside edge from each thread against all cor-

resp onding inside edges from parallel threads. The

1.1 Analysis Overview

result is a single parallel interaction graph that com-

The analysis is based on an abstraction we call paral lel in-

pletely characterizes the p oints-to and escap e informa-

teraction graphs. The no des in this graph represent ob jects;

tion generated by the combined parallel execution of

the edges between no des represent references between ob-

the threads. Unlike previously published algorithms,

jects. For each no de, the analysis also records information

which use an iterative xed-p oint algorithm to com-

that characterizes how it escap es the current analysis region.

pute the interactions [37 , 16 ], the algorithm presented

in this pap er can compute the interactions between

two parallel threads with a single pass over the paral-

lel interaction graphs from the threads.

Finally, the combination of p oints-to and escap e infor-

mation in the same analysis leads to an algorithm that is

1

Recursive metho ds require an iterative algorithm that may ana-

lyze metho ds multiple times to reach a xed p oint.

2 Example designed to analyze arbitrary regions of complete or incom-

plete programs, with the analysis result b ecoming more pre-

In this section we present an example that illustrates how

cise as more of the program is analyzed. Atevery stage in

the analysis works. Figure 1 presents the Java co de for the

the analysis, the current parallel interaction graph provides

example. The sum metho d in the Sum class computes the

complete information ab out the p oints-to relationships for

sum of the numb ers from 0 to n, storing the result into a

ob jects that do not escap e the currently analyzed region

destination accumulator a. It computes this sum by creating

of the program. The algorithm can therefore obtain useful

aworker thread to compute the sum of the even numb ers

analysis information without analyzing the entire program.

while it computes the sum of the o dd numb ers. When they

nish, b oth threads add their contribution into the destina-

1.2 Analysis Uses

tion accumulator. The sum metho d rst constructs a work

Parallel interaction graphs also record the actions that each

vector v of Integers for the worker thread to sum up, then

thread p erforms and contain ordering information for these

initializes the worker ob ject to p oint to the work vector and

actions relative to the actions p erformed by other threads.

the destination accumulator. It starts the worker thread

Optimization and analysis algorithms can use this informa-

running by invoking its start metho d, which invokes the

tion to determine that actions from di erent threads can

run metho d in a new thread running in parallel with the

never execute concurrently and are therefore indep endent.

current thread. This mechanism of initializing a thread ob-

Our compiler uses this ordering information to improve the

ject to p oint to its conceptual parameters is the standard

precision of the thread interaction analysis. It also uses this

way for Java programs to provide threads with the informa-

information to implement a synchronization elimination op-

tion they need to initiate their computation.

timization | if all lo ck acquire and release actions on a

given ob ject are indep endent, they have no e ect on the

class Accumulator {

computation and can b e removed.

int value = 0;

The analysis also provides information that is generally

synchronized void addint v {

useful to and program analysis to ols for multi-

value += v;

threaded programs. Potential applications of our analysis

}

include: sophisticated software engineering to ols such as

}

static race detectors and program slicers [28 , 36]; memory

class Sum {

system optimizations such as prefetching and moving com-

public static void sumint n, Accumulator a {

putation to remote data; automatic batching of long latency

1: Vector v = new Vector;

le system op erations; memory bank disambiguation in com-

for int i = 0; i < n; i += 2 {

pilers for distributed memory machines [8 ]; memory mo dule

v.addElementnew Integeri;

splitting in compilers that generate hardware directly from

}

high-level languages [6 ]; lo ck coarsening [33 , 22 ]; synchro-

2: Worker t = new Worker;

nization elimination and stack allo cation[41 , 10 , 12 , 15 ]; and

t.initv,a;

to provide information required to apply traditional com-

t.start;

piler optimizations such as constant propagation, common

int s = 0;

sub expression elimination, register allo cation, co de motion

for int i = 1; i < n; i+= 2 {

and elimination to multithreaded pro-

s = s + i;

grams.

}

a.adds;

1.3 Contributions

}

}

This pap er makes the following contributions:

class Worker extends Thread {

 Analysis Algorithm: It presents a new combined

Vector work;

p ointer and escap e analysis algorithm for multithreaded

Accumulator dest;

programs. The algorithm is comp ositional at b oth the

void initVector v, Accumulator a {

metho d and thread levels and is designed to deliver

work = v;

useful information without analyzing the entire pro-

dest = a;

gram.

}

public void run {

 Analysis Uses: It shows how to use the action order-

3: Enumeration e = work.elements;

ing information present in parallel interaction graphs

int s = 0;

to p erform a synchronization elimination optimization.

while e.hasMoreElements {

Integer i = Integer e.nextElement;

 Exp erimental Results: It presents exp erimental re-

s = s + i.intValue;

sults from a prototyp e implementation of the algo-

}

rithms. These results show that the algorithm can

4: dest.adds;

eliminate a signi cantnumb er of synchronization op-

}

erations.

}

The remainder of the pap er is organized as follows. Sec-

tion 2 presents an example that illustrates how the analy-

sis works. Sections 3 through 11 present the analysis algo-

Figure 1: Sum Example

rithms. Section 14 presents exp erimental results, Section 15

presents related work, and we conclude in Section 16.

We contrast the unstructured form of multithreading in

this example with the structured, fork-join form of multi-

Figure 2: Callee-Caller Interaction Between init and sum

Figure 3: Parallel Thread Interaction Between run and sum

Figure 4: Result of Parallel Thread Interaction

In addition to the p oints-to information, the parallel in- threading found in, for example, the Cilk programming lan-

teraction graph also records the synchronization actions that guage [11 ]. Once the thread is the example is created, it exe-

the metho d p erforms and the ob jects to which the actions cutes indep endently of its parent thread and in parallel with

are applied. In this case, the run metho d synchronizes on the rest of its parent thread's computation. Cilk threads, on

the work vector and the destination accumulator. The ac- the other hand, must join with their parent thread, complet-

tions hsync ; 3i and hsync; 4i record these synchronizations. ing b efore their parent thread returns from the pro cedure in

The synchronization on the work vector happ ens inside the which they were spawned.

enumeration's nextElement metho d | Java library classes We now illustrate the analysis by discussing its op era-

suchas Vector often come with the synchronization required tion on this example. We start with the init metho d. This

for correct execution in the face of concurrent access by par- metho d is passed the work vector and destination accumula-

allel threads. In this case, however, the synchronizations tor and initializes the worker to p oint to them. The analysis

are unnecessary. Even though the work vector is accessed result for this metho d is a paral lel interaction graph. Fig-

bymultiple threads, the accesses are separated temp orally ure 2 contains the p oints-to graph from the parallel interac-

by thread start events. Among other things, this example tion graph at the end of the init metho d. In general, our

will showhow the analysis detects the indep endence of the p oints-to graphs contain two kinds of no des: inside no des,

synchronization actions on the work vector. which represent ob jects created during the computation of

We next move to the parallel interaction graph at the end the metho d, and outside no des, which represent ob jects cre-

of the sum metho d. In addition to the p oints-to and action ated outside its computation. All of the no des in the p oints-

information, the graph records the threads that the metho d to graph for the init metho d represent the receiver or ob-

starts and ordering information between the metho d's ac- jects passed into the metho d as parameters. These no des are

tion and started threads. In this case, sum starts the worker therefore outside no des. Our p oints-to graphs also contain

thread, which is represented in the analysis bynode 2. The two kinds of edges: inside edges, which represent references

ordering relation hsync ; 0ijj2 records the fact that a syn- created during the computation of the metho d, and out-

chronization action on ob ject 0 the destination accumula- side edges, which represent references created outside the

tor may execute in parallel with the actions of the worker computation. Because the init metho d reads no references

thread. Note that there is no parallel ordering relation b e- created by other metho ds or threads, all of the edges in its

tween the worker thread and sum's synchronization actions p oints-to graph are inside edges.

on the work vector ob ject. This absence indicates that all

of these actions o ccur b efore the worker thread starts, and

2.1 Interaction Between Caller and Callee

therefore do not execute in parallel with any of the thread's

Figure 2 also presents the p oints-to graph from the sum

actions.

metho d just b efore the call to init. This graph contains

To mo del the parallel interaction, the analysis constructs

one outside no de no de 0, which represents the destination

a bidirectional mapping b etween the no des of the parallel

accumulator passed as a parameter to sum. It also contains

interaction graphs. Initially, the receiver ob ject of the run

two inside no des | no de 1, which represents the work vec-

metho d is mapp ed to the worker thread ob ject from the sum

tor, and no de 2, which represents the worker thread ob ject.

metho d. The analysis then matches outside edges from one

Each inside no de corresp onds to an ob ject creation site and

graph to corresp onding inside edges from the other graph,

represents all ob jects created at that site. In the example, we

using the match to extend the mapping from outside no des

lab el inside no des with the line numb er of the corresp onding

to inside no des. When the mapping is complete, it is used

ob ject creation site in Figure 1.

to combine the graph from the run metho d into the graph

We next discuss how the analysis combines this p oints-to

from the sum metho d. Figure 4 presents the combined graph.

graph with the p oints-to graph from the init metho d to de-

In addition to the p oints-to information, the combination

rive the p oints-to graph after the call to init. The algorithm

algorithm also translates the synchronization actions from

uses the corresp ondence b etween the formal and actual pa-

the worker thread into the new parallel interaction graph. It

rameters to construct a mapping from the outside no des of

tags each action with the thread that p erforms the action.

the init metho d to the no des of the sum metho d. This

So, for example, the action hsync ; 0; 2 i indicates that no de 2

mapping is then used to translate the inside edges from the

the worker thread's no de may p erform a synchronization

init metho d into the p oints-to graph from the sum metho d.

action on an ob ject represented bynode 0.

Figure 2 presents the result of this mapping, which yields

The run metho d also invokes the work.elements metho d,

the p oints-to graph after the call to init.

which creates an enumerator ob ject to enumerate through

elements of the vector. The enumerator ob ject is represented

by an inside no de. Note that b ecause the enumeration ob-

2.2 Interaction Between Parallel Threads

ject is captured in the run metho d, it cannot a ect the com-

We next discuss how the analysis computes the interaction

putation outside this metho d. Therefore, the algorithm do es

between the two threads in the example. Figure 3 presents

not transfer its inside no de into the combined graph.

the parallel interaction graph from the end of the worker

thread's run metho d. This metho d loads the work and dest

2.3 Information in the Combined Graph

references from the receiver ob ject. Because these references

were created outside the run metho d, the analysis uses an

The compiler can extract the following information from the

outside edge to represent each reference. Each of these out-

combined graph. First, the work vector no de is captured in

side edges p oints to sp eci c kind of outside no de called a load

this graph. Even though it is accessed bymultiple threads,

node. In general, there is one load no de for each statement

it is not reachable from outside the total computation of

in the program that loads a reference from an escap ed ob-

the sum metho d. The combined graph therefore completely

ject; that load no de represents all of the ob jects to which the

characterizes the p oints-to information and actions involv-

reference may p oint. In Figure 3, wehave lab eled the load

ing the work vector ob ject. Second, b oth the thread exe-

no des with the numb er of the corresp onding load statement

cuting the sum metho d and the worker thread synchronize from Figure 1.

on the work vector ob ject. But the ordering information The analysis represents the control ow relationships b e-

indicates that none of these synchronizations can o ccur con- tween statements as follows: predst  is the set of state-

currently. The synchronization actions have no e ect on the ments that may execute immediately b efore st, and succst 

2

computation and can therefore b e removed. Finally, b oth is the set of statements that may execute immediately after

thread synchronize on the destination accumulator ob ject. st . There are two program p oints for each statement st,

But in this case, the synchronization actions from the sum the program p oint st immediately b efore st executes, and

thread may execute in parallel with the synchronization ac- the program p oint st  immediately after st executes. The

tions from the worker thread. The compiler will not b e able control ow graph for each metho d op starts with an enter

to remove the synchronization from the destination accumu- statement enter and ends with an exit statement exit .

op op

lator ob ject. The interpro cedural analysis uses call graph information

to compute sets of metho ds that maybeinvoked at metho d

invo cation sites. For each metho d invo cation site m 2 M ,

3 Analysis Abstractions

callees m is the set of metho ds that m mayinvoke. Given a

metho d op, callers op is the set of metho d invo cation sites

In this section we formally present the basic abstractions

that mayinvoke op. The current implementation obtains

that the analysis uses: the program and ob ject represen-

this call graph information using a variant of class hierarchy

tations, p oints-to escap e graphs, and parallel interaction

analysis [17 ], but the algorithm can use any conservative

graphs.

approximation to the actual call graph generated when the

program runs.

3.1 Program Representation

The algorithm represents the program using the following

3.2 Object Representation

analysis ob jects. There is a set l 2 L of lo cal variables and

The analysis represents the ob jects that the program ma-

a set p 2 P of formal parameter variables. There is one for-

nipulates using a set n 2 N of no des. There are several

mal parameter variable for each formal parameter of each

kinds of no des:

metho d in the program. There is also a set cl 2 CL of class

names. The analysis mo dels static class variables using a

 There is a set N of inside no des. Inside no des rep-

I

one-of-a-kind no de for each class; the elds of this no de are

resent inside objects, which are ob jects created within

the static class variables for the corresp onding class. The

the current analysis scop e and accessed via references

analysis therefore treats the class name cl as a read-only

created within the current analysis scop e. This set

variable that p oints to the corresp onding one-of-a-kind no de

consists of two subsets:

that contains the class's static class variables. Together, the

{ No des in N represent ob jects created by the cur-

I

lo cal, formal parameter, and class name variables makeup

rent thread. There is one no de in N for each ob-

I

the set v 2 V = L [ P [ CL of variables. There is also a set

ject creation site; that no de represents all ob jects

f 2 F of ob ject elds and a set op 2 OP of metho ds. Ob-

created at that site by the current thread.

ject elds are accessed using syntax of the form v:f . Static

class variables are accessed using syntax of the form cl:f .

N represent ob jects created by threads { No des in

I

Each metho d has a receiver class cl and a formal parameter

running in parallel with the current thread. There

list p ;:::; p . We adopt the convention that parameter p

N for each ob ject creation site; is one no de in

I

0 k 0

p oints to the receiver ob ject of the metho d.

that no de represents all ob jects created at that

The algorithm represents the computation of each metho d

site by threads running in parallel with the cur-

using a control ow graph. The no des of these graphs are

rent thread.

statements st 2 ST . We assume the program has b een pre-

Two no des n 2 N and n 2 N are corresp onding

1 I 2 I

pro cessed so that all statements relevant to the analysis are

no des if they represent ob jects created at the same

in one of the following forms:

ob ject creation site.

 A copy statement l = v.

In Java, each thread corresp onds to an ob ject that im-

plements the Runnable interface. The set N N

T I

 A load statement l = l :f .

1 2

of runnable no des represents runnable ob jects. N

T

represents runnable ob jects created by the current N

I

 A store statement l :f = l .

1 2

thread, and N N represents runnable ob jects cre-

T I

 A monitor acquire statement acquire l.

ated by threads running in parallel with the current

thread. N = N [ N .

T T T

 A monitor release statement release l.

 There is a set N of outside no des. Outside no des rep-

O

 A return statement return l , which identi es the re-

resent outside objects, which are ob jects created out-

turn value l of the metho d.

side the current analysis scop e or accessed via refer-

ences created outside the current analysis scop e. This

 An ob ject creation site of the form l = new cl .

set consists of several subsets:

 A metho d invo cation site m 2 M of the form

{ There is a set N of load no des. When a load

L

l = l :opl ;::: ;l .

0 1

k

statement executes, it loads a value from a eld

 A thread start site of the form l:start .

in an ob ject. If the loaded value is a reference, the

analysis must represent the ob ject that the refer-

2

To satisfy the Java memory mo del, the compiler mayhaveto

ence p oints to. Each load no de represents outside

leave memory barriers b ehind at the old synchronization p oints.

ob jects whose references are loaded by the corre-

sp onding load statement. There are two kinds of load no des:

 I N  F  N  [ V  N  is a set of inside edges. In-  N contains one no de for each load statement

L

side edges represent references created inside the cur- in the program. That no de represents outside

rent analysis scop e. ob jects whose references are loaded at that

statementby the current thread.

N [N [N [M

P C T

 e : N ! 2 is an escap e function that

 N contains one no de for each load statement

L

records the escap e information for each no de. Anode

in the program. That no de represents outside

escap es if it is reachable from a parameter no de n 2

P

ob jects whose references are loaded at that

N , a static class variable represented by a eld of a

P

statementby threads running in parallel with

class no de n 2 N , a thread no de n 2 N run-

C C T T

the current thread.

ning in parallel with the current thread, or an ob ject

Two no des n 2 N and n 2 N are corresp ond-

1 L 2 L

passed as a parameter to or returned from an unana-

ing no des if they represent outside ob jects whose

lyzed metho d invo cation site.

references are loaded at the same load statement.

 r N is a return set that represents the set of ob-

{ There is a set of return no des N . When the al-

R

jects that may b e returned by the currently analyzed

gorithm skips the analysis of a metho d invo cation

metho d. All no des in the return set escap e to the caller

site, it uses a return no de to represent the return

of the analyzed metho d.

value of the metho d invoked at that site. There

are two kinds of return no des:

Both O and I are graphs with edges lab eled with a eld

from F . We de ne the following op erations on no des of the

contains one no de for each skipp ed metho d  N

R

graphs:

invo cation site in the program. That no de

represents ob jects returned by metho ds in-

0

edgesToI; n = fhv;ni2I g[fhhn ; f i;ni2 I g

voked at that site by the current thread.

edgesFromI; v = fhv;ni2I g

N contains one no de for each skipp ed metho d 

0

R

edgesFromI; n = fhhn; fi;n i2 I g

invo cation site in the program. That no de

edges I; v = edgesFromI; v

represents ob jects returned by metho ds in-

edgesI; n = edgesToI; n [ edgesFromI; n

voked at that site by threads running in par-

I v = fn:h v;ni2 I g

allel with the current thread.

0 0

I n; f = fn :hhn; fi;n i2 I g

Two no des n 2 N and n 2 N are corresp ond-

1 R 2 R

For eachnode n, the escap e function en and the return

ing no des if they represent ob jects returned at the

set r together record all of the di erentways the no de and

same skipp ed metho d invo cation site.

the ob jects that it represents may escap e from the current

{ N : There is one parameter no de n 2 N for each

P P

analysis scop e. Here are the p ossibilities:

formal parameter in the program. Each param-

eter no de represents the ob ject that its parame-

 If a parameter no de n 2 en, then n represents an

p

ter p oints to during the execution of the analyzed

ob ject that may b e reachable from p.

metho d. The receiver ob ject is treated as the rst

 If a class no de n 2 en, then n represents an ob-

parameter of each metho d. Given a parameter p ,

cl

ject that may b e reachable from one of the static class

the corresp onding parameter no de is n . There

p

variables of the class cl .

is always an inside edge from p to n .

p

{ N : There is one class no de n 2 N for each class

C C

 If a thread no de n 2 en, then n represents an ob ject

T

in the program. The elds of this no de represent

that may b e reachable from a runnable ob ject repre-

the static class variables of its class. Given a class

sented by n .

T

cl, the corresp onding class no de is n . There is

cl

 If a metho d invo cation site m 2 en, then n represents

always an inside edge from cl to n .

cl

an ob ject that may b e reachable from the parameters

The set N = N [N [N , and the set N = N [N [N .

I L R I L R

or the return value of an unanalyzed metho d invoked

Given a no de n 2 N , n represents the corresp onding no de

at m.

N . Givenanoden 2 N , n represents the corresp onding in

 If n 2 r , then n represents an ob ject that may be

. no de in N

returned to the caller of the analyzed metho d.

The analysis represents each array with a single no de.

This no de has a eld elements , which represents all of the

The escap e information must satisfy the escap e informa-

elements of the array. Because the p oints-to information for

tion propagation invariant that if n p oints to n , then n

1 2 2

all of the array elements is merged into this eld, the analysis

escap es in at least all of the ways that n escap es. We for-

1

do es not make a distinction b etween di erent elements of the

malize this invariant with the following inference rule, which

same array.

states that if there is an edge from n to n , en  en .

1 2 1 2

When the analysis adds an edge to the p oints-to escap e

3.3 Points-To Escap e Graphs

graph, it may need to up date the escap e information.

A p oints-to escap e graph is a quadruple of the form hO ; I ; e; r i,

hhn ; fi;n i2 O [ I

1 2

where

en  en 

1 2

 O N  F  N is a set of outside edges. Outside

L

We say that a no de n violates the propagation con-

1

edges represent references created outside the current

straint if there is an edge from n to n and en  6 en .

1 2 1 2

analysis scop e, either by the caller, by a thread run-

During the analysis of a metho d, the algorithm may add

ning in parallel with the current thread, or by an un-

edges to the p oints-to escap e graph. These edges may make

analyzed invoked metho d.

the up dated no des the no des that the new edges p oint from

5 Parallel Interaction Graphs temp orarily violate the propagation constraint. Whenever it

adds a new edge, the analysis uses the propagate hO ; I ; e; r i;S

The algorithm uses a data ow analysis to generate, at each

algorithm in Figure 5 to propagate escap e information from

program p oint in the metho d, a parallel interaction graph

the no des in S to restore the invariant and pro duce a new

0

hG;  ; ;  i.

escap e function e that satis es the propagation constraint.

This algorithm takes a p oints-to escap e graph and a set S

 G is a p oints-to escap e graph that summarizes the

of no des that may violate the constraint, then uses a work-

p oints-to and escap e information for the current thread.

list approach to propagate the escap e information from the

up dated no des to the other no des in the graph.

 The parallel thread map  : N !f0; 1; 2g counts the

T

numb er of instances of each thread that may execute

in parallel with the current thread at the current pro-

propagate hO ; I ; e; r i;S

gram p oint. If  n = 1, then at most one instance of

Initialize worklist and new escape function

n may execute in parallel with the current thread at

0

e = e

the current program p oint; if  n = 2, then multiple

W = S

instances of n may execute in parallel with the current

while W 6= ;do

thread at the current program p oint. We de ne the

Remove a node from worklist

following op erations:

W = W fn g

1

{ x  y = min2;x + y 

Propagate escape information to al l nodes



that n points to

1

2 if x  2

for all hhn ; fi;n i2O [ I do

{ x y =

1 2

max0;x y  otherwise

Restoreconstraint for n

2

0 0 0

n =e n  [ e n  e

2 2 1

 The action set A records the set of actions exe-

0

if e n changed then

2

cuted by the analyzed computation.

W = W [fn g

2

0

 The parallel action relation  A  N records or- returne 

T

dering information b etween the actions of the current

thread and threads that execute in parallel with the

Figure 5: Escap e Information Propagation Algorithm

current thread. Sp eci cally, ha; n i2 if a mayhave

T

happ ened after at least one of the thread ob jects rep-

resented by n started executing. In this case, the

Given our abstraction of p oints-to escap e graphs, we can

T

actions of a thread ob ject represented by n may af-

de ne the concepts of escap ed and captured no des as follows:

T

fect a.

 escap ed hO ; I ; e; r i;nifen 6= ; or n 2 r , and

removehG;  ; ;  i;S denotes the parallel interaction graph

 capturedhO ; I ; e; r i;nifen=; and n 62 r .

obtained by removing all of the no des in S from hG;  ; ;  i.

The analysis uses parallel interaction graphs to compute

the interactions b etween parallel threads. During the anal-

4 Actions

ysis, one of the threads is the current thread; conceptually,

no des from the other threads moveinto the context of the

The algorithm is designed to record various actions that the

current thread. In the analyis context of the current thread,

program p erforms on ob jects. Each action consists of an

all of the no des from the other threads come from outside

action lab el that identi es the kind of action p erformed, a

the current thread. The analysis mo dels this by replacing

no de that represents the ob ject on which the action was

each no de from an other thread with its corresp onding ver-

p erformed, and an optional thread no de that represents the

sion from outside the current thread. Given a parallel inter-

thread that p erformed the action. For the purp oses of this

action graph hhO ; I ; e; r i;; ;i, hh O; I; e; r i; ; ;  i is the

pap er, the set of action lab els is b 2 B = fld ; sync g. The

parallel interaction graph with no des replaced with the cor-

set of actions is a 2 A =B  N  [ B  N  N . Here is

T

resp onding no des from outside the current thread, de ned

the meaning of the actions:

as follows:

 hsync ;ni records a synchronization action either a



n if n 2 N

monitor acquire or release p erformed by the current

n =

n otherwise

thread on an ob ject represented by n.

 S  = fn:n 2 S g

 hsync ;n;n i records a synchronization action p erformed

T

O = fhhn ; fi;n i:hhn ; fi;n i2O g

1 2 1 2

by a thread represented by n .

T

I = fhhn ; fi;n i:hhn ; fi;n i2I g[

1 2 1 2

fhv;ni:hv;ni2I g

 hld;ni records a load by the current thread on an es-



cap ed no de. The result of the load is a reference to an

 en [  en if n 2 N

ob ject represented by the load no de n.

en =

; if n 2 N

 en otherwise

 hld;n;n i records a load p erformed by the thread n .

T T

r = fn:n 2 r g



 if n 2 N  n   n

It is straightforward to augment the set of action lab els and

 n =

0 otherwise

the analysis to record arbitrary actions such as reading and



writing ob jects or invoking a given metho d on an ob ject.

hb ;ni if a = hb;ni

 a =

A

It is also straightforward to generalize the set of actions to

hb ;n;n i if a = hb;n;n i

T T

include actions p erformed on multiple ob jects.

= f a:a 2 g

A

 = fh a;n i:h a; n i2 g A

statement, a graphical representation of existing edges, a The following op eration removes a set of no des S from a

graphical representation of the existing edges plus the new parallel interaction graph hhO ; I ; e; r i;; ;i.

edges that the statement generates, and a set of side condi-

0 0 0 0 0 0 0

hhO ;I ;e ;r i; ; ; i = removehhO ; I ; e; r i;; ;i;S

tions. The interpretation of eachrow is that whenever the

p oints-to escap e graph contains the existing edges and the

where

side conditions are satis ed, the transfer function for the

statement generates the new edges. Wewould like to p oint

0

S = N S 

out several asp ects of the intrapro cedural analysis:

0 0 0

O = O \ S  F   S 

0 0 0

I = I \ S  F   S 

 start Statements: At each start statement of the

0 0

e n = en \ S [ M 

form l.start, l may p oint to several thread no des.

0 0

r = r \ S

The analysis adds all of these no des to the new parallel



0

thread map.

 n if n 2 S

0

 n =

0 otherwise

 Synchronization: The transfer function for synchro-

0 0 0 0

= \ B  S  [ B  S  S 

nization statements adds a synchronization action to

0 0 0 0 0

 =  \ B  S  [ B  S  S   S 

the new parallel interaction graph. This synchroniza-

tion action is recorded as executing in parallel with all

6 Intrapro cedural Analysis

of the thread no des in the current parallel thread map.

This ordering information is used later in the analysis

The analysis of each metho d op starts with the initial par-

to help determine if synchronization actions on a given

allel interaction graph hhO ;I ;e ;r i; ; ; i, de ned as

0 0 0 0 0 0 0

no de are indep endent.

follows:

 Outside Edges: A load statementmay add an out-

side edge to the current parallel interaction graph. In

 In I , each formal parameter p oints to its corresp ond-

0

this case, the transfer function also records the fact

ing parameter no de and each class p oints to its corre-

that the outside edge is created in parallel with all

sp onding class no de.

of the thread no des in the parallel thread map. This

I = fhp;n i:p 2 Pg[fhcl;n i:cl 2 CLg

ordering information is used during the thread interac-

0 p

cl

tion algorithm to ensure that these outside edges are

not matched with inside edges from threads whose ex-

 The initial set of outside edges is empty: O = ;

0

ecution starts after the execution of the load statement

 The initial escap e function e is set up so that each

that generated the outside edge.

0

parameter or class no de is marked as escaping via it-

We next present the data ow analysis framework from

self.



the intrapro cedural analysis. This framework includes the

fn g if n 2 N [ N

P C

e n=

0

transfer functions for the basic statements and the de nition

; otherwise

of the con uence op erator at merge p oints in the control- ow

graph.

 The initial return set and action set are empty: r = ;,

0

and = ;.

0

6.1 Copy Statements

 Initially there are no threads running in parallel with

A copy statement of the form l = v makes l p oint to the

the current thread. 8n 2 N : n =0.

T T 0 T

ob ject that v p oints to. The transfer function up dates I to

 The initial parallel action relation is empty:  = ;.

re ect this change by killing the current set of edges from l,

0

then generating additional inside edges from l to all of the

The algorithm analyzes the metho d under the assump-

no des that v p oints to.

tion that the parameters and static class variables all p oint

Kill = edges I; l to di erent ob jects. If the metho d maybeinvoked in a call-

I

Gen = fl gI v ing context in which some of these p ointers p oint to the same

I

0

ob ject, this ob ject will b e represented bymultiple no des dur-

I = I Kill  [ Gen

I I

ing the analysis of the metho d. In this case, the analysis

describ ed b elow in Section 9 will merge the corresp onding

6.2 Load Statements

outside ob jects when it combines the nal analysis result for

A load statement of the form l = l :f makes l p ointtothe

1 2 1

the metho d into the calling context at the metho d invo ca-

ob ject that l :f p oints to. The analysis mo dels this change

2

tion site. Because the combination algorithm retains all of

by constructing a set S of no des that represent all of the

the edges from the merged ob jects, it conservatively mo dels

ob jects to which l :f may p oint, then generating additional

2

the actual e ect of the metho d.

inside edges from l to every no de in this set.

1

The intrapro cedural analysis is a data ow analysis that

All no des accessible via inside edges from l :f should

2

propagates parallel interaction graphs through the state-

clearly b e in S . But if l p oints to an escap ed no de, other

2

ments of the metho d's control ow graph. The transfer func-

0 0 0 0 0 0 0

parts of the program such as the caller or threads executing

tion hhO ;I ;e ;r i; ; ; i = [[ st ]] hhO ; I ; e; r i; ; ;i de-

in parallel with the current thread can access the referenced

nes the e ect of each statement st on the current parallel

ob ject and store values in its elds. In particular, the value

interaction graph. Most of the statements rst kill a set

in l :f may have b een written by the caller or a thread

2

of inside edges, then generate additional inside and outside

running in parallel with the current thread | in other words,

edges. Figure 6 graphically presents the rules that deter-

l :f may contain a reference created outside of the current

2

mine the sets of generated edges for the di erent kinds of

analysis scop e. The analysis uses an outside edge to mo del

statements. Eachrow in this gure contains four items: a

Figure 6: Generated Edges for Basic Statements

this reference. The outside edge p oints to the load no de for 6.5 Object Creation Sites

the load statement, which is the outside no de that represents

An ob ject creation site of the form l = new cl allo cates a

the ob jects that the reference may p oint to.

new ob ject and makes l p oint to the ob ject. The analysis

The analysis must therefore consider two cases: the case

represents all ob jects allo cated at a sp eci c creation site

when l do es not p oint to an escap ed no de, and the case

2

with the creation site's inside no de n. The transfer function

when l do es p oint to an escap ed no de. The algorithm

2

mo dels the e ect of the statementby killing all edges from

determines which case applies by computing S , the set of

E

l, then generating an inside edge from l to n.

escap ed no des to which l p oints. S is the set of no des

2 I

accessible via inside edges from l :f .

2

Kill = edges I; l

I

Gen = fhl;nig

S = fn 2 I l :escap ed hO ; I ; e; r i;n g

I

E 2 2 2

0

I = I Kill  [ Gen

S = [fI n ; f:n 2 I l g

I I

I 2 2 2

If S = ; i.e., l do es not p oint to an escap ed no de,

E 2

6.6 Return Statements

S = S and the transfer function simply kills all edges from

I

l , then generates inside edges from l to all of the no des

1 1

A return statement return l sp eci es the return value for

in S .

the metho d. The immediate successor of each return state-

Kill = edges I; l 

I 1

ment is the exit statement of the metho d. The analysis

Gen = fl gS

I 1

mo dels the e ect of the return statementby up dating r to

0

I = I Kill  [ Gen

I I

include all of the no des that l p oints to.

If S 6= ; i.e., l p oints to at least one escap ed no de,

E 2

0

r = I l

S = S [fn g, where n is the load no de for the load

I L L

statement. In addition to killing all edges from l , then

1

generating inside edges from l to all of the no des in S , the

1

6.7 Control-Flow Join Points

transfer function also generates outside edges from the es-

To analyze a statement, the algorithm rst computes the

cap ed no des to n and propagates the escap e information

L

join of the parallel interaction graphs owing into the state-

from the escap ed no des through n . It also generates a

L

ment from all of its predecessors. It then applies the transfer

load action hld ;ni and up dates the parallel action relation

function to obtain a new parallel interaction graph at the

to record the fact that the action may execute in parallel

p oint after the statement. The join op eration t is de ned

with thread ob jects represented by the current set of paral-

as follows.

lel thread no des. In this case, the new outside edges may

represent references created by these parallel thread ob jects.

hhO ; I ; e; r i;; ;i =

Kill = edges I; l 

hhO ;I ;e ;r i; ; ; it hhO ;I ;e ;r i; ; ; i

I 1

1 1 1 1 1 1 1 2 2 2 2 2 2 2

Gen = fl gS

I 1

0

where O = O [ O , I = I [ I , 8n 2 N:en=e n [ e n,

I = I Kill  [ Gen

1 2 1 2 1 2

I I

r = r [ r , 8n 2 N: n = max n; n, = [ ,

Gen = S ffg fn g

1 2 1 2 1 2

O E L

0

and  =  [  .

O = O [ Gen

1 2

O

0 0 0

The corresp onding partial order v is

e = propagate hO ;I ;e;ri;S 

E

0

= [fhld ;n ig

L

0

hhO ;I ;e ;r i; ; ; iv hhO ;I ;e ;r i; ; ; i

1 1 1 1 1 1 1 2 2 2 2 2 2 2

 =  [ fh ld;n igfn : n  > 0g

L T T

if O O , I I , 8n 2 N:e n e n, r r ,

1 2 1 2 1 2 1 2

6.3 Store Statements

8n 2 N: n   n, , and   . Bottom is

1 2 1 2 1 2

hh;; ;;e ; ;i; ; ;; ;i, where 8n 2 N:e n = ; and 8n 2

A store statement of the form l :f = l nds the ob ject

? ? ?

1 2

N: n=0.

to which l p oints, then makes the f eld of this ob ject

?

1

p oint to same ob ject as l . The analysis mo dels the e ect

2

of this assignmentby nding the set of no des that l p oints

1

6.8 Analysis Results

to, then generating inside edges from all of these no des to

The analysis of each metho d pro duces analysis results st 

the no des that l p oints to. It also propagates the escap e

2

and st b efore and after each statement st in the metho d's

information from all of the no des that l p oints to through

1

control ow graph. The analysis result satis es the fol-

the no des that l p oints to.

2

lowing equations:

Gen = I l  ff g  I l 

I 1 2

0

I = I [ Gen

I

enter  = hhO ;I ;e ;r i; ; ; i

op 0 0 0 0 0 0 0

0 0 0

0 0

e = propagate hO ;I ;e;ri;Il 

1

st  = tf st :st 2 predstg

st  = [[ st ]] st 

6.4 Acquire and Release Statements

The nal analysis result of metho d op is the analysis result

An acquire statement of the form acquire l nds the ob-

at the program p oint after the exit no de, i.e., exit .

op

ject to which l p oints, then acquires that ob ject's lo ck. A

As describ ed b elow in Section 9.4, the analysis solves these

release statement of the form release l nds the ob ject to

equations using a standard worklist algorithm.

which l p oints, then releases that ob ject's lo ck. The analy-

sis mo dels the e ect of these statements by nding the set

7 Matching Inside and Outside Edges

of no des that l p oints to, then recording synchronization

actions on all of these no des.

In the interpro cedural analysis, outside edges in the callee's

0

= [ fsync gI l

parallel interaction graph represent inside edges in the caller's

0

 =  [ fsync gI l fn : n  > 0g

T T

parallel interaction graph. To compute the e ect of a metho d

call, the analysis matches the callee's outside edges against 8 Combining Points-To Escap e Graphs

the corresp onding inside edges from the caller. In the inter-

Once the matching algorithm has set up the mapping b e-

thread analysis, outside edges in the parallel interaction

tween inside and outside no des in the two graphs, the combi-

graphs of each thread represent inside edges in the parallel

nation algorithm uses the mapping to generate a new p oints-

interaction graph of the other thread. To compute the inter-

to escap e graph that re ects the nal p oints-to and escap e

actions, the analysis matches outside edges from each thread

relationships generated by the interaction.

against inside edges from the other thread. The matching

The combination algorithm takes two p oints-to escap e

pro cess is conceptually similar in b oth cases. This section

graphs hO ;I ;e ;r i i 2f1; 2g and two initial mappings

i i i i

discusses the matching algorithm we use for b oth the inter-

 : N ! N . It pro duces the nal p oints-to escap e graph

i

pro cedural and the inter-thread analyses.

0 0 0 0

hO ;I ;e ;r i. The combination algorithm p erforms two ba-

The matching algorithm takes two p oints-to escap e graphs

sic tasks: it traces out reachable edges and no des from the

hO ;I ;e ;r i i 2f1; 2g and two initial mappings  : N !

i i i i i

0

two input p oints-to escap e graphs so that they are present

N . It pro duces two new mappings  : N ! N that extend

i

in the nal graph, and it uses the mappings b etween out-

the initial mappings to map the outside no des in each graph

side and inside no des to translate inside edges into the nal

to the corresp onding no des in the other graph.

graph.

0 0

h ; i = matchhO ;I ;e ;r i; hO ;I ;e ;r i; ; 

1 1 1 1 2 2 2 2 1 2

1 2

0 0 0 0 0 0

hhO ;I ;e ;r i; ; i =

1 2

We formulate the mapping using set inclusion constraints [1 ].

combinehO ;I ;e ;r i; hO ;I ;e ;r i; ; 

1 1 1 1 2 2 2 2 1 2

This formulation enables us to present a compact, simple

As for the mapping algorithm, we sp ecify the combina-

sp eci cation of the mapping result using a set of constraint

tion result using a system of set inclusion constraints. Fig-

rules. Figure 7 presents the constraints that the matching

ure 8 presents the constraints that sp ecify the result of the

algorithm must satisfy. Note that these contraints use the

combination algorithm. These constraints extend the initial

i to represent the complement of i; i.e. 1 = 2, notation

0

mappings to two new mappings  : N ! N . These new

2=1. The constraints basically sp ecify that if an outside

i

0

mappings have the prop erty that n 2  nif n is reachable

edge from one graph matches an inside edge from the other

i

in the nal p oints-to escap e graph and should be present

graph, then the mappings must map the outside edge's no de

in that graph. These constraints start with a set of no des

to the inside edge's no de. This no de mapping p otentially

that will b e mapp ed into the combined graph. They then

enables more edge matchings.

trace out the reachable no des to determine the complete

Figure 9 presents the algorithm that solves these con-

set of no des that should b e present in the combined graph.

straints. It op erates by rep eatedly nding a no de in one

Figures 10 and 11 present an algorithm for solving the con-

graph that is already mapp ed to a no de in the other graph.

straint system in Figure 8. We highlight several prop erties

It then checks if there is an inside edge from one of these

of this algorithm:

no des that it can match up with a corresp onding outside

edge from the other no de. If so, it up dates one of the map-

 Base No des: The analysis starts with a set of base

pings to re ect the fact that the no de that the outside edge

no des mapp ed into the combined graph. In the case of

p oints to is mapp ed to the no de that the inside edge p oints

caller/callee interaction, all of the no des from the caller

to.

are mapp ed directly into the combined graph. The

class no des from the callee are also mapp ed directly,

while the parameter no des are mapp ed indirectly to

matchhO ;I ;e ;r i; hO ;I ;e ;r i; ; 

1 1 1 1 2 2 2 2 1 2

mo del the semantics of metho d invo cation.

Initialize worklists and results

for i =1; 2do

0

 Inside No des: An inside no de from one of the parallel

 = 

i

i

interaction graphs is present in the combined graph if

W = fhn ;n i:n 2  n g

i 1 3 3 i 1

it is reachable from the base no des.

D = ;

i

while cho ose hn ;n i2W do

1 3 i

 Inside Edges: If two no des are mapp ed into the com-

Remove a pair from worklist

bined graph, the analysis uses the mapping to translate

W = W fhn ;n ig

i i 1 3

insided edges b etween the two no des into the graph.

D = D [fhn ;n ig

i i 1 3

Check outside edges for rule 2

 Load No des: A load no de is mapp ed into the com-

for all hhn ; f i;n i2O do

bined graph if it is reachable from an escap ed base

1 2 i

for all hhn ; fi;n i2I do

no de.

3 4

i

0 0

n  [fn g n = 

2 4 2

i i

 Outside Edges: Outside edges are translated into

if hn ;n i62D then

2 4 i

the combined graph if the no de that they come from

W = W [fhn ;n ig

i i 2 4

is mapp ed into the graph and at least one of the no des

Check inside edges for rule 3

that it maps to is escap ed in the graph.

for all hhn ; f i;n i2I do

1 2 i

for all hhn ; fi;n i2O do

3 4

i

0 0

n = n  [fn g 

4 4 2

i i

8.0.1 Constraint Solution Algorithm

if hn ;n i62D then

4 2

i

= W [hn ;n i W

4 2

We next discuss the constraint solution algorithm in Fig-

i i

0 0

return h ; i

ures 10 and 11 for the constraint system in Figure 8. The al-

1 2

gorithm directly re ects the structure of the inference rules.

At each step it detects an inference rule antecedent that b e-

Figure 9: Algorithm for Matching Inside and Outside No des

comes true, then takes action to ensure that the consequent is also true.

Figure 7: Constraints for Matching Inside and Outside Edges

Figure 8: Constraints for Combining Points-to Escap e Graphs

The mapNo den ;n;i pro cedure in Figure 10 is invoked

1

whenever the algorithm maps a no de n from p oints-to es-

1

cap e graph i toanoden in the new graph. It rst matches

inside edges involving n in p oints-to escap e graph i to inside

1

edges involving n in the new graph. The pro cedure checks

mapNo den ;n;i

1

any edges to n that have already b een previously translated

1

if hn ;n;ii62 D then

1

into the new graph to see if they should also b e translated to

D = D [fhn ;n;iig

1

p ointton in the new graph. It also checks all of the inside

0 0

 n = n  [fng

1 1

i i

edges from n to see if they should b e translated into the

1

Add delayed inside edges for rule 7

new graph. The pro cedure then checks outside edges from

0 0

I = I [  n  fng

i 1

n to see if they should b e translated into the new graph.

1

0 0

0

S = fn 2 N:h n ; fi2 n g

1

Finally, the pro cedure up dates the new escap e function e

Check inside edges for rules 5 and 7

to re ect the e ect of the newly mapp ed no des and newly

for all hhn ; fi;n i2 I

1 2 i

translated edges.

Add inside edges for rule 7

Figure 11 presents the driver for the constraint solution

0 0 0

I

I = I [fhn; fig   n 

2

i

algorithm. It maintains a worklist W of inside no des from

i

S = S [fng

graph i that should be mapp ed into the new graph, and

O

 n = n  [fhn; f ig

i 2 i 2

a worklist W of outside edges that should be translated

i

Check conditions for rule 5

into the new graph if the edge's source no de is escap ed in

if n 2 N [ N

2 I R

the new graph. When the algorithm pro cesses a no de from

I I

I

W = W [fn g

2

W , it calls mapNo de to map that no de into the new graph.

i i

i

O

Check outside edges for rule 6

When the algorithm pro cesses an outside edge from W ,it

i

for all hhn ; fi;n i2 O

1 2 i

translates the edge into the new graph and maps the edge's

O O

W = W [fhhn; fi;n ig

target no de into the new graph.

2

i i

Update escape information for rules 8 and 9

There is a slight complication in the algorithm. As the al-

0 0

e n=e n [ e n 

gorithm executes, it p erio dically translates inside edges into

i 1

0 0 0 0 0

e = propagate hO ;I ;e ;r i;S

the new graph. Whenever a no de n from graph i is mapp ed

1

to n, the algorithm translates each inside edge hhn ; fi;n i

1 2

from graph i into the new graph. This translation pro cess in-

serts a corresp onding inside edge from n to each no de that

0

Figure 10: Algorithm for Mapping One No de to Another

n maps to; i.e., to each no de in  n . The algorithm

2 2

i

must ensure that when it completes, there is one such edge

0

combinehO ;I ;e ;r i; hO ;I ;e ;r i; ; 

1 1 1 1 2 2 2 2 1 2

n . But when the for each no de in the nal set of no des 

2

i

Initialize worklists and results

algorithm rst translates hhn ; fi;n i into the new graph,

1 2

0

D = ;

 n  may not be complete. In this case, the algorithm

2

i

for i =1; 2do

will eventually map n to more no des, increasing the set of

2

O I

0

i = h;; ;i ;W hW

n . There should b e edges from n to all of the no des in 

2 i i

i

0

for all n 2 N do  n=;

no des in the nal  n , not just to those no de that were

i

2

i

0

0

for all n 2 N do  n=;

presentin  n  when the algorithm mapp ed hhn ; fi;n i

2 1 2

i

i

0 0 0

hO ;I ;r i = h;; ;; ;i

into the new graph.

0

for all n 2 N do e n=;

The algorithm ensures that all of these edges are present

Cal l mapNo de for existing mappings

in the nal graph by building a set  n  of delayed actions.

i 2

for all hn ;n;ii such that n 2  n do

Each delayed action consists of a no de in the new graph and

1 i 1

mapNo den ;n;i

a eld in that no de. Whenever the no de n is mapp ed to a

1

2

0 0 0 0

done = false

new no de n i.e., the algorithm sets  n = n  [fn g,

2 2

i i

do

the algorithm establishes a new inside edge for each delayed

I

action. The new edge go es from the no de in the action to

if cho ose n 2 W then

i

0

I I

the newly mapp ed no de n . These edges ensure that the

W = W fng

i i

nal set of inside edges satis es the constraints.

mapNo den; n; i

O

such that else if cho ose hhn; fi;n i2 W

1

i

0 0 0

escap ed hO ;I ;e ; ;i;n then

9 Interpro cedural Analysis

I I

fhhn; fi;n ig = W W

1

i i

The interpro cedural analysis algorithm propagates parallel

Update outside edges for rule 6

0 0

interaction graphs from callees to callers. At thread start

O = O [ fhhn; fi;n ig

1

sites, the analysis adds the no des that represent the started

mapNo den ;n ;i

1 1

thread to the parallel thread map. It also marks the started

else done = true

thread no des as escaping via themselves and propagates the

while not done

0 0 0 0 0 0

escap e information. At each metho d invo cation site, the

return hhO ;I ;e ;r i; ; i

1 2

analysis has the option of either skipping the site or analyz-

ing the site. If it skips the site, it marks all of the parameters

and the return value as p ermanently escaping down into the

Figure 11: Algorithm for Combining Points-to Escap e

site. Graphs

 It uses the result mapping to combine the caller and 9.1 Thread Start Sites

callee graphs, generating the new parallel interaction

To simplify the presentation of the analysis, we assume that

graph. In addition to combining the p oints-to and es-

at each thread start site l:start , I l  N . This is al-

T

cap e information, the analysis must also translate the

most invariably the case in practice | most threads are

actions and parallel threads from the callee into the

started in the same metho d in which they are allo cated.

caller. It must also record the fact that all of the ac-

The algorithm adds the thread no des that l p oints to into

tions from the callee o ccur in parallel with all of the

the set of parallel threads. It also makes the escap e function

parallel threads from the caller.

for eachnode n 2 I l include n , and propagates the new

T T

The transfer function itself merges the combined results

escap e information.

from all p otentially invoked metho ds to derive the p oints-to



escap e graph at the p oint after the metho d invo cation site.

 n   1 if n 2 I l

T T

0

 n  =

T

 n  otherwise

T



en [fng if n 2 I l

mapUphhO ; I ; e; r i;; ;i;m;op 

e n =

T

en otherwise

Assume m of the form l = l :op l ; :::;l 

0 1

k

0

Assume op has formal parameters p ;:::;p

e = propagate hO; I ; e ;ri;Il

T

0 k

Extract the analysis result from the invoked method

hhO ;I ;e ;r i; ; ; i = exit 

R R R R R R R op

9.2 Skipp ed Metho d Invo cation Sites

Compute initial mappings



The transfer function for a skipp ed metho d invo cation site is

fng if n 2 N

C

 n =

C

de ned as follows. Given a skipp ed metho d invo cation site

; otherwise



m of the form l = l :op l ;:::;l  with return no de n

0 1 R

k

fng if n 2 N

C

and a current parallel interaction graph hhO ; I ; e; r i;; ;i,

m

I l  if n = n

 n =

i p

op

i

the parallel interaction graph

; otherwise

0 0 0 0 0 0 0

hhO ;I ;e ;r i; ; ; i = [[ m ]] hhO ; I ; e; r i; ; ;i after the

Match outside nodes from cal leetocal ler nodes

site is de ned as follows:

m

h ; i = matchh;;I;e;ri; hO ;I ;e ;r i; ; 

1 2 R R R R C

op

0

Compute mappings for combination

I = I edges I; l  [fhl;n ig

R



0

 n [fng if  n > 0orn 2 r or

1 R

O = O

0 0 0

9v 2 V :hv;ni2I

n =

e = propagate hO ;I ;e ;ri;S 

m m

0

 n otherwise

1

r = r



 n [fng if  n > 0or

2 R

n 2 r N [ N 

 n =

R P L

R

where S = [fI l :0  i  k g and

m i

 n otherwise

2



0

e n = e n P

R

R

en [fmg if n 2 S or n = n

m R

e n =

m

Combine points-to escapegraphs

en otherwise

0 0 0 0

hhO ;I ;e ;r i; ; i =

C C

1 2

0

combinehO ; I ; e; r i; hO ;I ;e ;r i;; 

R R R R

R

Recall that the return no de n is an outside no de used to

R

Generate the nal set of inside edges and return set

S

represent the return value of the invoked metho d.

0 0

n  I = I [ I edges l [ fl g

C

2

n2r

R

0

9.3 Analyzed Metho d Invo cation Sites

r = r

Compute new thread map

Given an analyzed metho d invo cation site m and a current

0

 n= n   n

R

parallel interaction graph hhO ; I ; e; r i;; ;i, the new par-

Compute action mapping from cal leetocal ler



allel interaction graph

0

0 0 0 0 0 0 0

fb g n if a = hb;ni

2

hhO ;I ;e ;r i; ; ; i = [[ m ]] hhO ; I ; e; r i; ; ;i after the

 a=

A

0

fb g n fn g if a = hb;n;n i

T T

2

site is de ned as follows:

Combine actions from cal ler and cal lee

S

0 0 0 0 0 0 0

0

hhO ;I ;e ;r i; ; ; i =

= [   a

A

tfmapUphhO ; I ; e; r i;; ;i;m;op :op 2 callees mg

a2

R

Compute new paral lel action relation

S

0

Figure 12 presents the combination algorithm, which p er-

 =  [   a fn g[

A T

forms the following steps:

ha;n i2

T R

S

  a fn : n  > 0g

A T T

 It retrieves the analysis result from the exit statement

a2

R

of the invoked metho d op.

Return the new paral lel interaction graph

0 0 0 0 0 0 0

return hhO ;I ;e ;r i; ; ; i

 It builds an initial mapping. This mapping maps the

parameter no des from the callee to the corresp onding

no des in the caller that represent the actual parame-

ters, and the class no des to themselves.

Figure 12: Callee/Caller Interaction Algorithm

 It uses the initial mapping to match outside edges

in the callee to the corresp onding inside edges in the

caller. The result is a mapping from the outside no des

9.4 Fixed-Point Analysis Algorithm

of the callee to the corresp onding no des in the caller.

Figure 13 presents the interpro cedural xed-p oint algorithm

that the compiler uses to generate the interpro cedural anal-

ysis results. It uses a worklist of p ending statements to solve e ect of the parallel threads on the p oints-to and escap e in-

the combined intrapro cedural and interpro cedural data ow formation at that p oint.

equations. At each step, it removes a statement from the

worklist and up dates the analysis results b efore and after

10.1 Thread Interaction

the statement. If the analysis result after the statement

We next discuss the interaction algorithm for parallel threads.

changed, it inserts all of its successors or for exit no des,

The interaction takes place b etween a starter thread a thread

all of the callers of its metho d into the worklist. As sp ec-

that starts a parallel thread and a startee thread the thread

i ed, the algorithm is therefore b oth intrapro cedural and

that is started. The interaction algorithm is given the par-

interpro cedural.

allel interaction graph hhO ; I ; e; r i;; ;i from the starter

thread, a no de n that represents the startee thread, and a

T

run metho d op with receiver ob ject p that runs when the

Initialize analysis results

0

thread ob ject represented by n starts.

for all st 2 ST do

T

st = st=hh;; ;;e ; ;i; ; ;; ;i

0

;

0 0 0 0 0 0 0

hhO ;I ;e ;r i; ; ; i = interact hhO ; I ; e; r i;; ;i;n ; op 

T

where e n=; for all n 2 N and

;

 n  = 0 for all n 2 N

0 T T T

Figure 14 presents the interaction algorithm. The algorithm

for all op 2 OP do

p erforms the following steps:

enter =hhO ;I ;e ;r i; ; ; i

op 0 0 0 0 0 0 0

Initialize the worklist

 It extracts the nal analysis result for the startee thread.

W = fenter :op 2 OPg

op

This is the analysis result after the exit no de of the

while W 6= ;do

startee thread's run metho d.

Remove a statement from worklist

W = W fst g

 It matches corresp onding inside and outside edges from

Process the statement

the two threads. All of the outside edges from the star-

0 0

st =tf st :st 2 predstg

tee thread participate in the matching. Outside edges

st =[[ st ]]  st 

from the starter participate only if they represent loads

if stchanged then

that mayhave o ccurred after the startee thread b egan

Put potential ly a ected statements

its execution. The algorithm uses the event ordering

onto worklist

information to determine which of these outside edges

W = W [ succst

should participate.

if st exit then

op

 It combines the two parallel interaction graphs to gen-

W = W [ callers op 

erate the nal parallel interaction graph. In addition

to combining the p oints-to and escap e information, the

analysis must also translate the actions and started

threads from the two graphs into the combined graphs.

Figure 13: Fixed-Point Analysis Algorithm

It must also up date the ordering information as fol-

lows:

The order in which the algorithm analyzes metho ds can

have a signi cant impact on the analysis. For non-recursive

{ Each thread has ordering information that sp ec-

metho ds, a b ottom-up analysis of the program yields the full

i es which of its actions may execute in parallel

result with one analysis p er metho d. For recursive metho ds,

with the threads that it starts. For b oth threads,

the analysis results must b e iteratively recomputed within

this ordering information is translated into the

each strongly connected comp onent of the call graph us-

new p oints-to escap e graph.

ing the current b est result until the analysis reaches a xed

{ All actions from the starter thread that o ccur in

p oint.

parallel with the startee thread also o ccur in par-

It is p ossible to extend the algorithm so that it initially

allel with all of the startee's threads.

skips the analysis of metho d invo cation sites. If the analysis

result is not precise enough, it can incrementally increase

{ All actions from the startee thread o ccur in par-

the precision by analyzing metho d invo cation sites that it

allel with all of the starter's threads.

originally skipp ed. The algorithm will then propagate the

Note that b ecause the parallel interaction graphs rep-

new, more precise result to up date the analysis results at

resent all p otential interactions b etween threads, the algo-

a ected program p oints.

rithm can compute the actual interactions with a single pass

over the two graphs.

10 Inter-thread Analysis

10.2 Resolution

The interpro cedural algorithm describ ed ab ove generates a

parallel interaction graph for every p oint in the program.

To generate the nal parallel interaction graph that summa-

This graph records all of the p oints-to relationships created

rizes all of the interactions, the algorithm resolves all of the

by the current thread and all of the p otential interactions

interactions between the parallel threads. The resolution

of that thread with other threads. The thread interaction

algorithm rep eatedly takes the current parallel interaction

algorithm resolves the p otential interactions by computing

graph, cho oses one of the startee threads, and computes

whichinteractions may actually o ccur b etween the current

the interactions b etween the current graph and the startee

thread and the parallel threads that it transitively starts.

thread to derive a new current graph. It continues this pro-

The basic idea is to rep eatedly match corresp onding inside

cess until it reaches a xed p oint. In the absence of lo op

and outside edges from parallel threads. The result is a sin-

or recursively generated concurrency, the algorithm reaches

gle parallel interaction graph that summarizes the combined

a xed p oint after pro cessing each thread once. The result

of the resolution algorithm resolvehG;  ; ;  imust satisfy

the following equation:

interact hhO ; I ; e; r i;; ;i;n ; op 

T

Assume p represents receiver of op

0

resolvehG;  ; ;  i=

F

Extract the analysis result for the paral lel thread

fresolve interact hG;  ; ;i;n ; op g

n T

T

hhO ;I ;e ;r i; ; ; i =  exit 

R R R R R R R op

hn ;op i2S

T

Compute outside edges from starter thread that

participate in mapping

where

O = fhhn ; fi;n i:hhld ;n i;n i2 or

n 1 2 2 T

T

hhld ;n ;ni;n i2  g

2 T

 S = fh n ; opi: n  > 0 and op 2 runn g

T T T

Compute initial mappings for the match





 n 1 if n = n

T

fng if n 2 N

C

  n=

n

 n =

C

T

 n otherwise

; otherwise



fng if n 2 N

C

To p erform the resolution, the analysis requires informa-

n

T

fn g if n = n

 n =

T p

op

0

tion ab out the corresp ondence between thread no des and

; otherwise

the run metho d that executes when the start metho d is in-

Match corresponding inside and outside edges

voked on an ob ject that the no de represents. Given a thread

n

T

O ; I ; e ; r i; ;  h ; i = matchhO ;I;e;ri; h

R R R R C 1 2 n

op T

no de n 2 N a no de that represents runnable ob jects,

T

Compute mappings for the combination

runn is the set of run metho ds that may execute when the



 n [fng if  n > 0orn 2 r or

1

start metho d is invoked on an ob ject represented by n. In

9v 2 V:h v;ni2 I

n =

the current compiler, runn is computed using the declared

 n otherwise

1

typ e of n. Figure 15 presents a xed-p oint algorithm that



 n > 0  n [fng if

computes resolvehG;  ; ;  i.

R 2

 n =

R

 n otherwise

2

0

n = e e n P

R R

resolvehG;  ; ;  i

Combine the two paral lel interaction graphs

0 0 0 0

0 0 0 0

hG ; ; ; i = hG;  ; ;  i

hhO ;I ;e ;r i; ; i =

C C

1 2

0

S = ;

combinehO ; I ; e; r i; h O ; I ; e r i;;  ;

R R R R R

S

0 0

while there exists n 2 N such that

T T

fvg n I = I [

C

1

0 0 0 0

 n  > 0 and hn ; ; ; i62S do

T T

hv;ni2I

0 0 0

S

S = S [fhn ; ; ; ig

0 0

T

r =  n

1

0 0 0 0

cho ose n such that  n  > 0 and hn ; ; ; i62S

T T T

n2r

F

0 0 0 0 0 0 0 0

hG ; ; ; i = interacthG ; ; ; i;n ; op

T

Compute action mappings for combination

n

T



op 2runn 

0

T

fb g n if a = hb;ni

A

1

0

a = 

if G changed then S = ;

0 0 0

1

fb g n fn g if a = hb;n;n i

1

T T

0 0 0 0



returnhG ; ; ; i

0

fb g n if a = hb;ni

A

2

a = 

0 0 0

2

fb g i g if a = hb;n;n n fn

2

T T

Compute new paral lel thread map



Figure 15: Fixed-Point Resolution Algorithm

 n 1 if n = n

T

 n =

n

T

 n otherwise  n 

R

Given a statement st, it is p ossible to compute a sin-

0

 n =  n   n

n R

T

gle parallel interaction graph hhO ; I ; e; r i;; ;i that com-

Compute combined action sets

S

pletely summarizes the p oints-to and escap e relationships

0 A

=   a[

1

after st as follows:

a2

S

A

fb g n fn g[ 

T

2

hhO ; I ; e; r i;; ;i = trimresolve st 

hb;n i2

R

S

A 0

fb g n fn g 

where

2

T

0

i2 hb;n ;n

R

T

 S = fn 2 N :en N = ;g

L T

Compute new paral lel action relation

S

0 A 0

 =   a fn g[

 trimhhO ; I ; e; r i; ; ;i=

1

T

0

0

ha;n i2

removehhO; I ; e ;ri;; ;i;S

T

S

A 0 0

:   a fn  > 0g[  n

R

0

1

T T

 e n=en N .

T

ha;n i2

T

S

A 0

The algorithm trims o any outside edges that come

 a fn g[ 

2

T

0

from no des that escap e only b ecause they are reachable from

 ha;n i2

R

T

S

A 0 0

thread no des. It also removes all thread no des from the es-

  a fn : n  > 0g

n

2

T T

T

cap e function. The resolved graph already contains all of

a2

R

the p ossible interactions that may a ect no des that escap e

Return the new paral lel interaction graph

0 0 0 0 0 0 0

only via other threads.

return hhO ;I ;e ;r i; ; ; i

For the program p oint at the exit of the main metho d,

the analysis may also trim o outside edges that come from

no des that escap e only b ecause they are reachable from

static class variables. The resolved graph at this program

Figure 14: Parallel Thread Interaction Algorithm

p oint contains all of the p ossible interactions that may a ect

these no des. In this graph, the only source of uncertaintly eachnoden, a map set ! n of no des that n is mapp ed to

comes from no des passed into or returned from unanalyzed during the analysis of the corresp onding metho d invo cation

metho d invo cation sites. site or metho d. These systems are sp eci ed using set inclu-

sion constraints of the forms ! n  ! n , which sp eci es

1 2

hhO ; I ; e; r i;; ;i = trimMainresolve exit 

that the map set for n includes the map set for n . We use

2 1

main

the notation ! n to indicate the solution of the constraint

m

where

system for n, and similarly ! n for the solution of

m op

for n. The initial map set consists of the set of con-

op 0

 S = fn 2 N :en N [ N =;g

L T C

straints fng ! n, which sp eci es that eachnodeisinits

map set. The constraint systems can b e solved by a simple

 trimhhO ; I ; e; r i;; ;i=

0

constraint propagation algorithm [24 ].

removehhO; I ; e ;ri; ; ;i;S

We de ne the constraint systems using the mappings

0

 e n=en N [ N .

generated by the algorithms in Figures 12 and 14. Sp eci -

T C

m 0 0

cally,  n= n, where  is the mapping computed

op 2 2

by the algorithm in Figure 12 for the metho d op invoked

11 Indep endence Testing

0 0

n, n [  at metho d invo cation site m, and  n=

op

2 1

0 0

where  and  are the mappings computed by the algo-

The indep endence testing algorithm nds ob jects that are

1 2

rithm in Fgiure 14 when applied to the parallel interaction

captured at the end of a metho d, using either the resolved

graph at the program p oint exit . The constraint system

graph as discussed in Section 10.2 or the single thread in- op

at a metho d invo cation site m is de ned as follows:

terpro cedural analysis result as discussed in Section 9. If

an ob ject is captured in either graph, it is unreachable out-

[

m

= f! n  ! n :n 2  n g[

side the metho d. In this case, the graph completely sum-

m 1 2 1 2 op

op

marizes all of the actions that threads may p erform on the

op 2callees m

ob ject. The algorithm uses the parallel action relation to

determine if two con icting actions may o ccur concurrently.

The constraint system is considered to b e nal for a no de

If not, the actions are indep endent and the compiler can

n if it completely summarizes all of the inside no des that n

eliminate all synchronization on the ob jects that the no de

can represent during the analysis of the metho d invo cation

represents. Given a parallel interaction graph hG;  ; ;  i

site. To determine if the constraint system is nal for n,

from the exit no de of a metho d and a captured no de n,

the algorithm checks to see that all of the outside no des

the algorithm in Figure 16 tests if all of the synchronization

that n represented during the analysis have b een completely

actions on n are indep endent.

resolved to inside no des. Formally,

0 m 0

nalm; n=8n 2 ! n; op 2 callees m: n \N = ;

m O

op

indep endenthG;  ; ;  i;n

Check if the node is escapedinG

For each metho d op , the analysis can cho ose whether

if escap edG; n then return false

it wishes to compute the interactions between threads to

Find al l sync actions on n

resolve outside no des. The advantage of doing so is a p oten-

S = fhsync ;ni2 g[fhsync ;n;n i2 g

tial increase in the precision; the disadvantage is a p otential

T

Check if one of the actions a in S may execute in

increase in the analysis time. If the analysis do es not com-

paral lel with a thread n that also synchronizes on n

pute the interactions b etween threads, and nalop;n

T

op

if 9a 2 S; n 2 N :ha; n i2  and hsync ;n;n i2S then

are de ned as follows:

T T T T

return false

S

= [

op 0 m

else return true

m2invo cations op

nalop;n = 8m 2 invo cations op: nalm; n

Figure 16: Indep endence Testing Algorithm

Here invo cations op is the set of all metho d invo cation sites

in op. If the analysis do es compute the interactions, and

op

The compiler tests all inside no des for indep endence in

nalop;n are de ned as follows:

the analysis result exit  for all metho ds op. It also

op

tests all inside no des in the resolved analysis result

= f! n  ! n :n 2  n g[

op 1 2 1 op 2

S

trimresolve exit  for metho ds op that contain thread

op

[

0 m

creation sites and in the resolved analysis result

m2invo cations op

trimMainresolve exit  at the end of the main

main

0 0

nalop;n = 8n 2 ! n: n  \ N = ;

op op O

metho d.

The analysis can mix and match these two approaches on a

12 Resolving Outside No des

p er-metho d basis; an appropriate p olicy is to compute the

interactions only for metho ds that start threads.

It is p ossible to augment the algorithm so that it records,

If a no de is nal in a given constraint system, the analysis

for each outside no de, all of no des that it represents dur-

has determined all of the inside no des that it represents dur-

ing the analysis. This information allows the algorithm to

ing the analysis of the corresp onding metho d invo cation site

go back to the analysis results generated at the various pro-

or metho d. More formally, if nalm; n, then ! n \ N is

m I

gram p oints and resolve each outside no de to the set of inside

the set of inside no des represented by n during the analysis

no des that it represents during the analysis. The basic idea

of the metho d invo cation site m. Similarly, if nalop;n,

is to generate a set of inclusion constraint systems. There is

then ! n \ N is the set of inside no des represented by

op I

one system for each metho d invo cation site m and one

m

n during the analysis of the metho d invo cation site op .

system for each metho d op . These systems sp ecify, for op

13 Abstraction Relation  If an ob ject is represented by a captured no de, it is

represented by only that no de:

In this section, wecharacterize the corresp ondence b etween

{ n 2 o and capturedhO ; I ; e; r i;n implies

parallel interaction graphs and the ob jects and references

o=fng

created during the execution of the program. A key prop erty

of this corresp ondence is that a single concrete ob ject in the

Given this prop erty,we de ne that an ob ject is cap-

execution of the program maybe represented by multiple

tured if it is represented by a captured no de. All ref-

no des in the parallel interaction graph. We therefore state

erences to captured ob jects are either from lo cal vari-

the prop erties that characterize the corresp ondence using an

ables or from other captured ob jects:

abstraction relation, which relates each ob ject to all of the

no des that represent it.

{ n 2 o, capturedhO ; I ; e; r i;n, and hv;oi2R

As the program executes, it creates a set of concrete

implies v 2 L

ob jects o 2 C and a set of references r 2 R V  C [ C 

{ n 2 o , capturedhO ; I ; e; r i;n , and

2 2 2

F   C b etween ob jects. At each p oint in the execution of

hho ; fi;o i 2 R implies 9n 2 N:o  = fn g

1 2 1 1 1

the program, it is p ossible to de ne the following sets of

and capturedhO ; I ; e; r i;n 

1

references and ob jects:

These prop erties ensure that captured ob jects are reach-

 R is the set of references created by the current ex-

C

able only via paths that start with the lo cal variables.

ecution of the current metho d and all of the analyzed

If an ob ject is captured at a metho d exit p oint, it will

metho ds that it invokes.

therefore b ecome inaccessible as so on as the metho d

 R is the set of references read by the current exe-

returns.

R

cution of the current metho d and all of the analyzed

 The p oints-to information in the p oints-to escap e graph

metho ds that it invokes.

completely characterizes the references between ob-

 C is the set of ob jects reachable from the lo cal vari-

jects represented by captured no des:

R

ables, static class variables, and parameters by follow-

{ capturedhO ; I ; e; r i;n ; capturedhO ; I ; e; r i;n ;

1 2

ing references in R [ R .

C R

n 2 o ;n 2 o  and hhn ; fi;n i62 I implies

1 1 2 2 1 2

 R = R \ C  F  C  [ V  C  is the set of

hho ; fi;o i62 R I C R R R

1 2

inside references. These are the references represented

by the set of inside edges in the analysis.

14 Exp erimental Results

 R =R \ C  F  C  R is the set of outside

O R R R I

Wehave implemented a combined p ointer and escap e anal-

references. These are the references represented by the

ysis based on the algorithm describ ed in this pap er. We

set of outside edges in the analysis.

implemented the analysis in the compiler for the Jalap eno ~

It is always p ossible to construct an abstraction relation

JVM [3], a Java virtual machine written in Java with a few

 C  N between the ob jects and the no des in the parallel

unsafe extensions for p erforming low-level system op erations

interaction graph hhO ; I ; e; r i;; ;i at the current program

such as explicit memory management and p ointer manipu-

p oint. This relation relates each ob ject to all of the no des in

lation.

the p oints-to escap e graph that represent the ob ject during

The analysis is implemented as a separate phase of the

the analysis of the metho d. The abstraction relation has all

Jalap eno ~ dynamic compiler, which op erates on the Jalap eno ~

of the prop erties describ ed b elow.

intermediate representation. To analyze a class, the algo-

rithm loads the class, converts its metho ds into the interme-

 Reachable ob jects are represented by their allo cation

diate representation, then analyzes the metho ds. The nal

sites. If o was created at an ob ject creation site within

analysis results for the metho ds are written out to a le.

the current execution of the current metho d or ana-

This approach provides excellent supp ort for dynamically

lyzed metho ds that it invokes, and o is reachable i.e.

loaded programs. It allows the compiler to analyze a large,

o 2 C , n 2 o, where n is the ob ject creation site's

R

commonly used package such as the Java Class Libraries

inside no de.

once, then reuse the analyze results every time a program is

loaded that uses the package. It also supp orts the delivery

 Each ob ject is represented by at most one inside no de:

of preanalyzed packages. Instead of requiring the analysis

{ n ;n 2 o and n ;n 2 N implies n = n

to b e p erformed when the package is rst loaded into a cus-

1 2 1 2 I 1 2

tomer's virtual machine, a vendor could p erform the analysis

 All outside references have a corresp onding outside

as part of the release pro cess, then ship the analysis results

edge in the p oints-to escap e graph:

along with the co de.

Our b enchmark set includes four programs: javac Java

{ hhcl ; fi;oi2 R implies O cl ; f \ o 6= ;

O

compiler, javacup parser generator, server a simple mul-

{ hho ; fi;o i2 R implies

1 2 O

tithreaded web server, and work a compute b enchmark

o  ffg  o  \ O 6= ;

1 2

with multiple worker threads. Figure 17 presents the total

number of synchronizations required to execute each pro-

 All inside references have a corresp onding inside edge

gram. We rep ort counts for three di erent optimization lev-

in the p oints-to escap e graph:

els:

{ hv ;oi2 R implies I v \ o 6= ;

I

 Original: No analysis is p erformed.

{ hhcl ; fi;oi2 R implies I cl; f \ o 6= ;

I

{ hho ; fi;o i2 R implies

1 2 I

o  ffg  o  \ I 6= ;

1 2

 Interpro cedural: The compiler uses the interpro- and recursively generated concurrency. Unlike Rugina and

cedural, single-threaded analysis results as de ned in Rinard's algorithm, it handles programs with unstructured

Section 9. At the end of each metho d, it nds all cap- multithreading.

tured no des and removes all synchronization on the Rugina and Rinard's algorithm propagates information

corresp onding ob jects from the counts. with three sets of edges: the current set of edges C ,inter-

ference edges I from other threads, and the set of edges E

 Interthread: The compiler uses the inter-thread anal-

created by the current thread. Separating C and E enables

ysis as de ned in Section 10. In addition to the In-

the algorithm to p erform strong up dates to shared variables.

terpro cedural optimization describ ed ab ove, the com-

Strong up dates eliminate edges from C , leaving them in E to

piler uses the thread interaction results. At the end of

b e correctly observed by other threads. There are twoways

each metho d that starts a thread, and at the end of

to extend the algorithm presented in this pap er to handle

the main metho d, it resolves the interactions b etween

strong up dates to heap allo cated ob jects. The rst is to

started parallel threads. If all of the synchronization

allow the analysis to p erform strong up dates to captured

actions on a captured no de in the resulting parallel in-

ob jects [41 ]. The second is to adopt the Rugina and Rinard

teraction graph are indep endent, the analysis removes

solution and split the set of inside edges in the parallel inter-

all synchronization on the corresp onding ob jects from

action graphs into a set of current edges and a set of edges

the counts.

created by the current thread, with strong up dates remov-

ing edges from the set of current edges but leaving them in

place in the set of edges created by the current thread.

Original Interpro cedural Interthread Application

The interference edges in Rugina and Rinard's analy-

javac 2,080,116 1,348,814 51,164

sis corresp ond to inside edges from parallel threads in the

javacup 1,704,563 537,040 121,798

analysis presented in this pap er. In general, the set of in-

server 7,091 6,123 1,842

terference edges coming into a current thread dep ends on

21,877 21,317 2,983 work

interactions b etween that current thread and threads that

run in parallel with it. Rugina and Rinard's analysis uses a

Figure 17: Total Numb er of Synchronization Op erations

xed-p oint algorithm to resolve these interactions and com-

pute a complete set of interference edges for the analysis of

For javac and javacup, the inter-thread optimization re-

each thread. This algorithm rep eatedly reanalyzes threads

moves over 92 of the total synchronizations. For server

until the sets of interference edges from parallel threads do

and work, the inter-thread synchronization removes over

not change. The analysis presented in this pap er takes a

74 of the synchronizations. In all of the cases, the In-

di erent approach. It uses outside edges and no des to rep-

terthread optimization signi cantly reduces the number of

resent all p otential interactions of the current thread with

synchronizations as compared to the Interpro cedural opti-

other parallel threads. These outside edges and no des en-

mization. To put these results in p ersp ective, recent research

able the analysis to conceptually derive a complete set of in-

with escap e analysis algorithms less p owerful than our Inter-

terference edges from parallel threads without a xed-p oint

pro cedural optimization level rep orted signi cant sp eedups

algorithm. Instead, the analysis matches inside and outside

from synchronization elimination for a range of Java pro-

edges to compute the interactions without reanalyzing each

grams [12 , 3].

thread.

In general, the analysis of multithreaded programs is a

15 Related Work

relatively unexplored eld. There is an awareness that mul-

tithreading signi cantly complicates program analysis [31 ],

In this section, we discuss several areas of related work:

but a full range of standard techniques haveyet to emerge.

p ointer analysis, escap e analysis, and synchronization op-

Grunwald and Srinivasan present a data ow analysis frame-

timizations.

work for reaching de nitions for explicitly parallel programs [25 ],

and Kno op, Ste en and Vollmer present an ecient data ow

analysis framework for bit-vector problems such as liveness,

15.1 for Multithreaded Programs

reachability and available expressions, but neither frame-

There have b een, to our knowledge, two previously pub-

work applies to p ointer analysis [29 ]. In fact, the application

lished ow-sensitive p ointer analysis algorithms for multi-

of these frameworks for programs with p ointers would re-

threaded programs. Rugina and Rinard published an algo-

quire p ointer analysis information. Zhu and Hendren present

rithm for programs with structured, fork-join parallelism [37 ].

a set of communication optimizations for parallel programs

The algorithm is interpro cedural, context-sensitive, and top-

that use information from their p ointer analysis; this analy-

down, generating calling contexts in a top-down manner

sis uses a ow-insensitive analysis to detect p ointer variable

starting with the main metho d. Each pro cedure is rean-

interference between parallel threads [44 ]. Hicks also has

alyzed for each new calling context. Corb ett published a

develop ed a ow-insensitive analysis sp eci cally for a mul-

algorithm for multithreaded programs that consist of a sin-

tithreaded language [27 ].

gle pro cedure [16 ]. Both analyses use an iterative, xed-

p oint algorithm to compute the interactions b etween paral-

15.2 Escap e Analysis for Multithreaded Programs

lel threads, and must analyze the entire program. Our al-

gorithm, on the other hand, is a b ottom-up, comp ositional,

There have b een, to our knowledge, four previously pub-

interpro cedural algorithm that analyzes each metho d once

lished escap e analysis algorithms for multithreaded programs [41 ,

to derive a parameterized analysis result that can b e sp e-

10 , 12 , 15 ]. All of these algorithms use the escap e informa-

3

cialized for use at all call sites that invoke the metho d.

tion for stack allo cation and synchronization elimination.

Unlike Corb ett's algorithm, it handles multiple pro cedures

They only analyze single threads, and are designed to nd

ob jects that are accessible to only the current thread. If an

3

Recursive metho ds require an iterative algorithm that may ana-

ob ject escap es the current thread, either to another thread

lyze metho ds multiple times to reach a xed p oint.

or by b eing written into a static class variable, it is marked scop e. Our analysis deals with this p ossibilityby using out-

as globally escaping, and there is no attempt to recapture side edges and outside no des to represent the results of loads

the ob ject by analyzing the interactions b etween the threads from escap ed ob jects.

that access the ob ject. These algorithms are therefore fun- There is a similarity between the outside no des in our

damentally sequential program analyses that have b een ad- analysis and invisible variables in previous analyses [30 , 23,

justed to ensure that they op erate conservatively in the pres- 42 ]. Both outside no des and invisible variables are used to

ence of parallel threads. The algorithm presented in this represent ob jects from outside the analysis context during

pap er, on the other hand, is designed to analyze the inter- the analysis of a metho d or pro cedure. One di erence is that

actions b etween parallel threads. Unlike all other previously when the analysis generates contexts in a top-down fashion,

published algorithms, it is capable of extracting precise es- it has a complete characterization of the aliasing and p oints-

cap e information even for ob jects that are accessible to mul- to relationships involving all invisible variables. In this con-

tiple threads. text, invisible variables primarily serve to enable the analysis

to reuse analysis results for contexts with the same aliasing

and p oints-to relationships b etween pro cedure parameters.

15.3 Pointer Analysis for Sequential Programs

Because our analysis is b ottom-up, it knows nothing ab out

Pointer analysis for sequential programs is a relatively ma-

the relationships involving ob jects represented by outside

ture eld. Flow-insensitive analyses, as the name suggests,

no des. It therefore analyzes each metho d under the two as-

do not take statement ordering into account, and often use

sumptions that there are no aliases b etween outside no des,

an analysis based on some form of set inclusion constraints

and that every load from an escap ed no de may access a ref-

to pro duce a single p oints-to graph that is valid across the

erence created outside the metho d. One implication of this

entire program [4 , 40 , 39 ]. Many ow-insensitive algorithms

di erence is that di erentinvisible variables always repre-

scale well to very large programs, in part b ecause they gen-

sent disjoint sets of ob jects, while di erent outside no des

erate one analysis result instead of one p er program p oint

may representoverlapping sets of ob jects.

and in part b ecause of highly optimized implementations

Invisible variables and outside no des also supp ort a par-

of the inclusion constraint solution algorithms [24 ]. Because

ticular kind of precision in the analysis. Consider a metho d

ow-insensitive analyses are insensitive to the order in which

invoked at multiple call sites. It may b e the case that an

statements execute, they mo del all interleavings and extend

ob ject is allo cated inside the metho d, escap es the metho d,

trivially to multithreaded programs. Like many ow insen-

but is recaptured at each call site. In this case, the use

sitive algorithms, we use set inclusion constraints as a fun-

of invisible variables or outside no des enables the analysis

damental to ol in our analysis. A di erence is that our anal-

to recognize that the ob ject was recaptured. It can there-

ysis uses these constraints to formally sp ecify the result of

fore separate the di erent instantiations of the allo cated ob-

interactions b etween parallel interaction graphs during the

ject from each other in the analysis. To our knowledge,

interpro cedural and inter-thread analyses, with eachinterac-

the only published analyses that can separate the di erent

tion generating its own constraint solution problem. The in-

instantiations of the ob ject are b oth ow sensitive and use

trapro cedural analysis uses a standard xed-p oint data ow

some variant of the concept of invisible variables [30 , 23 , 42 ].

approach. Flow-insensitive analyses typically formulate the

Context-insensitive analyses simply merge the information

entire analysis problem as a single collection of set inclusion

from the di erent call sites. In the absence of recursion,

constraints.

other context-sensitive analyses are capable of separating

Flow-sensitive analyses take the statement ordering into

the di erent instantiations [32 ], but merge information from

account, typically using a data ow analysis to pro duce a

recursive call sites in a way that destroys the distinction b e-

p oints-to graph or set of alias pairs for each program p oint [38 ,

tween multiple instantiations of the same variable in a re-

35 , 42 , 23 , 14 , 30]. One approach analyzes the program in a

cursive pro cedure [32 ]. A ow-insensitive, constraint-based

top-down fashion starting from the main pro cedure, reana-

analysis with p olymorphic recursion maybe able to sepa-

lyzing each p otentially invoked pro cedure in each new calling

rate the instantiations and recover this particular kind of

context [42 , 23 ]. Another approach analyzes the program in

precision.

a b ottom-up fashion, extracting a single analysis result for

each pro cedure. The result is reused at each call site that

15.4 Escap e Analysis

mayinvoke the pro cedure [38 , 13 ]. Our algorithm builds on

these previous approaches. It is extended to include escap e There has b een a fair amount of work on escap e analysis

and action ordering information and to explicitly represent in the context of functional languages [7 , 5, 43 , 18, 19 , 9,

p otential interactions using outside edges and no des. These 26 ]. The implementations of functional languages create

extensions enable the algorithm to generalize in a straight- many ob jects for example, cons cells and closures implic-

forward way to mo del interactions b etween parallel threads. itly. These ob jects are usually allo cated in the heap and

Multithreading intro duces one particulary subtle p oint. reclaimed later by the garbage collector. It is often p ossible

Consider a load of a reference from an inside no de, which to use a lifetime or escap e analysis to deduce b ounds on the

represents an ob ject created within the computation of the lifetimes of these dynamically created ob jects, and to p er-

currently analyzed metho d. In a sequential program, the form optimizations to improve their memory management.

load would always return a reference created within the anal- Deutsch [18 ] describ es a lifetime and sharing analysis for

ysis of the current metho d | b ecause the inside no de did higher-order functional languages. His analysis rst trans-

not exist b efore the metho d was invoked, no unanalyzed lates a higher-order functional program into a sequence of

co de could have executed to write a reference into the ob- op erations in a low-level op erational mo del, then p erforms

ject. But in a multithreaded program, an unanalyzed par- an analysis on the translated program to determine the life-

allel thread may write references into an ob ject as so on as times of dynamically created ob jects. The analysis is a

it escap es. The analysis for multithreaded programs must whole-program analysis. Park and Goldb erg [5 ] also describ e

therefore assume that every load from an escap ed ob ject an escap e analysis for higher-order functional languages.

may access a reference created outside the current analysis Their analysis is less precise than Deutsch's. It is, how-

It extends the current state of the art in two ways: it is ever, conceptually simpler and more ecient. Their main

the rst interpro cedural, ow-sensitive p ointer analysis al- contribution was to extend escap e analysis to include lists.

gorithm for unstructured multithreaded programs, and it is Deutsch [19 ] later presented an analysis that extracts the

the rst algorithm to extract precise escap e analysis infor- same information but runs in almost linear time. Blanchet [9]

mation for ob jects accessible to multiple threads. Wehave extended this algorithm to work in the presence of imp era-

implemented the algorithm in the IBM Jalap eno ~ virtual ma- tive features and p olymorphism. He also provides a correct-

chine, and used the analysis results to p erform a synchro- ness pro of and some exp erimental results.

nization elimination optimization. Our exp erimental results Baker [7] describ es an novel approach to higher-order

show that, for our set of b enchmark applications, the anal- escap e analysis of functional languages based on the typ e

ysis can successfully removebetween 75 and 95 of the inference uni cation technique. The analysis provides es-

total synchronizations. cap e information for lists only. Hannan also describ es a

In the long run, we b elieve the most imp ortant concept typ e-based analysis in [26 ]. He uses annotated typ es to de-

in this research may turn out to be designing analysis al- scrib e the escap e information. He only gives inference rules

gorithms from the p ersp ective of extracting and represent- and no algorithm to compute annotated typ es.

ing interactions b etween analyzed and unanalyzed regions of

the program. This approach leads to clean, comp ositional

15.5 Synchronization Optimizations

algorithms that are capable of analyzing arbitrary parts of

Diniz and Rinard [20 , 21 ] describ e several algorithms for p er-

complete or incomplete programs.

forming synchronization optimizations in parallel programs.

The basic idea is to drivedown the lo cking overhead by coa-

References

lescing multiple critical sections that acquire and release the

same lo ckmultiple times into a single critical section that

[1] A. Aiken and E. Wimmers. Solving systems of set con-

acquires and releases the lo ck only once. When p ossible, the

straints. In Proceedings of the Seventh Annual IEEE

algorithm also coarsens the lo ck granularityby using lo cks

Symposium on Logic in Computer Science, Santa Cruz,

in enclosing ob jects to synchronize op erations on nested ob-

CA, June 1992.

jects. Plevyak and Chien describ e similar algorithms for

reducing synchronization overhead in sequential executions

[2] J. Aldrich, C. Chamb ers, E. Sirer, and S. Eggers. Static

of concurrent ob ject-oriented programs [34 ].

analyses for eliminating unnecessary synchronization

Several research groups have recently develop ed synchro-

from java programs. In Proceedings of the 6th Inter-

nization optimization techniques for Java programs. Aldrich,

national Static Analysis Symposium, Septemb er 1999.

Chamb ers, Sirer, and Eggers describ e several techniques for

[3] B. Alp ern, D. Attanasio, A. Co chi, D. Lieb er, S. Smith,

reducing synchronization overhead, including synchroniza-

T. Ngo, and J. Barton. Implementing jalap eno ~ in

tion elimination for thread-private ob jects and several opti-

java. In Proceedings of the 14th Annual Conferenceon

mizations that eliminate synchronization from nested moni-

Object-OrientedProgramming Systems, Languages and

tor calls [2 ]. Blanchet describ es a pure escap e analysis based

Applications, Denver, CO, Novemb er 1999.

on an abstraction of a typ e-based analysis [10 ]. The imple-

mentation uses the results to eliminate synchronization for

[4] Lars Ole Andersen. Program Analysis and Specializa-

thread-private ob jects and to allo cate captured ob jects on

tion for the C Programming Language. PhD thesis,

the stack. Bogda and Ho elzle describ e a ow-insensitive es-

DIKU, University of Cop enhagen, May 1994.

cap e analysis based on global set inclusion constraints [12 ].

The implementation uses the results to eliminate synchro-

[5] B. Goldb erg and Y. Park. Escap e analysis on lists. In

nization for thread-private ob jects. A limitation is that the

Proceedings of the SIGPLAN '92 Conference on Pro-

analysis is not designed to nd captured ob jects that are

gram Language Design and Implementation, pages 116{

reachable via paths with more than two references.

127, July 1992.

Choi, Gupta, Serrano, Sreedhar, and Midki presenta

comp ositional data ow analysis for computing reachability

[6] J. Babb, M. Rinard, A. Moritz, W. Lee, M. Frank,

information [15 ]. The analysis results are used for synchro-

R. Barua, and S. Amarasinghe. Parallelizing applica-

nization elimination and stack allo cation of ob jects. Like

tions into silicon. In Proceedings of the IEEE Workshop

the analysis presented in this pap er, it uses an extension

on FPGAs for Custom Computing Machines, Napa Val-

of p oints-to graphs with abstract no des that may represent

ley, CA, April 1999.

multiple ob jects. It do es not distinguish b etween inside and

[7] H. Baker. Unifying and conquer garbage, up dating,

outside edges, but do es contain an optimization, deferred

aliasing ... in functional languages. In Proceedings of

edges, that is designed to improve the eciency of the anal-

the ACM Conference on Lisp and Functional Program-

ysis. The approach classi es ob jects as globally escaping,

ming, pages 218{226, 1990.

escaping via an argument, and not escaping. Because the

primary goal was to compute escap e information, the anal-

[8] R. Barua, W. Lee, S. Amarasinghe, and A. Agarwal.

ysis collapses globally escaping subgraphs into a single no de

Maps: A compiler-managed memory system for Raw

instead of maintaining the extracted p oints-to information.

machines. In Proceedings of the 26th International Sym-

Our analysis retains this information, which is crucial for de-

posium on Computer Architecture, Atlanta, GA, May

veloping a p ointer analysis algorithm that takes interactions

1999.

between threads into account.

[9] B. Blanchet. Escap e analysis: Correctness pro of, imple-

mentation and exp erimental results. In Proceedings of

16 Conclusion

the 25th Annual ACM Symposium on the Principles of

Programming Languages,Paris, France, January 1998.

This pap er presents a new combined p ointer and escap e

ACM, ACM, New York.

analysis algorithm for unstructured multithreaded programs.

[10] B. Blanchet. Escap e analysis for ob ject oriented lan- Annual ACM Symposium on the Principles of Program-

guages. application to java. In Proceedings of the 14th ming Languages, pages 187{200, Paris, France, January

Annual Conference on Object-Oriented Programming 1997. ACM, New York.

Systems, Languages and Applications, Denver, CO,

[22] P. Diniz and M. Rinard. Lo ck coarsening: Eliminat-

Novemb er 1999.

ing lo ckoverhead in automatically parallelized ob ject-

[11] R. Blumofe, C. Jo erg, B. Kuszmaul, C. Leiserson, based programs. Journal of Paral lel and Distributed

K. Randall, and Y. Zhou. Cilk: An ecient multi- Computing, 492:2218{244, March 1998.

threaded runtime system. In Proceedings of the 5th

[23] M. Emami, R. Ghiya, and L. Hendren. Context-

ACM SIGPLAN Symposium on Principles and Prac-

sensitiveinterpro cedural p oints-to analysis in the pres-

ticeofParal lel Programming, Santa Barbara, CA, July

ence of function p ointers. In Proceedings of the SIG-

1995. ACM, New York.

PLAN '94 Conference on Program Language Design

[12] J. Bogda and U. Ho elzle. Removing unnecessary syn- and Implementation, pages 242{256, Orlando, FL, June

chronization in java. In Proceedings of the 14th Annual 1994. ACM, New York.

Conference on Object-OrientedProgramming Systems,

[24] M. Fahndrich, J. Foster, Z. Su, and A. Aiken. Partial

Languages and Applications, Denver, CO, November

online cycle elimination in inclusion constraint graphs.

1999.

In Proceedings of the SIGPLAN '98 ConferenceonPro-

[13] R. Chatterjee, B. Ryder, and W. Landi. Relevant gram Language Design and Implementation, Montreal,

context inference. In Proceedings of the 26th Annual Canada, June 1998.

ACM Symposium on the Principles of Programming

[25] D. Grunwald and H. Srinivasan. Data ow equations for

Languages, San Antonio, TX, January 1999.

explicitly parallel programs. In Proceedings of the 4th

[14] J. Choi, M. Burke, and P. Carini. Ecient ACM SIGPLAN Symposium on Principles and Practice

ow-sensitive interpro cedural computation of p ointer- of Paral lel Programming, San Diego, CA, May 1993.

induced aliases and side e ects. In ConferenceRecordof

[26] J. Hannan. Atyp e-based analysis for blo ck allo cation

the Twentieth Annual Symposium on Principles of Pro-

in functional languages. In Proceedings of the Second

gramming Languages, Charleston, SC, January 1993.

International Static Analysis Symposium.ACM, ACM,

ACM.

New York, Septemb er 1995.

[15] J. Choi, M. Gupta, M. Serrano, V. Sreedhar, and

[27] J. Hicks. Exp eriences with compiler-directed storage

S. Midki . Escap e analysis for java. In Proceedings

reclamation. In Proceedings of the 5th ACM Conference

of the 14th Annual Conference on Object-OrientedPro-

on Languages and Computer

gramming Systems, Languages and Applications, Den-

Architecture, pages 95{105, June 1993.

ver, CO, Novemb er 1999.

[28] S. Horowitz, T. Reps, and D. Binkley. Interpro cedural

[16] J. Corb ett. Using shap e analysis to reduce nite-state

slicing using dep endence graphs. In Proceedings of the

mo dels of concurrentjava programs. In Proceedings of

SIGPLAN '88 Conference on Program Language De-

the International Symposium on Software Testing and

sign and Implementation,Atlanta, GA, June 1988.

Analysis, March 1998.

[29] J. Kno op, B. Ste en, and J. Vollmer. Parallelism for

[17] J. Dean, D. Grove, and C. Chamb ers. Optimization

free: Ecient and optimal bitvector analyses for paral-

of ob ject-oriented programs using static class hierarchy

lel programs. ACM Transactions on Programming Lan-

analysis. In Proceedings of the 9th European Confer-

guages and Systems, 183:268{299, May 1996.

ence on Object-Oriented Programming, Aarhus, Den-

mark, August 1995.

[30] W. Landi and B. Ryder. A safe approximation algo-

rithm for interpro cedural p ointer aliasing. In Proceed-

[18] A. Deutsch. On determining lifetime and aliasing of

ings of the SIGPLAN '92 ConferenceonProgram Lan-

dynamically allo cated data in higher-order functional

guage Design and Implementation, San Francisco, CA,

sp eci cations. In Proceedings of the 17th Annual ACM

June 1992.

Symposium on the Principles of Programming Lan-

guages, pages 157{168, San Francisco, CA, January

[31] S. Midki and D. Padua. Issues in the optimization of

1990. ACM, ACM, New York.

parallel programs. In Proceedings of the 1990 Interna-

tional Conference on Paral lel Processing, pages I I{105{

[19] A. Deutsch. On the complexity of escap e analysis. In

113, 1990.

Proceedings of the 24th Annual ACM Symposium on the

Principles of Programming Languages, Paris, France,

[32] R. O'Callahan and D. Jackson. Lackwit: A program un-

January 1997. ACM, ACM, New York.

derstanding to ol based on typ e inference. In 1997 Inter-

national Conference on Software Engineering, Boston,

[20] P. Diniz and M. Rinard. Lo ck coarsening: Eliminat-

MA, May 1997.

ing lo ckoverhead in automatically parallelized ob ject-

based programs. In Proceedings of the Ninth Workshop

[33] J. Plevyak, X. Zhang, and A. Chien. Obtaining sequen-

on Languages and Compilers for Paral lel Computing,

tial eciency for concurrent ob ject-oriented languages.

pages 285{299, San Jose, CA, August 1996. Springer-

In Proceedings of the 22nd Annual ACM Symposium on

Verlag.

the Principles of Programming Languages.ACM, Jan-

uary 1995.

[21] P. Diniz and M. Rinard. Synchronization transforma-

tions for parallel computing. In Proceedings of the 24th

[34] J. Plevyak, X. Zhang, and A. Chien. Obtaining sequen-

tial eciency for concurrent ob ject-oriented languages.

In Proceedings of the 22nd Annual ACM Symposium on

the Principles of Programming Languages, San Fran-

cisco, CA, January 1995. ACM, New York.

[35] E. Ruf. Context-insensitive reconsidered.

In Proceedings of the SIGPLAN '95 ConferenceonPro-

gram Language Design and Implementation, La Jolla,

CA, June 1995.

[36] R. Rugina and M. Rinard. Symb olic analysis of divide

and conquer programs. In Submitted to PLDI '00.

[37] R. Rugina and M. Rinard. Pointer analysis for mul-

tithreaded programs. In Proceedings of the SIGPLAN

'99 ConferenceonProgram Language Design and Im-

plementation,Atlanta, GA, May 1999.

[38] P. Sathyanathan and M. Lam. Context-sensitive in-

terpro cedural p ointer analysis in the presence of dy-

namic aliasing. In Proceedings of the Ninth Workshop

on Languages and Compilers for Paral lel Computing,

San Jose, CA, August 1996. Springer-Verlag.

[39] M. Shapiro and S. Horwitz. Fast and accurate ow-

insensitive p oints-to analysis. In Proceedings of the 24th

Annual ACM Symposium on the Principles of Program-

ming Languages,Paris, France, January 1997.

[40] Bjarne Steensgaard. Points-to analysis in almost linear

time. In Proceedings of the 23rdAnnual ACM Sympo-

sium on the Principles of Programming Languages, St.

Petersburg Beach, FL, January 1996.

[41] J. Whaley and M. Rinard. Comp ositional p ointer and

escap e analysis for java programs. In Proceedings of

the 14th Annual Conference on Object-Oriented Pro-

gramming Systems, Languages and Applications, Den-

ver, CO, Novemb er 1999.

[42] R. Wilson and M. Lam. Ecient context-sensitive

p ointer analysis for C programs. In Proceedings of the

SIGPLAN '95 Conference on Program Language De-

sign and Implementation, La Jolla, CA, June 1995.

ACM, New York.

[43] Y. Tang and P. Jouvelot. Control- ow e ects for escap e

analysis. In Workshop on Static Analysis, pages 313{

321, Septemb er 1992.

[44] H. Zhu and L. Hendren. Communication optimizations

for parallel C programs. In Proceedings of the SIG-

PLAN '98 Conference on Program Language Design

and Implementation, Montreal, Canada, June 1998.