Parallelizing Elimination Orders with Linear Fill

Claudson Bornstein Bruce Maggs Gary Miller R Ravi

;;

School of Computer Science and

Graduate School of Industrial Administration

Carnegie Mellon University

Pittsburgh PA

Abstract pivoting step variable x is eliminated from equa

i

This paper presents an algorithm for nding parallel

tions i i n

elimination orders for Gaussian elimination Viewing

The system of equations is typically represented as

a system of equations as a graph the algorithm can be

a matrix and as the pivots are p erformed some entries

applied directly to interval graphs and chordal graphs

in the matrix that were originally zero may b ecome

For general graphs the algorithm can be used to paral

nonzero The number of new nonzeros pro duced in

lelize the order produced by some other heuristic such

solving the system is called the l l Among the many

as minimum degree In this case the algorithm is ap

dierent orders of the variables one is typically chosen

plied to the chordal completion that the heuristic gen

so as to minimize the ll Minimizing the ll is desir

erates from the input graph In general the input to

able b ecause it limits the amount of storage needed to

the algorithm is a G with n nodes and

solve the problem and also b ecause the ll is strongly

m edges The algorithm produces an order with height

correlated with the total number of op erations work

p erformed

at most O log n times optimal l l at most O m

 

and work at most O W G where W G is the

Gaussian elimination can also b e viewed as an algo

minimum possible work over al l elimination orders for

rithm that is p erformed on the graph whose adjacency

G Experimental results show that when applied after

matrix is the matrix representing the system of equa

some other heuristic the increase in work and l l is

tions Pivoting on a variable corresp onds to

usual ly smal l In some instances the algorithm obtains

removing a vertex from the graph and forming a

an order that is actual ly better in terms of work and

of its neighbors The number of new edges added to

l l than the original one We also present an algo

the graph in this pro cess constitutes the ll

rithm that produces an order with a factor of log n less

Throughout this pap er we assume that our matrices

p

are symmetric p ositive denite so that our graphs are

log n more l l height but with a factor of O

undirected our pivots are always nonzero and we can

Introduction

ignore the issue of numerical stability

One of the most p opular metho ds for solving a sys

Heuristics for sparse Gaussian elimi

tem of linear equations is Gaussian elimination The

nation

crux of this metho d is to pivot on the variables of the

A number of heuristics for minimizing ll for sparse

system oneatatime according to some order For ex

matrices are available the most p opular b eing nested

ample supp ose that the variables x x are to b e

n

dissection and minimumdegree

eliminated according to an order Then in the ith

Nested dissection as the name suggests is a recur

1

Emailcfbcscmuedu

sive elimination pro cedure It identies a balanced

2

Bruce Maggs is supp orted in part by the Air Force Materiel

separator in the graph and sets the no des in the sepa

Command AFMC and ARPA under Contract FC

rator apart for elimination at the very end The com

by ARPA Contracts F and N

p onents resulting from removing the separator are re

and by an NSF National Young Investigator Award No

cursively ordered one after the other and placed b e

CCR with matching funds provided by NEC Research

fore the separator in the elimination order George

Institute and Sun Microsystems Emailbmmcscmuedu

rst prop osed this metho d for eliminating no des

3

Gary Miller is supp orted in part by NSF Grant CCR

in a mesh and later generalized it in a pap er with Liu

and ARPA under Contract N

for eliminating the no des in an arbitrary graph

Emailglmillercscmuedu

Bounds on the ll induced by nested dissection orders

4

R Ravi is supp orted in part by an NSF CAREER Award

are known for planar graphs and arbitrary graphs with

CCR Emailraviandrewcmuedu

b ounded degree

The views and conclusions contained here are those of the

The minimumdegree heuristic rep eatedly nds a

authors and should not b e interpreted as necessarily represent

vertex of minimum degree and eliminates it This

ing the ocial p olicies or endorsements either express or im

plied of AFMC ARPA CMU or the US Government heuristic originated with the work of Markowitz in the

late s and has undergone several enhancements in ing the space overhead minimal corresp onds to nd

the years since Its p opularity is attested to by its ing an order that has simultaneously low height and

inclusion in various publicly available co des such as low ll Gilb ert conjectured the existence of a par

MA YALESMP and SPARSPAK In contrast to allel elimination order that has the minimum p ossible

nested dissection no p erformance guarantee is known height among all orders and ll that is only a constant

for the ll induced by this heuristic In fact there ex factor more than the number of edges in a minimum

ist graphs for which the ll induced by the minimum ll order see The hop e was that a small increase

degree order can b e very high in ll could b e traded for faster parallel solutions Asp

vall disproved this conjecture however by exhibit

Recently some hybrid algorithms have b een ex

ing a graph for which any order that has the minimum

p erimentally shown to pro duce orders that compare

p ossible height requires a p olynomial factor more ll

favorably with those pro duced by either minimum

than the minimum p ossible

degree or nested dissection alone Hendrickson and

Rothberg and indep endently Liu and Ashcraft

Given an a sub class of chordal

prop osed algorithms that rst nd a separa

graphs with n vertices and m edges can we nd an

tor that partitions the graph into small comp onents

order with O m ll and height close to the minimum

The minimumdegree heuristic is then used to order

p ossible In particular do es a nested dissection order

the vertices within each comp onent and also within

accomplish this We show that the classical nested

the separators In practice b oth algorithms pro

dissection pro cedure applied to interval graphs pro

p

duce orders that compare favorably with stateofthe

log n m ll and with height duces an order with O

art minimumdegree and nesteddissection algorithms

at most O log n times the optimum for that graph In

but no b ounds on the amount of ll that they intro

fact the b ound on ll is tight if the nested dissection

duce are known

algorithm is forced to choose a balanced separa

tor thus providing a negative answer to our question

Our results

Even in this very restricted class of graphs nested dis

In this pap er we fo cus on parallel elimination or

section may generate undesirable ll

ders for chordal graphs Chordal graphs are a natural

On the p ositive side we show that an extension

choice b ecause they are rich in structure and b ecause

of nested dissection provides go o d orders We ex

they are intimately related to Gaussian elimination or

hibit an interval graph algorithm that pro duces orders

ders in fact chordal graphs are exactly the class of

with O m ll and height within a factor of O log n

graphs that have zeroll elimination orders More

times the optimum This same algorithm can then

over since any elimination order constructs a chordal

b e generalized to chordal graphs and pro duces orders

completion of a graph that is adds edges to the graph

that also have O m ll while having height within

so as to make it chordal we can apply our algorithm to

an O log n factor of the minimum p ossible While

the chordal completion induced by any go o d ordering

this guarantee is worse in terms of height than the

algorithm

one nested dissection provides it is signicantly b et

Chordal graphs already have zeroll elimination

ter than that given by either a p erfect elimination or

orders so how can we p ossibly improve on that We

der or by the minimumdegree heuristic In addition

prop ose that some extra ll might b e tolerable if paral

to the balanced separators used in nested dissection

lelism can b e exp osed Although we have thus far de

we utilize another kind of separator that we call sen

scrib ed Gaussian elimination as if vertices were elim

tinels Sentinels help lo calize the ll induced by our

inated oneatatime in fact a set of vertices can b e

orders without increasing the height by much In ad

eliminated in parallel if they are indep endent ie no

dition to the b ound on the ll we also show that the

two vertices in the set are neighbors Thus in a par

total work as well as the front size of our orders are

allel elimination order we allow indep endent sets to

within a constant factor of the minimum p ossible

b e eliminated in one step and we dene the height to

Preliminary exp eriments show that the overheads

b e the total number of steps A lower b ound on the

in ll and height are much b etter than predicted by

height of any elimination order is the size of the maxi

our theoretical analysis For instance we observed

mum clique in the graph since the vertices in a clique

the following interesting b ehavior in twodimensional

cannot b e eliminated in parallel

grids Minimumdegree p erforms b etter than nested

Although a strictly sequential order has height n

dissection in terms of ll and work on grids with high

it is often p ossible to exp ose some parallelism in a

asp ect ratio In this case however the orders

sequential order The idea is to view a sequential or

pro duced by minimumdegree are very sequential in

der as a partial order that constrains each vertex to

nature and exhibit large height Our algorithm ap

b e eliminated b efore any of its neighbors that app ear

plied to the chordal completion obtained from the

later in the sequential order Thus we can dene the

minimumdegree order generates an order that has

height of a sequential order to b e the minimumheight

go o d height and low ll and work In fact when com

parallel order that is consistent with the partial or

pared to a nested dissection order of the original grid

der Nested dissection is known to pro duce lowheight

our order exhibits worse height but slightly b etter ll

orders in particular within a p olylogarithmic factor

and work See Table

of the minimum p ossible On the other hand

minimumdegree orders can have a p olynomial factor

1

more height than the minimum p ossible eg a path

The front size corresp onds roughly to the size of a maximum

clique in the lled graph Trying to achieve fast parallel solutions while keep

The pro ofs of the lemmas contained here have b een elimination orders for chordal graphs Some exp eri

omitted and can b e found in mental results obtained for an implementation of the

algorithm in Section can b e found in Section We

Related work Fill

conclude with some remarks in Section

The problem of nding an elimination order that

minimizes the ll for arbitrary graphs is known to b e

Denitions

NPhard

In order to pro ceed we need to establish some no

The rst analysis for a variant of nested dissection

tation concerning matrices and graphs

p

for graphs with small separators of size O n in an

Each step of Gaussian elimination on a symmetric

nno de graph was given by Lipton Rose and Tarjan

matrix M corresp onds to choosing a vertex v in G

The ll introduced by this variant is O n log n

adding edges to G if necessary to make v s neighbor

on an nno de graph Subsequently Gilb ert and Tar

ho o d a clique and then removing v from G v is said

jan analyzed the original nested dissection algo

to have b een eliminated from G Any new edges in

rithm of George and Liu for planar graphs and showed

tro duced by the elimination of a vertex are called l l

that using small separators in the recursive pro cedure

edges or simply l l A vertex v is simplicial in G if

yields a ll of O n log n They also p oint out

its neighborho o d N v is a clique of G Simplicial ver

that this metho d do es not work in general for graphs

tices are of sp ecial interest since the elimination of a

with small separators by constructing a counterexam

simplicial vertex do es not introduce any ll edges

ple Both of these pap ers also show a b ound

Alternatively we can think of Gaussian elimination

3

2

as simply inserting the ll edges in a graph without

of O n on the work of the orders It is interesting

ever really removing any vertices In this case the

to note that there are nno de planar graphs square

elimination of a vertex corresp onds to the introduc

grids in particular for which any elimination order

tion of edges b etween any pair of its neighbors that

introduces ll n log n

are not connected and are later in the elimination

Agrawal Klein and Ravi gave the rst approxi

order than the vertex b eing considered The graph

mation algorithms for nding elimination orders that

augmented with all the ll edges is referred to as the

simultaneously minimize the ll height and the work

updated graph Given two nonadjacent vertices v and

all to within a p olylogarithmic factor of optimal when

w in a graph G there exists a ll edge v w in the

the degree of the input graph is b ounded Their al

up dated graph i there exists a path from v to w that

gorithm is essentially the nested dissection algorithm

only go es through vertices that are eliminated b efore

using approximately minimumsize balanced no de sep

b oth v and w

arators to construct the recursive decomp osition

An order v v v of the vertices of G is a per

They also analyze the ll and the height of their order

n

fect elimination order if it do es not introduce any ll

when the degree of the graph is not b ounded The key

edges ie if each v is simplicial in G fv v g

dierence b etween our work and these results is that

i i

A graph is said to b e chordal if it has a p erfect elim

we start with a chordal completion of a graph and

ination order Equivalently a graph is chordal if ev

fo cus our eorts on nding parallel elimination orders

ery simple cycle with more than three vertices has a

with linear ll

chord ie no subset of vertices of G induces a

Related work Height

subgraph isomorphic to a cycle with more than three

Ignoring ll computing an elimination order for a

vertices

given graph with minimum height is NPhard and

The intersection graph of a family F of sets S is

i

remains so even if an additive error in the estimate of

the graph obtained by asso ciating a vertex v with

i

the height is allowed Pan and Reif give one of the

each set S and edges v v whenever S intersects

i i j i

rst analyses of the parallel height of nested dissection

S One characterization of chordal graphs that has

j

orders and show how nested dissection can b e used for

proven particularly useful is as the intersection graphs

solving the shortest path problem in graphs

of subtrees that is connected subgraphs of a tree

Bo dlaender et al uses an approach similar to to

We call the tree in question a skeleton of the chordal

nd elimination orders with b ounds on the height and

graph G Along with the subtrees it forms a tree rep

several related parameters Both these pap ers

resentation of G A tree representation of a graph G

is said to b e minimal if the asso ciated skeleton has the

give elimination orders with height at most O log n

minimum number of no des p ossible Gavril and

times the minimum p ossible for any nno de graph

Buneman showed that in a minimal representation

Numerous heuristics without p erformance guarantees

there is a onetoone corresp ondence b etween vertices

are also known for this problem

of T and maximal cliques of G Alternatively we can

Outline

consider the no des of T to b e formed by sets of ver

tices of G so that for each vertex v of G a subtree T

The remainder of this pap er is organized as follows

v

induced in T by the no des that contain v can b e used

In the next section we introduce some denitions In

to represent v T is said to b e the representative sub

Section we present two algorithms for nding par

v

tree of v A minimal tree representation of G is called

allel elimination orders for interval graphs The rst

a clique tree of G

algorithm which is based on nested dissection is de

scrib ed in Section The second which has lin Throughout this pap er we refer to vertices in a

ear ll is presented in Section We then show in graph but we reserve the term node to refer to vertices

Section how these algorithms can b e used to nd in the skeleton of a chordal graph and to vertices in

separator trees that is trees whose no des corresp ond dierent sizes Every separator is a clique in the cor

to separators in the original graph In b oth cases resp onding graph and hence its size is a lower b ound

no des typically corresp ond to sets of one or more ver on the height of any elimination order If the depth

tices Similarly we reserve the term link to refer to of the separator tree is d we get the following lemma

edges b etween no des in a skeleton of a chordal graph which assures that this unbalance can b e at most a

A subtree T is said to cover a no delink of the skele logarithmic factor for our nested dissection algorithm

v

ton if that no delink is in T A terminal branch of T

v

is a maximal path from a leaf v to a no de w in T that

Lemma Let G be a graph A depth d mini

except for v and w contains only degree no des

mal balanced separator tree for G produces an order of

An imp ortant sub class of chordal graphs are the

depth within a factor of d of the optimal

interval graphs which are chordal graphs that have a

skeleton that is a path A tree representation with

We now pro ceed to b ound the total number of ll

a path for a skeleton is also called an interval repre

edges introduced by the nested dissection algorithm

sentation for the representative subtrees are also just

In the lemmas that follow we only consider nontrivial

paths and can b e interpreted as intervals

interval graphs that is we assume that the graphs in

question have at least two distinct maximal cliques

Parallel elimination orders for inter

We also assume that any preexisting simplicial ver

tices have b een eliminated from the graph so that af

val graphs

ter this prepro cessing step no representative subtree

Nested dissection

consists of a single no de

In this section we analyze a simple nested dissec

tion algorithm that chooses a balanced separator at each step thus pro ducing a logarithmic depth separa

p A

log n We show an upp er b ound of O m

tor tree v

on the amount of ll for the orders pro duced

Given a graph G an balanced separator of G

V E is a set of no des S V such that no connected

comp onent of V S has more than jV j vertices for

some An balanced separator tree is one whose

are balanced separators of the subgraphs of

no des B

G The ro ot of the tree is an balanced separator of

G and we build a tree recursively for each comp onent

and attach them as subtrees of the ro ot From now on

whenever we use the term balanced separator we sim

ply mean an balanced separator for some constant

Nested dissection on an interval graph I builds a

balanced separator tree whose no des are minimal sep

arators of subgraphs of I For interval graphs every

minimal separator corresp onds to the set of vertices

that cover some link of its skeleton P We order this

Figure Fill among vertices in a separator tree of an

tree so that an inorder traversal of the separator tree

interval graph

corresp onds to a lefttoright traversal of the links of

P When necessary we will refer to an ordered sepa

Figure shows part of a separator tree of an in

rator tree to make it clear that we are considering a

terval graph Each no de in the tree corresp onds to a

separator tree whose children are ordered as describ ed

minimal separator of the graph The elimination or

here

der sp ecied by the separator tree might introduce ll

As long as the algorithm chooses balanced sep

edges b etween a vertex v in A and vertices in B s right

arators the depth of the separator tree is O log n

subtree as depicted by the dotted edges In this case

since b oth the left and right subtrees of each no de

v must b e adjacent to some vertex in B s right subtree

have no more than an fraction of the vertices in the

as depicted by the edge in the gure

subtree ro oted at that no de

We dene an inner path of a no de A in an ordered

binary tree as the path that starts with the edge to

the left or right child of A and go es all the way to the

Analysis

inorder predecessor or successor of A resp ectively

The next lemma states that amounts of ll in excess

Given a separator tree we obtain a corresp onding

of O m must b e b etween a no de and its inner paths

elimination order by recursively eliminating the ro ot

separators subtrees in parallel and then eliminating

Lemma Let I be a connected interval graph The the ro ot separator itself Even though a nested dis

total amount of l l between vertices in separator nodes section separator tree has depth O log n the corre

of an ordered balanced separator tree of I and vertices sp onding elimination order p otentially can b e fairly

not in the corresponding inner paths is O m unbalanced since the separators will probably have

Lemma allows us to concentrate on ll involving unused In choosing sentinels we incur an extra log n

vertices in inner paths Consider one such inner path factor in the height of our order since we select up to

log n sentinels p er Kill step

Lemma Let I be a connected interval graph and

We also analyze a chordal graph version of our in

let V be a node in an ordered balanced separator tree

terval graph algorithm which is describ ed in more de

of I Let V V V be the nodes in V s right inner

tail in Section The algorithms are essentially the

k

path The total amount of l l between V and vertices

same except that when dealing with chordal graphs

p

P

k

we have a skeleton that is a tree not a path We can

k jV j in its right inner path is at most O

i

i

apply the interval graph algorithms to eliminate each

of the branches paths of the skeleton that lead to

A separator is in at most inner paths two starting

leaves of the tree We rep eat this step until the whole

at itself one starting at its parent and one starting at

tree has b een pro cessed

some other ancestor Since a balanced separator tree

has O log n depth Lemmas and give the following

corollary

Denitions used in the algorithm

Corollary Let I be a connected interval graph with

An edge is said to b e an extremity of a tree if it is an

a balanced separator tree The total amount of l l in

edge to some leaf of the tree To insert a link e of the

p

duced by the order specied by the tree is O m log n

skeleton into the separator tree consists of adding to

the tree the separator formed by those vertices of the

It is not hard to nd examples of interval graphs on

chordal graph whose representative subtrees cover e

which a nested dissection algorithm that is forced to

A link of the skeleton is said to b e a root if it has b een

choose balanced separators pro duces an order

inserted into the separator tree A ro oted skeleton is

p

a skeleton whose extremities are ro ots

log n showing that the b ound ing with ll m

A vertex v and its representative subtree T are said

derived in the previous section is tight

v

to b e pinned at a link r if r is a ro ot and either T

v

A linearll O log ndepth algorithm

covers r or T is a singleton covering one of the no des

v

In this section we present a recursive algorithm

of r The term depleted is used to help us keep track

that given an interval graph nds an O log ndepth

of which edges in the input graph have b een used to

separator tree that represents the elimination order for

pay for ll and can no longer b e used Some subset of

the vertices in the graph Unlike traditional separator

the ro ots may b e said to b e depleted in G A vertex

trees vertices can app ear multiple times in the tree

is said to b e depleted if it is pinned at a depleted

and should b e eliminated at the top level separator it

link Whenever we refer to a pinned vertex it is not

app ears in

depleted unless explicitly stated Edges b etween pairs

The algorithm is comp osed of three steps which

of depleted vertices are also said to b e depleted Edges

op erate on a skeleton path of an interval graph The

b etween pinned vertices are said to b e pinned edges

analysis of the algorithm uses a p otential function

Unless otherwise stated the term pinned edges only

whose value is O m initially and is used to account

refers to nondepleted pinned edges A vertex is said

for the ll edges The last of the three steps is carried

to b e internal to a graph if it is not pinned Edges to

out to ensure that we do not charge to the same part

internal vertices are also said to b e internal regardless

of the p otential function multiple times This is done

of whether the other endp oint of the edge is internal

by keeping track of which edges are depleted of their

Let G b e an interval graph with skeleton T P

lr

contribution to the p otential and dividing the graph

denotes the path b etween and including the two links

into subgraphs on which to recurse each of which has

l and r in T P l r denotes the interval graph ob

no depleted edges

tained by restricting G to P and eliminating any

lr

The rst step Homogenize nds a sequence of up

single vertex representative subtrees Given a repre

to k O log n separators that divide the graph into

sentative subtree T asso ciated with a vertex v of G

v

k comp onents The sizes of these separators are 0

let T b e the path induced in T by P P l r is

v lr

v

geometrically decreasing a prop erty that is useful in

0

the interval graph that has those T with two or more

v

accounting for ll induced to any of these separator

no des in their representative subtrees on the skeleton

vertices The next step which we call Halving is anal

P

lr

ogous to a regular nested dissection iteration we sim

The ply p of a link e of T is the number of subtrees

e

ply select a separator that divides the skeleton of the

T that cover that link

v

interval graph we are currently working with in half

As in nested dissection this ensures that the algo

rithm nishes within O log n iterations of the steps

The algorithm

thus pro ducing an order with go o d height Finally

the algorithm p erforms the Kil l step As mentioned Given an interval graph I and an interval representa

earlier the purp ose of this step is to ensure that the tion of I with skeleton P whose extremities are l and

p otential function is used correctly to account for the r we remove any simplicial vertices of I insert b oth

ll The Kill step accomplishes this by choosing sp e l and r into the separator tree and apply the step

cial kinds of separators called sentinels that lo calize Homogenize to it Eliminating simplicial vertices do es

subsequent ll to subgraphs that contain only non not change the skeleton of the graph but rather elim

depleted edges whose p otential contribution is as yet inates any single no de representative subtrees Since

none of the steps of the algorithm creates these single KillI P l r

no de subpaths later steps do not have to deal with

them

Assume the ply at l is at least the ply at r ie

Each step of the algorithm inserts one or more links

p p Otherwise swap the roles of l and r

l r

of the skeleton into the separator tree resulting in

in this algorithm From r to l in P select the

a number of subgraphs Let K e e denote the

l i i

rst link that covers a no de that is pinned at

interval graph obtained from P e e by removing

i i

l Keep scanning P towards l and selecting the

all vertices pinned at l as well as those that cover b oth

rst link that covers at least twice as many ver

e and e

i i

tices pinned at l as did the last selected link Call

The algorithm consists of three ma jor steps that

these milestone links Also select the links adja

call each other recursively Each step inserts some ver cent to the milestones which are closer to r than

tices into the separator tree thus creating subgraphs

the corresp onding milestone and call those sen

to which the next step is applied provided there ex tinels Insert the links that were selected into the

ists at least one vertex internal to the subgraph Each

separator tree in order from l to r Call those

step op erates on an interval graph I a skeleton path P links e including l e and r e Ap

i k

of I and P s two extremities l and r The rst step

e e ply HomogenizeK e e P

i i l i i e e

i i+1

Homogenize divides the interval graph in a number of

for all i such that K e e has at least one

l i i

subgraphs and then applies the Halving step to each

vertex that hasnt b een inserted into the sepa

one of them The ply of the links for each subgraph is

rator tree Note that when e and e are a

i i

relatively homogeneous in the sense that internal links

milestone and its corresp onding sentinel all ver

have ply at least equal to half of the ply of the ro ots

have already b een ordered tices in P

e e

i i+1

of that subgraph

HomogenizeI P l r

Fill Analysis

We now dene a p otential function that will b e used to

Select l and traverse P towards r selecting

b ound the amount of ll introduced by the algorithms

the rst link whose ply is at most half the

Let G b e a chordal graph and let T b e a skeleton of G

ply of the last selected link Rep eat until r

at some stage in the algorithm Let s b e the number

is reached Apply the same algorithm from r

of links of T that are not ro ots Let x b e the number

to l Let e i k b e the links

i

of internal edges of G and y the number of pinned

that were selected ordered from l to r includ

nondepleted edges The p otential G is given by

ing l e and r e Insert all the se

k

lected links except l and r into the separator tree

G x y s

e e for and do HalvingP e e P

i i i i e e

i i+1

each i b etween and k for which some vertex in

Given an initial connected graph we take the skeleton

P e e is not in the separator tree yet

i i

T to b e a clique tree of the graph Since the graph

is chordal T can only have as many vertices as the

The Halving step simply do es what its name suggests

original graph and thus the p otential of a graph is

that is divides the problem in two subproblems that

O m n This p otential is used to pay for all the ll

are no more than half the size of the original one

introduced by our algorithm To do this at each step

in the recursion we ensure that some xed constant

times the dierence in p otential b etween the initial

HalvingI P l r

graph and the subgraphs into which that step divides

the graph is enough to pay for any ll edges that are

introduced b ecause of that step see condition iii

Cho ose a link m in P such that when m is in

later in this section

serted into the separator tree b oth P and

lm

By examining the dierent kinds of edges in the

P have no more than half as many skeleton

mr

interval graph we can determine the following change

links as P did Then do KillP l m P l m

lm

in the p otential function when a new no de is inserted

and KillP m r P m r provided that

mr

in the separator tree

P l m P m r has some vertex that has not

b een inserted in the separator tree

Lemma Let I be an interval graph and let P be a

rooted skeleton path of I with roots l and r such that

only l is depleted Let m be a link in the skeleton

Finally the Kill step divides an interval graph into a

distinct from l and r If m and l are roots in P l m

number of subgraphs and from each one it removes

l being depleted in P l m and m and r are roots

vertices that have either b een depleted or cover the

in P m r m being depleted in P m r but not in

entire piece of the skeleton corresp onding to that sub

P l m then I P l m P m r

graph This reestablishes the preconditions of the Ho

0 0 0 0 0

mogenize step as describ ed in Lemma which is m m m p m where m is the number

m

applied to each of the resulting subgraphs of internal vertices that cover m in P

We now present an analysis of the p erformance of Pro of Sketch The whole selection pro cess selects

our algorithms in terms of ll The next lemma says no more than k log p log p O log p links of

l r l

that ll is lo cal to the subgraphs dened by the sepa the skeleton aside from l and r The depletion of

rators in the separator tree the endp oints can b e seen as making l depleted and

then applying Lemma rep eatedly Making l depleted

reduces the p otential of the graph by p p while

l l

Lemma Let I be an interval graph and let P be a

according to Lemma each insertion pro duces two

rooted skeleton path of I with extremity roots l and r

subgraphs further reducing the p otential by at least

Then any link m in P distinct from l and r can be in

unit By considering where ll might b e induced we

serted into the separator tree thus dividing I into two

can verify the amount of ll introduced can b e payed

subgraphs P l m with roots l and m and P m r

for by the decrease in p otential

with roots m and r so that the vertices internal to each

Sp ecic orders for the links selected by the Homog

of the subgraphs can only have l l to other vertices in

enize algorithm can result in smaller constants when

that same subgraph

computing ll and in b etter or worse orders in terms

of the depth of the separator tree

Let P l r b e an interval graph at the b eginning of

As long as the skeleton of a subgraph generated

a step either Homogenize Halving or Kill and let e

by the algorithm describ ed in this lemma has internal

b e a link of its skeleton path P distinct from its ro ots

vertices it satises all the preconditions for Lemma

l and r When inserting e into the graphs separator

regarding the next step in the algorithm

tree we require that

Lemma Halving Step Let I be an interval graph

i ll b etween vertices ro oted at l and r must have

and let P be a rooted skeleton path of I with at least

b een accounted for in some previous step if ei

links Let l and r be its roots l being possibly depleted

ther l or r is depleted and otherwise must b e

Moreover let p maxp p for al l links e in

e l r

accounted for in the current step

P Then HalvingIPlr inserts a link m into the

separator tree such that if l is depleted in P l m and

ii ll b etween vertices pinned at e and those pinned

m is depleted in P m r conditions i ii and iii

at l and r must b e accounted for in the current

are satised

step if at least one of e or l resp ectively r is

depleted

Pro of Sketch As long as P has more than two links

the Halving step will nd such a link m All that needs

iii A constant times the dierence in p otential b e

to b e shown is that the ll describ ed in conditions i

tween the original graph and the parts must b e

and ii can b e paid for with a constant times the

enough to pay in that step for any ll

dierence in p otential b etween I and the subgraphs

P l m and P m r as prescrib ed in iii

Since condition iii refers to the dierence in p o

For the last step in the algorithm the depleted

tential the total amount of ll allowed by iii is a

ro ot must b e the ro ot with largest ply The follow

constant times the p otential of the initial graph and

ing lemma makes sure that is the case by allowing us

thus O m In the next lemmas we show that the

to relab el them if needed

ab ove three invariants are maintained as we p erform

each of the three steps describ ed in Section

Lemma Let P be a rooted skeleton of an interval

Given an interval graph I the Homogenize step di

graph I Let l and r be the extremities of P r being

vides the problem of nding an order for the vertices

depleted Then if p p the depleted status of l and

l r

of I into a number of subproblems each of which has

r can be exchanged without increasing the potential of

the same desirable prop erty namely except for the ex

I

tremities the ply of every link in the skeleton of the

Coming into the Halving step we know that p was

e

subgraphs is at least half of that of the extremity of

larger than or equal to maxp p Now since p

l r m

the skeleton with the largest ply If this condition is

could b e larger than either of those plies we dont

already met the Homogenize step do es not insert any

have that condition anymore However we can still

links into the separator tree and we go into the next

use the fact that p minfp p g where l and r are

e l r

step of the algorithm with no depleted links

the extremities of the interval graph b eing considered

Lemma Homogenize Step Let I be an interval

Lemma Kil l Step Let I be an interval graph with

graph with a rooted skeleton path P Let l and r be

a rooted skeleton path P whose roots are l and r l be

roots of P neither being depleted and let p p

l r

ing depleted in I Let p p and p p for al l

l r e r

Then HomogenizeIPlr nds k O log p links of

l

links e in P Then Kil lIPlr nds k O log p

l

the skeleton P If k then the selected links can be

links that when inserted into the separator tree in or

inserted into the separator tree in any order dening

der from l to r satisfy conditions i and iii while

subgraphs P e e rooted at e and e e being

i i i i i

producing subgraphs K e e where e and e

l i i i i

depleted Moreover except for possibly e or e the

i i

are nondepleted roots i k and also paying

have ply greater than or links of the skeleton P

e e

i i+1

for l l between vertices in P e e and those in

i i

and conditions and p equal to half the larger of p

e e

P e e K e e which includes but is not

i+1 i

i i l i i

restricted to the l l described in ii i ii and iii can be satised

Pro of Sketch The pro of involves examining all p os present in the chordal graph These should b e elimi

sible sources of ll and verifying that they can b e paid nated b efore any of the other vertices of P l r thus

for with a constant times the decrease in p otential not introducing any ll

The lemmas that follow allow us to extend the anal

caused by this step

ysis we have for each of the interval graph algorithms

A chordal graph and in particular an interval graph

to the chordal case

I has at most n maximal cliques and thus at most n

no des in its clique tree Just b efore the rst recur

Lemma Let G be a chordal graph with a skeleton

sive invocation of Homogenize the three steps have

tree T Let r be a link to a leaf of T and l be the other

partitioned I into subgraphs that have skeletons with

extremity of the terminal branch of T that includes

no more than half the number of links in the skeleton

0

r Let l be a depleted root in P l r and let G be

of I Since the three steps add O log n no des to the

the graph obtained from G by removing al l vertices

separator tree and log n iterations of those three steps

of G that are internal to P l r Then the potential

might b e necessary the depth of the tree generated by

0

G G P l r

this algorithm is O log n Since every separator is

a clique of I the order pro duced has height at most

0

Let G G and P l r b e as describ ed in Lemma

O C log n where C is the size of the maximum

clique of I

Lemma If l is the only root in P l r then any

We can obtain similar b ounds on the amount of

order for the remaining vertices of P l r does not

work that is induced by the orders pro duced by our

induce any l l in G to vertices of G P l r

algorithm Our results are summarized in the next

lemma The work analysis can b e found in

Given a graph G with n vertices and m edges and

whose largest clique has size C we can show the fol

lowing results

Lemma Let I be an interval graph with n vertices

and m edges When applied to I the algorithm pre

sented in Section produces an order with height

The chordal version of the nested dissection al



gorithm Section pro duces an order with

O C log n l l O m and work O W I where C

p



is the size of the largest clique in I and W I is the

log n ll O C log n height and O m

minimum amount of work required to perform Gaus

The chordal version of our linear ll algorithm

sian elimination on I according to any order

Section pro duces an order with O C log n



height O m ll and O W G work where

Chordal graphs



W G is the minimum amount of work over all

In this section we show how to nd an elimina

p ossible orders for G

tion order for a chordal graph G by rep eatedly ap

plying one of the interval graph algorithms weve al

Exp erimental results

ready describ ed to branches of Gs skeleton This

We implemented the chordal version of the algo

treecontraction like approach was also used by Naor

rithm presented in Section and p erformed some ex

Naor and Schaer when developing algorithms for

p eriments

chordal graphs

Using either the algorithm in Section or Sec

tion we can derive an algorithm for chordal

Matrix Vertices Edges

graphs Consider a chordal graph G and its skele

b csstk

ton tree T The chordal graph algorithm consists of

b csstk

nding terminal branches of the skeleton T and apply

b csstk

ing an interval graph algorithm to the branches All

b csstk

terminal branches can b e pro cessed in parallel By re

b csstk

b csstk

iterating this pro cess we order the tree with O log n

nasasrb

calls to an interval graph algorithm

sf

The actual algorithm that is to b e applied to the

sf

path P l r is essentially the Kill step The only dif

Gx

ferences are that r is not a ro ot and we do not have

Gx

Gx

any sort of b ounds on the plies of the links in the

skeleton path This algorithm will divide P l r in a

number of subgraphs that will then b e used as input

Table Number of vertices and edges of the input

to an ordering algorithm for interval graphs

0

graphs

We add an imaginary link r to the skeleton of

0

P l r so that r is now the last link of the skele

0 0

The b csstk matrices used in our exp eriments come ton as opp osed to r Call r a ro ot Since p

r

from structural engineering problems and were ob the ply conditions in Lemma are trivially satised

tained from the HarwellBoeing collection and from and we can apply the Kill step to partition P l r in

Timothy Daviss University of Florida ro oted subgraphs with no depleted ro ots P l r do es

Collection the matrices were provided to Davis by not include any singlevertex subtrees that might b e

Nonzeros Nonzeros AMD Nonzeros Height Height AMD Height

Matrix AMD Post METIS Post Hybrid Post AMD Post METIS Post Hybrid Post

AMD METIS hybrid AMD METIS Hybrid

b csstk

b csstk

b csstk

b csstk

b csstk

b csstk

nasasrb

sf

sf

Gx

Gx

Gx

Table Results obtained by p ostpro cessing go o d existing orders relative to an AMD order

AMD Work Work AMD Work

Matrix AMD Post METIS Post Hybrid Post

AMD METIS Hybrid

b csstk e

b csstk e

b csstk e

b csstk e

b csstk e

b csstk e

nasasrb e

sf e

sf e

Gx e

Gx e

Gx e

Table Work results obtained by p ostpro cessing go o d existing orders relative to an AMD order

Roger Grimes at Bo eing The nasasrb matrix mo d p oint op erations involved in p erforming Gaussian

els the structure of the NASA Langley shuttle ro cket elimination according to each of the orders

b o oster while the sf matrices are used in the simula

Our results indicate that our algorithm usually pro

tion of an earthquake in the San Fernando Valley

duces an order that has a small amount of extra ll

The G matrices are h w grids The number of ver

when compared to the chordal completion it starts

tices and edges in each graph can b e found in Table

with In some cases our p ostpro cessing actually pro

duces small improvements on the number of nonzeros

We applied our algorithm as a p ost pro cessing step

Contrary to what we exp ected for most graphs the

to the orders pro duced by a version of the approximate

AMD orders were very parallel thus making it harder

minimumdegree heuristic AMD to the nested

for us to obtain signicant improvements in height It

dissection orders pro duced by METIS and to the

is interesting to notice that for grids with large asp ect

orders pro duced by a hybrid of mindegree and nested

ratio the AMD orders cannot b e directly parallelized

dissection obtained from Rothberg

In those test cases our orders induce a slightly lower

Table shows the results we obtained as a func

number of nonzero entries than the nested dissection

tion of the numbers obtained for an AMD order The

orders and a small constant factor higher than the

number of nonzeros includes entries that are in the

number of nonzeros induced by the AMD orders In

original graph as well as any ll entries ab ove and in

these cases the orders our algorithm pro duced are sig

cluding the diagonal The height entries corresp ond to

nicantly more parallel then the original AMD orders

the height of a minimum height order whose up dated

and only slightly worse than the nested dissection or

graph is identical to the chordal completion b eing con

ders The hybrid algorithm pro duces orders that are

sidered Given a chordal completion such a minimum

usually go o d in terms of b oth ll and height

height order and its height can b e easily computed

see

Concluding remarks

Table shows the amount of work that is oating

A number of interesting questions remain op en

2

co de from ftpftpciseuedupubumfpackAMD

3

co de from ftpftpcsumnedudeptuserskumarmetis Can we prove any b ounds on the p erformance of

J Jess and H Kees A data structure for parallel LU the minimumdegree heuristic on chordal or even in

decomp osition IEEE Transactions on Computers

terval graphs

G Karypis and V Kumar Metis Unstructured

Are there algorithms that obtain parallel orders

graph partitioning and sparse matrix ordering system

with ll within a constant factor of optimal for gen

httpwwwcsumnedu karypismetismetismainhtml

eral graphs

T Leighton and S Rao An approximate maxow mincut

Can we incorp orate the idea of sentinels into other

theorem for uniform multicommodity ow problems with

applications to approximation algorithms In Proceedings heuristics such as minimumdegree and nested dissec

of the th Annual Symposium on Foundations of Com

tion If so what ll and work b ounds can b e proved

puter Science pages Octob er

References

C Leiserson and J Lewis Orderings for parallel sparse

symmetric factorization In G Ro driguez editor Parallel

A Agrawal P Klein and R Ravi Cutting down on ll

Processing for Scientic Computing pages SIAM

using nested dissection provably go o d elimination order

Philadelphia PA

ings In A George J Gilb ert and J W H Liu editors

Graph Theory and Sparse Matrix Computation volume

J Lewis B Peyton and A Pothen A fast algorithm for

of the IMA Volumes in Mathematics and its Applications

reordering sparse matrices for parallel factorization SIAM

pages SpringerVerlag New York NY

Journal on Scientic and Statistical Computing

C Ashcraft and J Liu Robust ordering of sparse matri

ces using multisection Technical Rep ort ISSTECH

R J Lipton D J Rose and R E Tarjan Generalized

Bo eing information and supp ort services

nested dissection SIAM Journal of Numerical Analysis

B Aspvall Minimizing elimination tree height can increase

ll more than linearly Information Processing Letters

R J Lipton and R E Tarjan Applications of a planar

separator theorem SIAM Journal on Computing

P Berman and G Schnitger On the p erformance of the

minimum degree ordering for Gaussian elimination SIAM

J W Liu Mo dication of the minimum degree algorithm

Journal of Matrix Analysis and Applications

by multiple elimination ACM Transactions on Mathemat

January

ical Software

Hans L Bo dlaender John R Gilb ert HjalmtyrHafsteins

J W Liu Reordering sparse matrices for parallel elimina

son and Ton Kloks Approximating

tion Parallel Computing

frontsize and shortest elimination tree Journal of Algo

J W Liu and A Mirzaian A linear reo dering algorithm

rithms March

for parallel pivoting of chordal graphs SIAM Journal of

C Bornstein B Maggs G Miller and R Ravi Paral

Discrete Mathematics

lel Gaussian elimination with linear work and ll Techni

J Naor M Naor and A A Schaer Fast parallel al cal Rep ort CMUCS School of Computer Science

gorithms for chordal graphs In Proceedings of the th Carnegie Mellon University Pittsburgh PA May

Annual ACM Symposium on Theory of Computing pages

P Buneman A characterization of rigid circuit graphs

Discrete Mathematics

D OHallaron and J Shewchuk Prop erties of a family of

F R K Chung and D Mumford Chordal completions

parallel nite element simulations Technical Rep ort CMU

of planar graphs Journal of Combinatorial Theory B

CS School of Computer Science Carnegie Mellon

University Pittsburgh PA

T Davis P Amestoy and I Du An approximate min

V Pan Parallel solutions of sparse linear and path sys

imum degree ordering algorithm Technical Rep ort TR

tems In John Reif editor Synthesis of Parallel Algo

Computer and Information Sciences Department

rithms pages Morgan Kaufmann San Mateo

University of Florida Gainesville FL December

CA

G A Dirac On rigid circuit graphs Abh Math Sem

V Pan and J Reif Ecient parallel solution of linear sys

Univ Hamburgh

tems In Proceedings of the th Annual ACM Symposium

K A Gallivan et al Parallel algorithms for matrix com

on Theory of Computing pages Providence RI

putations SIAM Philadelphia PA

May ACM

F Gavril The intersection graph of subtrees of a tree

S Parter The use of linear graphs in Gaussian elimination

are exactly the chordal graphs Journal of Combinatorial

SIAM Review

Theory B

A Pothen The complexity of optimal elimination trees

J A George Nested dissection of a regular nite element

Technical Rep ort CS Department of Computer Sci

mesh SIAM Journal of Numerical Analysis

ence The Pennsylvania State University University Park

PA

J A George and J W Liu An automatic nested dissec

D Rose Triangulated graphs and the elimination pro

tion algorithm for irregular nite element problems SIAM

cess Journal of Mathematical Analysis and Applications

Journal of Numerical Analysis

J R Gilb ert and R E Tarjan The analysis of a nested dis

M Yannakakis Computing the minimum llin is NP

section algorithm Numerische Mathematik

complete SIAM Journal of Algebraic and Discrete Meth

ods

B Hendrickson and E Rothberg Improving the runtime

and quality of nested dissection ordering Technical Rep ort

SAND Sandia National Labs Albuquerque NM

March To app ear in the SIAM Journal on Scientic

and Statistical Computing

A Homann M Martin and D Rose Complexity b ounds

for regular nite dierence and nite element grids SIAM

Journal of Numerical Analysis