Technische Universitt Chemnitz

Ulrich Elsner

Graph Partitioning

A survey

MASSIV PARALLEL M SIMULATION S

SFB 393

Sonderforschungsbereich

Numerische Simulation

auf massiv parallelen Rechnern

Preprint SFB Dec

Contents

Contents i

Notation iii

The problem

Introduction

Graph Partitioning

Examples

Partial Dierential Equations

Sparse MatrixVector Multiplication

Other Applications

Time vs Quality

Miscellaneous Algorithms

Introduction

Recursive Bisection

Partitioning with geometric information

Co ordinate Bisection

Inertial Bisection

Geometric Partitioning

Partitioning without geometric information

Introduction

Graph Growing and Greedy Algorithms

KerniganLin Algorithm

Sp ectral Bisection

Sp ectral Bisection for weighted graphs

Sp ectral Quadri and Octasection

Multilevel Sp ectral Bisection

Algebraic Multilevel

Multilevel Partitioning

Coarsening the graph i

Contents

Partitioning the smallest graph

Pro jecting up and rening the partition

Other metho ds

Bibliography ii

Notation

Below we list most of the symbols used in the text together with a brief explanation

of their meaning

V E A undirected graph with vertices V and edges E

E fe j there is an edge b etween v and v g the set

ij i j

of edges

V fv j i ng the set of vertices

i

W fw e N j e E g weights of the edges in E

E e ij ij

W fw v N j v V g weights of the vertices in V

V v i i

the degree of vertex v fv V je E g deg v

i j ij i

The number of adjacent vertices

LG The of the graph G V E

if i j and e E

ij

l

if i j and e E

ij

ij

dv if i j

i

AG The of the graph G V E

if i j and e E

ij

a

ij

otherwise

the empty set

j j for X a set jX j is the number of elements in X iii

Notation

n

k k for x C a vector jxj is the pnorm of x

p

P

n

p 1p

jx j kxk

i p

i=1

k k k k the Euklidian norm

2

T

e

O f n f g n j there exists a C with jg nj C f n for all n n g

0

the Landau symbol

AB A B with A B disjoint union The dot just emphasizes

that the two sets are disjoint

I I The identity matrix of size n If the size is obvious from the

n

context I is used

A sp ectrum of A

diag w a diagonal matrix with the comp onents of the vector w on the

diagonal iv

The problem

Intro duction

While the p erformance of classical von Neumann computers has grown tremen

dously there is always demand for more than the current technology can deliver At

any given moment it is only p ossible to put so many transistors on a chip and while

that number increases every year certain physical barriers lo om on the horizon sim

ply put an electron has to t through each data path the information has to travel a

certain distance in each cycle that cannot b e b e surpassed by the technology known

to day

But while it is certainly imp ortant to pursue new metho ds and technologies that

can push these barriers a bit further or even break them what can b e done to get

bigger p erformance now

Mankind has for eons used teamwork to achieve things that a single human b eing

is incapable of from hunting as a group to building the pyramids The same idea

can b e used with computers parallel computing Invented or rather revived ab out

years ago parallel computers achieve p erformances that are imp ossible or much to

exp ensive to achieve with single pro cessor machines Therefore parallel computers of

one architecture or another have b ecome increasingly p opular with the scientic and

technical community

But just as working together as a team do es not always work eciently p eople

cannot dig a p osthole times as fast as one and creates new problems that do not

o ccur when one is working alone how to co ordinate the p easants working on the

pyramid obstacles unknown to single pro cessor programmers o ccur when working

on a parallel machine

First not all parts of a problem can b e parallelized but some have to b e executed

serially This limits the total p ossible sp eedup Ahmdals law for a given problem

of constant size Quite often though the sequential part of the work tends to b e

indep endent from the size of the input and so for a bigger problem adequate for

sharing the work among more pro cessors a higher sp eedup is p ossible Problems of

this kind are called scalable

Second the amount of work done has to b e distributed evenly amongst the pro

The problem

cessors This is even more necessary if during the course of the computation some

data has to b e exchanged or some pro cess synchronized In this case some of the

pro cessors have to wait for others to nish This loadbalancing might b e easy for

some problems but quite complicated for others The amount of work p er pro cessor

might vary during the course of the computation for example when some critical

p oint needs to b e examined closer So p erfect balance in one step do es not have to

mean balance in the next

This could easily b e avoided by moving some of the work to another underutilized

pro cessor But this always means communication b etween dierent pro cessors And

communication is alas timeconsuming

This brings us to the third obstacle In almost all reallife problems the pro ces

sors regularly need information from each other to continue their work This has two

eects As mentioned ab ove these exchanges entail synchronization p oints making

it necessary for some of the pro cessors to wait for others And of course communica

tion itself takes some time Dep ending on the architecture exchanging even one bit

of information b etween two pro cessors might b e many times slower than accessing a

bit in the pro cessor itself There might b e a high startup time such that exchanging

hundreds of bytes takes almost the same amount of time as exchanging one byte It

might or might not b e p ossible that exchanges b etween two disjoint pairs of pro cessors

takes place at the same time

While all this is highly machinedependent it is evident that it is desirable to

partition the problem or the data in such a way that not only the work is distributed

evenly but also that at the same time communication is minimized As we will see for

certain problems this will lead to the problem of partitioning a graph In its simplest

form this means that we divide the vertices of a graph into equal sized parts such that

the number of edges connecting these parts is minimal

Graph Partitioning

First some notations cf

A graph V E consists of a number of vertices V fv j i ng some

i

of which are connected by edges in E fe i j j there is an edge b etween

ij

v and v g Other names for vertices include no des grid p oints or mesh p oints

i j

In order to avoid excessive subscripts we will sometimes equate i with v and i j

i

with e if there is no danger of confusion

ij

Given a subset V V of vertices the induced subgraph V E contains all those

edges E E that join two vertices in V

Both vertices and edges can have weights asso ciated with them

Graph Partitioning

W fw e N j e E g are the weights of the edges and W fw v

E e ij ij V v i

N j v V g are the weights of the vertices

i

The weights are nonnegative integers This often allows for a more ecient analy

sis or implementation of algorithms For example sorting a list of integers of restricted

size can b e done using the bucketsort algorithm with its O n p erformance But this

restriction is not as big as it seems For example p ositive rational weights can easily

b e mapp ed to N by scaling them with the smallest common multiple of the denom

+

inators If no weights are given all weights should b e considered to b e

The simplest form of graph partitioning unweighted graph bisection is this given a

graph consisting of vertices and edges divide the vertices in two sets of equal size such

that the number of edges b etween these two parts is minimized Or more formally

Let G V E b e a graph with jV j the number of of elements in V even

Find a partition V V of V ie V V V and V V This disjoint

1 2 1 2 1 2

union ist sometimes expressed by V V V with

1 2

jV j jV j

1 2

such that

jfe E j v V and v V gj

ij i 1 j 2

is minimized among all p ossible partitions of V with equal size

A more general formulation involves weighted edges weighted vertices and p dis

joint subsets Then the sum of weights of vertices in each of the subsets should b e

equal while the sum of the edges b etween these subsets is minimized Again more

formally

P

Let G V E b e a graph with weights W and W with w v divisible

V E v i

v V

i

by p

Find a ppartition V V V of V ie

1 2 p

p

V V and V V for all i j

i i j

i=1

with

X

w v equal for all j f pg

v i

v (i)V

j

such that

X

w e

e ij

e E with

ij

v V v V and p=q

p q

i j

is minimized among all p ossible partitions of V

The problem

and are also referred to as cutsize implying the picture that one takes

the whole graph cuts these edges and is left with the subgraphs induced by the

subsets of the vertices Other names used include edgecut or cost of the partition

As describ ed ab ove we are lo oking for a small subset of edges that when taken

away separates the graph in two disjoint subsets This is called nding an edge

separator

Instead of edges one can also lo ok for a subset of vertices the socalled vertex

separator that separates the graph

Given a graph G V E nd three disjoint subsets V V and V with

1 2 S

V V V V

1 2 S

jV j small

S

jV j jV j

1 2

no edge joins V and V

1 2

Dep ending on the application either the edge separator or the vertex separator

is needed Fortunately it is easy to convert these two kinds of separators into each

other

For example assume we have an edge separator but need a vertex separator

Consider the graph G consisting only of the separating edges and their endp oints

Any vertex cover of G that is any set of vertices that include at least one endp oint

of every edge in G is a vertex separator for the graph Since G is bipartite we can

compute the smallest vertex cover eciently by bipartite matching

Examples

In the following examples we assume that we have a parallel computer with distributed

memory that is each pro cessor has its own memory If it needs to access data in some

other pro cessors memory communication is necessary Many of the parallel computers

of to day are of this kind

Partial Dierential Equations

A lot of the metho ds for solving partial dierential equations PDEs are gridoriented

ie the data is dened on a discrete grid of p oints nite elements or nite volumes

and the calculation consists of applying certain op erations on the data asso ciated with

all the p oints elements or volumes of the grid These calculations usually also

involve some of the data of the neighboring elements

Examples

go o d partition bad partition

Figure Two partitions

v

9

c

v

v

8 c

1

A

c c

c

A

v

14 B

c

B

c A

c

Q

B Q

A

v

10

c

c c

Q

A

B

v c

Q

7

v

A

2

B c c Q

H

A

H D

c

H

A

D

H

v

3

v

12

H

A c H

D

H

v v

v

4 13

6 A H

D

c c c

H

X

A

X H

H

v

D

X

11

X H

H

X

c A X c

H

H

D

H

H

v

H

5

H

H HD

Figure A nite element mesh and the corresp onding dep endency graph

Distributing the problem to parallel pro cessors is done by partitioning the grid into

subgrids and solving the asso ciated subproblems on the dierent pro cessors Since the

elements on the b orders of the subgrids need to access data on the other pro cessors

communication is necessary Using the reasonable assumption that the amount of

communication b etween two neighbors is constant one has to minimize the b order

length ie the number of neighbors in dierent partitions cf Figure

To formulate the problem as a graph partitioning problem we lo ok at the socalled

dep endency graph of the grid cf Figure

Every gridp oint or nite element has a corresp onding vertex on this graph The

weight of the vertex is prop ortional to the amount of computational work done on

this grid p oint Quite often the weights of all vertices are equal

For each pair of neighboring or otherwise dep endent grid p oints the corresp ond

ing vertices are connected by an edge The weight of this edge is prop ortional to

the amount of communication b etween these two grid p oints Again quite often the

weights of all edges are equal

The problem

For nite element meshes A mesh is a graph embedded into usually two or

threedimensional space so that the co ordinates of the vertices are known the un

weighted dep endency graph is equal to the dual graph of the mesh

Now the optimal grid partitioning can simply b e found by solving the graph par

titioning problem on the dep endency graph and distributing the gridp oint or nite

element to the j th pro cessor if its corresp onding vertex is in the j th partition of the

graph

Sparse MatrixVector Multiplication

A sparse matrix is a matrix containing so many zero entries that savings in space

or computational work can b e achieved by taking them into consideration in the

algorithm

A fundamental op eration in many matrix related algorithms is the multiplication

of the matrix A with a vector x For example in iterative metho ds for the solution

of linear system like cg QMR or GMRES the amount of work sp end

on this op eration can dominate the total amount of work done

Now we want to distribute the matrix on a parallel computer with p pro cessors

One of the ways that a sparse matrix can b e distributed is rowwise ie each row of

the matrix is distributed to one pro cessor If the ith row of A is kept on a pro cessor

then so is the ith element x of the vector x

i

How should the rows b e distributed amongst the pro cessors for the workload to

b e balanced and the necessary communication minimized

Consider the workload to calculate the ith element of the vector y Ax we

need to calculate

n

X

y a x

i ij j

j =1

But since A is sparse it suces to calculate

X

y a x

i ij j

j : a =0

ij

So the amount of work to calculate y is prop ortional to the amount of nonzero

i

elements in the ith row of A

But as we see in we not only need the values of the ith row of A but also

of some of the x namely those where a Now if those values are stored on

j ij

the same pro cessor we are in luck Otherwise we have to communicate with the

pro cessor where those values are stored

Examples

    

       

B C C B

B C C B

       

B C C B

B C C B

       

B C C B

B C C B

      

B C C B

B C C B

       

B C C B

B C C B

        

B C C B

B C C B

        

B C C B

B C C B

        

B C C B

B C C B

        

B C C B

B C C B

       

B C C B

B C C B

      

B C C B

B C C B

         

B C C B

B C C B

       

B C C B

A A

       

     

Figure Distribution of matrix rows to pro cessors

To minimize communication we will have to try to store those values and conse

quently the asso ciated rows on the same pro cessor Thus we want to minimize the

o ccurrence of the following the ith row of A is stored on a pro cessor a but

ij

the j th row of A is stored on another pro cessor

Figure shows a bad on the left side and a go o d on the right distribution of

matrix rows to pro cessors In each matrix the rst four rows are distributed to one

pro cessor the second four to the next and so on Elements that cause communication

are shown by a Note that there are many more in the left matrix But note that

the right matrix is just the left matrix with p ermuted rows and columns and a dierent

rowprocessor assignment Permuting rows and columns and the matching pro cessor

distribution do es not change the amount of work and communication necessary It

is just easier to see rows as b elonging together than eg rows and So

the lesser number of in the right matrix demonstrates how much communication

can b e saved by the prop er distribution

Again the problem can b e reformulated as a graph partitioning problem In order

to simplify the example we will lo ok at a structurally symmetric matrix ie if a

ij

then so is a

j i

nn

is dened as follows It consists of n vertices The graph of a matrix A R

v v There is an edge b etween v and v with i j if a

1 n i j ij

The degree of a vertex is the number of direct neighbors it has For our purp ose

we dene the weight w v of a vertex v as the degree of this vertex plus Then

v i i

the weight of v is equal to the number of nonzero elements of the ith row of the

i

matrix and therefore prop ortional to the amount of work to calculate y

i

The edges all have unit weight Each edge e symbolizes a data dep endency

ij

b etween row i and row j and corresp onding vector element x and thus a p ossible

j

communication

The problem

f f v v f v v v

f f v v f f rf bf

bf bf rf rf bf f rf rf

bf bf rf rf bf bf v rf

Figure Partitioned graphs of matrices in gure

Partitioning the weighted graph in p parts with the minimal amount of edges cut

allows us to nd a go o d distribution of the sparse matrix on the parallel pro cessors

If v is in the j th subpartition we store the ith row of the matrix on pro cessor j

i

Since the weight of the vertices is prop ortional to the work and the partitions are

of equal weight p erfect loadbalance is achieved And since each cut edge stands for

necessary communication b etween two pro cessors a minimized cutsize means that

communication is minimized

In Figure we show the partitioned graphs of to the matrices in Figure The

numbers at the vertices corresp ond to the linecolumn number in the matrix Each

cut edge corresp onds to a in Figure It is now easy to see that b oth matrices

are essentially the same just with p ermuted rowscolumns leading to a dierent row

pro cessor assignment

Other Applications

While b oth detailed examples ab ove stem from the area of parallelizing mathematical

algorithms graph partitioning is applicable to many other quite dierent problems

and areas as well One of the rst algorithm the KerniganLin algorithm was de

veloped to assign the comp onents of electronic circuits to circuit b oards such that

the number of connections b etween b oards is minimized This algorithm is with

some mo dications still much in use and will b e discussed in more detail in section

Other applications include VLSI Very Large Scale Integration design CAD

Computer Aided Design decomp osition or envelope reduction of sparse matrices hy

p ertext browsing geographic information services and physical mapping of DNA De

Time vs Quality

oxyriboNucleic Acid basic constituent of the gene see eg

and references therein

Time vs Quality

In the sections ab ove we have always talked ab out minimizing the cutsize under cer

tain conditions But for all but some wellstructured or small graphs real minimization

is infeasible b ecause it will take to much time Finding the optimal solution for the

graph partitioning problem is for all nontrivial cases known to b e NPcomplete

Very simply put this means that there is no known algorithm that is much faster

than trying all p ossible combinations and there is little hop e that one will b e found

A little more exact but still ignoring some fundamentals of the NP theory it

means that graph bisection falls in a huge class of problems all of which can b e

transformed into each other and for which there are no known algorithms that can

solve the problem in p olynomial time in the size of the input

Given a graph V E without sp ecial prop erties and k jV j jE j it is susp ected

that there exists no algorithm for solving the graph bisection problem which runs in

n

O k time for any given n N

So instead of nding the optimal solution we resort to heuristics That is we try

to use algorithms which may not deliver the optimal solution every time but which

will give a go o d solution at least most of the time

As we will see there often is a tradeo b etween the execution time and the quality

of the solution Some algorithms run quite fast but nd only a solution of medium

quality while others take a long time but deliver excellent solutions and even others

can b e tuned b etween b oth extremes

The choice of time vs quality dep ends on the intended application For VLSI

design or network layout it might b e acceptable to wait for a very long time b ecause

an even slightly b etter solution can save real money

On the other hand in the context of sparse matrixvector multiplication we are

only interested in the total time Therefore the execution time for the graph parti

tioning has to b e less than the time saved by the faster matrixvector multiplication

If we only use a certain matrix once a fast algorithm delivering only medium qual

ity partitions might overall b e faster than a slower algorithm with b etter quality

partitions But if we use the same matrix or dierent matrices with the same graph

often the slower algorithm might well b e preferable

In fact there are b e even more factors to consider Up to now we wanted the

dierent partitions to b e of the same weight As we will see in section it might b e

of advantage to accept partitions of slightly dierent size in order to achieve a b etter

The problem

cutsize

All this should demonstrate that there is no single best algorithm for all situations

and that the dierent algorithms describ ed in the following chapters all have their

applications

Miscellaneous Algorithms

Intro duction

There are many dierent approaches to graph bisection Some of them dep end on the

geometrical prop erties of the mesh and are thus only applicable for some problems

while others only need the graph itself and no additional information Some take a

lo cal view of the graph and only try to improve a given partition while others treat

the problem globally Some are strictly deterministic always giving the same result

while others use quite a lot of random decisions Some apply graphtheoretic metho ds

while others just treat the problem as a sp ecial instance of some other problem like

nonlinear optimization

In this chapter we will survey some of these algorithms

The selection of the algorithms in this chapter is somewhat arbitrary but we tried

to include b oth the more commonly used and the ones that are of a certain historical

interest For some other summaries of algorithms see

In the interest of simplicity the algorithms will b e presented for unweighted graphs

wherever generalization to weighted graphs is easy We also tend to ignore techni

calities like dealing with unconnected graphs which have to b e considered in real

implementations of these algorithms

Recursive Bisection

As we will see most of the algorithms are at least originally designed for bisection

only But the general formulation of the graph partitioning problem involves nding

a ppartition

k

It turns out that in many reallife applications p is a p ower of ie p For

k

example most parallel computers have pro cessors So a natural idea is to work

recursively First partition the graph in two partitions then partition each of these

two in two subpartitions and so on see Figure

But even if we could nd a p erfect bisection will this give a partition with nearly

the same cutsize as a prop er pway partition As H Simon and S Teng have shown

in the answer is No but In this section we will recapitulate their re

Miscellaneous Algorithms

b b b b b b b b b b b b b b b b b b b b b b b b

b b b b b b b b b b b b b b b b b b b b b b b b

b b b b b b b b b b b b b b b b b b b b b b b b

b b b b b b b b b b b b b b b b b b b b b b b b

b b b b b b b b b b b b b b b b b b b b b b b b

b b b b b b b b b b b b b b b b b b b b b b b b

b b b b b b b b b b b b b b b b b b b b b b b b

b b b b b b b b b b b b b b b b b b b b b b b b

Figure Recursive Bisection

B B

2 1

A A

2 1

A A

4 3

B B

4 3

Figure Example of a graph with go o d way partition but bad recursive bisection

sults Recursive bisection can deliver nearly arbitrarily bad cutsizes compared to

ppartitions But that might not b e as bad as it sounds For some imp ortant classes

of graphs recursive bisection works quite well And if one do es not insist on partitions

of exactly equal size it is p ossible to use recursive bisection to nd partitions that are

nearly as go o d as the ones found by pway partitioning

First the bad news We will demonstrate it by an example We give a graph with n

vertices that has a constant cutsize of using a way partition but where recursive

2

bisection pro duces a cutsize of size O n

Consider the graph sketched in Figure Its eight subgraphs A B i

i i

are cliques ie totally connected graphs with A having n vertices

i i

and B having n vertices with i where the have the following

i i i

prop erties

with xed and

i i

Recursive Bisection

a parts cutsize

b parts cutsize

Figure Reallife Counterexample for recursive bisection

1 2 3 4

for all i j f g

i j

n N for i f g

i

One p ossible combination would b e and

1 2 3

and n

4 1 2 3

The optimal way partition decomp oses the graph into A B i

i i

The total cutsize is obviously

S

4

In contrast the recursive bisection rst decomp oses the graph into A and

i

i=1

S

4

B But then in the next recursion level at least one of the A and one of the

i i

i=1

B will have to b e cut This is b ecause condition ensures that it is not p ossible

i

to combine any A A B B to the prop er size Since the A and B are cliques

i j i j i j

cutting even one vertex out of them will increase the cutsize by at least n

i

2

at least the number of vertices in A resp B So the cost of the partition is O n

i i

2

n

This example can b e easily adjusted for general ppartitions And Figure

shows that this problem do es not only o ccur in contrived examples Partitioning a

grid into parts using p erfect recursive bisection Figure a will lead to

an cutsize of while the optimal partition Figure b has an cutsize of only

Note that in Figure a the cuts into and parts are optimal

Now for the go o d news For some of the more common graphs it is proved in

that recursive bisection is not much worse than real ppartitioning In this section

Miscellaneous Algorithms

we assume that ppartition and bisection deliver the optimal solution and are not

some heuristics

A is a graph that can b e drawn on a piece of pap er so that no edges

cross each other Discretizations of dimensional structures are often planar

In the nite element metho d the domain of the problem is sub divided into a mesh

of p olyhedral elements A common choice for an element is a ddimensional simplex

ie a triangle in two dimensions In classical nite element metho ds it is usually a

requirement for numerical accuracy that the triangulation is uniformly regular that

is that the simplices are wel l shaped A common shap e criteria used in mesh

generation is an upp er b ound on the asp ect ratio of the simplices This term has

many denitions which are roughly equivalent For the following theorem the

asp ect ratio of a simplex T is the radius of the smallest sphere containing T divided

by the radius of the largest sphere that can b e inscrib ed in T

For planar graphs it is shown in Theorem that the solution found by

p

np times recursive bisection will have a cutsize that is in the worst case only O

11d

bigger than that of the ppartition with n jV j With a factor of O np

the same statement holds for well shap ed meshes in d dimensions

Now what happ ens if we do not insist on partitions of exactly the same size but

allow for some imbalance

A pway partition is a ppartition where we allow the subpartitions

V to b e a little bigger than jV jp eg jV j jV jp

i i

Then Theorem in states among other things that if the cutsize of the

optimal ppartition is C then one can nd a ppartition with an cutsize of

size O C log p using recursive partitioning This is a more theoretical result since

the execution time may b e exp onential in n But the theorem also shows that there is

an algorithm using recursive partitioning that runs in p olynomial time and that will

nd a ppartition with an cutsize of size of O C log p log n

This shows that by allowing a little imbalance in the partition size one can con

tinue to use recursive bisection without paying to much of a p enalty in the cutsize

compared to real ppartitioning

Partitioning with geometric information

Sometimes there is additional information available that may help to solve the prob

lem For example in many cases like in the solution of PDEs cf section the

graph is derived from some structure in two or threedimensional space so geometric

co ordinates can b e attached to the vertices and therefore the geometric layout of

the graph is known In this case the graph is also called a mesh

Partitioning with geometric information

e

e e e e e e e e

T

e

T

e

T

e

T

T

e

e e e e e e e e

T

e

T

e

T

e

e

e

e

e e e e e e e e

T

e

T

T

e

T

T

e

T

e

T

e e e e e e e e

T

e

a b c

Figure Problems with Co ordinate Bisection

The algorithms describ ed in this section implicitly assume that vertices that are

close together physically are close together in the mesh ie connected by a short

path In fact they make no use of the information ab out the edges at all This limits

their applicability to problems where such geometric information is b oth available and

meaningful It also implies that these algorithms cannot easily b e extended to graphs

with weighted edges

But for these problems geometrical partitioning techniques are cheap metho ds

that nevertheless pro duce acceptable partitions

All of these metho ds work by nding some structure that divides the underlying

space into two parts in such a way that each part contains half of the p oints or for

weighted vertices half of the weight If some of the p oints lie on the structure some

tiebreaking mechanism will have to b e used Often this structure is a hyperplane ie

a plane in two dimensions a line in two dimensions but it can also b e some other

structure like a sphere

Co ordinate Bisection

Co ordinate Bisection is the simplest co ordinatebased metho d It simply consists of

nding the hyperplane orthogonal to a chosen co ordinate axis eg the y axis that

divides the p oints in two equal parts This is done by only lo oking at the y co ordinates

and nding a value y such that half of the p oints have a y value of less than y and

half of the p oints have a y value bigger than y The separating structure in this case

is the hyperplane orthogonal to the y axis containing the p oint y resp y

Recursive application of this technique can b e done in two ways

Always choosing the same axis will lead to thin strips and is usually undesir

able since it leads to long b oundaries and thus probably to a high cutsize see

Figure a

Miscellaneous Algorithms

P x y

dr

i i i

d

d

i

L

d

d

u

t

d

d

d x y

Figure Inertial Bisection in D

A b etter way is to alternately bisect the x y and z co ordinates This leads to

a b etter asp ectratio and hop efully to a lower cutsize see Figure b

One of the problems of co ordinate bisection is that it is co ordinatedep endent

Hence in another co ordinate system the same structure may b e partitioned quite

dierently see Figure c The socalled Inertial Bisection describ ed in the next

section tries to remedy this problem by choosing its own co ordinate system

Inertial Bisection

The basic idea of Inertial Bisection eg and references therein is the follow

ing instead of choosing a hyperplane orthogonal to some xed co ordinate axis one

chooses an axis that runs through the middle of the p oints

Mathematically we choose the line L in such a way that the sum of the squares

of the distance of the mesh p oints to the line is minimized

The physical interpretation is that the line is the axis of minimal rotational inertia

If the domain in nearly convex this axis will align itself with the overall shap e of the

mesh and the mesh will have a small spatial extend in the directions orthogonal to

the axis This hop efully minimizes the fo otprint of the domain in the hyperplane and

therefore the cutsize

We state the algorithm in two dimensions The generalization to three dimensions

and to weighted vertices is easy

Let P x y i n b e the p osition of the p oints

i i i

Let the line L b e given by a p oint P x y and a direction u v w with

p

2 2

v w such that L fP u j Rg see also Figure kuk

2

Partitioning with geometric information

Since L is supp osed to b e the line that minimizes the sum of the squares of the

distance from the mesh p oints to the line it is easy to see that the center of mass

ie the p oint that minimizes the sum of the squares of the distance from the mesh

p oints to a single p oint lies on L So we can simply choose

n n

X X

x x y y

i i

n n

i=1 i=1

It remains to choose u so that the following quantity is minimized

n

X

2

d

i

i=1

n

X

2

2 2

x x y y v x x w y y

i i i i

i=1

n n n

X X X

2 2 2 2

v x x w y y v w x x y y

i i i i

i=1 i=1 i=1

2 2

v S w S w v S

xx y y xy

2 2

w S v S w v S

xx y y xy

S S

y y xy

T

u u

S S

xy xx

T

u M u

where S S and S are the summations in the previous line

y y y y xy

M is symmetric and hence we can choose u to b e the normalized eigenvector

corresp onding to the smallest eigenvalue M in order to minimize this expression This

follows from eg Theorem

Geometric Partitioning

Another rather more complicated metho d to partition meshes was introduced in

In its original form and for certain graphs it will nd a vertex separator that with a

high probability has an asymptotically optimal size and divides the graph in such

a way that the sizes of the two partitions are not to o dierent

Definition A k ply neighborhood system in d dimensions is a set fD D

1 2

d d

D g of closed disks in R such that no p oint of R is strictly interior to more

n

than k disks

Miscellaneous Algorithms

p

p

p

t t t t

p

p

p

p

p

p

p

p t t t t

p

p

p

p

p

p t t t t

p

p

p

p

p

p t t t t

p

p

a b

Figure a A regular mesh as a overlap graph

b Bisecting a dim cub e

Definition An k overlap graph is a graph dened in terms of a k ply neigh

b orho o d system fD D D g and a constant

1 2 n

There is a vertex for each disk D The vertices i and j are connected by an edge if

i

expanding the radius of the smaller of D and D by a factor causes the disks to

i j

overlap ie

D D and D D E e

i j i j ij

Overlap graphs are go o d mo dels of computational meshes b ecause every well

shap ed mesh in two or three dimensions and every planar graph are contained in

some k overlap graph for suitable and k

For example a regular nbyn mesh is a overlap graph see Figure a

We have the following theorem

Theorem

Let G V E b e an k overlap graph in d dimensions with n no des Then there

is a vertex separator V so that V V V V with the following prop erties

S 1 S 2

V and V each have at most nd d vertices and

1 2

1d (d1d)

V has at most O k n vertices

S

It is easy to see that for regular meshes in simple geometric shap es like balls

squares or cub es cf Figure b the separator size given in the theorem is asymp

totically optimal

Partitioning with geometric information

1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

−0.4 −0.2 0 0.2 0.4 0.6

a The mesh b The scaled p oints

Figure Geometric Bisection I

The pro of for this theorem is rather long and involved but it leads to a randomized

algorithm running in linear time that will nd with high probability a separator of

the size given in the theorem

The separator is dened by a ddimensional sphere The algorithm randomly

chooses the separating circle from a distribution that is constructed so that the sepa

rator will satisfy the conclusions of the theorem with high probability

To describ e the algorithm we will introduce some terminology

d

A stereographic projection is a mapping of p oints in R to the unitsphere centered

d+1 d d+1

at the origin in R First a p oint p R is embedded in R by setting the d

co ordinate to It is then pro jected on the surface of the sphere along a line passing

through p and the north p ole

d

The centerpoint of a given set of p oints in R has the prop erty that every hyper

plane through the centerpoint divides the set into two subsets whose sizes dier by at

most a ratio of d Note that the centerpoint do es not have to b e one of the original

p oints

With these denitions the algorithm is as follows see also Fig

Algorithm Geometric Bisection

d

Project up Stereographically project the vertices from R to the unitsphere

d+

in R

Miscellaneous Algorithms

0.2 1

0 0.5 −0.2

−0.4 0

−0.6 −0.5 −0.8

−1 −1 1 1 0.5 0.5 1 1 0 0.5 0 0.5 0 0 −0.5 −0.5 −0.5 −0.5

−1 −1 −1 −1

a Pro jected meshp oints with b Comformally mapp ed p oints

centerpoint with great circle

Figure Geometric Bisection I I

Find centerp oint z of the projected p oints

Conformal Map Conformally map the p oints on the sphere such that the

centerp oint now lies at the origin This is done in the following way First

rotate the sphere around the origin so that the centerp oint has the co ordinates

r for some r Second dilate the p oints so that the new centerp oint

lies in the origin This can b e done by projecting the rotated p oints back on the

d

R plane using the inverse of the stereographic projection scaling the p oints

p

d

r r and then stereographically projecting the in R by a factor of

p oints back on the sphere

d+

Find a great circle by intersecting the R sphere with a random d

dimensional hyp erplane through the origin

d

Unmap the great circle to a circle C in R by inverting the dilation the rotation

and the stereographic projection

Find separator A vertex separator is found by cho osing all those vertices

whose corresponding disk cf Def magnied by a factor of intersects

C

While implementing this algorithm the following simplications suggested in

can b e made

Instead of nding a vertex separator it is easier to nd an edgeseparator by

choosing all those edges cut by the circle C This has the advantage that one do es

not have to know the neighborho o d system for the graph

Partitioning with geometric information

1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

−0.4 −0.2 0 0.2 0.4 0.6

a Separating circle pro jected back b The edge separator induced by

on the plane the separating circle

Figure Geometric Bisection I I I

The theorem only guarantees a ratio of d b etween the two partitions while

for bisection we want the two partitions to b e of the same size p ossibly up to a

dierence of one vertex To achieve this the hyperplane is moved along its normal

vector until it divides the p oints evenly Thus the separator is a circle but not a great

d+1 d

circle on the unitsphere in R its pro jection to the R plane is still a circle or

p ossibly a line

The centerpoint can b e found by linear programming and although this is done

in p olynomial time it is still a slow pro cess So instead of using the real centerpoint

one can use a heuristic to nd an approximate centerpoint This can b e done in linear

time using a randomized algorithm

To sp eed up the computation even more only a randomly chosen subset of p oints

is used to compute the centerpoint

A random great circle has a go o d p ossibility to induce a go o d partition But

according to exp eriments showed that it is worthwhile to generate dierent cir

cles and choose the one that delivers the b est result One can further improve the

algorithm by mo difying this selection such that the normal vector of the randomly

generated great circles are biased in the direction of the moment of inertia of the

p oints see also Section Inertial Bisection

Miscellaneous Algorithms

Partitioning without geometric

information

Intro duction

All of the metho ds mentioned ab ove work quite well on some meshes but they have a

couple of disadvantages First they assume that connected vertices are in some sense

geometrically adjacent While this is certainly the case for some applications it do es

not hold in general Secondly these metho ds require geometric information ab out the

mesh p oints Even in cases where the rst condition is satised the layout of the

mesh may not b e available

Consequently metho ds that do not use geometric information but only consider

the connectivity information of the graph are applicable for a wider range of problems

Graph Growing and Greedy Algorithms

When thinking ab out partitioning algorithms one obvious idea is the following

Cho ose one starting vertex and in some way add vertices to it until the partition

is big enough These algorithms can b e classied as greedy and graphgrowing

type algorithms

In greedyalgorithms eg the next vertex added is one that is b est in

some sense eg it increases the cutsize by the least amount while in graphgrowing

algorithms the vertices are added in such a way that the resulting subgraph grows in

a certain way eg breadthrst Other graphgrowing algorithms may for example

start with several starting vertices at the same time and grow the subgraphs in parallel

or may have some preference to grow in a certain direction

Since the ab ove describ es a whole family of algorithms we will describ e some

examples a bit more closely

First consider the following greedy approach

Partitioning without geometric information

Algorithm Greedy Partitioning

Unmark all vertices

Cho ose a pseudop eripheral vertex one of a pair of vertices that are approxi

mately at the greatest distance from each other in the graph cf as starting

vertex mark it and add it to the rst partition

For the desired numb er of partitions do

rep eat

Among all the unmarked vertices adjacent to the current partition

cho ose the one with the least numb er of unmarked neighb ors

Mark it and add it to the current partition

until the current partition is big enough

If there are unmarked vertices left cho ose one adjacent to the current

partition with the least numb er of unmarked neighb ors as starting vertex

for the next partition

Note that by choosing new vertices only adjacent to the current partition this

algorithm tries to keep the subpartitions connected and thus also do es some kind of

graph growing

Another wellknown algorithm that is often called greedy is the Farhatalgo

rithm A closer lo ok reveals that it is a graph growing algorithm which only

chooses the starting vertices of each partition in a greedy way

Adapted to work on a graph the original works on no des and elements of a FEM

mesh it can b e describ ed as follows

Algorithm Farhats Algorithm

Unmark all vertices

For the desired numb er of partitions do

Among all the vertices chosen for the last partition cho ose the one with the

smallest nonzero numb er of unmarked neighb ors mark all these neighb ors

and add them to the current partition For the rst vertex one could

again cho ose a pseudop eripheral vertex mark it and add it to the current

partition

rep eat

For each vertex v in the current partition do

For each unmarked neighb or v of v do

n

Add v to the current partition

n

mark v

n

until the current partition is big enough

Graph Growing and Greedy Algorithms

These algorithms in particular the graph growing ones are very fastThey also

have the additional advantage of b eing able to divide the graph into the desired

number of partitions directly thereby avoiding recursive bisection Therefore their

running time is essentially indep endent of the desired number of subpartitions

The ability to divide the graph into any number of partitions is also attractive if

the desired number is not a p ower of

On the downside the quality of the partitions measured in cutsize is not always

great and when partitioning complicated graphs into several subpartitions the last

subpartitions tend to b e disconnected see eg They also tend to b e sensitive

to the choice of the starting vertex but since they are quite fast one can run the

algorithms with dierent starting vertices and simply choose the b est result

KerniganLin Algorithm

The KerniganLin algorithm often abbreviated as KL is one of the earliest

graph partitioning algorithms and was originally developed to optimize the place

ment of electronic circuits onto printed circuit cards so as to minimize the number of

connections b etween cards

The KL algorithm do es not create partitions but rather improves them iteratively

there is no need for the partitions to b e of equal size The original idea was to take a

random partition and apply KerniganLin to it This would b e rep eated several times

and the b est result chosen While for small graphs this delivers reasonable results it

is quite inecient for larger problem sizes

Nowadays the algorithm is used to improve partitions found by other algorithms

As we will see KL has a somewhat lo cal view of the problem trying to improve

partitions by exchanging neighboring no des So it nicely complements algorithms

which have a more global view of the problem but tend to ignore lo cal characteristics

Examples for these kinds of algorithms would b e inertial partitioning and sp ectral

partitioning see sections and resp

Imp ortant practical advances were made by C Fiduccia and R Mattheyses in

who implemented the algorithm in such a way that one iteration runs in O jE j instead

2

of O jV j log jV j We will rst describ e the original algorithm and later discuss these

and other improvements

Let us introduce some notation It is actually easier to describ e the algorithm for

a graph with weighted edges so let V E W b e a graph with given subpartitions

E

V V V

A B

The divalue of a vertex v is the amount the cutsize will decrease if it is moved

Partitioning without geometric information

to the other partition ie for v V

A

X X

w e w e di v di v V V

e v a e v b A B

aV bV

A B

Note that if a vertex is moved from one subpartition to the other only its divalue

and the divalues of its neighbors change

The gainvalue of a pair a V b V of vertices is the amount the cutsize

A B

changes if we swap a into subpartition V and b into V One can easily see that

B A

gain a b gaina b V V di a di b w e

A B e ab

With this we will now describ e one pass of the algorithm

Algorithm KerniganLin

Given two subpartitions V V

A B

Compute the divalue of all vertices

Unmark all vertices

Let k cutsize

For i to minjV j jV j do

A B

Among all unmarked vertices nd the pair a b with the biggest gain

i i

which might b e negative

mark a and b

i i

For each neighb or v of a or b do

i i

Up date di v as if a and b had b een swapped ie

i i

for v V w e w e

A e vb e va

i i

di v di v

w e w e for v V

e vb e va B

i i

k k gain a b ie k would b e the cutsize if a a a and

i i i i i i

b b b had b een swapped

i

Find the smallest j such that k min k

j i i

Swap the rst j pairs ie

V V fa a a g fb b b g

A A j j

V V fb b b gfa a b g

B B j j

Continue these iterations until no further cutsize improvement is achieved

Graph Growing and Greedy Algorithms

So in each phase KL swaps pairs of vertices chosen to maximize the gain It

continues doing so until all vertices of the smaller subpartition are swapped The

imp ortant part is that the algorithm do es not stop as so on as there are no more

improvements to b e made Instead it continues even accepting negative gains in the

hop e that later gains will b e large and that the overall cutsize will later b e reduced

This ability to climb out of lo cal minima is a crucial feature See also Figure

taken from

d t d d d m d d

d d d d d d d

d d d d m d d d m d

t d m

Original Graph k First swap k

0 1

d t t d d d t

d m t t d d d d

H

H

H

d t t d H t d d t

m t d

Second swap k Result k

2 4

n

d t

unmarked marked selected

Figure The KerniganLin algorithm at work

As mentioned ab ove the algorithm implemented in takes O jE j time for one

iteration The reduction is partly achieved by choosing single no des to b e swapped

instead of pairs Furthermore the divalues of the vertices are bucketsorted and

the moves asso ciated with each gain value are stored in a doubly linked list Cho osing

a move with the highest divalue involves nding a nonempty list with the highest

divalue while up dating the divalue of a vertex is done by deleting it from one

doubly linked list and inserting it into another this can b e done in constant time

There are many variations of the KerniganLin algorithm often

trading execution time against quality or generalizing the algorithm to more than two

subpartitions Some examples are

Partitioning without geometric information

Instead of exchanging all vertices limit the exchange to some xed number or

allow only a xed number of exchanges with negative gain This is based on the

observation that often the largest gain is achieved early on

Use only a xed number of inner iterations often only one

Only evaluate divalues of vertices near the b oundary since it is unlikely that

moving inner no des will b e helpful

Sp ectral Bisection

The spectral bisection algorithm describ ed in this section is quite dierent from all

the other algorithms describ ed so far It do es not use geometric information yet it

do es not op erate on the graph itself but on some mathematical representation of

it While the other algorithms use almost no oating p oint calculations at all the

main computational eort of the sp ectral bisection algorithm are standard vector

op erations on oating p oint numbers The decision ab out each vertex is based on the

whole graph so it takes a more global view of the problem

The metho d itself is quite easy to explain but why the heuristic works is not at

all obvious

Definition Let G V E b e a graph with n jV j vertices numbered in some

way and with deg v the degree of vertex i ie the number of adjacent vertices

i

The Laplacian matrix LG or only L if the context is obvious is an n n symmetric

matrix with one row and column for each vertex Its entries are dened as follows

if i j and e E

ij

l

if i j and e E

ij

ij

deg v if i j

i

The same matrix with a zero diagonal is often called the adjacency matrix AG

of G

Since a p ermutation of the vertex numbering will lead to a dierent matrix it is

sloppy to sp eak of the Laplacian of a graph But since matrices induced by dierent

vertex numberings can b e derived from each other by p ermuting rows and columns

they share most prop erties so we will continue to sp eak of the Laplacian unless a

closer distinction is necessary

The Laplacian has some nice prop erties cf eg

Sp ectral Bisection

Theorem

LG is real and symmetric its eigenvalues are therefore

1 1 n

real and its eigenvectors are real and orthogonal

All eigenvalues are nonnegative and hence LG is p ositive semidenite

j

T

With e LG e e so with e an asso ciated eigenvector

1

The multiplicity of the zero eigenvalue is equal to the number of connected

comp onents of G In particular

G is connected

2

Proof and are trivial and follows from the Gershgorin Circle Theorem

Theorem together with the simple fact from the denition of L that

n

X

jl j l for all j f ng

ij j j

i=1

i=j

and so all Gershgorin circles have nonnegative real parts Since all eigenvalues are

real they are all nonnegative This also shows that max deg v n

n i i

For part we rst show that if G is connected Note that G is connected

2

if and only if LG is irreducible and if L is irreducible then so is the nonnegative

matrix L nI L Also is an eigenvalue of L if and only if n is an

i ni i

eigenvalue of L In particular the largest eigenvalue of L and all other eigenvalues

n

of L are p ositive

The PerronFrobenius theorem Theorem applied to L implies that if L

is connected is a simple eigenvalue and hence so is Therefore

n 1 2 1

Now assume that G not connected but that it consists of k connected comp onents

G G By rst numbering the vertices of G then of G and so on LG has blo ck

1 k 1 2

S

k

LG diagonal form with LG LG as diagonal blo cks So LG

i 1 k

i=1

and since each of the LG contains one the rst k eigenvalues of LG are

i

We have thus shown that G is connected if and only if

2

But what if Then G cannot b e connected and con

1 2 k

sequently consists of at least two unconnected comp onents G and G As ab ove we

1 2

see that LG LG LG and if k at least one of the submatri

1 2

ces has more than one zeroeigenvalue and is therefore disconnected Rep eating this

argument we see that G has k connected comp onents

Equation also motivates the following

Definition LG is called the algebraic connectivity

2

Partitioning without geometric information

The corresp onding eigenvector is often called Fiedler vector It was named to

honor Miroslav Fiedlers pioneering work on these ob jects One of his discov

eries which serves as a rst very small motivation for the following algorithm is

the following

Theorem

Let G V E with V fv v v g b e a connected graph and let u b e its

1 2 n

Fiedler vector For any given real number r dene V fv V j u r g

1 i i

Then the subgraph induced by V is connected

1

Similarly for a real number r the subgraph induced by V fv V j u r g

2 i i

is also connected

Note that the theorem is indep endent of the length and sign of the Fiedler vector

This theorem do es not imply anything ab out a go o d bisection but it do es imply

that the Fiedler vector do es somehow divide the vertices in a sensible way After

all it guarantees at least one connected partition This is much more than what wo

would achieve by choosing just a random subset

This leads us to the idea of using the Fiedler vector to bisect the graph We will

rst describ e the algorithm and then later try to explain why it works

Algorithm Spectral Bisection

Given a connected graph G numb er the vertices in some way and form the

Laplacian matrix LG

Calculate the secondsmallest eigenvalue and its eigenvector u

Calculate the median m of all the comp onents of u

u

Cho ose V fv V j u m g V fv V j u m g and if some ele

i i u i i u

ments of u equal m distribute the corresponding vertices so that the partition

u

is balanced

But why do es this algorithm work There are some very dierent approaches that

try to answer this question In addition to the graphtheoretic approach cf Theorem

one can lo ok at graph bisection as a discrete optimization problem in or

as a quadratic assignment problem in lo ok at several physical mo dels like a

vibrating string or a net of connected weights in or examine the output of

a certain neural net trained to bisect a graph in

Among all of these the following approach via the discrete optimization formula

tion is in the authors opinion the most convincing derivation of the sp ectral bisection

algorithm

Sp ectral Bisection

We have a graph G V E with n jV j even which we want to partition into

two equal sized parts V and V To designate which vertex b elongs to which part we

1 2

n

use an index vector x fg such that

if v V

i 1

x

i

if v V

i 2

Notice that the function

X

2

f x x x

i j

(ij )E

denotes the number of cut edges since x x if b oth v and v lie in the same

i j i j

subpartition On the other hand if v and v lie in dierent partitions they have a

i j

2

dierent sign and thus x x

i j

We rewrite

X X X

2 2 2

x x x x x x

i j i j

i j

(ij )E (ij )E (ij )E

and note that

X X X

T

x x x A x x Ax

i j i ij j

v V v V

(ij )E

i j

with A the adjacency matrix and

n

X X X

T 2 2

degv x D x x x jE j

i

i j

i=1

(ij )E (ij )E

with D L A a diagonal matrix with the vertex degrees on the diagonal ie

D diag deg v and so cf Denition

i

i

T

1

x Lx f x

4

Now the graph bisection problem can b e formulated as

T

1

x Lx Minimize f x

4

n

sub ject to x fg

T

x e

T

where again e The third condition x e forces the sets to b e of

equal size

Partitioning without geometric information

Graph partitioning is NPcomplete and is just another formulation so we

cannot really solve this problem either But we can relax the constraints a bit instead

of choosing discrete values for the x we lo ok for a real vector z which has the same

i

Euclidean length as x and fullls the other conditions

T

1

z Lz Minimize f z

4

T

sub ject to z z n

T

z e

As we will see this problem is easy to solve Since the feasible set of the continuous

problem is a subset of the feasible set of the discrete problem every solution

of will give us a lower b ound for and we hop e that the continuous solution

will give us a go o d approximation to the solution of

From Theorem we know that the eigenvectors u u u of L are orthonor

1 2 n

p

n

mal and therefore span R and that u ne We can thus decomp ose every

1

P

n

n

z R into a linear combination z u of these vectors

i i

i=1

In order to to meet the restrictions we have to place some conditions on the

i

T

We want z e and since

n n

X X

T

T T T

z e u u u u u u

i i 1 i i 1 1 1 1

1

i=1 i=1

T

we need to have For z z n we need

1

n n

X X

T

T

z z u u

i i i i

i=1 i=1

n n n

X X X

T 2

u u n

i j j

i i

i=1 j =1 i=1

Furthermore we have

n n

X X

T

T

f z z Lz u L u

i i i i

i=1 i=1

n n n n

X X X X

T

T

u u u u

j i j j i i i i i

i

j =1 i=1 i=1 i=1

n

X

2

i

i

i=1

n n

X X

2 2

n

2 i 2

i i

i=1 i=2

Sp ectral Bisection

where holds since and the inequality follows from for

1 2 i

p

i n Since we can achieve f z n by choosing z nu we nd

2 2

p

nu which satises b oth constraints is a that the correctly scaled Fiedler vector

2

solution of the continuous optimization problem In fact if the solution

3 2

n

is unique up to a factor of cf also p In addition is a lower b ound

2

4

for the cutsize of a bisection of V

Having found the solution of the continuous problem we need to map it

back to a discrete partition and hop e that the result is a go o d approximation to the

solution of the discrete problem

There seem to b e two natural ways to do this The rst metho d places

more imp ortance on the sign of the result and just map x sign z dealing with

i i

zero elements of x in some way eg assigning them all to one partition Its drawback

is that the partitions need not b e balanced Consequently some additional work is

necessary

The second and more often used metho d eg places its emphasis

on the balance and assigns the largest n elements of z to one partition and the rest

to the other partition This can easily b e done by computing the median z of the z

i

setting

if z z

i

x

i

if z z

i

and distributing the elements z z in such a way that the balance is preserved

i

T Chan P Ciarlet and W Szeto showed in that the vector x formed that way

is closest in any norm to z amongst all balanced vectors ie

x arg min kp z k

n T

pf1g p e=0

Note that in b oth cases it is not necessary to compute the eigenvector to a high

degree of accuracy If we use an iterative metho d like the Lanczos algorithm

Ch or the Rayleigh quotient iteration RQI Sec we can stop iterating

once the result is suciently accurate to either decide on the sign of the elements or

to recognize the n largest elements This can b e much cheaper than computing the

exact eigenvector

Sp ectral Bisection for weighted graphs

It is easy to generalize the sp ectral bisection algorithm to handle weighted graphs

Partitioning without geometric information

Given a graph G V E with weighted edges W one can dene the weighted

E

Laplacian L l with

ij ij

w e if i j and e E

e ij ij

l if i j and e E

ij ij

P

n

w e if i j

e ik

k =1

1

T

With this denition and x as ab ove see equation x Lx still denotes the

4

weight of the cut edges and so we can pro ceed as ab ove

T

If the graph has weighted vertices W the subsets have equal weights if w x

V

n

with w w v the vector of the vertex weights Using V diag w a diagonal

v i

i=1

matrix with the comp onents of w on the diagonal we reformulate this condition as

T

x V e We therefore have to solve the problem

T

1

x Lx Minimize f x

4

n

sub ject to x fg

T

x V e

Again we relax the constraints a bit instead of choosing discrete values for the x

i

we now lo ok for a real vector z which has the same weighted length as x ie

P

n

T T

z V z e V e w v note that the feasible solutions of the discrete problem

v i

i=1

satisfy this condition and fullls the other conditions

T

1

z Lz Minimize f z

4

T T

sub ject to z V z e V e

T

z V e

A solution for the relaxed equation is the eigenvector corresp onding to the

second smallest eigenvalue of the generalized eigenproblem Lx V z But since V

is diagonal with p ositive elements on the diagonal the problem can b e reduced to

12

an ordinary eigenproblem by substituting y V z This leads to Ay y with

12 12

A V LV The new matrix A is also p ositive semidenite with the same

structure as L

Sp ectral Quadri and Octasection

As seen in section it can b e advantageous to divide a graph into more than two

parts at once so it would b e interesting to nd a sp ectral algorithm doing this Rendl

and Wolkowitz describ e one such technique in While their algorithm allows the

decomp osition of a graph in p parts it requires the calculation of p eigenvectors and

Sp ectral Bisection

is thus exp ensive But as we will see the sp ectral bisection idea describ ed ab ove can

b e also adapted to divide a graph into more than two parts at once Hendrickson and

Leland developed a way to partition a graph into four or eight partitions of

equal size at once using two or three eigenvectors ie sp ectral quadrisection resp

octasection R Van Driessche later generalized the quadrisection algorithm to

divide the graph into partitions of unequal size

Contrary to sp ectral bisection these algorithms do

not minimize the cutsize but a closely related size

w = 1

e e

the hypercube hop or Manhattan metric In this the

weight of each cut edge is multiplied by the number w

e e

of bits that dier in the binary representation of the

w = 2

subset numbers see Figure This mo dels the p er

w = 1

formance of parallel communication on hypercub e and

e

or dimensional mesh top ologies cf In these

e

top ologies w can b e interpreted as the number of wires

a message must take while b eing transfered b etween the

resp ective pro cessors

Figure Hyp ercub e hop

We will describ e only the derivation of the quadri

metric in D

section For the similar but slightly more complicated

o ctasection see or

To distinguish b etween the four subpartitions V V we will use two index

0 3

n

vectors x y fg with

x and y v V

i i i 0

x and y v V

i i i 1

x and y v V

i i i 2

x and y v V

i i i 3

Using considerations similar to the ones that that lead to we nd that

1 T T

f x y x Lx y Ly

4

gives us the hop weight of the partition induced by x and y Assuming that n is a

multiple of the constraints that all V are to b e of the same size can b e expressed

i

Partitioning without geometric information

as

X

T

x y e x e y n

i i

i

X

T

x y e x e y n

i i

i

X

T

x y e x e y n

i i

i

X

T

x y e x e y n

i i

i

which after expanding and solving a small linear system simplies to

T T T

e x e y x y

Therefore we want to solve the following discrete minimization problem

T T

1

Minimize f x y x Lx y Ly

4

T T T

sub ject to e x e y x y

n

x y fg a

Again since this problem is to o dicult to solve we relax the discreteness condi

n

tion a and only demand that the solutions z w R have the same Euclidean

p

length n as x y We then get

T T

1

z Lz w Lw Minimize f z w

4

T T T

sub ject to e z e w z w

T T

z z w w n a

Using an argument similar to the one in the bisection case p we see that

p p

the scaled eigenvectors z nu and w nu b elonging to the second and third

2 3

smallest eigenvalues and are a solution of the continuous problem and

2 3

n

is a lower b ound on the number of hops induced by any quadrisection of

2 3

4

G

But any orthonormal pair z w from the subspace spanned by and ie z

2 3

p p p p

nu cos nu sin and w nu sin n u cos are also solutions of

2 3 2 3

The question now remains which of the solutions to choose One idea is to choose

in such a way that the resulting values closest to the discrete values that we were

P

2 2 2 2

originally lo oking for ie choose a that minimizes g z w

0

i i

i

for

g reduces to a quartic expression in sin and cos After constructing g

at a cost of O n the cost of evaluating and thus minimizing g is indep endent

Sp ectral Bisection

of n and usually insignicant compared to calculating the eigenvectors In it is

recommended to use a lo cal minimizer with some random starting p oints

But of course originally we were lo oking for an approximative solution to the

discrete problem So the last hurdle is the assignment of the continuous p oints

z w to discrete p oints Since we need to adhere to the restrictions given

i i

in ie the partitions need to have the same size we cannot simply use the

median metho d we used for the bisection case If we dene a distance function from

continuous p oints z w to discrete p oints distributing the p oints evenly

i i

while minimizing the sum of the distance is known as the minimum cost assignment

problem According to there exist ecient algorithms to solve the problem

Multilevel Sp ectral Bisection

For the ab ove algorithms we have to compute the Fiedlervector or the eigenvector

asso ciated with the thirdsmallest eigenvalue and for o ctasection the forthsmallest

of a sparse symmetric matrix First choice for such an algorithm is some variation of

the Lanczosalgorithm Ch But it turns out that compared to other graph

partitioning algorithms sp ectral bisection using the Lanczosalgorithm is extremely

slow

But how to sp eed up the calculation of the eigenvectors and thus the sp ectral

bisection algorithm Luckily we do not need to calculate the eigenvectors of just any

sparse matrix but only those of the Laplacian of a graph In S Barnard and H

Simon found a clever way to prot from the connections b etween the eigenvectors and

the graph We will present their idea in this section

(0)

The basic idea is quite simple Given a graph G we try to contract it to a

(1) (0)

new graph G that is in some sense similar to G but with fewer vertices Since

the new graph and consequently its Laplacian is smaller it is cheaper to calculate

(1) (1)

as a starting p oint to of the new graph We then use u the Fiedlervector u

2 2

(0)

calculate the Fiedlervector u of the original graph This pro cess has also b een

2

termed coarsening the graph and the smaller graph is also called coarser

Of course we can use this idea recursively constructing ever smaller graphs

(1) (2)

(2) (3)

G G to calculate u u and so on These dierent sized graphs lead to

2 2

the name of this algorithm Multilevel Spectral Bisection or MSB

So the rst rough draft of the eigenvector calculation for MSB lo oks like this

again we present it only for an unweighted graph generalization to the case with

weighted edges is quite straightforward

Partitioning without geometric information

Algorithm Multilevel Fiedlervector calculation

Function Fiedler G

If G is small enough then

Calculate f u using the Lanczos algorithm

else

Construct the smaller graph G and implicitly the corresponding Laplacian

L

f Fiedler G

(0)

Use f to nd an initial guess f for f

(0)

calculate f from the initial guess f

endif

Return f

Of course we have left out three rather imp ortant details How to construct the

(0)

smaller graph how to nd an initial guess f for f and how to eciently calculate

(0)

f using f

Constructing the smaller graph

First we need the following denitions

Definition Given a graph G V E a subset V V is called an independent

set with resp ect to V if no two elements of V are connected by an edge

Definition V is called a maximal independent set with resp ect to V if adding

any vertex from V n V to V would no longer make it an indep endent set

Note that there is another denition of a maximal indep endent set that requires

the indep endant set to b e of maximal size ie V is a maximal indep endent set with

resp ect to V if there do es not exist an indep endant subset V of V with jV j jV j

The two denitions are not equivalent

It is easy to see that an indep endent set V V is maximal if and only if every

no de from V n V is adjacent to a no de in V

A maximal indep endent set can easily and eciently b e found by using a greedy

algorithm and while they are by no means unique dierent maximal indep endent sets

typically have a similar size

So to coarsen a graph G V E to a new smaller graph G V E we do the

following

Sp ectral Bisection

Cho ose a maximal indep endent subset V of V and use it as the vertex set of

the new graph any subset of V would do but a maximally indep endent set is

guaranteed to b e evenly distributed over G

Grow domains in G around the vertices in V and add an edge e to E whenever

ij

the domains around v and v intersect

i j

Growing a domain around a vertex is done by rep eatedly adding all neighbors

of vertices already in the domain If the starting set is maximal indep endent two

cycles of domain growth are enough to cover the whole graph and therefore determine

E In that case two vertices in V are connected exactly if there exists a path with

vertices or less connecting the corresp onding vertices in V It is also easy to see

that the resulting graph is connected if and only if the original one was connected as

well Note that contrary to what is stated in two growth cycles are enough

[k ] k

since using the notation in G shortcuts all paths of length instead of the

claimed k

Finding an initial guess

Having calculated the Fiedler vector f of the smaller graph we now face the problem

of how to use it to approximate the Fiedler vector f of the bigger graph Here we use

our knowledge ab out the relationship b etween G and G First we assign the values

f to the appropriate elements of f In mathematical terms Let mi b e a mapping

i

b etween the vertices of the condensed graph and the vertices of G such that i is the

index of the vertex in V that corresp onds to the mith vertex in V Then let

f f for i jV j

m(i)

i

Finally for the comp onents j of f corresp onding to vertices in V n V we assign

the average of the values corresp onding to those vertices in V adjacent to v The

j

mathematical formulation of this is

X X

f f for all j with v V n V

j i j

0 0

i: v G i: v G

i i

e E e E

ij ij

Since V is maximal indep endent every element in V n V has at least one neighbor in

(0)

V With the ab ove we have thus assigned a value to each comp onent of f

One should note that this op eration is similar to the prolongation op eration in

classical multigrid metho ds for linear systems

Partitioning without geometric information

(0)

Calculating the Fiedlervector using the estimate f

(0)

The last detail to solve is how to calculate the Fiedlervector f using the estimate f

While the Lanczosalgorithm is the usual favorite for extreme eigenpairs of symmetric

sparse matrices it do es not eectively take advantage of an initial guess In the

Rayleigh quotient iteration RQI Sec was choosen to rene the initial

(0)

guess f Since we already have a go o d approximation and our accuracy demands

are rather low RQI will usually take only very few iterations to converge

Algorithm Rayleigh quotient iteration

(0) (0) (0) (0)

Given an initial guess f for an eigenpair f set v f kf k and

rep eat

T

v Lv

Solve L I x v

v xkxk

until is close enough to

A closer lo ok at the RQI algorithm reveals that we have to solve a linear equation

with L I in each step Since this matrix is sparse symmetric but probably indenite

and illconditioned the metho d of choice is the SYMMLQ iteration According

to the total number of matrixvector op erations in all RQI steps combined is

generally less than

In exp eriments describ ed in the MSB with RQI and SYMMLQ has turned out

to b e very ecient Compared with the normal sp ectral bisection using the Lanczos

algorithm it usually is at least an order of magnitude faster

Algebraic Multilevel

In the MSB algorithm describ ed ab ove the smaller graph was constructed solely out

of the larger graph and only then the corresp onding Laplacian matrix was implicitly

build and its Fiedlervector computed

But the problem can also b e approached in another way more in tune with the

classical multilevel algorithms for linear systems That is given a Laplacian matrix

0

xn nn

L R we construct a restriction op erator R R and a prolongation op erator

0

n n

P R and study the relation b etween prolongated solution of the smaller

restricted problem and the original problem In R Van Driessche did just that

We will rep ort some of his results in this section

Also choosing a maximal indep endent subset of V the matrix equivalent of the

interpolation metho d ab ove in section was used as prolongation op erator P

Sp ectral Bisection

P T I

Further two restriction op erators were choosen R P and R a simple injection

I

op erator r for i n and j n

m(i)j

ij

I I

It turns out that using R leads to another eigenproblem with a matrix L R LP

L can b e interpreted as the Laplacian matrix of a new smaller graph with weighted

edges This new graph can b e shown to b e connected so we can rep eat the pro cedure

recursively

P

R do es not have these nice prop erties Using it leads to a generalized eigenprob

P

lem L R LP is again the Laplacian of a weighted graph but that graph cannot

b e guaranteed to b e connected and the weights of the edges might not b e p ositive

A rep eated application may therefore not b e p ossible Finally the solution of the

generalized eigenproblem is computationally more exp ensive

I P

In the prolongations v and v of the second third and fourth eigenvector

i i

of the contracted graph were compared with the corresp onding eigenvectors u j

j

of the original graph This was done by examining the absolute values of the

inner pro ducts of the normalized vectors Notable results were

I P

The prolongations v v were quite rich in u i that is the absolute

i

i i

value of the inner pro duct was quite big well ab ove for j and ab ove

for j

The contributions of the other eigenvectors ie for i j were much smaller

most of the time les than

This means that the prolonged eigenvectors are go o d approximations of the eigen

vector of the original graph and are thus go o d starting vectors for the Rayleigh quo

tient iteration

Further ndings included

P

The metho d using R delivered b etter eigenvector approximations But this

do es not oset the more exp ensive solution of the generalized eigenproblem and

I

so the other metho d using R may often b e faster in delivering the nal result

The quality of the eigenvector approximation is not only determined by the

eigenvalue approximation but also by the gap b etween its eigenvalue and neigh

b oring eigenvalues

Other implementations of MSB use even other metho ds of constructing the smaller

meshes Chaco for example uses one of the algorithms describ ed in section

but rep ort similar nal results so it seems that the choice of the coarsening metho d

is not critical

Partitioning without geometric information

coarsen graph

partitio n graph

project and refine graph

PSfrag replacements

Figure Multilevel graph partitioning

Multilevel Partitioning

The multilevel sp ectral bisection algorithm describ ed in section has turned out

to b e very ecient Since the multilevel approach obviously has a lot of merit can it

b e applied in some other way as well

In MSB we transfer approximate information ab out the second eigenvector of the

original graph b etween levels improve this information on each level and nally use

this eigenvector to partition the graph A more direct way would b e to simply transfer

information ab out the partitions itself b etween levels and improve the partitions on

each level And this is in a rough outline the description of one of the most successful

algorithms the multilevel graph partitioning

Algorithm Multilevel graph bisection

Function MLPartition G

If G is small enough then

Find partition V V of G in some way

else

Construct a smaller approximation G

V V MLPartition G

V V Projectpartitionup V V

V V Renepartition V V

endif

Multilevel Partitioning

1 3 1 3

t t t t

A A

A A

A A

1 2 2

1 4

A A

A A

A A

A A

A A

A A t t t

3 2 5

a original graph w weights b contracted graph

Figure Coarsening a graph

Return V V

In algorithm we bisect the graph but the same idea can b e used to partition

the graph into p parts at once Actually there is a whole family of slightly dierent

multilevel graph partitioning algorithms which dier in the three generic parts con

struct smaller approximation pro ject partition up and ree pro jected partition

and in the way the smallest graph is partitioned We will describ e some of the so

lutions implemented in software packages like Chaco METIS or WGPP

All of the following coarsening algorithms lead to graphs with edge and vertex

weights so we might as well start with a weighted graph V E

Coarsening the graph

In all of the following metho ds the coarser graph is constructed in similar ways First

we nd a maximal matching A matching is a subset M E of edges no two of which

share an endp oint A matching is maximal if no other edges from E can b e added

Given the matching M the new graph is constructed in the following way for

each edge e M we contract its endp oints v and v into a new vertex v Its

ij i j n

weight equals the sum w v w v of the weights of v and v Its neighbors are

v i v j i j

the combined neighbors of v and v with accumulated edgeweights ie w e

i j e nk

w e w e where the weight on nonexistent edges is assumed to b e The

e ik e ik

edge e disapp ears See also Figure

ij

Partitioning without geometric information

The main dierence b etween the dierent implementations is the construction

of the maximal matching The simplest metho d suggested by B Hendrickson and

R Leland in is randomized matching RM which randomly selects tting edges

until the matching is maximal

Algorithm Randomized Matching RM

Unmark all vertices and set M

While there are unmarked vertices left

Randomly select unmarked vertex v

i

Mark v

i

If v has unmarked neighb ors

i

Randomly cho ose unmarked neighb or v of v

j i

Mark v

j

Add edge e to M

ij

If we lo ok at the way the coarser graph is constructed we notice that the edges in

the matching disapp ear from the lower levels The total edgeweight of the coarser

graph equals the edgeweight of the original graph minus the weight of the edges in

the matching In it is suggested that by decreasing the total edgeweight of the

coarser graph one might lower its cutsize and also the cutsize of the original graph

Exp eriments conrmed this assumption and also showed that this choice accelerated

the renement pro cess

To achieve this G Karypis and V Kumar use a slight mo dication of the RM

algorithm called heavy edge matching HEM Instead of choosing a random neighbor

in line of Algorithm they simply select the unmarked neighbor which is

connected to v by the heaviest edge

i

The problem with this approach is that it might miss some heavy edges If in

Figure vertex v is selected rst the edge with weight will b e selected Conse

i

quently the heavier edge with weight will not b e selected To avoid this A Gupta

suggests in to sort the edges by weight and then to choose the heaviest p ermissible

edge rst ties are broken randomly He calls this metho d is heaviest edge matching

Is usually done only in the later stages of the coarsening pro cess when the graphs

are considerably smaller and thus the sorting is cheap Furthermore heaviest edge

matching is particularly ecient when the graph has edges with very dierent weights

This weight inequilibrium will often app ear after some coarsening steps

Multilevel Partitioning

These basic strategies can b e coupled with other metho ds to further improve the

coarser graph

G Karypis and V Kumar try to decrease the average degree of the coarser graph

arguing that fewer edges lead to a higher average edgeweight which in turn will help

the HEM to nd heavier edges This modied heavy edge matching HEM is done

by choosing the appropriate edge amongst all p ermissible edges with maximal weight

Using his heavytriangle matching HTM A Gupta

i

j

also tries to contract three vertices into one again try

vm v

ing to maximize the weight of the connecting edges

Additionally he tries to keep the weights of the coars

ened vertices balanced by using a mo died edgeweight

1

v

w e w e w v w v which also

e ij e ij v i v j

takes the vertexweights into account For this the

graph is regarded as completely connected with the ad

Figure Missing a

ditional edges having zero weight This simplies the

heavy edge

bisection of the nal graph

Partitioning the smallest graph

Once the graph is coarse enough usually when jV j is less than say or or when

further coarsening steps only reduce the graph a little it needs to b e partitioned

To do this one can use any of the other metho ds mentioned ab ove Since the graph

to b e partitioned is very small eciency is not imp ortant One can even try several

metho ds or the same randomized metho d multiple times and choose the b est result

Metho ds normally used include graphgrowing sp ectral bisection or just Kernigan

Lin with random starting partitions

But this very small graph will have vertices with sometimes very dierent

weights So it might b e imp ossible to divide the vertices into p partitions of equal

weight And even if it is p ossible doing so might cause a high cutsize So it is

advisable to allow some imbalance in order to improve the quality of the cut This

imbalance can then b e reduced during the renement steps

The WGPP program uses a clever way to eliminate the need for additional

co de to partition the smallest graph Since its coarsening metho d tries to keep the

weights of the coarsened vertices balanced it simply coarsens the graph down to p

normally vertices and then uncoarsens and renes it up to q with p q n

It then saves this partition coarsens it again down to p vertices remember that this

coarsening is randomized and rep eats this pro cess always saving the currently b est

partition The quality of the partition should also take the partition size imbalance

into consideration Since q is small the pro cess can b e cheaply rep eated a couple of

Partitioning without geometric information

times and the b est result is chosen

Projecting up and rening the partition

Given the partition of the coarser graph we now need to nd a go o d partition of the

ner graph First we pro ject up the partition of the coarser graph to the ner one

Each vertex of the ner graph can b e traced to exactly one vertex in the coarser graph

either it together with one or more other vertices is contracted into a new coarser

level vertex or it is simply copied to the coarser level So having partitioned the

coarser graph we can simply assign the vertices on the ner graph to the appropriate

partitions

But on the ner graph we now have more degrees of freedom to nd a b etter

partition with a lower cutsize so we need to rene the graph improving b oth the

cutsize and the balance since the partitions might b e of dierent weight

The renement is usually done by using one of the variants of the KerniganLin

algorithm describ ed in section The FiducciaMattheyses variant in par

ticular can easily b e mo died to improve the balance at the same time by adding

some term describing the imbalance into the costfunction used to nd the next ver

tex to move Other changes include using only one inner pass and considering only

b oundaryvertices

Other metho ds

In the sections ab ove we have describ ed the more common metho ds for graph par

titioning But there are many other metho ds as well b oth geometrically oriented

and graph based which we will only cite here Some of the

metho ds are sp ecialized for nite element meshes for PDEs Others have

tried to apply general optimization techniques like simulated annealing or genetic al

gorithms to the partitioning problem And nally in addition to the

software libraries mentioned ab ove there are other programs available as well for

example Jostle Party and Scotch

Bibliography

G Ahmdahl The validity of the single pro cessor approach to achieving large scale

computing capabilities In AFIPS Proc of the SJCC volume pages

S T Barnard A Pothen and H Simon A sp ectral algorithm for envelope reduction

of sparse matrices Numerical Linear Algebra with Applications

S T Barnard and H D Simon A fast multilevel implementation of recursive sp ectral

bisection for partitioning unstructured problems Concurrency Practice and Experi

ence Feb

S T Barnard PMRSB Parallel multilevel recursive sp ectral bisection In Proceedings

of the ACMIEEE Supercomputing Conference ACM IEEE published

online and as CD

A Berman and R J Plemmons Nonnegative Matrices in the Mathematical Sciences

Academic Press

M W Berry B Hendrickson and P Raghavan Sparse matrix reordering schemes for

browsing hypertext In J Renegar M Shub and S Smale editors The Mathematics

of Numerical Analysis volume of Lectures in Applied Mathematics pages

AMS

W L Briggs A Multigrid Tutorial SIAM

T F Chan J R Gilb ert and SH Teng Geometric sp ectral partitioning Technical

Rep ort CSL Parc Xerox Jan

T F Chan J P Ciarlet and W K Szeto On the optimality of the median cut

sp ectral bisection graph partitioning metho d SIAM Journal on Scientic Computing

May

T F Chan and W K Szeto A sign cut version of the recursive sp ectral bisection graph

partitioning algorithm In J G Lewis editor Proceedings of the th SIAM Conference

on Applied Linear Algebra Snowbird UT SIAM

G Chartrand and O R Oellermann Applied and algorithmic graph theory Interna

tional series in pure and applied mathematics McGrawHill New York

Bibliography

P G Ciarlet and J L Lions Handbook of numerical analysis Part Finite element

methods NorthHolland Amsterdam

H L deCougny D D Devine J E Flaherty R M Loy C zturan and M S Shep

hard Load balancing for the parallel adaptive solution of partial dierential equations

Applied Numerical Mathematics

R Dieckmann B Monien and R Preis Using helpful sets to improve graph bisections

Technical Rep ort trrf Fak f Informatik UniversittGH Paderborn June

R Dieckmann B Monien and R Preis Load balancing strategies for distributed mem

ory machines Technical Rep ort trrsfb SFB UniversittGH Paderborn

Sept

J Falkner F Rendl and H Wolkowitz A computational study of graph partition

ing Mathematical Programming Also University of Waterloo

TechReport CORR

C Farhat A simple and ecient automatic FEM domain decomp oser Computers and

Structures

C M Fiduccia and R M Mattheyses A linear time heuristic for improving network

partitions Technical Rep ort CRD General Electric Co Corp orate Research and

Development Center SchenectadyNY May

M Fiedler Algebraic connectivity of graphs Czecheslovak Mathematical Journal

M Fiedler A prop erty of eigenvectors of nonnegative symmetric matrices and its

application to graph theory Czecheslovak Mathematical Journal

G C Fox R D Williams and P C Messing Parallel Computing Works Morgam

Kaufmann Publishers

R Freund and N Nachtigal Qmr A quasiminimal residual metho d for nonhermitian

linear systems Numerical Mathematics

M R Garey and D S Johnson Computers and Intractability A Guide to the Theory

of NPCompleteness Freeman

J R Gilb ert G L Miller and SH Teng Geometric mesh partitioning Implemen

tation and exp eriments Technical Rep ort CSL Xerox PARC to app ear

SIAM J Sci Comp

G H Golub and C F v Loan Matrix Computations Johns Hopkins University Press

Baltimore and London third edition

A Gupta WGPP Watson Graph Partitioning and sparse matrix ordering Package

IBM Research Division TJ Watson Research Center POBox Yorktown Heights

New York

A Gupta Graph partitioning based sparse matrix orderings for interior p oint algo

rithms IBM Journal of Research and Development

S Hammond Mapping unstructured grid computations to massively parallel computers

PhD thesis Rensselaer Polytechnic Institute Dept of Computer Science Renssselaer

NY

B Hendrickson and R Leland Multidimensional sp ectral load balancing In Proc th

SIAM Conf Par Proc Sci Comput

B Hendrickson and R Leland Multidimensional sp ectral load balancing Technical

Rep ort SAND Sandia National Lab oratories also in Pro c th SIAM

Conf Par Pro c Sci Comput

B Hendrickson and R Leland A multilevel algorithm for partitioning graphs Tech

nical Rep ort SAND Sandia National Lab oratories app eared in Pro c

Sup ercomputing

B Hendrickson and R Leland An empirical study of static load balancing algorithms

In Proceedings of the Scalable High Performance Computer Conference pages

IEEE

B Hendrickson and R Leland The Chaco Users Guide Version Sandia National

Lab oratories Albuquerque NM

B Hendrickson and R Leland An improved sp ectral graph partitioning algorithm for

mapping parallel computations SIAM Journal on Scientic Computing

R A Horn and C R Johnson Matrix Analysis Cambridge University Press

M Jerrum and G Sorkin Simulated annealing for graph bisection Technical Rep ort

ECSLFCS Dept of Computer Science University of Edinburgh

D S Johnson C R Aragon L A McGeo ch and C Schevon Optimization by sim

ulated annealing an exp erimental evaluation part graph partitioning Operations

Research Nov

G Karypis R Aggarwal V Kumar and S Shekhar Multilevel partition

ing Applications in vlsi domain Technical rep ort University of Minnesota Department

of Computer Science short version in th Design Automation Conference

Bibliography

G Karypis and V Kumar A fast and high quality multilevel scheme for partitioning

irregular graphs Technical Rep ort TR University of Minnesota Department of

Computer Science July to app ear SIAM J Sci Comp

G Karypis and V Kumar MeTis Unstrctured Graph Partitioning and Sparse Matrix

Ordering System Version Aug

G Karypis and V Kumar Parallel multilevel k way partitioning scheme for irregular

graphs Technical Rep ort TR University of Minnesota Department of Computer

Science

B Kernigan and S Lin An ecient heuristic pro cedure for partitioning graphs The

Bel l System Technical Journal

V Kumar A Grama A Gupta and G Karypis Introduction to Parallel Computing

BenjaminCummings Redwoo d City CA

LT Liu MT Kuo SC Huang and CK Cheng A gradient metho d on the initial

partition of FiducciaMattheyses algorithm In IEEEACM Int Conf on Computer

Aided Design pages IEEEACM Nov

R Merris Laplacians matrices of graphs A survey Linear Algebra and its Applications

G L Miller SH Teng W Thurston and S A Vavasis Automatic mesh partitioning

In A George J R Gilb ert and J W Liu editors Graph Theory and Sparse Matrix

Computation volume of IMA Volumes in Mathematics and its Applications pages

Springer

G L Miller SH Teng W Thurston and S A Vavasis Geometric separators for

nite element meshes Technical rep ort University of Minnesota to app ear SIAM

J Sci Comp Vol Nr march

C C Paige and M A Saunders Solution of sparse indenite systems of linear equations

SIAM Journal on Numerical Analysis

F Pellegrini SCOTCH Users Guide Universit Bordeaux I Aug

A Pothen E Rothberg H Simon and L Wang Parallel sparse Cholesky factoriza

tion with sp ectral nested bisection ordering Technical Rep ort RNR Numerical

Aerospace Simulation Facility NASA Ames Research Center May

A Pothen H Simon and KP Liou Partitioning sparse matrices with eigenvectors of

graphs SIAM Journal on Matrix Analysis and Applications July

A Pothen H Simon and L Wang Sp ectral nested disection Technical Rep ort RNR

NASA Ames Research Center Short version in Sup ercomputing pp

Nov

A Pothen Graph partitioning algorithms with applications to scientic computing

In D E Keyes A H Sameh and V Venkatakrishnan editors Parallel Numerical

Algorithms Kluwer Academic Press

R Preis and R Dieckmann The PARTY Partitioning Library User Guide Version

SFB Univ Paderborn

F Rendl and H Wolkowicz A pro jection technique for partitioning the no des of a

graph Annals of Operations Research

E Rothberg and B Hendrickson Sparse matrix ordering metho ds for interior p oint

linear programming Technical Rep ort SANDJ Sandia National Lab oratories

Jan

Y Saad and M Schultz GMRES A generalized minimal residual algorithm for solving

nonsymmetric linear systems SIAM Journal on Scientic and Statistical Computing

Y Saad Iterative methods for sparse linear system PWS Publishing Company Boston

MA

H Simon and SH Teng How go o d is recursive bisection SIAM Journal on Scientic

Computing Sept

D Vanderstraeten and R Keunings Optimized partitioning of unstructured nite

element meshes International Journal for Numerical Methods in Engineering

R Van Driessche Algorithms for static and dynamic Load Balancing on Parallel Com

puters PhD thesis KU Leuven Nov

C Walshaw and M Berzins Dynamic load balancing for PDE solvers on adaptive

unstructured meshes Research Rep ort Series University of Leeds School of Com

puter Studies Dec

C Walshaw MCross M G Everett SJohnson and K McManus Partitioning

mapping of unstructured meshes to parallel machine top ologies In A Ferreira and

J Rolim editors Irregular Parallel Algorithms for Irregularly Structured Problems

volume of LNCS pages Springer

Bibliography

C Walshaw MCross M G Everett and SJohnson JOSTLE Partitioning of unstruc

tured meshes for massively parallel machines In N S et al editor Parallel Computa

tional Fluid Dynamics New Algorithms and Applications pages Amsterdam

Elsevier

C Walshaw MCross and M G Everett Dynamic mesh partitioning a unied opti

misation and loadbalancing algorithm Technical Rep ort IM School of Mathe

matics Statistics Computer Science University of Greenwich

C Walshaw MCross and M G Everett A lo calised algorithm for optimising unstruc

tured mesh partitions International Journal of Supercomputer Applications

C Walshaw MCross and M G Everett A parallelisable algorithm for optimising un

stuctured mesh partitions Technical Rep ort IM School of Mathematics Statis

tics Computer Science University of Greenwich Jan

R D Williams Performance of dynamic load balancing algorithms for unstructured

mesh calculations Concurrency