Random Sampling and Greedy Sparsication for

Optimization Problems



David R Karger

MIT

Lab oratory for Computer Science

kargerlcsmitedu

Abstract

Random sampling is a p owerful to ol for gathering information ab out a group by considering

only a small part of it We discuss some broadly applicable paradigms for using random sampling

in combinatorial optimization and demonstrate the eectiveness of these paradigms for two

optimization problems on nding an optimum matroid basis and packing disjoint

matroid bases Applications of these ideas to the led to fast algorithms for

minimum spanning trees and minimum cuts

An optimum matroid basis is typically found by a greedy algorithm that grows an indep endent

set into an the optimum basis one element at a time This continuous change in the indep endent

set can make it hard to p erform the indep endence tests needed by the greedy algorithm We

simplify matters by using sampling to reduce the problem of nding an optimum matroid basis

to the problem of verifying that a given xed basis is optimum showing that the two problems

can b e solved in roughly the same time

Another application of sampling is to packing matroid bases also known as matroid par

titioning Sampling reduces the numb er of bases that must b e packed We combine sampling

with a greedy packing strategy that reduces the size of the matroid Together these techniques

give accelerated packing algorithms We give particular attention to the problem of packing

spanning trees in graphs which has applications in network reliability analysis Our results can

b e seen as generalizing certain results from random graph theory The techniques have also

b een eective for other packing problems

Intro duction

Arguably the central concept of statistics is that of a representative sample It is often p ossible to

gather a great deal of information ab out a large p opulation by examining a small sample randomly

drawn from it This has obvious advantages in reducing the investigators work b oth in gathering

and in analyzing the data

We apply the concept of a representative sample to combinatorial optimization Given an op

timization problem it may b e p ossible to generate a small representative subproblem by random

sampling Intuitively such a subproblem may form a micro cosm of the larger problem In particu

lar an optimum solution to the subproblem may b e a nearly optimum solution to the problem as

a whole In some situations such an approximation might b e sucient In other situations it may

b e relatively easy to improve this go o d solution to a truly optimum solution



Some of this work done at Stanford University supp orted by National Science Foundation and Hertz Foun

dation Graduate Fellowships and NSF Young Investigator Award CCR with matching funds from IBM

Schlumb erger Foundation Shell Foundation and Xerox Corp oration Research supp orted in part by ARPA contract

N and NSF contract CCR

We show that these desirable prop erties apply to two problems involving matroids matroid

optimization which aims to nd a minimum cost matroid basis and basis packing which aims to

nd a maximum set of pairwisedisjoint matroid bases

An early algorithmic use of sampling was in a median nding algorithm due to Floyd and

Rivest FR Their algorithm selects a small random sample of elements from the set they prove

that insp ecting this sample gives a very accurate approximation to the value of the median It is

then easy to nd the actual median by examining only those elements close to the estimate This

algorithm uses fewer comparisons in exp ectation than any other known mediannding algorithm

Examination of this algorithm reveals a paradigm involving three parts The rst is a denition

of a randomly sampled subproblem The second is an approximation theorem that describ es how a

solution to the subproblem approximates a solution to the original problem These two comp onents

by themselves will typically yield an obvious approximation algorithm with a sp eedaccuracy tradeo

The third comp onent is a renement algorithm that takes the approximate solution and turns it into

an exact solution Combining these three comp onents can yield an algorithm whose running time will

b e determined by the renement algorithm intuitively renement should b e easier than computing

a solution from scratch

This approach b een applied successfully to several graph optimization problems A preliminary

version of this pap er Kar gave a samplingbased algorithm for the minimum spanning prob

lem after which Klein and Tarjan KT showed that a version of it ran in linear time a merged

journal version app eared in KKT This author Kara Kar BK Karb used these sam

pling techniques in fast algorithms for nding and approximating minimum cuts and maximum ows

in undirected graphs

In this pap er we extend our techniques from graphs to matroids Besides increasing our under

standing of what is imp ortant in the graph sampling results our matroid sampling techniques have

additional applications in particular to packing spanning trees

Matroids

We give a brief discussion of matroids and the ramications of our approach to them An extensive

discussion of matroids can b e found in Wel

The matroid is an abstraction that generalizes b oth graphs and vector spaces A matroid M

consists of a nite ground set M of which some subsets are declared to b e independent The

indep endent sets must satisfy three prop erties

The empty set is indep endent

All subsets of an indep endent set are indep endent

If U and V are indep endent and kU k kV k then some element of U can b e added to V to

yield an indep endent set

This denition clearly generalizes the notion of linear indep endence in vector spaces VDW

However it was noted Whi that matroids also generalize graphs in the graphic matroid the

edges of the graph form the ground set and the indep endent sets are the acyclic sets of edges

forests Maximal indep endent sets of a matroid are called bases bases in a vector space are the

standard ones while the bases in a graph are the spanning forests spanning trees if the graph

is connected In the matching matroid of a graph Law bases corresp ond to the endp oints of

maximum matchings

Matroids have rich structure and are the sub ject of much study in their own right Wel

Matroid theory is used to solve problems in electrical circuit analysis and structural rigidity Rec

A discussion of many optimization problems that turn out to b e sp ecial cases of matroid problems

can b e found in LR

Matroid Optimization

In computer science p erhaps the most natural problem involving matroids is matroid optimization

If a weight is assigned to each element of a matroid and the weight of a set is dened as the sum of

its elements weights the optimization problem is to nd a basis of minimum weight The minimum

problem is the matroid optimization problem on the graphic matroid Several other

problems can also b e formulated as instances of matroid optimization CLR Law LR

Edmonds Edm observed that the matroid optimization problem can b e solved by the following

natural greedy algorithm Begin with an empty indep endent set I and consider the matroid elements

in order of increasing weight Add each element to I if doing so will keep I indep endent Applying the

greedy algorithm to the graphic matroid yields Kruskals algorithm Kru for minimum spanning

trees grow a forest by rep eatedly adding to the forest the minimum weight edge that do es not form

a cycle with edges already in the forest An interesting converse result Wel is that if a family of

sets do es not satisfy the matroid indep endent set prop erties then there is an assignment of weights

to the elements for which the greedy algorithm will fail to nd an optimum set in the family

The greedy algorithm has two drawbacks First the elements of the matroid must b e examined

in order of weight Thus the matroid elements must b e sorted adding an m log m term to the

running time of the greedy algorithm on an melement matroid Second the indep endent set under

construction is constantly changing so that the problem of determining indep endence of elements

is a dynamic one

Contrast the optimization problem with that of verifying the optimality of a given basis For

matroids all that is necessary is to verify that no single element of the matroid M improves the

basis Thus in verication the elements of M can b e examined in any order Furthermore the basis

that must b e veried is static Extensive study of dynamic algorithms has demonstrated that they

tend to b e signicantly more complicated than their static counterpartsin particular algorithms

on a static input can prepro cess the input so as to accelerate queries against it

Applying the sampling paradigm we show in Section how an algorithm for verifying basis

optimality can b e used to construct an optimum basis The reduction is very simple and suggests

that the b est way to develop a go o d optimization algorithm for a matroid is to fo cus attention on

developing a go o d verication algorithm

Basis Packing

Packing problems are common in combinatorial optimization Some well studied problems that can

b e seen as packing problems include the maximum ow problem packing paths b etween a source

and sink computing graph connectivity equivalent to packing directed spanning trees as discussed

by Gab ow Gab matching packing disjoint edges and bin packing Schrijver Sch discusses

packing problems in numerous settings

Many packing problems have the following form given a collection of implicitly dened feasible

sets over some universe nd a maximum collection of disjoint feasible sets Random sampling applies

to packing problems A natural random subproblem is dened by sp ecifying a random subset of

the universe and asking for a packing of feasible sets within the random subset Furthermore if we

partition the universe randomly into two sets dening two subproblems the solutions to the two

subproblems can b e merged to give an approximate solution to the whole problem

In Section we discuss the problem of packing matroid bases ie nding a maximum set of

disjoint bases in a matroid This problem arises in the analysis of electrical networks and in the

analysis of the rigidity of physical structures see LR for details Edmonds Edm gave an early

algorithm for the problem A simpler algorithm was given by Knuth Knu Faster algorithms

exist for the sp ecial case of the graphic matroid GW Bar where the problem is to nd a

maximum collection of disjoint spanning trees this particular problem is imp ortant in network

reliability analysissee for example Colb ourns b o ok Col

In Section we apply random sampling to the basis packing problem Let the packing number

of a matroid b e the maximum numb er of disjoint bases in it We show that a random sample of a

k fraction of the elements from a matroid with packing numb er n has a packing numb er tightly

distributed around nk This provides the approximation results needed to apply sampling

Random sampling lets us reduce algorithms dep endence on the packing numb er k A com

plementary greedy packing scheme generalizing Nagamo chi and Ibarakis sparse certicate tech

nique NI for graphs lets us reduce the running time dep endence on the size of the matroid We

describ e this scheme in Section

In Section we show how combined sampling and greedy packing yields approximate basis

packing algorithms whose running times aside from a linear term dep end only on the rank of the

input matroid and not its size In section we give an application of this combination to the

problem of packing spanning trees in a graph

Related work

Polesskii Pol showed a b ound on the likeliho o d that a random sample of the elements of a matroid

contains a basis as a function of the numb er of disjoint bases in the matroid His result is related

to our study of matroid basis packing in Section

Our packing techniques have had some applications to mincuts and maxows in undirected

graphs Among our samplingbased results for an medge nvertex undirected graph are

A randomized algorithm to nd minimum cuts in O m time Kar

p

A randomized algorithm to nd a packing of c directed spanning trees in O m c time

extending Gab ows deterministic O mc time algorithm together with some faster maximum

ow algorithms Kara

A randomized algorithm for approximating st mincuts in O n time BK

Our matroid algorithm generalizes of the graph algorithms presented there

Reif and Spirakis RS studied random matroids their results generalized existence pro ofs and

algorithms for Hamiltonian paths and p erfect matchings in random graphs Their approach was

to analyze the average case b ehavior of matroid algorithms on random inputs while our goal is to

develop randomized algorithms that work well on all inputs

Unlike the problem of counting or constructing disjoint bases the problem of counting the total

numb er of bases in a matroid is hard This P complete generalization of the problem of counting

p erfect matchings in a graph has b een addressed recently see for example FM

Preliminaries and Denitions

Throughout this pap er log m denotes log m

For any set A kAk denotes the cardinality of A The complement of A with resp ect to a known

A universe is denoted by

Our algorithms rely on random sampling typically from some universe U

Denition For a xed universe U Up is a set generated from U by including each element

indep endently with probability p For A U Ap A U p

Note that we are overloading the denition of Apit reects the outcome of the original

exp eriment on U and not of a new sampling exp eriment p erformed only on the elements of A

Which meaning is intended will b e clear from context

Consider a statement that refers to a variable n We say that the statement holds with high

probability whp in n if for any constant d there is a setting of the constants in the statement

d

typically hidden by O notation such that the probability that the statement fails to hold is O n

1

O f denotes O f p olylog f

Randomized Algorithms

Many algorithms describ ed in this pap er are randomized We assume these algorithms can call a

subroutine RBIT that in O time returns a random bit that is indep endent of all other bits and

unbiased or with equal probability The outcomes of calls to RBIT denes a probability space

over which we measure the probability of events related to the algorithms execution When we say

that an algorithm works with high probability without reference to a variable then the implicit

parameter is the size of our problem inputin our case the numb er of elements in the matroid

Although the mo del allows us to generate only unbiased random bits a standard construc

tion KY lets us use RBIT to generate random bits that are with any sp ecied probability The

exp ected time to generate such a bit is constant and the time to generate a sequence of more than

log n such bits is amortized linear in the numb er of bits with high probability in n All algo

rithms in this pap er use enough random bits to allow this amortization to take place so we assume

throughout that generating a biased random bit takes constant time

Randomized algorithms can fail to b e correct in two ways A time T n Monte Carlo MC

algorithm runs in time T n with certainty on problems of size n and gives the correct answer with

high probability in n A time T n Las Vegas LV algorithm always gives the correct answer but

only runs in time T n with high probability Any Las Vegas algorithm can b e turned into a Monte

Carlo algorithm with the same time b oundssimply give an arbitrary answer if the algorithm takes

to o long However there is no general scheme for turning a Monte Carlo algorithm into a Las Vegas

one since it may not b e p ossible to check whether the algorithms answer is correct Our theorems

claiming time b ounds will app end MC or LV to clarify which typ e of algorithm is b eing used

The Cherno Bound

A basic to ol in our analysis is the Cherno b ound Che

Lemma Cherno Che MR Let X be a sum of Bernoul li random variables on n

P

independent trials with success probabilities p p and expected value p Then for

n i

2

PrX e

and

2

PrX e

In particular if log n then X with high probability in n

Matroids

We use several standard matroid denitions details can b e found in Wel

Denition The rank of S M written rkS is the size of any largest indep endent set in S

Denition A set of elements X spans an element e if rkX feg rkX Otherwise e is

independent of X

Denition The closure of X in M clX M is the set of all elements in M spanned by X

If M is understo o d from context we write clX

Denition A set is closed if it contains all the elements it spans

Matroid Constructions

Any subset S M can itself b e thought of as a matroid its indep endent sets are all the indep endent

sets of M contained in S Its rank as a matroid is the same as its rank as a subset

Denition Let A b e any indep endent set of M The quotient matroid M A is dened as

follows

Its elements are all elements of M indep endent of A

Set S is indep endent in M A if and only if S A is indep endent in M

Note that the elements of M A are simply the complement of clA The rank of M A denoted

C

rk M A is rkM rkA In general the rank of a set in M A can b e dierent from its rank in

M

Denition If B is any subset of M M B denotes the quotient M A where A is any basis of

B this is indep endent of the choice of A

Finding a Minimum Cost Basis

In this section we address the problem of nding a minimum cost basis in a matroid We consider

sampling at random from a weighted matroid that is a matroid in which each element has b een

assigned a realvalued weight or cost and formalize the following intuition a random sample is

likely to contain a nearly minimum cost basis This shows that if we need to nd a go o d but

not necessarily optimum basis it suces to cho ose a small random sample from the matroid and

construct its optimum basis This gives an obvious improvement in the running time we will make

the sp eedaccuracy tradeo precise Extending this idea once we have a go o d basis we can use

a verication algorithm to identify the elements that improve it preventing it from actually b eing

optimum In Section we show how this information can b e used to quickly identify the optimum

basis thus reducing the problem of constructing an optimum basis to that of verifying one

Denitions

We pro ceed to formalize our ideas Fix attention on an melement rankr matroid M Given

S M let OPTS denote the optimum basis of the set S By ordering sameweight elements

lexicographically we can assume without loss of generality that all element weights are unique

implying that the optimum basis is unique Supp ose we have a not necessarily minimum weight

or full rank indep endent set T of G We say that an element e improves T if e OPTT feg

In other words an element improves an indep endent set if adding it allows one to decrease the cost

of or enlarge that indep endent set We have an equivalent alternative formulation e improves T if

and only if the elements of T weighing less than e do not span e The pro of that these denitions

are equivalent can b e found in any discussion of optimum bases eg Wel

It is convenient for the discussion and follows from the denition that the elements of any

indep endent set T improve T It is easy to prove but not necessary for our discussion that

OPTM is the only indep endent set with no improving elements other than its own Furthermore

the optimum basis elements are the only elements that improve all bases This allows a natural

concept of approximation of minimum bases

Denition An indep endent set is k optimum if at most k r elements improve it

Our denition diverges from the more common approach of measuring approximation by the

ratio of the output value to the optimum This change is natural since in the comparison mo del

of computation in which all of our results apply any approximation algorithm with a b ounded

relativecost p erformance guarantee will necessarily b e an exact algorithm This is b ecause the

optimum basis dep ends only on the order and not on the values of elements of the matroid

Note that our denitions apply to indep endent sets of any rank not just bases

A Sampling Theorem

In this section we determine the quality of an approximately optimum basis found by random

sampling Let M p denote a random submatroid of M constructed by including each element of

M indep endently with probability p We pro ceed to prove that with high probability OPTM p

is poptimum in M This is a generalization of a result for graphs app earing in KKT In

preliminary versions of this pap er Kar Kar we proved the weaker claim that OPTM p is

O log npoptimum Klein and Tarjan KT gave an improved analysis for minimum spanning

trees that immediately generalizes to matroids as follows

Theorem The expected number of elements improving OPT M p is at most r p For any

constant the probability that more than r p elements improve OPTM p is at most

2

r r

e e

Corollary OPTM p is O poptimum in M with high probability in r

Proof Given a matroid M and xing p let T denote the random set OPTM p Let V denote the

random set of elements of M that improve T but are not in T Intuitively and in our algorithms we

determine V by constructing M p nding its optimum basis and then identifying the improving

elements However to prove the theorem we interleave the sampling pro cess that constructs M p

with the pro cess of nding T and V using the standard greedy algorithm for matroids

The pro of uses a variant of the standard greedy algorithm to nd the optimum basis and the

improving elements of the sample Recall that the greedy algorithm examines elements in order of

increasing weight and adds an element to the growing optimum basis if and only if that element is

not spanned by the smaller elements that have already b een examined We mo dify this algorithm

to interleave the sampling pro cess to determine T and V simultaneously Consider the algorithm

GreedySampling of Figure

Input A matroid M with m elements

Output An instance of M p

its optimum basis T and

the elements V of M that improve T

Order the elements e by increasing weight

i

Let T V

for i to m

if e is spanned by T then

i i

discard e

i

else

Flip a coin to sample e with probability p

i

if e is sampled then

i

V V

i i

T T feg

i i

else

V V feg

i i

T T

i i

return T T and V V

m m

Figure GreedySampling

Note rst that if we delete the lines involving V from the algorithm we are left with the standard

greedy algorithm running on the random matroid M p except that we do not b other ipping coins

for elements that we know cannot b e in the optimum basis The only change is that instead of

p erforming all the coin ips for the elements b efore running the algorithm we ip the coins only

when they are needed This do es not eect the outcome since we could just as easily ip all the

coins in advance but keep the results secret until we needed them this deferred decision principle is

discussed in MR Thus we know that the output T is indeed the optimum basis of the sampled

matroid

We next argue that the output V T is indeed the set of improving elements To see this

observe that every element e is examined by the algorithm There are two cases enumerated in

i

the algorithm to consider If e is spanned by T then since all elements of T are smaller than e

i i i i

and since T T by denition e do es not improve T If e is not spanned by T then e is placed

i i i i i

in one of T or V and is thus part of the output Thus all improving elements are in V T It

i i

is also clear that only improving elements are in V T This follows from the fact that any element

in V T is not spanned by the elements smaller than itself in the sample

It remains to b ound the value of s kV T k Consider the s elements for which we ip coins in

the algorithm Call each coin ip a success if it makes us put an element in T probability p and a

failure if it makes us put an element in V The numb er of improving elements is simply s the numb er

of coins ipp ed Note as well that since T is indep endent the numb er of successes is at most r The

question is therefore if the success probability is p how many coins will b e ipp ed b efore we have

r successes This formulation denes the well known negative binomial distribution Fel Mul

The exp ected numb er of coin ips needed is r p Furthermore the probability that more than

r p coin ips are required is equal to the probability that of the rst r p coin ips less

than r successes o ccur This probability is exp onentially small in r by the Cherno b ound

Remark Note that we cannot say that the exp ected numb er of improving elements is exactly r p

since our exp eriment ab ove may terminate before we have r successes this will o ccur if M p do es

not have full rank Thus we can only give an upp er b ound

Optimizing by Verifying

We use the results of the previous section to reduce the problem of constructing the optimum basis to

the problem of verifying a basis to determine which elements improve it Supp ose that we have two

algorithms available a construction algorithm that takes a matroid with m elements and rank r and

constructs its optimum basis in time C m r and a verication algorithm that takes an melement

rankr matroid M and an indep endent set T and determines which elements of M improve T in

time V m r We show how to combine these two algorithms to yield a more ecient construction

algorithm when V is faster than C

We b egin with a simple observation we will use to simplify some b ounds in our samplingbased

algorithm

Lemma Without loss of generality C m r m log m mV r r

Proof We give a construction algorithm that uses the verier as a subroutine This is an implemen

tation of the standard greedy algorithm First we sort the elements of M in O m log m time Then

we build the optimum basis incrementally We maintain an indep endent set B initially empty We

go through the elements e of M in increasing order asking the verier to verify indep endent set B

against matroid B feg If the verier says that e improves B it means that B feg is indep endent

so we add e to B After all m elements have b een tested B will contain the optimum basis

We now give samplingbased algorithm for the optimum basis Supp ose we sample each element

of M with probability k and construct the optimum basis OPTM p of the sample using C

With high probability the sample has size O mk so that this takes C mk r time Use the

verication algorithm to nd the set V of elements of M that improve OPTM p this takes time

V m r Construct OPTV since OPTM V we know OPTV OPTM By Theorem

V has size O k r with high probability thus this construction takes C k r r time The overall

running time is thus V m r C mk r C k r r If we set mk k r the running time b ecomes

p

V m r C mr r

This is a clear improvement when r is signicantly less than m This new algorithm is just as

simple as the original construction and verication algorithms were since it only consists of two

construction calls and a verication call

At the cost of some additional complexity we can improve the running time further by applying

the reduction recursively We apply the algorithm RecursiveRefinement of Figure

Algorithm RecursiveRefinementM C V

Input A matroid M with m elements and rank r

A construction algorithm C

A verication algorithm V

Output The optimum basis of M

if m r then

return C M

else

S M

T RecursiveRefinementS

U elements of M S improving T using verier V

return C U which is the optimum basis of M

Figure RecursiveRefinement

Algorithm RecursiveRefinement is clearly correct by induction since at each level of the

recursion T will contain all optimum elements of M that are in S We now analyze its running time

We b egin with an intuitive argument Observe that the exp ected size of M is m Further

more Theorem says that U has exp ected size at most r This suggests the following recurrence

0

for the running time C of the new construction algorithm

0

C r r C r r

0 0

C m r C m r V m r C r r

If we now make the natural assumption that V m r m this recurrence solves to

0

C m r O V m r C r r log mr

Unfortunately this argument is not rigorous Our rigorous argument must b e careful ab out

passing exp ectations through the recurrence functions and must allow for the p ossibility that V

andor C have exp onential running times We now give a rigorous highprobability result an

exp ected time result follows similarly The statement of the theorem is simplied signicantly if we

assume that V is at least linear and at most p olynomial in m After proving the theorem for this

case we state the messier general case

Theorem Suppose an m element rank r matroid has algorithms for constructing an optimum

basis in C m r time and for determining the elements that improve a given independent set in

V m r time where V is at least linear and at most polynomial in m Then with high probability in

m an optimum basis for the matroid can be found in O V m r C r r log mr time LV

Proof We prove an O m probability of failure a general high probability b ound follows with

dierent choice of constants

We will need dierent pro ofs dep ending on whether or not r olog m We b egin with some

observations that apply to b oth cases We rst determine the size of all problems passed to the

verier Consider the recursion tree pro duced by the algorithm Let M b e the set of elements input

i

to level i of the recursion tree Thus M M is the set of elements sampled at level i Let

i i

st

T M b e the set of elements returned to level i by the i recursive call Then the set of

i i

elements passed to the verier at level i is just M M T A given element e is in M M

i i i i i

for exactly one value of i and each T has size at most r It follows that if the recursion reaches

i

depth d then the total size of problems passed to the verier is at most m r d Assuming V is

sup erlinear it follows that the total time sp ent verifying is O V m r d r

We now prove the time b ound for the case r ln m Consider the recursion tree pro duced

by the algorithm In the recursive step each element of M is pushed down to the next level of the

recursion with probability It follows that the probability that an element survives to recursion

depth logmr is O r m Therefore the exp ected numb er of elements at level logmr in the

recursion tree is r By the Cherno b ound with the numb er of elements at this level at most

r

r with probability e m so the recursion terminates with high probability in m

at depth logmr It follows from the previous paragraph assuming V is p olynomial and since

r log mr O m that the time sp ent verifying is O V m r logmr O V m with high

probability

We now determine the time sp ent in calls to C Each set passed to C is known by Theorem

to have size upp er b ounded by a negative binomial distribution with mean r That theorem also

2

r r

shows that the probability that the passed set has size exceeding r is at most e e

m We have already argued that the recursion terminates at depth logmr so there are only

logmr calls to C Thus the probability that any one of the calls involves a set whose size exceeds

size r is O m

It remains to consider the case r ln m We again b egin by b ounding the recursion depth

Arguing as in the previous case we note that the exp ected numb er of elements at recursion depth

log m is m it follows from Markovs inequality that the recursion terminates at this depth with

probability m Thus the b ound on verication time follows as in the previous case

To b ound the time sp ent in calls to C note as ab ove that the numb er of elements passed to

C at one level of the recursion is upp er b ounded by a negative binomial distribution with mean

r Therefore the numb er of elements passed to C over al l recursion levels is upp er b ounded

with high probability by a negative binomial distribution with mean O r log m and is therefore

O r log m with high probability in m Recall from Lemma that C n r O n log n nV r

r Therefore the total time sp ent on construction is upp er b ounded by O r log m logr log m

V r log m r O m V m r and can therefore b e absorb ed by the b ound on verication

time

We now give a generalization that holds without restriction on C and V

Corollary An optimum basis can be found in time

log mr

X

V m r log m r C r r log mr log mV log m r O

i

i

P

with high probability LV for some m such that m m

i i

Proof The rst term is from the verication time analysis in the theorem the second from the

construction time for the case r log m the third for the case r log m

Since there is for example a lineartime algorithm for verifying minimum spanning trees DRT

and a simple O m log n time algorithm for constructing minimum spanning trees Kru the ab ove

lemma immediately yields an O m n log n logmntime algorithm for constructing minimum

spanning trees in nvertex medge graphs The nonlinear term reects the need to solve O log mn

base cases on n edges Karger Klein and Tarjan KKT observed that this can b e improved By

merging vertices connected by minimum spanning tree edges as those edges are discovered we can

reduce the rank of the graphic matroid in the recursive calls This leads to a lineartime recursive

algorithm for the minimum spanning tree problem A similar technique do es not seem to exist for

the general matroid optimization problem

In their survey pap er LR Lee and Ryan discuss several applications of matroid optimization

These include job sequencing and nding a minimum cycle basis of a graph In the applications they

discuss attention has apparently b een fo cused on the construction rather than the verication of the

optimum basis This work suggests that examining verication would b e pro ductive In particular

applying Lemma to the main theorem we have

Corollary Given a matroid verication algorithm with running time V m r an optimum

basis can be found in O V m r r log r logmr V r r time

Packing Matroid Bases

We now turn to the basis packing problem The ob jective is to nd a maximum collection of pairwise

disjoint bases in a matroid M It is closely related to the problem of nding a basis in a k fold

matroid sum for M which by denition is a matroid whose indep endent sets are the sets that

can b e partitioned into k indep endent sets of M A rank r matroid contains k disjoint bases

precisely when its k fold sum has rank k r An early algorithm for this problem was given by

Edmonds Edm Law

Denition The packing number P M for a matroid M is the maximum numb er of disjoint

bases in it

Many matroid algorithms access the matroid via calls to an independence oracle that tells whether

a given set is indep endent in the matroid The running time for such an oracle call can b e quite large

but sp ecial implementations for eg the graphic matroid GW run quickly For convenience we

will state running times in terms of the numb er of primitive op erations plus the numb er of oracle

calls p erformed If we say that an algorithm runs in time T we are actually saying that given an

0 0

oracle with running time T the algorithm will run in time TT

Many matroid packing algorithms use a concept of augmenting paths the algorithms rep eatedly

augment a collection of indep endent sets by a single element until they form the desired numb er

of bases The augmentation steps generally have running times dep endent on the matroid size

m the r and k the numb er of bases already found For example Knuth Knu

gives a generic augmenting path algorithm that nds an augmenting path in O mr k calls to an

indep endence oracle and can therefore nd a set of k disjoint bases of total size r k if they exist in

O mr k time Knuth claims an augmentation time of m mr k but an examination of his

algorithms inner lo op shows that each of the at most r k elements of the current packing is tested

against each element of the matroid yielding the O mr k time b ound

In the following sections we give two techniques for improving the running times of basis packing

algorithms Random sampling lets us reduce or eliminate the running time dep endence on k A

greedy packing technique lets us replace m by k r ln r m ln r or in some cases by k r m

Combining the two techniques yields algorithms whose running time dep ends almost entirely on r

We apply them to the problem of packing spanning trees in a graph previously investigated by

Gab ow and Westermann GW and Barahona Bar

We b elieve that the paradigms develop ed here will b e useful for a wide variety of packing prob

lems We have used similar ideas Kara BK Kar in fast new algorithms for maximum ows

packings of paths and connectivity computations using a tree packing algorithm develop ed by

Gab ow Gab

A Quotient Formulation

We rst present some of the to ols we will use in our development An alternative characterization

of a matroids packing numb er is given in the following theorem of Edmonds Edm Recall

Section that any subset A of M has a matroid structure induced by the indep endent sets of

M contained in A Let A denote M A

Theorem Edmonds A matroid on M with rank r has k disjoint bases if and only if for

every A M

Ak k r k rkA k

Recall the denitions of closures and quotients from Section In particular recall that a

C

quotient is any complement of a closed set and that rk M A rkM rkA is the rank of the

quotient matroid M A Edmonds Theorem can b e rephrased in quotient language as follows

Theorem P M k if and only if for every quotient Q of M

C

kQk k rk Q

Proof The elements of each quotient Q are a subset of M and each subset that is a complement of

a closed set denes exactly one quotient Since any indep endent set of the quotient is indep endent

C

in M as well we have rk Q rkQ Therefore Edmonds original formulation implies this one

Q For the other direction supp ose the ab ove statement holds and consider trivially by taking A

C

B is a quotient with rk B rkM rkA Then any set A M Let B clA so

Ak kB k k

C

k rk B

k rkM rkA

Motivated by the ab ove theorem we make the following denitions

Denition The density of a quotient Q

kQk

Q

C

rk Q

Denition A sparsest quotient of M is a quotient of minimum density

Corollary P M is equal to the density of a sparsest quotient rounded down

As an example of quotients consider the graphic matroid Given a set of edges its closure

corresp onds to the connected comp onents of the edge set An edge is in the closure if b oth endp oints

are in the same connected comp onent The complement is the set of edges with endp oints in dierent

comp onents Thus it corresp onds to a multiway cut of the graph The size of the quotient is just

the value of the cut It is worth distinguishing the minimum density quotient from the minimum

cut the minimum density quotient can have many more edges than the minimum cut if it partitions

the graph into many comp onents

Sampling for Packing

We now show that randomly sampling from the elements of a matroid scales the densities of all the

quotients in the matroid It will immediately follow that sampling also scales the packing numb er

in a predictable fashion We write x y when we mean y x y

Theorem Let P M k Suppose each element of M is sampled with probability p

ln mk yielding a matroid M p Then with high probability in m P M p pk

Clearly this theorem is only interesting when P m ln m since obviously we want p

and We will prove it in Section A second pro of is given in the app endix

This theorem gives an obvious scheme for approximating P M nd P M p for some small

sampling probability p We elab orate on this approach extending it to exactly compute the optimum

packing in Section

To prove the theorem we consider a corresp ondence b etween the quotients in M and the quotients

in a random sample M p We use it to prove that no quotient of M p has low density The result

then follows from Corollary We now give this corresp ondence

Consider a quotient Q of M p Let A M p Q It is easy to verify that Q M A M p

M Ap In other words every quotient in M p is the sampled part of a corresponding quotient

of M In Lemma we show that with high probability every quotient in M p contains p

times as many elements as its corresp onding quotient in M It will follow that

Every quotient of M p has density at least pk

The minimumdensity quotient in M with density k induces a quotient of density at most

pk in M p

Combining these facts with Corollary will prove the theorem

Related Results

Before proving the main theorem we note a few related results

Existence of a Basis

The ab ove theorem proves that the numb er of bases is predictable We might settle for knowing

that at least one basis gets sampled

Theorem Suppose M contains a k ln r disjoint bases Then M k contains a basis for

ak

M with probability O e

A pro of of this theorem is given in the app endix Note that Polesskii Pol proved the same

fact by giving an explicit construction of the least reliable least likely to contain a basis matroid

containing k disjoint bases for any k

This theorem is in a sense orthogonal to Theorem That theorem shows that regardless of

the numb er of bases few elements are likely to b e indep endent of the sample However it do es not

prove that the sample contains a basis This theorem shows that if the numb er of bases is large

enough no elements will b e indep endent of the sample b ecause the sample will contain a basis

Note also that the sampling probability in the theorem is signicantly less than that used in our

main theorem At these lower sampling probabilities we can guarantee one basis but the numb er

of bases app ears to have a large variance This conforms to our intuition ab out simpler sampling

exp eriments When the exp ected numb er of bases pk is ln r we exp ect to nd at least one but

cannot predict how many When pk b ecomes slightly larger ab out ln m we b egin to see tight

concentration ab out the mean

The Graphic Matroid

We also remark on some particular restrictions to the graphic matroid Erdos and Renyi ER

Bol proved that if a random graph on n vertices is generated by including each edge indep endently

with probability exceeding ln nn then the graph is likely to b e connected Rephrased in the

language of graphic matroids if the edges of a complete graph on n vertices are sampled with

the given probability then the sampled edges are likely to contain a spanning tree ie a basis

Theorem says the same thing though with weaker constants

In Kara we prove the following result analogous to Theorem for graphs

Theorem If a graph G has minimum cut k then with p and as in Theorem every cut

in Gp has value p times its value on the original graphs

Corollary If a graph G has minimum cut k then with p and as in Theorem Gp has

minimum cut pk

A Theorem of NashWilliams NW gives a connection b etween minimum cuts and spanning

tree packingsone closely tied to Edmonds Theorem for matroids Our sampling theorems for

matroids generalize our sampling theorems for graphs Just as our matroid sampling theorem leads

to algorithms for approximating and nding basis packings the graph sampling theorem on cuts

leads to fast algorithms for approximating and exactly nding minimum cuts and st minimum cuts

and maximum ows Kara

Pro of of Theorem

This section is devoted to proving Theorem The reader interested only in its applications can

skip to the next section An alternative constructive pro of is given in the app endix We actually

prove the theorem for a slightly smaller sampling probability p ln mr k ln mk since

r m It will b e clear that using the larger correct p can only increase our success probability

As discussed ab ove we aim to prove that every quotient in M p has roughly p times the density

of its corresp onding quotient in M To do so we prove that every quotient has size roughly p times

that of its corresp onding quotient and then note that by Theorem the rank of M p is the

same as the rank of M Density scaling then follows from the denitions

We rst prove that the numb er of elements sampled from every quotient in M is near its exp ecta

tion with high probability For any one quotient this follows immediately from the Cherno b ound

However a matroid can have exp onentially many quotients implying that even the exp onentially

unlikely event of a Cherno b ound violation might o ccur To surmount this problem we show that

almost all quotients in a matroid are large Since the likeliho o d of deviation from exp ectation decays

exp onentially with the size of the quotient this suces to prove our result

t

Lemma For integer t the number of quotients of size less than tk is at most mr

Proof First supp ose t r Each quotient corresp onds to a closed set which is the closure of some

indep endent set that spans it so there is a surjective mapping from indep endent sets to closures to

quotients Thus the numb er of quotients is at most the numb er of indep endent sets There are at

m m m

r

most m such indep endent sets proving the lemma for t r

r

Now assume t r We use a pro of that essentially generalizes the Contraction Algorithm

of KS We give an algorithm for randomly cho osing a small quotient and show that every small

quotient has a large chance of b eing chosen This proves that there are only a few such quotients

Consider the following randomized algorithm for selecting a quotient of M

Let M M

r

For i r downto t

Let x b e an element selected uniformly at random from M

i i

Let M M x

i i i

Pick a quotient at uniformly at random from M

t

Consider a particular quotient Q of size less than tk for some real t We determine the probability

this quotient is found by the algorithm We rst nd the probability that this quotient is contained

in M ie contains only elements indep endent of x x Consider the selection of x By

t r t r

hyp othesis Q has size at most tk However since M has k disjoint bases of rank r it has at

r

least r k elements Therefore Prx Q tk r k tr Thus Prx Q tr However

r r

if x Q then since Q is a quotient we know that Q M M x Now note that M

r r r r r

still has k disjoint bases the quotients by x of the bases in M but has rank r Thus

r r

Prx Q j x Q tr Continuing we nd that

r r

t t t

PrQ M

t

r r t

r t r t

r r t

r

t

t

r

Now supp ose the we select a quotient uniformly at random from M The numb er of quotients of

t

M is at most the numb er of indep endent sets of M But an indep endent set in M can have at most

t t t

P

m

t

t elements in it since M has rank t Thus the numb er of quotients is at most m so

t

it

i

t

the probability that we pick our particular quotient given that it is a subset of M is at least m

t

Combining the ab ove two arguments we nd that the probability of selecting a particular quo

t

tient of size tk is at least mr Therefore since each quotient of size less than k t has at least this

probability of b eing chosen and the events of these quotients b eing chosen are disjoint we know

t

that the numb er of such quotients is at most mr

Lemma With high probability for al l quotients Q of M we have kQpk pkQk

Proof Consider a particular quotient Q of size q in M The probability that Qp has size outside

2

pq q k

the range pq is by the Cherno b ound at most e mr

Let q q b e the sizes of M s quotients in increasing order so that k q q Let p

j

th

b e the probability that the j quotient diverges by more than from its exp ected size Then the

P

probability that some quotient diverges by more than is at most p which we pro ceed to b ound

j

from ab ove

q k

j

We have just argued that p mr According to Lemma for integer t there are

j

t

at most mr quotients of size less than tk We now pro ceed in two steps First consider the mr

smallest quotients Each of them has q k and thus p mr so that

j j

X

p mr mr O mr

j

j mr

Next consider the remaining larger quotients Since we have numb ered the quotients in increasing or

t t

t+1 t+1

der this means that quotient q q all have size at least tk Thus p p

mr mr

mr mr

t

mr It follows that

X X

t t t

p mr mr mr

j

j mr

t

X

t

mr

t

O mr

Lemma With high probability rkM p rkM

Proof This follows from Theorem For a direct pro of supp ose that rkM p rkM This

implies that M p do es not span M meaning that Q M M p is a nonempty quotient Edmonds

Theorem implies that in fact such a quotient has at least k elements From Lemma we deduce

that Qp is nonempty But from the denition of Q we know that Q M p a contradiction

Remark Theorem proves that a signicantly smaller value of p than that used here is sucient

to ensure full rank

Lemma With high probability the density of every quotient in M p is p times the

density of the corresponding quotient in M

Proof Consider a quotient M pA where A M p As discussed at the start of the section it

has the form Qp for the quotient Q M A By Lemma we know that kQpk pkQk

At the same time we know from Lemma that rkM p rkM It follows that the density

Qp satises

kQpk

Qp

rkM p rkA

pkQk

rkM rkA

p Q

Corollary With high probability P M p p P M

Proof Every quotient in M p corresp onds to a quotient of density at least P M in M

This gives the lower b ound to prove the upp er b ound we just need to show that some quotient

in M p corresp onds to the minimum density quotient of M

Lemma With high probability P M p p P M

Proof Let Q M A b e the quotient of minimum density in M Then consider Qp and

C

Ap Clearly rkAp rkA thus by Lemma or Theorem we have rk M pAp

C

rk M A By the Cherno b ound we have kQpk kQk It follows that

kQpk

Qp

C

rk M pAp

pkQk

C

rk M A

p Q

This completes the pro of of Theorem We have shown that with high probability every

quotient is dense so P M p k and that some quotient is sparse so that P M p

k

Capacitated Matroids

In some cases such as the tree packing problem will will address later it is natural to assign a

capacity to each element of the matroid A basis packing is now one that uses each element no more

times than the capacity of that element We can even allow fractional capacities and talk ab out

the maximum fractional packing in which each basis is given a weight and the total weight of bases

using an element can b e no more than the capacity of the element

We would like to apply Theorem to capacitated matroids One approach is to represent an

element of capacity c as a set of c identical uncapacitated matroid elements and apply the theorem

directly This has the drawback of putting into the error b ound of Theorem a term dep endent

on the total capacity of matroid elements In fact Theorem holds for capacitated matroids even

when m refers to the numb er of distinct matroid elements rather than their total capacity

t

To see this note that the m term in the error b ound arose from only one place the O mr

b ound on the numb er of quotients of size less than tk which we would like to turn into a b ound on

the numb er of quotients of total capacity less than tk The factor of m in this b ound arose in turn

r

from the m b ound on the numb er of quotients in an melement rankr matroid which is at most

the numb er of indep endent sets in the matroid But now observe that two indep endent sets with

dierent copies of the same elements yield the same quotient This means that replicating elements

do es not increase the numb er of indep endent sets or quotients Thus the numb er of indep endent

t

sets and quotients of capacity tk is b ounded by mr where m is the numb er of distinct elements

of M regardless of their capacity

Corollary Let M be a capacitated matroid with m elements and with P M k Suppose

each copy of a capacitated element is sampled with probability p ln mk yielding a matroid

M p Then with high probability in m P M p pk

To apply this argument to fractional capacities consider multiplying each capacity by a large

integer T rounding down to the nearest integer and applying the previous theorem In the limit as

T b ecomes large we reduce to the integer capacity case Note that multiplying by T also scales the

packing numb er by T which scales the sampling probability of any one element down by a factor

of T The limiting distribution on the numb er of copies sampled from a given capacitated element

b ecomes a Poisson pro cess

0

Corollary Suppose M has fractional packing number k Build a new matroid M by sam

pling from each capacity w element of M according to a Poisson process with mean pw for p

0

ln mk Then with high probability P M pk

Further details of this approach to sampling capacitated elements can b e found in Kara

Packing by Sampling

Having develop ed approximation theorems for basis packing we now use sampling with renement to

develop algorithms for exact and approximate basis packing We rst note an obvious approximation

scheme To approximate the packing numb er we can sample matroid elements with some probability

p and compute the packing numb er of M p using Knuths algorithm By Theorem P M p

will b e roughly pk so Knuths algorithm will run faster by a factor of p on the sample Dividing

the samples packing numb er by p will give an estimate for P M Before giving the details of this

scheme we present a sp eedup of Knuths algorithm for nding an exact packing Using this exact

algorithm instead of Knuths will yield a faster approximation algorithm based on the ab ove scheme

Theorem A maximum packing of k matroid bases in an m element rank r matroid can be

p

log m time LV found with high probability in O mr k

Recall that the sux LV denotes a Las Vegas algorithm This result accelerates Knuths

p

algorithm Knu by a factor of k log m

Proof Start by running Knuths algorithm to nd out if the packing numb er is less than log m

return the packing if so Otherwise randomly partition the matroid elements into groups by

ipping an unbiased coin to decide which group each element go es in Find a maximum basis

packing in each of the two groups recursively Combining these two packings gives a not necessarily

maximum packing of bases in M Use Knuths algorithm to augment this packing to a maximum

one

The algorithm is clearly correct we now analyze its running time First supp ose that the matroid

p

log m time and we contains k log m bases Then the rst test takes O mr k O mr k

m then halt and our pro of is done for this case Otherwise the rst test takes O mr log

p

O mr k log m time and we then go on to the recursive phase If the recursive phase executes

we can assume k log m

We b egin with an intuitive argument Consider a particular no de in the recursion tree Each of

the two submatroids we construct lo oks like a random sample with p though the two groups

are not indep endent We can apply Theorem with p to each submatroid solving for yields

p

p

ln mk implying that with high probability in m each group contains k O k log m

disjoint bases Thus Knuths algorithm needs to augment the union of the recursive calls by an

p p

k log m elements which takes O mr k k log m time This gives a recurrence in additional O r

which the nal cleanup phase dominates the running time

Unfortunately there is a aw in this argument the many leaves in the recursion tree are small

and might fail to ob ey the high probability arguments We therefore unroll the recursion to prove

the theorem We add up the work cleanup augmentations p erformed at each node of the recursion

tree We separately b ound two dierent kinds of work at a given no de useful augmentations are

augmentations that eventually contribute to an increase in the numb er of bases returned useless

augmentations are those last few which do not so contribute At any given no de there can b e at

most r useless augmentations since a sequence of r augmentations adds a basis

We rst analyze the shap e of the recursion tree Note that each no de at depth d of the recursion

d

tree contains a random subset of M in which each element app ears with probability It follows

that at depth d log m every no de has at most element with high probability this can b e seen

by considering throwing the m elements randomly into the m recursion no des at this depth This

means the recursion depth is O log m and thus that the numb er of recursion no des is p olynomial

in m so that events that are high probability in m can b e assumed to happ en at every no de

Now we separately analyze shallow and deep recursion tree no des First consider the shal

d d

low no des at depth d with O k log m Applying Theorem with p to such a

d

no de we deduce that with high probability in m any such no de contains k bases where

d

p p

d

d d

log mk In other words the no de contains k O k log m bases Since O

d

O

the numb er of no des in the recursion tree is m with high probability this argument holds with

high probability at every suciently shallow no de

It follows that after merging the bases from its two children each shallow no de at depth d must

augment at most

q q q

d d

d d d

k k ln m k ln m k log m k O

d

additional bases bases to nish Since with high probability in m each of the no des has O m

elements this takes Knuths augmentation algorithm

q

m

d

k log m r O

d

d

time Since there are no des at level d the total time sp ent at level d is

q

d

O mr k log m

To nd the time sp ent at all relevant levels we sum the times at each This is a geometric series

summing to

p

k log m O mr

The ab ove argument b ounded the numb er of useful augmentations It can also cover the time

sp ent in useless augmentations Since every recursion no de is assumed in the ab ove analysis to

augment by at least one basis we can charge the r useless augmentations to the last r useful ones

d

It remains to b ound the time at deep levels with log k log m To b ound useful

augmentations note that each no de at this level or b elow has O mlog mk elements and O log m

bases with high probability Furthermore among all the no des at or b elow this level there can b e

at most k bases added since each added base contributes to the total output of bases Therefore

the total cost of adding bases b elow this level is

O k mr log mk O mr log m

p

O mr k log m

Finally we must account for the useless augmentations Consider a no de with m elements It

i

can p erform at most r useless augmentations each costing at most O m r log m time Thus the

i

P

time sp ent at a given level is O m r log m O mr log m Thus the total time sp ent over all

i

O log m deep levels is mr log m which is negligible since k log m

We now use this faster exact packing algorithm to implement a faster approximation algorithm

Theorem For any the packing number k of an melement matroid of rank r can be esti

mated to within in O mr log mk time MC

Proof Let z ln m and consider sampling with probability p z k Theorem tells us

that with high probability M p has packing numb er z z Thus if we compute z in the

sampled matroid z p provides an estimate of k with the desired accuracy

As describ ed the algorithm requires a knowledge of k But this is easy to simulate with a rep eated

doubling scheme Start with p m p Rep eatedly double p and pack M p until M p has

packing numb er z z The probability that this happ ens when p p is negligible To see this

consider the following scheme for sampling elements with probability p p First tentatively sample

each element with probability p Second sample each tentatively sampled element with probability

pp As required each element gets sampled with probability p But consider the tentatively

sampled elements By Theorem the packing numb er of the tentative sample M p is at most

z z Sampling a subset of these elements clearly cannot increase the packing numb er so

the claim holds It follows that with high probability when the doubling pro cess terminates we have

p p so z p k

To determine the running time note that at trial i of doubling p we are working with a matroid

that with high probability has size O pm and packing numb er O z Thus the time to nd its

p

log m by Theorem We stop b efore p z k Thus the work optimum packing is O pmr z

p

over all trials is a geometric series summing to O mr z log mk

As a technical detail note that if p is much to o small we might get a sample with smaller rank

than the original matroid Bases in this sample will not b e bases of M and the sampling theorem

will not apply But this case is easy to avoid To start with we compute the rank of M in O m

time by greedily constructing a basis Before packing a sample we make sure that it has the same

rank taking time linear in the sample size The total work is negligible

As another technical detail note that if p we just run Knuths algorithm on the original

matroid Since p implies z k meaning k O ln m the original algorithms running time

of O mr k is b etter than the one we claim Thus the claimed b ound is certainly achievable

Corollary In the same time bounds as above we can identify a quotient of M with density

at most times the minimum

Proof The ab ove algorithm will also identify a sparse quotient in the sample This quotient by

Theorem corresp onds to a sparse quotient in the original matroid

Theorem A packing of k disjoint bases can be found in O mr log m time LV

Proof After approximating k with the previous algorithm partition the elements of M randomly

into p k ln m groups Each lo oks like a sample of the kind discussed in Theorem and

therefore contains pk bases Run the basis packing algorithm on each group and take the

union of the results The running time is the time for p executions of the previous algorithm

The algorithm as stated is Monte Carlo in that there is a small probability that the various groups

will not contain k bases among them But if we combine with the previous algorithm we can

guarantee the correctness of the solution By Corollary we can nd a sparse quotient in the

original graph this provides an upp er b ound on the value of the packing numb er By Theorem

we can nd a nearly maximum packing thus lower b ounding the packing numb er So long as the

upp er and lower b ounds disagree by more than we can rerun b oth algorithms Once

they agree this closely we will have pinned down the correct value to within

Greedy Packings

The sampling techniques we have just describ ed let us eectively reduce the dep endence of a packing

algorithm on the packing numb er k partially replacing k with O log m We now give a dierent

deterministic scheme that lets us reduce the dep endencies of these algorithms on the matroid size m

We eectively replace m by r k for basis packing algorithms In particular we give a deterministic

basis packing algorithm with running time

O m r k

Sampling then lets us nearly eliminate the newly intro duced factor of k so we are left only with a

dep endence on the rank r the error parameter and log k

The idea b ehind this approach is a connection b etween a maximum packing and a maximal

packing Maximal packings are much easier to construct but can b e made to contain as many bases

as the maximum packing The maximum packing algorithms can then extract those bases from

the maximal packing instead of from the entire matroid This approach was explored for graphs

by Nagamo chi and Ibaraki NI who showed how a maximal packing of forests could b e used in

minimum cut computations

Denition A greedy packing of M is a partition of the elements of M into indep endent sets

B i such that each B is a basis of M B The prex B B is a greedy tpacking

i j ij i t

A greedy tpacking has at most tr elements and may therefore b e smaller than the original

matroid However it still contains a large numb er of disjoint bases

Lemma Let S be a greedy k ln r packing of M If P M k then P S k

Proof Let S fB g b e the greedy packing We show that for every quotient Q of S its density

i

Q k The result then follows from the quotient version of Edmonds Theorem We use the same

corresp ondence as we did for the random sampling pro ofs A quotient SA has the same elements

C C

as S M A Since S contains a basis B of M we know that rk SA rk M A Therefore

to show the density claim we need only prove that the necessary numb er of elements is selected in

S from each quotient Q M

P P

kB k d is We b egin with Q M and show that kS k kB k k r Write d k r

j i i i

j i

the decit in the greedy packing after we have added i sets to it Once the decit reaches the

greedy packing contains k r elements We derive a recurrence for the d After removing B we have

i i

removed a total of k r d elements By Edmonds Theorem taking A in Theorem to b e the set

i

of removed elements the rank of the remaining elements is at least d k Therefore kB k d k

i i i

meaning that d d d k In other words we have the following recurrence

i i i

d k r

d k d

i i

i

It follows that d k k r meaning that d k Now consider the next k greedy bases

i k ln r

Each of them has size at least unless we have exhausted all the elements of M Therefore after

adding these k bases we will either have k r elements or else kM k elements But kM k k r which

proves the claim for Q M

Now consider any other quotient Q M We know that Q contains k disjoint bases contained

in the quotients of the k disjoint bases of M We therefore aim to apply the previous argument

to the matroid Q To do so we show that any greedy tpacking fB g constructed in M contains

i

a greedy tpacking of Q This is almost trivial as it is immediate that for all i j B Q spans

i

B Q The only detail to x is that p ossibly B Q is not indep endent in Q meaning that it is

j i

not a basis

We x this as follows Begin with B Q which certainly spans Q If this set is not indep endent

in Q we remove spanned elements from it one at a time until the set b ecomes indep endent For

each element e that we remove we nd the smallest i such that B Q do es not span e and place

i

e in B This preserves the invariant that B Q spans B Q whenever i j Eventually we will

i i j

have made B Q indep endent while preserving the invariant we now move on to do the same with

the remaining B When we nish we have a greedy packing of Q

i

This construction shows that our greedy packing of M contains a greedy packing of every quotient

of M But every quotient of M is itself a matroid and we started by proving that a greedy packing

of such a matroid contains the claimed numb er of elements Thus the greedy packing of M contains

the required numb ers of elements of every quotient of M

Corollary A greedy k lnpacking of M contains k disjoint bases

Proof Follows from the value of d in the previous pro of

k ln

Using Greedy PackingsFirst Attempt

The most obvious way to build a greedy packing is greedily

Lemma A greedy tpacking can be constructed in O tm time A greedy packing can be found

in O m time

Proof Construct one B at a time by scanning all asyet unused elements of M Constructing a

i

basis requires one indep endence test p er unused element Each indep endent set we delete contains

at least one element so we use at most m bases

Lemma Given a greedy k ln r packing of M a maximum packing of k matroid bases can

be found in O k r log r time

Proof Run Knuths Knu algorithm which takes mr k time on an m element matroid on the

O r k log r elements of the greedy packing

Corollary A maximum packing of k bases can be found in O k m log r r k log r time

Proof We need only determine k To do so use rep eated doubling Start by guessing k nd

a greedy k ln r packing in O k m log r time and use the previous lemma If we nd less than

k bases in the maximum packing we know that the maximum packing has value k k It follows

that our greedy packing contains k disjoint bases and that the packing we found in it is a maximum

packing If on the other hand we nd more than k bases we double k and try again

Clearly we will stop at some value of k with k k k The time sp ent is a geometric sum

dominated by that sp ent on the nal guess for k which is at most k The claimed time b ound

follows

Faster Greedy Packing

To improve the previous algorithm we p erform a more ecient construction of greedy tpackings

Instead of building each basis in turn we grow all of the bases at once start with all B empty

i

consider each element of M in turn and add it to the rst B with the smallest index i that

i

do es not span the element This pro duces exactly the same greedy packing as the naive algorithm

Checking each B in turn could take t indep endence tests yielding the same time b ound as b efore

i

But we can do b etter by observing the following

Lemma For any i j at al l times during the new construction B spans al l elements of B

i j

Proof Since an element is added to the rst B that do es not span it all B with i j must span

j i

the new element The claim follows by induction

Lemma During the new construction when we consider adding new element x there exists a

j for which x is spanned by B B but independent of B B

j j t

Proof Let j b e the largest index for which B spans x For any i j we have just argued that

j

B spans each element of B It is an elementary fact of matroid theory that if B spans B and

i j i j

B spans x then B spans x Thus B spans x for every i j but by choice of j the bases

j i i

B B do not span x

j t

Corollary A greedy tpacking can be constructed in O m r t log t time A greedy packing can

be constructed in O m log m time

Proof As we add each element x to the greedy packing as a rst step we check whether B spans

t

x If so we can immediately discard x since it cannot b e in any B Otherwise we can p erform a

i

binary search on B B to nd the largest j for which B spans x and add x to B The

t j j

binary search works b ecause if we test B and it spans x then j i otherwise j i We need to

i

p erform log t indep endence tests for this binary search We p erform at most tr binary searches since

each adds one element to the greedy packing For each element that is not added to the tpacking

we p erform only the one indep endence test against B

t

Using the Improved Algorithm

Corollary A maximum packing of k matroid bases can be found deterministical ly in O m log k

r k log r time

Proof If we knew k we could compute the greedy k ln r packing we need and run Knuths

O

algorithm on it The greedy packing time is O m r k log r log k log r O m r k log r k

Of course k is not known But this is easy to rectify by rep eatedly doubling our guess for k until

we fail to pack a numb er of bases exceeding our guess At this p oint we will clearly have built a

largeenough greedy packing to nd the maximum

Overall the running time is dominated by the time to run the algorithm with the nal guess

which will b e at most k and yield the claimed time b ound There is one exception the O m term in

the time for a greedy packing Since we recompute a greedy packing each time our estimate doubles

we compute O log k greedy packings and pick up an O m log k term in our running time

Another Small Improvement

We can make one more minor improvement The O m log k term in the ab ove time b ounds arises

b ecause we p erform log k greedy packing constructions to reach the correct value of k for our greedy

packing We now discuss a scheme for reducing this term to O m To start with we set

m

t

r log m

and construct a greedy tpacking in O r t log t O m time This provides us with free access to

0 0

greedy t packings in the algorithm of Corollary for all t t If our algorithm terminates b efore

needing a larger greedy packing which happ ens so long as k ln r t all is well Otherwise

t O k ln r We argue that in this case the O m log k time b ound is dominated by the O r k

term in our running time ab ove and can therefore b e ignored For t O k ln r implies that

m

O k log r

r log m

m

O k r log r

log m

Now it is straightforward that m log m O z implies m O z log z In other words

m O k r log r log k r log r

m log k O k r log r log k r log r log k

O

k r log k r

Corollary A maximum packing can be found in O m r k log r time The total time spent

O

constructing greedy packings is O m r k log r k

Interleaving Greedy Packings with Augmentation

We now show that if we interleave the greedy packing pro cess with the augmentations we can shave

an additional log r factor from the running time This small improvement is not used later and can

b e skipp ed Our rst step in removing the log r factor will temp orarily intro duce an extra factor of

k in our running time but we then show how to get rid of that

k

packing of Lemma Let S M contain b k disjoint bases and let T be a greedy

k b

M S Then S T contains b disjoint bases of M If b k then a greedy k ln r k

packing of T achieves the same goal

Proof As in the initial analysis of greedy packings we show that every quotient in S T has density

C

at least b Consider a quotient Q with rk Q q We need to show that kQ S T k q b

As we construct a greedy packing of M S we apply Edmonds Theorem as b efore to see that until

k q bq

so kQ S T k increases to q b the rank of the remaining elements will b e at least

k

b

that each basis of the greedy packing will contain at least q elements of Q S

k

Since S contains b disjoint bases of M it contains at least q b elements of Q Thus we only need

k b

q elements to add q elements of M S to reach our goal of q b elements Since we add

k

k

with each basis of T we need only add bases

k b

If b k the ab ove b ound is innite However we can use the original decit reduction

argument When b k the quotients decit is at most r Since the decit reduces by a

k factor with each greedy basis we only need k lnr k bases to reduce the decit to k and

another k bases to reduce it to

Lemma A packing of k matroid bases can be found deterministical ly if one exists in O mk log k

r k r k log r time

Proof We maintain a set S of disjoint bases We op erate in phases in phase b we increment the

k

packing T numb er of disjoint bases in S from b to b To do so we compute a greedy

k b

of M S By the previous theorem S T contains b bases of which we already have b in S

st

Therefore we can nd the b basis by running r iterations of Knuths augmentation algorithm

k

on the r b r elements of S T Since one augmentation takes O mr b time in an melement

k b

matroid with b bases and we p erform r augmentations the running time of phase b is

k k k

log r r b r b O m r

k b k b k b

For b k so k b we worsen this time b ound to

O m r k log k r b k r b O m r k log k r k O m r k

since it will not aect our analysis Thus the time for the k phases is k times the ab ove namely

O k m r k

as claimed

In the nal b k phase when the ab ove b ound is innite we use a greedy k lnr k packing

for T the r augmentations then take O r r k lnr k r k O r k log r time

Theorem A maximum packing of k bases can be found in O m r k r k log r O m

O

r k maxk log r time The time spent on greedy packing computations is O m r k log r k

Proof Assuming that we know k build a greedy k ln r packing G that contains k bases in

O m r k ln r log k ln r time Run the fast packing algorithm of Lemma on G Since G has

O r k log r elements this takes O r k log r k log k r k r k log r O r k r k log r time

As b efore rep eatedly doubling a guess for k allows us to assume that we know k As with

O

Corollary the total time sp ent on greedy packing can b e limited to O m r k log r k The

claimed time b ound follows

Combining Greedy Packing with Sampling

Greedy packings have reduced our algorithms runningtime dep endence on m Using sampling we

can simultaneously reduce or eliminate the runningtime dep endence on k

Corollary A maximum packing of k matroid bases can be found in O m r k log k r

time with high probability LV

Proof As b efore rep eated doubling lets us assume we know k So we can run the algorithm of

Theorem on a greedy k ln r packing As b efore the time to construct the greedy packing

is dominated aside from the linear term by the time to construct the packing

Corollary An approximation to the value of a matroids packing number can be computed in

O

O m r log k r time MC

Proof Use a greedy packing in the approximation algorithm of Corollary

Greedy packings can also b e used to nd approximate sparse quotients We take care to avoid

nding a quotient that is sparse in the greedy packing but not in the original matroid

Lemma Let S B B be a greedy tpacking of M with t k ln r Suppose Q is

t

a quotient of S with density at most k Then M B S Q is a quotient of M with density

t

at most k k in M

Proof Let t k ln r k and write A S Q Note that M B SB it follows that we

t t

need to prove the theorem only for the case S M That is we must prove that if SA has density

at most k then SAB SB A has density at most k

t t

We rst supp ose A so SA S We have already shown that the rst k ln r bases of the

greedy packing contain k r elements of S Now consider the remaining k bases If rkB r then

t

the last k bases of our greedy k ln r packing all at least this large contribute an additional

k r elements to S implying that S has size exceeding k r k r and thus density exceeding k

C

a contradiction So rkB r It follows that rk SB r r Thus the density of SB is

t t t

less than k r r r k

Now supp ose A We must b ound the density of SAB We consider SA as a matroid

t

0 0 0 0

g of S with rank r As in Lemma we transform fB Ag fB S g into a greedy packing fB

i i

i

0

S by shifting elements to largerindexed sets until each set is indep endent That construction has

two imp ortant prop erties First since elements are shifted to largerindexed sets we know that for

any j

X X

0

kB k kB k

i

i

ij ij

Second since we only shift spanned elements out of a set we never reduce its rank That is in

0 0

matroid S we have rkB rkB It follows that

i

i

0

kS B k

t

0

S B

t

0

rkS rkB

t

0 0

kS B k

t

0

rkS rkB

t

0 0

S B

t

k

0

where the last line follows by applying our argument ab ove for the A case to matroid S

Corollary For any an approximation to the sparsest quotient can be found in O m

O

r log k r time MC

Proof Construct a greedy k ln r packing S in M Note that S has s O r k ln r elements By

the ab ove lemma if we nd a quotient of density less than k in S it will yield a quotient of

density k k in M To nd an approximately sparsest quotient of S apply

Theorem sample each element with probability O log sk and nd a sparsest quotient of

the sample using the algorithm of Corollary The running time follows

Corollary A times maximum packing of disjoint bases can be found in O r k time

Proof Apply the algorithm of Theorem to a greedy packing

Application Packing Spanning Trees

A particular instance of matroid basis packing is the problem of packing spanning trees in an

undirected graph This problem has applications to fault tolerant communication IR Gus

It can b e formulated on uncapacitated graphs where each edge can b e in at most one tree or

capacitated graphs where the numb er of trees using a given edge must b e less than the capacity of

that edge The corresp onding quotient problem is to nd a multiway partition of the graphs

vertices that minimizes the ratio b etween the numb er or capacity of edges cut by the partition and

one less than the numb er of sets in the partition a variant of Edmonds Theorem on the duality of

these quantities in graphs was rst proven by NashWilliams NW

Gab ow and Westermann GW give an algorithm that solves the tree packing problem in

p

O minmn m n time on medge nvertex uncapacitated graphs Barahona Bar gave an

O mn time algorithm for nding the sparsest graph quotient in capacitated graphs and later Bar

followed with a tree packing algorithm that runs in O mn time

Theorem A sparsest quotient can be found in O m n time on uncapacitated

or capacitated graphs

Proof We use the fact that Nagamo chi and Ibaraki NI give an algorithm that constructs a greedy

packing of the graphic matroid in O m time on uncapacitated graphs and in O m n log n time

on capacitated graphs

First consider uncapacitated graphs As with general basis packing we use sampling to pro duce

a subgraph of G with packing numb er z O log m that accurately approximates the quotients

0

of G We then use Nagamo chi and Ibarakis algorithm to extract a greedy z ln npacking G from

the sampled graph This packing has m O nz ln n edges Therefore Gab ow and Westermanns

p

n algorithm GW nds the optimum packing and identies a sparsest quotient in O m

O n time

As noted in Section the algorithm for capacitated graphs is the same except that we

sample from each capacitated edge by generating a Poisson random variable with the appropriate

parameters

We have ignored the determination of the correct sampling probability By NashWilliams the

orem on graph quotients NW any graph with minimum cut c has packing numb er b etween c

and c Therefore we can determine a go o d sampling probability by using the lineartime minimum

cut approximation algorithms of Matula Mat or Karger Kara

Lemma A maximum tree packing of k trees can be found in O k n time LV

Proof After computing a greedy packing we apply Theorem but use Gab ow and Westermanns

algorithm GW to solve each of the random subproblems

Remark Gab ow and Westermanns algorithm is based on augmentations but do es not b enet

from having already found a large collection of bases It is analogous to the blo cking ow metho d

for maximum owsit cheaply nds a near optimal solution and then sp ends most of its time

augmenting to an exact solution Therefore the recursive algorithm of Theorem do es not

accelerate their exact algorithm An interest op en question is whether sampling can b e used to

sp eed up the blo ckingow typ e algorithms for example by proving that only a small numb er of

additional blo cking ows is needed to clean up the output of the approximation algorithm

Conclusion

This pap er has suggested an approach to random sampling for optimization and given results that

apply to matroids as mo dels for greedy algorithms and as combinatorial ob jects Two future direc

tions suggest themselves

In the realm of combinatorics how much of the theory of random graphs can b e extended to the

more general matroid mo del There is a well dened notion of connectivity in matroids Wel is

this relevant to the basis packing results presented here What further insight into random graphs

can b e gained by examining them from a matroid p ersp ective Erdos and Renyi showed a tight

threshold of p ln nn for connectivity in random graphs whereas our result gives a lo oser result

of p log nn for existence of sampled matroid bases Is there a law for bases in a random

submatroid

In the realm of optimization we have investigated a natural paradigm that works particularly

well for matroid problems generate a small random representative subproblem solve it quickly

and use the information gained to home in on the solution to the entire problem In particular

an optimum solution to the subproblem may b e a go o d solution to the original problem which can

quickly b e improved to an optimum solution The obvious opp ortunity apply this paradigm to

other optimization problems

Acknowledgments

Thanks to Don Knuth for help with a troublesome summation and to Daphne Koller for her extensive

assistance in clarifying this pap er

A Alternative Sampling Theorems for Packing

This section presents alternative pro ofs of our theorems ab out sampling matroid bases We b egin

by proving a theorem on the existence of a basis in the sample and we then generalize this theorem

by estimating the number of disjoint bases we will nd in the sample

A Finding a Basis

We b egin with some denitions needed in the pro of

We use slightly unusual terminology For the following denitions x some indep endent set T

in M

Denition A A set A is T independent or independent of T if A T is indep endent in M

Denition A If A is an indep endent set then AT is any maximal T indep endent subset of A

Denition A B n p is a binomial random variable

n

k nk

PrB n p k p p

k

Lemma A If A is a basis of M then AT is a basis for M T

Lemma A If B is a basis of M T then B T is a basis of M

Theorem A Suppose M contains a k ln r disjoint bases Then M k contains a basis

ak

for M with probability O e

ak ln r

Proof Let p k Let fB g b e disjoint bases of M We construct the basis in M p

i

i

by examining the sets B p one at a time and adding some of their elements to an indep endent

i

set I initially empty until I is large enough to b e a basis We invert the problem by asking how

many bases must b e examined b efore I b ecomes a basis Supp ose we determine U B p the

set of elements of B contained in M p Note that the size u of U is distributed as B r p thus

E u r p Consider the contraction M U By Lemma A this matroid contains disjoint bases

B U B U and has rank r u We ask recursively how many of these bases we need to examine

to construct a basis B for the contracted matroid Once we have done so we know from Lemma A

that U B is a basis for M This gives a probabilistic recurrence for the numb er of bases we need

to examine

T r T r u u B r p

If we replaced random variables by their exp ected values we would get a recurrence of the form

S r S pr which solves to S r log r where b p Probabilistic recurrences

b

are studied by Karp Kar His rst theorem exactly describ es our recurrence and proves that for

any a

a

P r T r blog r c a k

b

In our case log r k ln r

b

A Counting Bases

Theorem A If P M n then the probability that M p fails to contain k disjoint bases of M

is at most r PrB n p k

Proof We generalize the technique of the previous section We line up the bases fB g and pass

i

through them one by one adding some of the sampled elements from each basis to an indep endent

set I that grows until it is itself a basis For each B we set aside some of the elements b ecause

i

they may b e dep endent on elements already added to I we then examine the remaining elements of

B to nd out which ones were actually sampled and add those sampled elements to I The change

i

in the pro cedure is that we do this more than once to construct the next basis we examine those

elements set aside the previous time

Consider a series of phases in each phase we will construct one basis At the b eginning of phase

k k

k there will b e a remaining portion R of basis B the elements of R are those elements of B that

n n

n n

k

were not examined in any of the previous phases We construct an indep endent set I by pro cessing

k k k

each of the R in order Let I b e the p ortion of I that we have constructed b efore pro cessing

n n

k k k

are those elements that are set aside until the we split it into two sets R To pro cess R R

n n n

k k k

next phase while E R R is the set of elements we examine in this phase The elements of

n n n

k k

Thus as in the singlebasis case we simply check which elements will b e indep endent of I E

n n

k k k

p of elements we use and add them to E are in the sampled set identifying the set U of E

n n n

k k k k

our growing basis Formally we let I I U by construction I will b e indep endent

n n n n

k

I Indep endent set so far

n

th k

Remainder of n basis R

n

k

Elements examined for use E

n

k k k

p namely E Elements actually used from E U

n n n

th th

Figure Variables describing n basis in k phase

k k k k k k

We now explain precisely how we determine the split of R into R and E Let r i e

n n n n n n

k k k k k k

in hand and wish resp ectively Supp ose that we have I and U E I b e the size of R and u

n n n n n n

k k k

It follows from r We assume by induction that i to extend it by examining elements of R

n n n

k k k k

the denition of matroids that there must exist a set E R such that I E is indep endent

n n n n

k k k k k k k

and has size r Dening E this way determines R R E We then set U E p and

n n n n n n n

k k k

I I U

n n

n

To justify our inductive assumption we use induction on k To prove it for k note that our

k

k k k k k

construction makes r i Thus the fact that i i implies that r r Our

n n n n n

n

k k k

k

construction forces i r thus i r as desired

n

n n n

k k

We now use the just noted invariant r i to derive recurrences for the sizes of the various

n n

k k

sets As b efore the recurrences will b e probabilistic in nature Recall that u is the size of U so

n n

k k

u B e p Thus

n n

k k

i r

n n

k k

i u

n n

k

k

r p B e

n

n

It follows that

k k k

e r r

n n n

k k

k k

B e r p r p B e

n n

n n

k

k k

e B e p p B e

n n

n

k k

Now let f E e Linearity of exp ectation applied the recurrence shows that

n n

k

k k

f pf pf

n n

n

k

Since we examine the entire rst basis in the rst phase e r and e for k Therefore

this recurrence is solved by

n

k k nk

f p p r

n

k

th

We now ask how big n needs to b e to give us a basis in the k phase As in Section A it

simplies matters to assume that we b egin with an innite set of disjoint bases and ask for the value

th th k

of n such that in the k phase we nish constructing the k sampled basis I b efore we reach

k th k

denoting the numb er of items from B used in I the n original basis B Recall the variable u

n n

n

th

Supp ose that in the k phase we use no elements from any basis after B One way this might

n

k

happ en is if we never nish constructing I However this is a probability event The only other

k

p ossibility is that we have nished constructing I by the time we reach B so that we examine no

n

more bases

k k

It follows that if u for every j n then we must have nished constructing I b efore

j

P

k k

It u are nonnegative this is equivalent to saying that we examined B Since the u

n

j j

j n

P

k

u with high follows that our problem can b e solved by determining the value n such that

j

j n

probability

From the Markov inequality MR which says that for p ositive integer random variables PrX

k k k

we deduce that the probability that we pf pE e E X and from the fact that E u

j j j

k

fail to construct I b efore reaching B is at most

n

X X

k k k

s E p f u

n j j

j n j n

k

One way to b ound s is to sum by parts However an easier metho d is to consider an innite

n

k

sequence of coin ips with heads probability p The quantity pf is then r times the probability

j

st th k

that the k head o ccurs on the j ip Then s is just r times the probability that the

n

st th

k head app eared after the n ip which is equivalent to the probability that the rst n ips

had at most k heads ie PrB n p k This yields the theorem

np

The probability of nding no bases is thus at most s r e this is exactly the result proven

n

in the Section A

We also consider the converse problem namely to upp er b ound the numb er of bases that survive

Corollary A If P M n and k np then the probability that M p contains more than k

disjoint bases of M is at most PrB n p k

Proof By Edmonds theorem there must exist some A M such that

Ak n r n rkA k

If rkA r the ab ove statement cannot b e true Thus rkA r It follows that in fact

n rkA maxk A k n n r

Now consider what happ ens in M p The rank of Ap is certainly at most rkA To b ound kApk

consider two cases If kAk n then PrkApk k nkAk PrB n p k On the other hand

if k Ak n then PrkApk k PrB n p k In other words with the desired probability

k

maxk kApk maxk nkAk k Ak n It follows that with the desired probability

n

k

Apk k rkA Ak n maxk k rkAp k

n

k

A k n n rkA maxk

n

k

n r

n

k r

If M p contains no basis of M then certainly it contains fewer than k bases On the other hand if

M p do es contain a basis then M p has rank r in which case Ap demonstrates through Edmonds

Theorem that M p contains at most k bases

Applying the Cherno b ound Che to the previous two theorems yields Theorem

References

r d

ACM ACM Proceedings of the ACM Symposium on Theory of Computing ACM Press

May

th

ACM ACM Proceedings of the ACM Symposium on Theory of Computing ACM Press

May

th

ACM ACM Proceedings of the ACM Symposium on Theory of Computing ACM Press

May

Bar Francisco Barahona Separating from the dominant of the spanning tree p olytop e Oper

ations Research Letters

Bar Francisco Barahona Packing spanning trees Mathematics of Operation Research

February

BK Andras A Benczur and David R Karger Approximate st mincuts in O n time In

th

Proceedings of the ACM Symposium on Theory of Computing ACM pages

Bol Bela Bollobas Random Graphs Harcourt Brace Janovich

Che H Cherno A measure of the asymptotic eciency for tests of a hyp othesis based on the

sum of observations Annals of Mathematical Statistics

CLR Thomas H Cormen Charles E Leiserson and Ronald L Rivest Introduction to Algo

rithms MIT Press Cambridge MA

Col Charles J Colb ourn The Combinatorics of Network Reliability volume of The Inter

national Series of Monographs on Computer Science Oxford University Press

DRT Brandon Dixon Monika Rauch and Rob ert E Tarjan Verication and sensitivity analysis

of minimum spanning trees in linear time SIAM Journal on Computing

Edm Jack Edmonds Minimum partition of a matroid into indep endents subsets Journal of

Research of the National Bureau of Standards

Edm Jack Edmonds Matroids and the greedy algorithm Mathematical Programming

ER Pal Erdos and A Renyi On random graphs I Publ Math Debrecen

Fel William Feller An Introduction to Probability Theory and its Applications volume

John Wiley Sons edition

th

FM Tomas Feder and Milena Mihail Balanced matroids In Proceedings of the ACM

Symposium on Theory of Computing pages ACM ACM Press May

FR Rob ert W Floyd and Ronald L Rivest Exp ected time b ounds for selection Communi

cations of the ACM

Gab Harold N Gab ow A matroid approach to nding edge connectivity and packing ar

b orescences Journal of Computer and System Sciences April A

preliminary version app eared in STOC

Gus Dan Guseld Connectivity and edge disjoint spanning trees ipl

GW Harold N Gab ow and Herb ert H Westermann Forests frames and games Algorithms

for matroid sums and applications Algorithmica

IR A Itai and M Ro deh The multitree approach to reliability in distributed networks

Information and Control

r d

Kar Richard M Karp Probabilistic recurrence relations In Proceedings of the ACM

Symposium on Theory of Computing ACM pages

Kar David R Karger Approximating verifying and constructing minimum spanning forests

Manuscript

Kar David R Karger Random sampling in matroids with applications to graph connectiv

th

ity and minimum spanning trees In Proceedings of the Annual Symposium on the

Foundations of Computer Science pages IEEE IEEE Computer So ciety Press

Novemb er Submitted for publication

th

Kar David R Karger Minimum cuts in nearlinear time In Proceedings of the ACM

Symposium on Theory of Computing ACM pages

Kara David R Karger Random sampling in cut ow and network design problems Mathe

matics of Operations Research To app ear A preliminary version app eared in STOC

Karb David R Karger Using random sampling to nd maximum ows in uncapacitated undi

th

rected graphs In Proceedings of the ACM Symposium on Theory of Computing

pages ACM ACM Press May

KKT David R Karger Philip N Klein and Rob ert E Tarjan A randomized lineartime

algorithm to nd minimum spanning trees Journal of the ACM

Knu Donald E Knuth Matroid partitioning Technical Rep ort STANCS Stanford

University

Kru J B Kruskal Jr On the shortest spanning subtree of a graph and the traveling salesman

problem Proceedings of the American Mathematical Society

KS David R Karger and Cliord Stein A new approach to the minimum cut problem

Journal of the ACM July Preliminary p ortions app eared in SODA

and STOC

KT Philip N Klein and Rob ert E Tarjan A randomized lineartime algorithm for nding

th

minimum spanning trees In Proceedings of the ACM Symposium on Theory of

Computing ACM pages

KY Donald E Knuth and Andrew C Yao The complexity of nonuniform random numb er

generation In Joseph F Traub editor Algorithms and Complexity New Directions and

Recent Results pages Academic Press

Law Eugene L Lawler Combinatorial Optimization Networks and Matroids Holt Reinhardt

and Winston New York

LR Jon Lee and Jennifer Ryan Matroid applications and algorithms ORSA Journal on

Computing

th

Mat D W Matula Determining edge connectivity in O nm In Proceedings of the

Annual Symposium on the Foundations of Computer Science pages IEEE IEEE

Computer So ciety Press

MR Ra jeev Motwani and Prabhakar Raghavan Randomized Algorithms Cambridge Univer

sity Press New York NY

Mul Ketan Mulmuley Computational Geometry Prentice Hall

NI Hiroshi Nagamo chi and Toshihide Ibaraki Computing edge connectivity in multigraphs

and capacitated graphs SIAM Journal of Discrete Mathematics February

NW C St J A NashWilliams Edge disjoint spanning trees of nite graphs Journal of the

London Mathematical Society

Pol V P Polesskii Bounds on the connectedness probability of a random graph Problems

of Information Transmition

Rec Andras Recski Matroid Theory and its Applications In Electric Network Theory and in

Statics Numb er in Algorithms and Combinatorics SpringerVerlag

th

RS John H Reif and P Spirakis Random matroids In Proceedings of the ACM Sym

posium on Theory of Computing pages

Sch Alexander Schrijver editor Packing and Covering in Combinatorics Numb er in

Mathematical Centre Tracts Mathematical Centre

VDW B L Van Der Waerden Moderne Algebra Springer

Wel D J A Welsh Matroid Theory London Mathematical So ciety Monographs Academic

Press

Whi Hassler Whitney On the abstract prop erties of linear indep endence American Journal

of Mathematics