A GEOMETRIC APPROACH TO BETWEENNESS

y z

BENNY CHOR AND MADHU SUDAN

Abstract An input to the betweenness problem contains m constraints over n real variables

p oints Each constraint consists of three p oints where one of the p oint is sp ecied to lie inside the

interval dened by the other two The order of the other two p oints ie which one is the largest and

which one is the smallest is not sp ecied This problem comes up in questions related to physical

mapping in molecular biology In Opatrny showed that the problem of deciding whether

the n p oints can b e totally ordered while satisfying the m b etweenness constraints is NPcomplete

Furthermore the problem is MAX SNP complete and for every nding a total order that

satises at least of the m constraints is NPhard even if all the constraints are satisable It is

easy to nd an ordering of the p oints that satises of the m constraints eg by cho osing the

ordering at random

This pap er presents a p olynomial time that either determines that there is no feasible

solution or nds a total order that satises at least of the m constraints The algorithm

translates the problem into a set of quadratic inequalities and solves a semidenite relaxation of

n

them in R The n solution p oints are then pro jected on a random line through the origin The

claimed p erformance guarantee is shown using simple geometric prop erties of the SDP solution

Key words Semidenite programming NPcompleteness Compu

tational biology

AMS sub ject classications Q Q

Intro duction An input to the betweenness problem consists of a nite set

of n elements or points S fx x g and a nite set of m constraints Each

n

constraint consists of a triplet x x x S S S A candidate solution to the

i j k

x b etweenness problem is a total order on its p oints A total order x

i i

2 1

satises the constraint x x x if either x x x or x x x That is x

i j k i j k k j i i

n

each constraint forces the second variable x to b e b etween the two other variables x

j i

and x but do es not sp ecify the relative order of x and x The decision version of

k i k

the b etweenness problem is to decide if all constraints can b e simultaneously satised

by a total order of the variables

In Opatrny showed that the decision version of the b etweenness prob

lem is NPcomplete This problem arises naturally when analyzing certain mapping

problems in molecular biology For example it arises when trying to order markers

on a chromosome given the results of a radiation hybrid exp eriment A com

putational task of practical signicance in this context is to nd a total ordering of

the markers the x in our terminology that maximizes the numb er of satised con

i

straints Indeed b etweenness is central in the recent software package RHMAPPER

At the heart of this package is a metho d for pro ducing the order of framework

markers based on b etweenness constraints obtained from a statistical analysis of the

biological data Slonim et al successfully employ two greedy heuristics for solving

the b etweenness problem

A preliminary version of this pap er app eared in the pro ceedings of the third annual Europ ean

Symp osium on ESA Lecture Notes in Computer Science Paul Spirakis Ed

Springer Septemb er pp

y

Dept of Computer Science Technion Haifa Israel b ennycstechnionacil Research

partially supp orted by Technion VPR funds

z

Lab oratory for Computer Science MIT Technology Square Cambridge MA

madhulcsmitedu Part of the work was done when this author was at the IBM Thomas J

Watson Research Center

Chor and Sudan

Opatrny gave two reductions in his pro of of NPcompleteness One of these

reductions is from SAT Following his construction we show in Section an ap

proximation preserving from MAX SAT This implies that there exists an

such that nding a total order that satises at least m of the constraints

even if they are all satisable is NPhard In particular this holds for every

see Corollary On the other hand it is easy to nd a total order that satises

one third of the m constraints even if they are not all satisable Simply arrange

the p oints in a random order along the line The probability that a sp ecic constraint

x x x is satised by such a randomly chosen order is since exactly two of

i j k

the six p ermutations on i j k have j in the middle Thus the exp ected numb er of

constraints satised by a random order is at least a third of the m constraints On

the other hand it is easy to construct examples where at most m constraints are

satisable Thus to achieve b etter approximation factors one needs to b e able to

recognize instances of the b etweenness problems that are not satisable

We present a p olynomial time algorithm that either determines that there is no

feasible solution or nds a total order that satises at least of the m constraints

Our algorithm translates the problem into a set of quadratic inequalities and solves

n n

a semidenite programming relaxation SDP of them in R Let v v R

n

b e a feasible solution to the SDP where each v corresp onds to the real variable x

i i

The n solution p oints are then pro jected on a random line through the origin We

show that if x b etween x and x is one of the b etweenness constraints then the

j i k

n

angle b etween the lines v v and v v in R is obtuse Using this prop erty we prove

i j k j

that the random pro jection satises each constraint with probability at least

This gives a randomized algorithm with the claimed p erformance guarantee Next

we show how to derandomize the algorithm In addition we demonstrate that our

analysis of the semidenite program is tight There is an innite family of inputs to

the b etweenness problem such that the resulting SDP is feasible but any total order

of the variables satises at most o of the m constraints

Our use of semidenite programming is inspired by the recent success in using

this metho dology to nd improved approximation algorithms for several optimization

problems The applicability of SDP in combinatorial optimization was demonstrated

by Grotschel Lovasz and Schrijver to show that the Theta function of Lovasz

was p olynomial time computable This application was then turned into exact

coloring and indep endent set nding algorithms for p erfect graphs The use of SDP

in approximation algorithms was innovated by the work of Go emans and Williamson

who broke longstanding barriers in the approximability of MAX CUT and MAX

SAT by their SDP based algorithm Further evidence of the applicability of the SDP

approach is provided by the works of Karger Motwani and Sudan who use it

to approximate Alon and Kahale indep endent set approximation

and by Feige and Go emans improvements to MAX SAT

Thus the semidenite programming metho d has now b een used successfully to

solve many optimization problems exactly and approximately However all the

cases where SDP has b een used to nd approximation algorithms seem to b e essen

tially partition problems MAX CUT Coloring Multicut etc Our solution seems

to b e to the b est of our knowledge the only case where SDP has b een used to

solve an ordering problem This syntactic dierence b etween ordered structures and

unordered ones and the ability of SDP to help optimize over b oth oers critical

additional evidence on the p ower of the SDP metho dology

The remaining of this pap er is organized as following Section presents the

A Geometric Approach to Betweenness

approximation preserving reduction from MAX SAT as well as other observations

ab out the b etweenness problem Semidenite programming is briey reviewed in

Section The algorithm is presented in Section Section shows the tightness of

our analysis Finally Section contains some concluding remarks and op en problems

Preliminaries We start this section with some preliminary observations

ab out the b etweenness problem We b egin by dening the notion of an approxi

mate solution to the b etweenness problem and analyzing the complexity of nding

such a solution

Definition Given an instance of the betweenness problem on m constraints

and an approximate solution is one that satises at least k constraints

where k is the maximum number of constraints satised by any solution For

the approximation version of the betweenness problem is the task of nding an

approximate solution for every instance An algorithm which solves such a problem is

said to be an approximation algorithm For the approximation problem for

satisable instances is the task of nding a total order that satises m constraints

or determining that the instance is not satisable An algorithm which solves this

problem is an approximation algorithm for satisable instances

The complexity of solving the b etweenness problem exactly ie for is

wellsettled Opatrny has shown that it is NPhard to decide if a given instance of

the b etweenness problem is satisable We now turn our attention to the complexity

of the problem for We rst present a hardness result based on a simple

reduction from MAX CUT due to Michel Go emans An instance of the MAX CUT

problem is an undirected graph The goal of the problem is to nd a partition S S

of the vertex set so as to maximize the numb er of edges with one endp oint in S and

one in S This problem was shown to b e hard to approximate to within some factor

by Arora et al The b est result known to date due to Hastad see

also Trevisan et al is that approximating MAX CUT is NPhard for every

Proposition For every the approximation version of the MAX CUT

problem reduces to the approximation version of the betweenness problem

Proof Given an instance G of the MAX CUT problem we create an instance of

the b etweenness problem as follows For every vertex v in the graph create a p oint

i

p In addition we intro duce one sp ecial p oint s For every edge v v in the graph

i i j

we intro duce the b etweenness constraint p s p ie s is b etween p and p Now

i j i j

given a cut S S in the graph that has k edges crossing the cut any ordering that

places the p oints corresp onding to the vertices in S to the left of s and the rest of the

p oints to the right of s is an ordering that satises k of the b etweenness constraints

In the reverse direction any ordering of the p oints that satises k b etweenness con

straints can b e converted into a cut with k edges crossing the cut by letting S b e

the set of vertices corresp onding to p oints to the left of s Thus the optima of the

two problems are exactly equal furthermore given an approximate solution to the

b etweenness instance we can construct an approximate solution to the MAX CUT

instance Thus an approximation algorithm for the b etweenness problem yields an

approximation algorithm for the MAX CUT problem

Corollary The approximation version of the betweenness problem is

NPhard for

While the ab ove reduction provides some insight ab out the hardness of the b e

tweenness problem on general instances it do es not quite provide a hardness result

for the problem of interest to us This is b ecause the instances of the b etweenness

Chor and Sudan

problem that we typically consider are fully satisable In the reduction ab ove the

only instances of the MAX CUT problem that reduce to fully satisable instances of

b etweenness are when the input graph is bipartite But in such cases it is easy to nd

the MAX CUT and thus the instance of b etweenness pro duced are not necessarily

hard

In what follows we present an approximation preserving reduction from MAX

SAT to the b etweenness problem This reduction follows Opatrnys original reduction

and addresses the approximation problem for satisable instances It is wellknown

that there exists a constant such that the approximation version of the

MAX SAT problem is NPhard The b est results known to date due to Hastad

show this is true for every Based on our reduction we conclude that there

0 0

exists a constant such that nding an ordering that satises fraction

of the constraints in a satisable instance of the b etweenness problem is NPhard

Proposition For every the approximation version of the

MAX SAT problem on satisable instances reduces to the approximation

version of the betweenness problem on satisable instances

Proof Given a CNF formula on n variables and m clauses we construct an

instance I of the b etweenness problem on n m p oints with m constraints

such that for every there exists a total order satisfying m of the b etweenness

constraints in I if and only if there exists an assignment satisfying of the clauses

in The reduction pro ceeds as follows For each Bo olean variable x of we add

i

a p oint p to I In addition we create two sp ecial p oints T and F Without loss of

i

generality we consider orderings where T is to the right of F An ordering of the

p oints p T and F is supp osed to imply a truth assignment as follows If p is to the

i i

left of F then it is false if it is to the right of F then it is true This interpretation

will also apply to the additional clause p oints that are intro duced in the rest of the

construction

x x we create ve p oints q q and q Given a clause C say C x

j j

j j j

and r and r The p oints q are supp osed to represent the assignment to the

j

j j

literals in the clause For each literal in the clause we include a constraint that forces

the variable to b e assigned consistently with the literal We do so with the constraints

F is b etween p and q whereas q is b etween p and F and q is b etween p and

j j j

F Thus for example an assignment satises q if and only if it falsies p The

j

are supp osed to represent the OR of the rst two and three and r p oints r

j j

literals in the clause resp ectively This is enforced with the b etweenness constraints

r is b etween q and q and r is b etween r and q So for example if

j j j j j j

b oth literal p oints q and q are false and r is b etween q and q then r

j j j j j j

must b e false while if at least one of the literal p oints is true then r can b e placed

j

so that it is true to the right side of F Lastly we add a b etweenness constraint that

attempts to ensure that a clause is assigned true This is done with the constraint

r is b etween F and T

j

Thus corresp onding to each clause we have b etweenness constraints Consider

an assignment to the variables in satisfying clauses out of m Without loss of

generality assume that the assignment sets x x false and x x

k k n

true Order the p oints p and T and F as follows

i

p p F p p T

k k n

A Geometric Approach to Betweenness

For j going from to m the literal p oints q q and q are then placed b etween

j j j

p and F or b etween F and p dep ending on their truth value A true literal

k k

is placed b etween F and p while a false literal is b etween p and F Finally

k k

the p oints r and r are placed as far to the right as p ossible sub ject to the

j j

b etweenness constraints This tends to make r lie b etween F and T if any one

j

of the literals in the j th clause is true This arrangement always satises at least ve

of the b etweenness constraints asso ciated with the k th clause The only constraint

it may not satisfy is the constraint r is b etween F and T and this constraint

j

is satised if and only if at least one of the literals in the j th clause is true Thus

this ordering satises m of the b etweenness constraints Conversely it may

b e veried that if an arrangement of the p oints again with F left of T satises

m b etweenness constraints then the assignment that assigns true to all those

variables whose corresp onding p oints lie to the right of F satises at least clauses

in the formula There must b e at least values of j for which the arrangement

satises all b etweenness constraints involving q s and r s For these values of j

j j

the corresp onding assignment satises the j th clause

Thus given a CNF formula with m clauses we have constructed a b etweenness

0

instance I with m m constraints Further more given an ordering satisfying

0 0

m constraints we can reconstruct an assignment satisfying at least m

m m clauses of

Corollary The approximation version of the betweenness problem on

satisable instances is NPhard for every

Next we show what can achieved by the obvious randomized algorithm for the

b etweenness problem

The natural randomized algorithm for the b etweenness problem arranges the

p oints in a random order along the line The probability that a sp ecic constraint

is satised by such a randomly chosen order is Thus the exp ected numb er of

constraints satised by a random order is at least a third of all the constraints By

the metho d of conditional probabilities one can nd such order in p olynomial time

Since this order satises rd of all constraints it is within rd of the optimal

ordering The result is summarized b elow

Proposition The approximation version of the betweenness problem

can be solved in polynomial time

Before going on to more sophisticated techniques for solving this problem let us

examine the main weakness of the ab ove algorithm We rst argue that no algorithm

can do b etter than attempting to satisfy a third of all given constraints Consider an

instance of the b etweenness problem on three p oints with three constraints insisting

that each p oint b e b etween the other two Clearly we can satisfy only one of the

ab ove three constraints which proves the claim Thus the primary weakness of the

ab ove algorithm is not in the absolute numb er of constraints it satises but in

the fact that it attempts to do so for every instance of the b etweenness problem

even those that are obviously not satisable Thus to achieve b etter approximation

factors one needs to b e able to recognize instances of the b etweenness problems that

are not satisable But this is an NPhard task In fact Corollary indicates

that one cannot even distinguish instances that are satisable from those for which

an fraction of the constraints remain unsatised under any assignment In what

follows we use a semidenite relaxation of our problem to distinguish cases that are

not satisable from cases where at least of the given clauses are satisable We

then go on to show that using this relaxation we can achieve a b etter approximation

Chor and Sudan

than the naive randomized algorithm

Semidenite Programming In this section we briey intro duce the paradigm

of semidenite programming We describ e why it is solvable in p olynomial time

A complementary technique to that of semidenite programming is the incomplete

Cholesky decomp osition We describ e how the combination allows one to nd emb ed

dings of p oints in nitedimensional Euclidean space sub ject to certain constraints

Definition For positive integers m and n a semidenite program is dened

nn

over a col lection of n real variables fx g The input consists of a set of mn

ij

ij

k

nnm

k m

real numbers fa g a vector of m real numbers fb g and a vector of

ij

ij k k

nn nn

n real numbers fc g The objective is to nd fx g so as to

ij ij

ij ij

n n

X X

c x maximize

ij ij

i j

subject to

n n

X X

k

k

k f mg a x b

ij

ij

i j

and The matrix X fx g is symmetric

ij

and positive semidenite

Recall that the following are equivalent ways of dening when a symmetric matrix

X is p ositive semidenite

All the eigenvalues of X are nonnegative

n T

For all vectors y R y X y

T

There exists a real matrix V such that V V X

It is wellknown that the ellipsoid algorithm of Khaciyan can b e used to solve

any semidenite program approximately in the following sense Given a parameter

the algorithm runs in time p olynomial in the input size and log and nds

a feasible solution achieving an ob jective of at least optimum see for instance

In order to use the semidenite programming approach for solving combinatorial

optimization problems one more to ol is useful This is the ability to nd a matrix V

as guaranteed to exist in Part of the denition of p ositive semideniteness ab ove

The metho d that yields such a matrix is the incomplete Cholesky decomp osition

The matrix V can b e used to interpret the solution obtained by the semidenite

programming problem geometrically Interpret the columns of the n n matrix V as

n

n vectors v v in R Now the variables x of the matrix X simply corresp ond

n ij

to the inner pro duct of v and v Thus a linear constraint on the x s is simply a

i j ij

linear constraint on the inner pro ducts of the vectors v s And the ob jective function

i

is simply a linear function on the inner pro ducts

Thus the following provides an equivalent geometric interpretation of SDP

P

c Find n vectors v v so as to maximize the quantity

ij n

ij

v v

i j

P

k

k

sub ject to the constraints a v v b for every

i j

ij

ij

k f mg

Alternately one can interpret SDP as solving an optimization problem that at

tempts to nd n p oints in ndimensional Euclidean space sub ject to linear constraints

A Geometric Approach to Betweenness

on the squares of the distance b etween the p oints This is done by observing that the

square of the distance b etween p oints v and v denoted d is simply

i j

ij

v v v v v v v v v v

i j i j i i j j i j

Thus a linear inequality on the d s is also a linear inequality on the inner pro ducts

ij

of the v s Actually the distance squared interpretation is equivalent to SDP since

i

we can express v v as d d d

i j

i j ij

From this interpretation of SDP we can solve any problem of the form

n

Geometric SDP Emb ed n p oints in R such that the squares of

the distance b etween the p oints denoted d satisfy the constraints

ij

P P

k

k

c d In the a d b while trying to maximize

ij

ij ij

ij ij ij

additive approximation version the algorithm is allowed to return for

every feasible input a solution such that each constraint is violated

P

k

k

by at most ie b and the ob jective achieved a d

ij

ij

ij

is at least the optimum

In what follows we will use the last interpretation of SDP to solve the b etweenness

problem In particular we use the prop osition b elow

Proposition For every the additive approximation version of the

geometric SDP can be solved in time polynomial in the input size and log

The Algorithm The general idea of our algorithm is to express the b e

tweenness constraints as a set of real quadratic inequalities By considering an n

dimensional relaxation of the problem we get an instance of semidenite program

n

ming and can nd a feasible solution in R if one exists We study simple geometric

prop erties of this solution set We use them to argue that a pro jection of the set on a

random line satises at least half the b etweenness constraints with high probability

Then we show how to derandomize the algorithm

Consider a set of m b etweenness constraints on n real variables x x Sup

n

p ose these constraints are satisable and that x x x is a satisfy

n

ing linear order We can clearly emb ed the p oints in the unit interval and assign

x i n i n Let x x x b e a triplet such that x is re

i i j k j

quired to b e b etween x and x For the assignment ab ove it is readily seen that

i k

x x x x x x Furthermore the xs are at least n apart

i j k j i k

and at most apart Thus for every pair of distinct indices i j the xs satisfy the

inequalities n x x This motivates the following geometric SDP

i j

relaxation for the b etweenness problem

n

Emb ed n p oints in R sub ject to the constraints

d i j

2

ij

n

d d d for every constraint x x x SDPI

i j k

ij

j k ik

We strengthen this relaxation slightly b efore showing how to use it to nd an

approximate solution to the instance of the b etweenness problem Recall that the xs

are at least n apart and at most apart Therefore for any triple x x x

i j k

the ratio b etween x x x x and x x is maximized when x and x

i j j k i k i k

are extreme p oints and and x is as close as p ossible to one of them n

j

or n n For these values the ratio is

n n n n

Chor and Sudan

p

2 1 r

n

v r v

i k

Fig Possible location for the midpoint v

j

Denote this value by Notice that n on dep ends only on the

n n

numb er of variables

We are now ready to set up our nal SDP relaxation

n

Emb ed n p oints in R sub ject to the constraints

d i j

2

ij

n

d d d for every constraint x x x SDPI

n i j k

ij

j k ik

The argument leading to the construction of the instance SDPI says that the

SDP is feasible if the instance I is satisable and in fact there exists an emb edding of

the p oints in one dimension satisfying all the constraints We summarize this b elow

Proposition For every instance I of the betweenness problem if I is sat

isable then the semidenite program SDPI is feasible

As argued in Section see Prop osition we can use the ellipsoid algorithm to

test the feasibility of SDPI and if it is feasible to nd an approximation of a feasible

n

solution if one exists Let v v R b e an approximately feasible solution

n

n

and let v v v R b e a triplet that corresp onds to a b etweenness constraint We

i j k

rst prove some geometric facts ab out the p oints v v v and then use this to design

i j k

our approximation algorithm

Consider any two dimensional plane through the p oints v v v If v v v are

i j k i j k

not collinear then this plane is unique else we pick any such plane arbitrarily Let

r b e the distance b etween v and v n r In what follows

i k

we shall skip the term since it can b e made arbitrarily small and in particular

exp onentially small in n

6

We now consider the angle v v v We claim that this angle is obtuse

ijk i j k

ie at least To see this we pro ject the p oints down to the two dimensional

plane containing v v and v Furthermore we rotate and translate the p oints so

i j k

that v r v r and v x y Now we can use the explicit formulae

i k j

d x r y d r x y and d r The constraint on these

ij

j k ik

distances yields

x r y x r y r n

A Geometric Approach to Betweenness

which implies

x y r

n

This means that v the midp oint in the b etweenness constraint lies inside a ball

j

p

of radius r whose center is the middle p oint v v and outside the two

n i k

small balls of radius n around v and v see gure

i k

6

v v v is indeed obtuse The following claim This proves that the angle

i j k ijk

proves a tighter b ound on

ijk

Claim The angle satises n

ijk ijk

Proof We apply the cosine rule

cos d d d d d

ijk ij j k

ij j k ik

p

x y r x y r r x

x y r x y r

n n

n

n n

Denoting h and using the Taylor series expansion

ijk

h h

cosh h

we get

h h

n n

namely n so h

ijk

n

We are now ready to describ e our algorithm The algorithm pro ceeds by picking

uniformly at random a line through the origin and pro jecting the n p oints v v

n

0 0

on this random line Let x x b e the n resulting p oints

n

0

6

Claim Let denote the angle v v v Then the probability that x lies

ijk i j k

j

0 0

equals between x and x

ijk

i

k

Proof Instead of considering an arbitrary line through the origin we consider a

parallel line that go es through the p oint v This do es not change the b etweenness

j

relation of the pro jections Neither is this relation changed when considering the

pro jection of this line on the two dimensional plane dened by v v v Consider the

i j k

section of the circle dened by the two lines that go through v and are p erp endicular

j

to the lines v v and v v It is not hard to see that only lines going through this

i j k j

section violate the b etweenness constraint of the pro jections This section o ccupies

an angle of see gure The claim follows

ijk

Combining Claims and we get

Corollary Suppose SDPI has a feasible solution Then for any of the m

0 0 0

constraints the probability that x lies between x and x is at least n

j i

k

As a consequence the exp ected numb er of b etweenness constraints satised by

0 0

x x is at least m mn m n This yields the following

n

lemma that forms a weak converse to Prop osition

Chor and Sudan

ijk

v

i

v v

k j

Fig Lines going through the circular section violate the constraint

Lemma For any instance I of the betweenness problem if SDPI is feasible

then there exists a total order satisfying at least m mn of the betweenness

constraints in I

Thus we get a randomized p olynomial time algorithm that either nds that the

constraints are infeasible or generates a linear order that satises at least half the

constraints

We now outline a metho d for derandomizing our algorithm Given an emb edding

of the b etweenness problem we can dene a graph and an emb edding of the graph

n

in R such that the exp ected size of the MAX CUT found for this emb edding of the

graph equals the exp ected numb er of b etweenness constraints that are satised by a

random pro jection

For every ordered pair of p oints v v of the b etweenness problem intro duce

i j

the vertex w with emb edding v v If i j k is a b etweenness constraint then put

ij i j

an edge b etween w and w This denes the graph and its emb edding

ij k j

Now consider any hyp erplane through the origin that cuts across the edge b etween

w and w Let the slop e of the normal to the hyp erplane b e the vector r Assume

ij k j

wlog that rw and rw then rv rv and rv rv Thus j lies

ij j k i j j k

b etween i and k Conversely if pro jection onto the vector r satises the b etweenness

constraint for i j k then the edge b etween w and w must b e cut

ij j k

Maha jan and Ramesh give a metho d to deterministically nd a vector r whose

cut value equals the exp ected cut value They use this algorithm to derandomize the

MAX CUT and Max SAT algorithm of Go emans and Williamson By using

their algorithm we get a vector such that pro jection on to this vector satises as

many constraints as the exp ected numb er satised by a random vector

Remark Observe that the ab ove reduction is not a generic reduction from

b etweenness to MAX CUT It uses the fact that the graph pro duced for the MAX

CUT problem has a sp ecied emb edding in order to map a solution of the MAX CUT

problem to a solution of the b etweenness problem

We conclude the section by stating the main theorem of this pap er

A Geometric Approach to Betweenness

Theorem The approximation version of the betweenness problem can

be solved in polynomial time Specical ly there exists a polynomial time algorithm

which takes as input an instance of the betweenness algorithm on n points and m

constraints and either outputs not feasible or outputs a total order satisfying at

least m mn constraints

Tightness of our analysis In this section we show that our analysis of the

semidenite program is almost tight We do so by exhibiting two families of instances

of the b etweenness problem on m constraints such that the optimum value is at most

m o but a slight p erturbation of the SDP is nevertheless feasible

The rst example is related to the ddimensional hyp ercub e For every integer

d

d we construct the instance I as follows I has p oints corresp onding to

d d

d

d d

the vertices of the ddimensional hyp ercub e I has m constraints one

d

for every simple path of length in the hyp ercub e with the b etweenness constraint

exp ecting the middle vertex of the path to b e b etween the endp oints

Consider a small p erturbation of our SDP where we set d d for d

ij

jk ik

each b etweenness constraint This SDP is clearly feasible the natural emb edding

of the hyp ercub e in ddimensions as a hyp ercub e ensures that every path of length

o

subtends an angle of at their midp oint

Now consider a linear ordering of the p oints Consider any p oint p and all the

d

paths that have p as their midp oint The numb er of such paths is Now let d of the

neighb ors of p b e on its left and d of its neighb ors b e on its right where d d d

The numb er of b etweenness constraints exp ecting p to b e in the middle that get

satised is d d d Thus the fraction of b etweenness constraints asso ciated

with any p oint that get satised is at most d dd dd

d o

The second example suggested to us by Michel Go emans is related to the cuts

in the complete graph K on n variables For every integer n we construct the

n

instance C as follows C has n p oints a center p oint v and n vertices

n n

n

constraints one for every edge in the complete graph v v C has m

n n

For every i k n we have the b etweenness constraint that v is b etween v

i

and v

k

We now consider the following p erturbation of our SDP where d d

ij

jk

for each b etweenness constraint To see that this SDP is feasible consider nd

ik

the following emb edding The vertex v is emb edded as the p oint

i

where the o ccurs in the ith co ordinate The vertex v is emb edded as the p oint

p

and the distance n n Observe that the distance b etween v and v is

i j

p

b etween v and v is n Thus for any two indices i k the inequality

i

d d nd which corresp onds to the b etweenness constraints is

i

k ik

satised in fact equality holds Now in order to satisfy the SDP recall that we

required all pairwise distances to b e at most we simply scale down the simplex so

that the distance b etween the vertices is emb ed the center v in the origin and

each vertex v in the corresp onding simplex vertex This emb edding satises all the

i

SDP constraints

Again any linear ordering of the n p oints induces a cut in the graph K

n

vertices to the left of v vertices to the right of v An edge corresp onds to a

satised b etweenness constraint if and only if the edge is across the cut Therefore

the maximum numb er of satisable constraints equals the sized of a maximum cut in

K namely n m o

n

The advantage of this maximum cut example is that it shows tightness of the

Chor and Sudan

analysis with resp ect to quadratic inequalities of the form

d d d

n

ij k j ik

where n on Our original SDP has the form

n

d d d

n

ij k j ik

where n on By starting with the complete graph example and

n

padding it with extra dummy variables that do not take part in any constraint we

can construct an example where only o of the constraints are satisable yet

the original SDP with is feasible in fact any o can work here It is not

n n

clear how to come up with a nonarticial construction ie without padding having

these prop erties

Concluding Remarks We remark that metric information can b e easily

incorp orated into our algorithm As a simple example supp ose that for some of the

constraints we know not only that x is b etween x and x but that it is exactly in

j i k

the middle namely x x x In this case we add the inequality

j i k

d d d

ij k j ik

instead of

d d d

n

ij k j ik

Any feasible solution will have v exactly in the middle of v and v and the same

j i k

holds with resp ect to the nal pro jections

Finally notice that our formulation of the problem as SDP only tested for fea

sibility of the constraints It is interesting to see if the inclusion of an appropriate

ob jective function and p ossibly of additional inequalities can b e used to improve the

p erformance guarantee of the algorithm Other approaches to the problem p ossibly

purely combinatorial ones are also of interest

Acknowledgments We are grateful to Michel Go emans for providing us with

the max cut example and for helpful discussions on semidenite programming Many

thanks to Amir BenDor for numerous helpful discussions on the b etweenness problem

We would also like to thank Ron Shamir for acquainting us with reference and

for useful discussions Oded Goldreich and the anonymous referee for their comments

on earlier versions of this pap er and Amos Beimel and Dan Pelleg for their exp ert

advice on xg

REFERENCES

N Alon and N Kahale Approximating the independence number via the function

Manuscript August

S Arora C Lund R Motwani M Sudan and M Szegedy Proof Verication and Hard

ness of Approximation Problems Journal of the ACM to app ear An extended abstract

app ears in Pro c rd Annual IEEE Symp osium on Foundations of Computer Science

pp

D Cox M Burmeister E Price S Kim R Myers Radiation Hybrid Mapping A So

matic Cel l Genetic Method for Constructing High Resolution Maps of Mammalian Chro

mosomes Science Vol pp

U Feige and M Goemans Approximating the value of two prover proof systems with appli

cations to MAX SAT and MAX DICUT Pro ceedings of the Third Israel Symp osium on

Theory and Computing Systems Tel Aviv Israel pp

A Geometric Approach to Betweenness

M Goemans and D Williamson Improved approximation algorithms for maximum cut and

satisability problems using semidenite programming Journal of the ACM

Novemb er

S Goss and H Harris New Methods for Mapping Genes in Human Chromosomes Nature

Vol pp

M Grotschel L Lovasz and A Schrijver The el lipsoid method and its consequences in

combinatorial optimization Combinatorica

M Grotschel L Lovasz and A Schrijver Geometric Algorithms and Combinatorial Op

timization SpringerVerlag Berlin

J Hastad Some optimal inapproximability results Pro ceedings of the TwentyNinth Annual

ACM Symp osium on Theory of Computing El Paso Texas pp

D Karger R Motwani and M Sudan Approximate graph coloring via semidenite pro

gramming Journal of the ACM to app ear Extended abstract in Pro c of the th Annual

Symp osium on Foundations of Computer Science Santa Fe New Mexico pp

L Khaciyan A polynomial algorithm in linear programming English translation app ears in

Soviet Mathematics Doklady vol pp

L Lovasz On the Shannon capacity of a graph IEEE Transactions on Information Theory

IT

S Mahajan and H Ramesh Derandomizing Semidenite Programming Based Approximation

Algorithms Pro ceedings of the th Annual Symp osium on Foundations of Computer

Science pp

J Opatrny Total Ordering Problem SIAM Journal on Computing vol no February

pp

D Slonim L Stein L Kruglyak and E Lander RHMAPPER An interac

tive computer package for constructing radiation hybrids maps Available at

httpwwwgenomewimiteduftppubsoftwarerhmapper

D Slonim L Stein L Kruglyak and E Lander Building Human Genome Maps with

Radiation Hybrids Journal of Computational Biology vol no Winter pp

L Trevisan GB Sorkin M Sudan and DP Williamson Gadgets approximation and

linear programming Pro ceedings of the th Annual Symp osium on Foundations of Com

puter Science Burlington Vermont pp