Tel Aviv University

The Raymond and Beverly Sackler

Faculty of Exact Sciences

Scho ol of Mathematical Sciences

Complexity and Approximation of

Some Graph Mo dication Problems

Thesis submitted in partial fulllment

of graduate requirements for the degree

MSc in Tel Aviv University

Department of Computer Science

By

Assaf Natanzon Prepared under the sup ervision of Prof Ron Shamir

Acknowledgments

Iwould like to thank my advisor Prof Ron Shamir for directing me and giving me

a helping hand and for his clarifying comments and reformulations

Iwould liketothankmy b eloved Mom and Dad for their supp ort

Abstract

In an edge mo dication problem one has to change the edge set of a given graph as

little as p ossible so as to satisfy a certain prop ertyWe prove in this pap er the NP

hardness of a variety of edge mo dication problems with resp ect to some wellstudied

classes of graphs These include p erfect chordal comparability split threshold and

cluster graphs Weshow that some of these problems b ecome p olynomial when the

input graph has b ounded degree We give a general constant factor approximation

algorithm for deletion and editing problems on b ounded degree graphs with resp ect

to prop erties that can b e characterized by a nite set of forbidden induced subgraphs

We giveanapproximation algorithm for Cluster Deletion and Editing We also prove

some combinatorial b ounds on the sizes of the minimum completion and deletion sets of a graph

Contents

Intro duction and Summary

Intro duction

Denitions

Previous results

Summary of thesis results

Outline of the thesis

NP Hardness results

Chain and Chordal Deletion

Perfect Mo dications

Perfect Completion

Perfect Deletion

Perfect Editing

Comparability Editing

Cluster Deletion

Deletion and Completion

Split Graphs Deletion and Completion

Approximation Algorithms

Mo dication problems on b ounded degree graphs for families charac

terized by a nite set of forbidden induced subgraphs

Edge Deletion approximation algorithm

Edge Editing approximation algorithm

Cluster Deletion and Editing

Cluster Editing

Cluster Deletion

Polynomial Results On Bounded Degree Graphs

Combinatorial b ounds on the size of minimum completion sets

A concrete example for Chordal Completion

Random graphs

The minimum chordal completion of a random graph

The minimum p erfect graph containing a random graph

The minimum p erfect deletion in a random graph

Chapter

Intro duction and Summary

Intro duction

In this thesis we deal with graph mo dication problems Mo dication problems on

graphs play an imp ortant role in computer science and have applications in several

elds including computer algebra and molecular biology They include edge com

pletion problems edge deletion problems edge editing problems and vertex deletion

problems

Wenow dene the generic problems we shall study in this thesis All problems

are dened b elow with resp ect to a family of graphs F that is not a part of the input

For example F can b e the set of chordal graphs or the p erfect graphs etc

The Minimum Completion problem is dened as follows given an arbitrary graph

G V E nd a set of nonedges F such that jF j is minimum and G V E F

F ie nd a minimum set of edges whose addition to G will form a graph that

b elongs to F For a graph G V E a kcompletion set for a graph family F or

edges so that V E F F an F kcompletionisasetF of at most k

The Minimum Deletion problem is dened as follows given an arbitrary graph

G V E nd a set of edges H such that jH j is minimumand G V E n H F

ie nd a minimum set of edges F whose removal from G will form a graph that

b elongs to F For a graph G V E a kdeletion set for a graph family F or an

F kdeletionisasetF of at most k edges so that V E n F F

The Minimum Editing problem is dened as follows given an arbitrary graph

G V E nd a set of nonedges F and a set of edges H such that jF H j is

minimum and G V E F n H F ie nd a minimum set of edges you need

to edit add or delete such that editing of these edges in the graph G will form a

graph that b elongs to F For a graph G V E a kediting set for a graph family F

or an F keditingisasetK F H of at most k edges so that V E F n H F

The Minimum Vertex Deletion problem is dened as follows given an arbitrary

graph G V E nd a set of vertices F such that jF j is minimum and the induced

subgraph G V n F E V n F V n F b elongs to F

V nF

In the Maximum Problem one wishes to maximize jV n F j rather

then to minimize jF j The optim um solutions for b oth problems give the same graph

but they dier when approximation is sought

In the sequel we shall abbreviate minimum completion problem for prop erty F

to F Completion Names of other problems will b e abbreviated in the same manner

We shall study for example Chordal Completion Perfect Editing and Chain Deletion

The imp ortance of edge mo dication problems is b oth theoretical and practical

From the theoretical p oint of view edge mo dication problems are natural problems

in Moreover several fundamental problems in graph theory suchas

maximum matching and maxcut can b e formulated as edge mo dication problems

From the practical p oint of view edge mo dication problems have imp ortant

applications Chordal Completion which is also called the minimum l lin problem

arises in numerically p erforming a Gaussian elimination on a sparse symmetric matrix

such that a minimum numb er of new nonzero elements is intro duced see

Mo dication problems with resp ect to interval and prop er interval graphs have

application in physical mapping of DNA Cluster graph mo dication problems

arise in numerous application areas where clustering is called for see eg

In general mo dication problems arise when the input data that is supp osed to

b e represented byanF graph contains errors False negative errors corresp ond to

missing edges that should b e added and false p ositive errors corresp ond to edges that

must b e removed One seeks a solution minimizing the numb er of errors allowing

one or b oth kinds of errors

Denitions

All graphs in this study are nite and without parallel edges and self lo ops Let

G V E denote a graph For w V dene Adj w fxjw x E g Given a

subset A V of the vertices we dene the subgraph induced byA tobe G A E

A A

where E fx y E jx y Ag

A

graph of G is any graph A E obtained by taking A V and E E If A sub

A

G is a subgraph of H then we shall also say that H is a sup ergraph of G

The complement of G is the graph G V E E fx y V V jx y and

x y E g

A in a graph G is a complete induced subgraph K is a clique graph of size

l

l A clique is maximum if there is no clique of G of larger cardinality G is the

number of vertices in a maximum clique of G

wo of which A stable set or an independent set in a graph is a subset of vertices no t

are adjacent

A prop er ccoloring of a graph is a partition of its vertices V X X X

c

such that each X is an indep endent set The chromatic number of G denoted G

i

is the smallest p ossible c for which there exists a prop er ccoloring of G

A graph G is bipartite if G has a coloring ie its vertices can b e partitioned

into two indep endent sets

Asequenceofvertices v v is called a path of length k ifv v E for

k i i

i k A path v v is called simple if v v for i j A simple path

k i j

v v iscalledchord less if v v E ji j j P denotes a chordless path

k i j k

on k vertices

Asequenceofvertices v v is called a cycle of length l ifv v E

l i i

for i l and v v E A cycle v v is called simple if v v for i j

l l i j

A simple cycle v v is chord less if v v E ji j j or ji j j l C

l i j k

denotes a k vertex chordless cycle

For a graph G and an integer l l G or lG is the graph formed by l indep endent

copies of the graph G

Let H b e a graph A graph G is called H free if G do es not contain any induced

copyof H For example a graph is a clique i it is K freeFor a family F of

graphs wesay that a graph G is Ffree if it is F free for every F F

A family of graphs F is hereditary if for any graph G F every induced subgraph

of G b elongs to F

Agraph G V Eis perfect if G G for every A V

A A

A graph G V EisBerge if it contains no induced chordless cycle of o dd size

and no induced complementofa chordless cycle of o dd size

e will b e mainly interested in some graph sub classes of p erfect graphs classes or W

in closely related classes For a survey on p erfect graphs and for more information

ab out these prop erties see

A bipartite graph G P Q E is called a chain graph if the vertices of one of the

parts say P can b e ordered v v v so that Adj v Adj v Adj v

k k

recall that Adj v fxjv x E g This prop erty is called the chain property

This prop erty is hereditaryasifweremovea vertex from P or Q the chain of

containments do es not change For an example of a chain graph see Figure

Figure a chain graph

An undirected graph G is called chordal or triangulated if it contains no induced

chordless cycle of length greater than three

An orientation of an undirected graph G V E is an assignment of a single

directed arc uv or vu to each undirected edge u v E The resulting directed

graph V F is also called an orientation of G

An undirected graph G V Eisa comparability graphortransitively orientable

if there exists an orientation V F ofG satisfying if ab bc F then ac F In that

case F is also called a transitive orientation of G

A graph is called a complement reducible graph or a cographifitcontains no

induced P a path with four vertices

K I of V so A graph G V E is called a split graph if there is a partition

that K induces a clique and I induces an indep endentset

A graph G V E is called a threshold graph if there is a partition V V of

V such that G is a clique G is an indep endent set and the vertices of V can b e

V V



ordered w w w so that Adj w Adj w Adj w

k k

Agraph G is called a cluster graph if it is a disjoint union of cliques The cluster

prop erty is hereditary

A family of graphs F has a forbidden set characterization if there is a set H of

graphs such that a graph is in F i it do es not contain any graph of H as an induced

subgraph F has a nite forbidden set characterization if H is nite It is easy to see

that a family of graphs F is hereditary i it has a forbidden set characterization

A graph prop ertyisnon trivial if it is true for innitely many graphs and false for

innitely many graphs

Previous results

Yannakakis and Lewis showed that the vertex deletion problem for any non trivial

hereditary family of graphs is NP hard Furthermore Lund and Yannakakis

proved that for any nontrivial hereditary prop erty and for every the maximum



log n

induced subgraph problem cannot b e approximated with ratio in quasi

p olynomial time unless P NP

Most of the results obtained so far concerning edge mo dication problems are

NP hardness ones However unlike the situation for vertex deletion no general

hardness results for edge mo dication is known

In Yannakakis solved a central op en problem dealing with the complexityof

Chordal Completion He showed that this problem is NP complete using a reduction

ed to b e NP complete using a from Chain Completion The latter problem was prov

reduction from the optimal linear arrangement problem As noted in his pro of

implies that Interval Completion and Unit Interval Completion are also NP complete

Interval Completion was directly shown to b e NP complete by Garey et al see

problem GT in and by Kashiwabara and Fujisawa Deletion problems on

interval graphs and unit interval graphs were proved to b e NP complete by Goldb erg

et al Threshold Completion and Deletion were shown to b e NP complete

in The hardness of Comparability Completion and Deletion was shown in

and resp ectively

Variants of the completion problem in which the input graph is precolored and

the ob jective is to nd a sup ergraph satisfying a sp ecied prop erty suchthatitis

prop erly colored by the input coloring were also shown to b e NP complete Goldb erg

et al proved that Colored Unit Interval Completion is NP complete Golumbic et

al and Fellows et al proved indep endently that Colored Interval Completion

is NP complete Bo dlaender and de Fluiter strengthened this result by showing

that the latter problem is NP complete even if the numb er of colors is at most They

also gave a quadratic algorithm in the number of vertices for solving the Colored

Interval Completion problem on colored graphs Colored Chordal Completion

was proved by Bo dlaender et al to b e NP complete McMorris et al showed

that this problem is p olynomial when the numb er of colors is xed

A generalization of the colored graph completion problems is to nd a sup er

graph satisfying a given prop erty which do es not include forbidden edges which are

predened Problems of this typ e are called sandwich problems Golumbic and

Shamir proved that the interval sandwich problem is NP complete Their pro of

was mo died in to prove that the unit interval sandwich problem is also NP

complete Golumbic et al showed that sandwic h problems for chordal graphs

comparability graphs p ermutation graphs circulararc graphs and several other fam

ilies of graphs are NP complete They also proved that the sandwich problem is

p olynomial for split graphs threshold graphs this was rst shown by Hammer et al

and other families of graphs

As for editing problems the following results were obtained Chordal Editing was

proved to b e NP complete by BenDor using a reduction from Chordal Com

pletion Connected Bipartite Interval Caterpil lar Editing was proved to b e NP

complete by Cirino et al and Split Editing was solved in p olynomial time by

Hammer and Simeone

Since most edge mo dication problems discussed ab oveareNP complete it is

natural to investigate their parametric complexity Kaplan and Shamir gavea

p olynomial algorithm for the interval sandwich decision problem restricted to b ounded

degree input graphs whenever the solution has b ounded clique size or b ounded degree

The results in however imply that the problem of nding an interval sandwich

graph with a small clique is hard in the parametric sense if the parameter is the

size of the clique In Kaplan et al proved that Chordal Completion and Unit

Interval Completion are xed parameter tractable where the parameter is the number

of edges added The problem of altering a graph to one having a sp ecied prop erty

by deleting at most i vertices deleting at most j edges and adding at most k edges

where i j k are xed integers was proved by Cai to b e xed parameter tractable

for any hereditary prop erty that has a nite forbidden set characterization

Approximation results for edge mo dication problems are few In it was

proved that the minimum numb er of edit op erations needed to convert a graph to

a caterpillar cannot b e approximated in p olynomial time to within an additive term

of O n for unless P NP For Chordal Completion Agrawal

et al gave a p olynomial algorithm that guarantees an approximation factor of

p

O d log jV j for the case that the degree in G is b ounded by dTheyalsogavean

O jE j log jV j factor p olynomial approximation algorithm for the general case

For Interval Completion a p olynomial O log jV j approximation algorithm was given

by Agrawal et al in These algorithms approximate the total numb er of edges in

the sup ergraph containing G and not the numb er of mo dications

A p olynomial algorithm for Chordal Completion was recently given by Natanzon

et al The algorithm constructs a triangulation whose size is at most a constant

times the optimum size squared for the general case and an O d log kd factor

for the b ounded degree case A p olynomial approximation algorithm achieving an

O opt ratio is also given in for Chain Completion

Summary of thesis results

The thesis has four parts In the rst part weshow NP hardness of some problems

The results of this part as well as related results from the literature are summarized

in the following table

graph family deletion completion editing

problem typ e

chain graphs NP complete NP complete op en

chordal graphs NP complete NP complete NP complete

p erfect graphs NP hard new NP hard new NP hard new

Berge graphs NP hard new NP hard new NP hard new

comparability graphs NP complete NP complete NP complete new

split graphs NP completenew NP complete new op en

NP completenew NP complete new op en

cluster graphs NP complete new p olynomial trivial op en

Also NPhard to approximate to within a constant factornew

In the second part wegive some approximation algorithms Wegive a constant

factor algorithm for the edge deletionediting problem for families of graph with

nite forbidden set characterization on graphs with b ounded degrees We also give

an O opt approximation for the Cluster EditingDeletion problem

In the third part wegive p olynomial time algorithms for Chain Editing and

Threshold Editing for b ounded degree graphs

In the fourth part we prove some combinatorial b ounds on the sizes of the comple

tion and deletion of a graph We give a simple example of a graph with n vertices and

O n edges whose smallest chordal sup ergraph contains O n edges We show that



n

p

has the smallest chordal sup ergraph containing a random graph Gn



n log n





has n edges and the smallest p erfect graph containing a random graph Gn





n

edges We also show that the minimum p erfect deletion set of a random graph Gn

has n edges

Outline of the thesis

In chapter we deal with NP hardness of problems Chapter describ es the approx

imation algorithms In chapter we give the p olynomial time algorithms and chapter

deals with the combinatorial results

Chapter

NP Hardness results

In this chapter wegive hardness results for various graph mo dication problems We

shall omit in the pro ofs for the arguments for memb ership in NP and the p olynomi

ality of the reductions as they are standard

Chain and Chordal Deletion

The chordal and chain deletion results were rst obtained by Sharan They

are rep eated for completeness as the chain results are the basis to many of our

subsequent reductions

We need two simple characterizations of chain graphs

Claim If a bipartite graph is a chain graph both parts of the graph have

the chain property

Two edges i j k l are independent if the graph induced on fi j k l g is K

see Figure i j

kl

Figure i j and k l are indep endent edges Solid edges are present broken

lines denote missing edges

Theorem A bipartite graph is a chain graph i it does not contain two

independent edges

Theorem Yannakakis Chain Completion is NP complete

For a bipartite graph G P Q E we dene its complement bipartite graph by

G P Q P Q n E

Lemma Abipartite graph is a chain graph i its complement bipartite graph is

a chain graph

Pro of If G P Q E isachain graph the vertices in P can b e ordered p p

k

so that

Adj p Adj p Adj p

k

but this is true i the complement bipartite graph G is a chain graph since

Adj p Adj p Adj p

k k

in G

Lemma Chain Deletion is NP complete

Pro of Reduction from Chain Completion Given a bipartite graph G P Q E

and a constant k we will form its complement bipartite graph G and use same

constant k Let F b e a set of edges we remove from G giving the graph G

P Q P Q n E F By the previous lemma G isachain graph if and only if

G P Q E F isachain graph Therefore F is a chain k completion set for G i

it is a chain k deletion set for G

For a bipartite graph G P Q E we dene a new graph C GN E

where N P P Q Q where P Q are sets of k newvertices and E

E fu v ju v P P g fu v ju v Q Q g

Lemma G is a chain graph i C G is triangulated

We omit the pro of here A pro of of a similar lemma will b e given in a next chapter

Theorem Chordal Deletion is NP complete

Pro of Reduction from Chain Deletion Given a bipartite graph G P Q E and

a constant k we build the graph C G and give the same constant k By the previous

lemma if the deletion of a set of edges from G forms a chain graph then deletion of

the same set of edges in C G givesachordal graph

Conversely supp ose F is a chordal k deletion set in C G and supp ose G

P Q E n F isnota chain graph This means that G contains two indep endent

edges namely there are p p P q q Q st p q p q E n F and

p q p q E n F There are k edgedisjointpathsbetween p and p

passing through P Hence after deleting F there is still a vertexp P connected to

p and p Similarly there is still a vertexq Q connected to q and q The cycle

p p p q q q contains an induced cycle of length at least four a contradiction

Perfect Mo dications

In this chapter weprove that completion deletion and editing are all hard for p erfect

graphs We need a key prop erty of p erfect graphs

Theorem The perfect graph theorem Lovasz A graph is perfect if

and only if its complement is perfect

We will also use the simple fact that for anyodd k C and C are not p erfect

k k

Perfect Completion

Theorem Perfect Completion is NP Hard

Pro of Reduction from Chain Completion The input is a bipartite graph G

P Q E and a constant k We use the same constant k and form the following

graph P GN E N P Q C where

jq q Q q q i k g C fv v v

q i q q i q q i q

  

and E E E E where

E fu v ju v P g

and

v v v v v q jq q Q i k g E fq v

q q q i q q i q q i q q i q q i q i

     

For an example of a graph pro duced by the reduction see Figure

Lemma If G is a chain graph then P G is a perfect graph vertices of set C

q1 q2 q3 q4 vertices of set Q

p1 p2 p3 p4 vertices of set P

Figure A part of the graph P G in the pro of of Theorem

Pro of Let H V E b e an induced subgraph of P G We will show that H

H ie the coloring number of H is equal to the size of the maximum clique in

H Let V V V suchthatV P and V Q C We distinguish two cases

Case If there is a vertex in V which is connected to all the vertices in V the

clique number of H is at least jV j this vertex and the vertices of V form a

clique

We can color H with jV j colors in the following way

with jV j colors Color the vertices of V

Color the vertices that b elong to Q with color number jV j

Color all the vertices of typ e v with color number jV j

q q i



Color all the vertices of typ es v v with color number jV j

q q i q q i

 

This is a vertex coloring of H with jV j colors therefore H H but since

for every graph GG G wehave H H

Case If no vertex in V is connected to all vertices in V the maximum clique

size of H is at least jV j Since G is a chain graph there is an order q q q of

k

the vertices of V Q such that Adj q Adj q Adj q This means that

k

there is a vertex p in V such that no vertex in V Q is connected to pWe will color

the vertices of H with jV j colors in the following way

Color the vertices of V with jV j colors

Color the vertices of V Q with the color of p

Color the vertices of typ e v with the color of p

q q i



v with any color dierent from p Color the vertices of typ es v

q q i q q i

 

j we used colors so H H If jV j weused jV j colors If jV j orjV

if the graph has edges A graph with no edges is trivially p erfect with a clique size

and coloring number on any non empty induced subgraph Therefore for every

induced subgraph H of P G H H and by denition P G is p erfect

Lemma Thereisaperfect supergraph of P G QGN E F such

that jF jk i there is a chain kcompletion set of G Moreover every perfect

kcompletion set of P G is a chain kcompletion set of G

Pro of The previous lemma shows that QG is a p erfect graph if adding F

to G forms a chain graph

Supp ose F is a p erfect kcompletion set for P G Note rst that F

Q Q since if q q F q q Q then q and q are incidenton k

cycles fq v v v q g for i k that cannot all b e destroyed

q q i q q i q q i

  

by the other edges in F LetF F P Q Assume the contrary that G

P Q E F isnotachain graph Therefore G contains a pair of indep endent edges

p q p q suchthat p p P and q q Q But this means that the graph

QG contains an induced C There are k induced C subgraphs in P G F

g fp q v v v q p j i k

q q i q q i q q i

  

Since jF j k the only way to destroy all the k cycles by adding edges

from F is by the edges p q p q orq q The rst two edges are disallowed

by the indep endence assumption and the last are ruled out by the rst note ab ove

This concludes the pro of that Perfect Completion is NP hard

Note that since the complexity of p erfect graph recognition is op en this pro of do es

not imply NP completeness

Perfect Deletion

Theorem Perfect Deletion is NP Hard

Pro of Reduction from Perfect Completion Given a graph G and a constant k we

form its complement G and use the same constant k

By the p erfect graph theorem H is a p erfect k completion set of G i H is a p erfect

k deletion set of G

Perfect Editing

Theorem Perfect Editing is NP Hard

Pro of Reduction from Chain Completion The reduction is very similar to the

reduction for Perfect Completion

Given a bipartite graph G P Q E and a constant k we use the same constant k

and form a new graph P GN E N P Q C D where

jq q Q q q i k g v v C fv

q q i q q i q q i

  

and

D fw w w jp P q P Qp q i k g

qpi qpi q pi

E E E E E where

E fu v ju v P g

and

q jq q Q i k g v v v v v E fq v

q q i q q i q q i q q i q q i q q i

     

E fp w q w p w w w w qjp q E E i k g

pq i pq i pq i q q i q q i q q i

  

In words P is made a clique anytwo Q vertices are connected by k new

disjoint paths and any edge with b oth ends in P Q is made a chord in k new

indep endent cycles For an example of the graph generated by the reduction see

Figure

Lemma If G is a chain graph then P G is a perfect graph

Pro of The pro of is along the lines of the pro of Lemma for Perfect Completion

Lemma Every perfect kediting set F of P G satises F E Moreover

F P Q is a chain k completion set for G

Pro of Let F b e a p erfect k editing set for P G and let the mo died graph b e

P G Since each edge in E E is protected by k C s the removal of suchan

edge is imp ossible In particular F E Note also that adding an edge to Q Q

is also imp ossible since it would form k C s involving Q and C vertices Let

F F P Q By the ab ove arguments F P Q n E Supp ose P Q E F is

not a chain graph Then it contains indep endent edges p q and p q But then

p q p q are part of k C s which could not b e destroyed in P G contradicting

the assumption that P G is p erfect

In conclusion there is a p erfect k editing of P G i there is a chain k completion

of G

vertices of set C

q1 q2 q3 q4 vertices of set Q

vertices of set D

p1 p2 p3 p4 vertices of set P

vertices of set D

Figure A part of the graph P G in the pro of of Theorem

The pro ofs for the p erfect mo dication problems also apply for the Berge graph

mo dication problems therefore we can conclude

Theorem Berge Graph Deletion Completion and Editing are NP hard

Comparability Editing

In this chapter we show that Comparability Editing is NP complete and prove also

a hardness approximation result for it

A cut in a graph G V E is a partition V V of V ie V V V and

V V The weight of a cut V V is the numb er of edges v w E so that

v V and w V The MaxCut problem is to decide given a graph G and an

integer k whether there is a cut whose weightisatleast k For an example of a

graph with edges and a cut of size see Figure The MaxCut problem is

NP complete

a b

c

d

Figure MaxCut In this graph the cut fcg fa b dghasweight

Theorem Comparability Editing is NP complete

Pro of Reduction from MaxCut Given a graph G V E and a constant k

we use the constant k jE j k and form a new graph C GN E where

N V E W Q and

v

jv V i k g W fw

i

jp q E g Q fe

pq

and E E E where

v v

E fv w jv V w W i k g

i i

e e wjv w V v w E g e E fv e

vw vw

vw vw

for eachv w E the choice of which one of w v to connect to e is arbitrary

vw

Here we used e to denote the vertex in C G corresp onding to the edge a binG

ab

In other words we attach to each original vertex in G k private neighb ors and

replace each original edge by a path of length For an example of a graph obtained

by the reduction see Figure

abcd

1 edc e 1 1 bc e 1 ab eac

eeeeab ac bc dc

Figure The graph CG built by the reduction applied to the graph of Figure

with k

We shall prove that there is a comparability k editing H N RofC Gi

there is a cut of weight k in G

Supp ose V V is a cut of weight k in GFor each noncut edge

e v w V V V V E

we remove one edge of the corresp onding path in C G For the edge e v wwe

remove the edge e e We removed a total of jE j k edges from C G Let

vw

vw

the remaining graph b e CoGN E We will build a transitive orientation of

that graph which implies that this graph is a comparability graph Orienteach arc

incidenton v V resp v V out of v resp into v Orient the remaining edges

along the paths to resp ect transitivity The only orientation conicts may arise if a

path is intact no edge was deleted in it But then since the vertices at the ends

of the path b elong to dierent parts the orientations along that path are transitive

For an example see Figure

abcd 1 ead 1 ebc

e11e ab ac

eeee

ab ac bc ad

Figure Orientation of the graph after removal of one edge

Supp ose H N R is a comparability graph obtained by k editing of

C G Let F b e a transitive orientation of H

Lemma For a vertex v V either al l arcs in F are incoming into v or they

are al l outgoing from v

Pro of Supp ose H was formed by deleting l edges and adding l edges in C G By

assumption l l k

i i i i

Let W w W jvw F W w W jw v F Assume that W

v v v v

i

and W Since there are k of the typ e v w inC G and we deleted at most

v

k l k l k l

  

l edges either jW j or jW j Wlog assume jW j

and let u W Since uv F and for all w W vw F by transitivity of the

k l k l

 

j edges to H But l j j k a orientation we added at least j

contradiction

Dene now a partition V V of V by letting V contain all the vertices incidenton

incoming arcs in F We shall prove that there are at least k edges in the cut V V

For every edge v w E let us lo ok at the set of three edges in the corresp onding

path

fv e e e e wg

vw vw

vw vw

There are jE j such pairwise disjoint edge sets Since we mo died at most jE j k

edges from E there are at least k such paths that remained untouched ie with no

change to the graph induced by the path of vertices For each such path the edge

w are in the same side of the cut v wmustinthecutV V Otherwise v and

whichwould form a contradiction to transitivity

Note that the same reduction ab ove with a very similar pro of shows that Com

parability Completion is NP hard a fact that was already shown in The same

reduction also shows that Comparability Deletion is NP hard whichwas already

shown byYannakakis

Wenow show that the problem is also hard for approximation

Lemma If Comparability Editing can be approximatedinpolynomial time with

ratio then MAXCUT can be approximated within a factor of

Pro of Supp ose there is a p olynomial approximation algorithm A for Comparability

Editing that guarantees an approximation ratio of for some To approx

imate MaxCut on a graph G V E apply to G the reduction in Theorem

with k set to zero and obtain the graph C G Apply algorithm A to C G and obtain

an edited graph C G that has a transitive orientation F

We argue rst that for every v V G either all arcs in F are oriented into v

or all are oriented out of v For pro of observe that a maximum cut in G has value

jE j

since every graph has a cut with at least half its edges A minimum editing k

in C G has size k at most jE j delete one edge in each path so the approximation

algorithm gives a solution with value at most jE j The same argument as in Lemma

proves the claim

Dene a partition V V of V by letting V contain all vertices with incoming

arcs according to F The numb er of deleted edges in C Gisatmost k so

at least jE j k paths corresp onding to E edges remain unchanged and

the corresp onding cut in G has cardinality at least jE j k By the pro of

opt opt

of Theorem the value of the maximum cut in G k satises k jE jk

opt

jE j k

This implies that the approximation ratio of our solution is at least

opt

k

jE j jE j

opt

where the inequality follows since k

opt

k

Since approximation of MaxCut to within a factor of is NP hard we

conclude

Theorem It is NP hard to approximate Comparability Editing to within a fac

tor of

As the same reduction ab ove applies to Comparability Completion and Deletion

we also obtain the following result

Theorem It is NP hard to approximate Comparability Completion and Com

parability Deletion to within a factor of

Cluster Deletion

The Cluster Completion problem is trivially p olynomial we simply have to transform

every connected comp onentinto a clique

Theorem Cluster Deletion is NP complete

Pro of Reduction from Clique In the Clique problem given an input G V E

and a constant k the question is whether G contains an induced k clique This

problem is NP hard Let jV j nWe build a new graph H V WE F

jW j n and F fv wjv V Ww W g In other words we add to G an

n

n clique and fully connect it with V We set the constanttok n n k

Let K b e a clique of size k in G Deleting the edges D fv w E F jv

k

n k

V n K w V W gwe get a cluster graph and jD jn n k k

k

Let H b e a cluster graph obtained by deleting k edges from H where

k k Let R b e the subgraph of H induced byWLetXbea maximum clique in

R If jX jn n then at least nn n k edges were deleted from the induced

subgraph H SojX j n n Let jX j n t tn If all the cliques in G are

W

of size k or less then at least n k vertices do not b elong to same clique as

X in H Therefore we deleted at least

n k n ttn t n k n n k

edges a contradiction

Cograph Deletion and Completion

Theorem Cograph Deletion is NP complete

Pro of By reduction from Chain Deletion Let GP Q E k b e an instance

of Chain Deletion Build the following instance G V E k of Cograph

Deletion Dene V P Q W where W fw w g Dene E E

k

fp p jp p P W g

Let F b e a solution to the chain instance If G V E F is not a cograph

G contains an induced P v v v v Since P W is a clique and Q is an

v v Q and v v P W This means that v v and indep endent set

v v are two indep endent edges a contradiction

If F is a solution to the cograph instance and F E is not a solution to the chain

instance then there exist a pair of indep endent edges e e in P Q E n F

Let e p q e p q p p P q q QIfwe did not remove

the edge p p fromG q p p q is an induced P Otherwise for some

i k we did not remove the edges w p w p since we removed

i i

at most k edges and thus q p w p q is an induced P in V E n F a

contradiction

The complement of a cograph is a cograph therefore we conclude

Theorem Cograph Completion is NP complete

Split Graphs Deletion and Completion

Theorem Split Deletion in NP complete

Pro of Reduction from CLIQUE Let GV Ek b e an instance of

CLIQUE Build the following instance G V E k n n k



of Split Deletion Dene V V W where W fw w g and dene

n

E E V W

If G has a clique K of size at least k then denote K K fw g and partition V

into K V n K The numb er of edges that should b e deleted from G so that it

nk

b ecomes a split graph with resp ect to this partition is at most n n k

n n k

Assume that G has a k deletion set resulting in a split partition K I If

jK V j k then at least n n k k edges in V n K W n K

should have b een deleted from G jW n K jn since W is an indep endent set

and K is a clique jV n K jn k since jV K jk a contradiction

The complement of a split graph is a split graph therefore we conclude

Theorem Split Completion is NP complete

Chapter

Approximation Algorithms

Mo dication problems on b ounded degree graphs

for families characterized by a nite set of for

bidden induced subgraphs

We present b elow a p olynomial time constant factor approximation algorithm for the

Deletion and Editing problems on b ounded degree graphs The result applies to any

hereditary family that can b e characterized by a nite set of forbidden induced sub

graphs These include among others cographs trivially p erfect graphs split graphs

threshold graphs Meyniel graphs clawfree graphs sup erfragile graphs maxibrittle

and sup erbrittle graphs cf Note that for some of these families the complexity

of Editing and Deletion is op en and they may in fact b e p olynomial An analogous

result for vertex deletion problems was given byLundandYannakakis

Let b e an hereditary graph prop erty that can b e characterized by a nite set

F of forbidden induced subgraphs Let G V E b e the input graph We assume

that each forbidden subgraph contains at most t vertices and that G has maximum

degree at most d

Edge Deletion approximation algorithm

We distinguish three cases

Case No forbidden subgraph contains an isolated vertex We use V Gto

denote the vertex set of the graph G The approximation algorithm follows

Algorithm AG F

A

While G contains an induced subgraph H isomorphic to some F F do

V nA

A A V H

Remove all edges fv w E v A w V g from G

The algorithm is clearly p olynomial since a forbidden induced subgraph with t

t

vertices can b e found in O n time

Theorem The algorithm achieves an approximation ratio of td

Pro of Correctness After Step is completed G contains no forbidden induced

V nA

subgraph After Step is completed all vertices in A b ecome isolated Since no

forbidden induced subgraph contains an isolated vertex up on termination of the

algorithm the remaining subgraph has prop erty

Approximation ratio Let F b e an optimum solution set of size k For any

forbidden induced subgraph H found at Step of the algorithm F must contain an

edge incidenton H Hence at the end of the algorithm jAj kt and at most ktd

edges are deleted from G

Case The set F contains graphs with isolated vertices but no forbidden sub

graph is an indep endent set In case n dt the problem is solved exactly by

exhaustive search Otherwise we dene a new set of forbidden graphs F by remov

ing the isolated vertices from the original forbidden graphs Algorithm A is then

applied to the input G F

Let F b e a forbidden graph with some isolated vertices Let F b e the graph after

removing the isolated vertices

Lemma Let G beagraph with ndt vertices and boundeddegree d Either G

contains an inducedcopy of F or it does not contain an inducedcopy of F

Pro of Assume the contrary Let H b e an induced copyofF LetS denote the set

of vertices not adjacenttoanyvertex in H Since jH j t jS j n t ddt

Since the graph induced on S also has degree b ounded by d it contains an indep endent

set S of size t Hence S H contains an induced copyof F

Theorem The algorithm achieves an approximation ratio of td

Pro of Let G b e a solution to the edge deletion problem where jV G j dt By the

previous lemma G cannot contain a forbidden subgraph in F therefore by Theorem

the algorithm achieves an approximation ratio of td

Case The set F contain a graph which is an indep endent set If n d t

the problem is solved exactly by exhaustive search Otherwise there is no solution

for the deletion problem since a graph with d t vertices and b ounded degree d

contains an induced indep endent set of size t

Edge Editing approximation algorithm

Again we distinguish three cases

Case No forbidden subgraph contains an isolated vertex In this case we use

Algorithm AG F

Theorem The algorithm achieves an approximation ratio of td

The pro of is along the lines of the Pro of

Case The set F contains graphs with isolated vertices but no forbidden sub

graph is an indep endentsetWe dene a new set of forbidden graphs F byremoving

the isolated vertices from the original forbidden graphs The approximation algo

rithm solves the problem exactly by exhaustive searchifn dt and otherwise

applies Algorithm AG F In the analysis we concentrate on the situation when

ndt

Lemma If the minimum editing graph contains a copy of F F the minimum

ndt

editing size is at least

b e a minimum editing graph Supp ose Pro of Let G b e the original graph and let G

G contains an induced copy H of F Let N H fv G n H ju v E for some

G

v H g Then jH jt andjN H jt d Let S V n H N H

G G

Then jS j n t t d n t dn td In G at least

S

jS jtd vertices have new edges incident on them or else G contains a copyof F

G contains an indep endent set of size t since there are at least td vertices in S

with no added edges and a graph with td vertices and degree b ounded by d must

ntd

have an indep endent set of size t This implies that at least edges were

added

Theorem The algorithm achieves an approximation ratio of td

Pro of Let G b e the original graph and let G b e a minimum editing graph If G

contains no copyof F F by Theorem the approximation algorithm achieves an

ntd

approximation ratio of td Otherwise by the previous lemma at least edges must

b e edited in G Since Algorithm A only deletes edges the numb er of edges it deleted

nd

Hence the approximation is no more than all the edges in G which is at most

nd

nd d

td ratio is at most

ndt n

Case The set F contains a graph which is an indep endent set We rst claim

that in that case F cannot contain also another F that is a clique b ecause by Ram

seys Theorem cf a graph with no indep endent set of size l and no clique of size

k has at most a constantnumber of vertices For a graph smaller than that numb er

the problem is solved exactly otherwise the editing problem approximation solution

is to add all the edges ie to turn G into a clique

n

l

Lemma Agraph with no independent set of size l has at least l edges

Pro of Supp ose the largest indep endent set in any induced subgraph of G V E

is l Consider a largest indep endent set S of G By the maximalityofS eachof

the vertices of V n S is connected to a vertex is S Therefore there are at least

n l edges b etween S and V n S The graph G is an induced subgraph of G

V nS

therefore the same argument applies By induction the numb er of edges in G is at

n

n

P

l

l

least li l

i

n

l

edges and no indep endent set of size l is An example of a graph with l

n

l K

l

Theorem The algorithm achieves an approximation ratio of t

Pro of Let G b e the original graph Let G b e a solution to the minimum editing

n n

nd

t t

edges ie at least t problem By the previous lemma G contains at least t

n

edges were added Since the approximation algorithm adds at most edges it

achieves a ratio of t The algorithm generates a feasible solution since F cannot

contain a clique if n is large enough

We can now summarize the results of the two previous subsections

Theorem For the Deletion and Editing problems with respect to any graph prop

erty that is characterized by a nite set of forbidden inducedgraphs there exist a

polynomial time constant factor approximation algorithm on boundeddegreegraphs

Cluster Deletion and Editing

In this chapter we present approximation algorithms for Cluster Deletion and Cluster

Editing The family H of cluster graphs is characterized by the forbidden induced

subgraph P ie a path of length two Such a path can b e found in linear time using

breadth rst search

Cluster Editing

The algorithm for the editing problem is describ ed b elow The input is a graph

G V E

A

While there exist W V that induces a P set V V n WA A W

Denote by V the set V after Step is complete G is a cluster graph For

V

on G and determine the each x A compute a minimum editing set E

x

V fxg

clique in G asso ciated with x by this editing

V

Perform the editing for each x A indep endently

For each x y A if x y are adjacent and asso ciated with dierent cliques

removex y If x y are nonadjacent and asso ciated with the same clique add

x y

Claim For each x A a minimum edit set E can be found in polynomial time

x

in Step Moreover there is such a set that includes only edges incident on x

Pro of Let C C b e the cliques in an optimal editing of G Wlog assume

l

V fxg

that x C Let C C nfxgIfC contains vertices of two or more cliques in G

V

then C A A such that the vertices of A b elong to one clique and the vertices

of A b elong to other cliques Removing the edges we added b etween the vertices of

A we get a cluster graph with A and A and removing the edges b etween x and

no more edits we did at most jA j new edits and canceled at least jA jjA j added

edges Therefore we can assume that C contains vertices of only one clique in G

V

In the same waywe can show that C can contain all the vertices of that clique ie

we can assume C is a clique in G Since we can assume that C is a clique in G

V V

we can also assume that C C are cliques in G b ecause changing them will

l

V fxg

only increase the numb er of edits

Computing a minimum editing set E can b e done by calculating for each clique

x

in G the numb er of edits needed to b e done if we add x to it ie for each clique we

V

ha ve to add the missing edges b etween x and the clique and delete the edges b etween

x and other cliques This can b e done in linear time for all x A

For x Aletk jE j as dened in Step of the algorithm Dene k

x x

Maxfk jx Ag Let t b e the size of a minimum editing set for G

x

jAj

Claim t Maxf k g

Pro of Let y ar g maxfk jx Ag Since the subgraph induced by G must b e

x

V fy g

a cluster graph at least k edit op erations are needed Since on any induced P in A

jAj

an edit op eration must b e p erformed it follows that t

jAj

Claim The algorithm performs at most jAjk edge edit operations

jAj

Pro of Step requires at most k edits p er vertex in A Step requires at most

edits

Theorem The algorithm nds a cluster graph and guarantees an approximation

ratio of t

jAj

jAj

Pro of The algorithm do es at most jAjk edits Since t Maxf k g we

jAj jAj jAj

jAjk



jAjk

  



obtain the ratio jAj jAj t

t t t jAj

Cluster Deletion

The algorithm for the deletion problem is given b elow The input is a graph G

V E

A

While there exist W V that induces a P set V V n WA A W

Denote by V the set V after Step is complete For each x A compute

a minimum deletion set E on G and determine the clique C in G

x x

V fxg V

asso ciated with x by this deletion

Perform the deletion for each x A indep endently

For each x y A if x y are adjacent and asso ciated with dierent cliques remove

x y If x y are nonadjacent and asso ciated with the same clique remove all

edges incident on either x or y

Claim For each x A a minimum deletion set E can be found in polynomial

x

time in Step Moreover there is such a set that includes only edges incident on x

The pro of is along the lines of the pro of of Claim

For x Aletk jE j as dened in step of the algorithm Dene k

x x

Maxfk jx Ag A clique C is called split if there are x y A x y E and C

x x x

C In that case x and y are called splitting vertices for C Let k MaxfjC jjC

y x x x

is a split clique gLett b e the minimum numb er of edge deletion op erations needed

on G

jAj

Claim t Maxf k k g

Pro of Let C be a maximum split clique and let x y b e splitting vertices for C

x x

Since the induced subgraph G must b e transformed to a cluster graph at least

C fxy g

x

k edges must b e deleted The rest of the pro of is identical to Claim

Theorem The algorithm nds a cluster graph with guaranteed approximation

ratio of t

Pro of In Step at most k deletions are needed p er vertex in A Step requires

jAj

at most deletions b etween vertices of A and at most jAjk more deletions for

jAj

splitting vertices Therefore the algorithm do es at most jAjk k deletions

jAj

k



k k g the ratio is at most t jAj t t t Since t Maxf

t

Chapter

Polynomial Results On Bounded

Degree Graphs

We present b elow p olynomial algorithms for Chain Editing and Threshold Editing on

b ounded degree graphs

Lemma Let be a hereditary graph property such that if G V E satises

G fv g satises ie the property remains satisedifweremove al l the

V nfv g

edges incident on a vertex v Any optimal solution of Editing on a graph with

boundeddegree d is a graph with degreeboundedby d

Pro of Let G V E b e an instance to the Editing problem Assume that a

minimum Editing of G contains a vertex w with degree at least d This means

we added at least d edges incidenton w but if instead weremoved all the edges

incidenton w wewould have p erformed less edits and the resulting graph has the

prop ertyacontradiction

Lemma A chain graph G P Q E with boundeddegree d contains at most

d vertices with degreeatleast one

Pro of Let W b e the set of vertices with degree at least one Let P P W and let

Q Q W There is a vertex q Q q fAdj pjp P g such that q has degree

jP j Similarly there is a vertex p P with degree jQ j If jP Qj d either

jP j d or jQ j d so there is a vertex with degree at least d a contradiction

Theorem Chain Editing on boundeddegreegraphs is polynomial

Pro of Let G V E b e an instance of Chain Editing and let H V E b e the

graph obtained byaminimum chain editing of GIfG has degrees b ounded by d

by Lemma H has degree b ounded byd By Lemma there are at most d

vertices with degree at most one in H The p olynomial algorithm will enumerate all

sets of at most d vertices and for each such set it will solve exactly the Chain Editing

problem The value of each solution is the optimal value of the editing problem on

S plus the numb er of edges not incidenton S at b oth endp oints The b est value

d

obtained is the resulting solution The algorithms runs in O n time

Theorem Threshold Editing on boundeddegreegraphs is polynomial

Pro of T is a threshold graph if T K I E such that K induces a clique I

induces an indep endent set and the graph K I E K I is a chain graph Let

G V E b e an instance of Threshold Editing and let H K I E b e the graph

obtained by a minimum threshold editing of GIfG has degrees b ounded by dby

Lemma H has degree b ounded byd H containsavertex in K which is adjacent

to all nonisolated vertices in K I Since its degree is at most d there are at most

d nonisolated vertices in H The algorithm will enumerate the set K and the

vertices with degree at least one in I For each such set of at most d vertices the

d

algorithm calculates the minimum threshold editing The algorithm runs in O n

time

Chapter

Combinatorial b ounds on the size of

minimum completion sets

A natural combinatorial question arises when dealing with completion problems How

large can the completion set size b e compared to the original graph size As usual

for a graph G V Ewe denote jV j n jE j mFor a family of graphs F dene

m nmax minfjE j GV E E Fg In this notation our question is

F G

How large can m n b e as a function of m and n

F

A concrete example for Chordal Completion

Let F b e the family of chordal graphs We will build a graph G V E where





and m n n m n

F

Let G V WE b e the following bipartite graph V v v v W

n

w w w and E fe v w j i ngInwords G is a p erfect matching

n i i i

on a nvertex graph or nK

Lemma The smal lest chain graph containing G has n edges

Pro of Let H V W E b e the smallest chain graph containing G Wlog we can

assume that the bipartition of H is also V W Since H has the chain prop erty

the vertices of V can b e ordered v v v so that Adj v Adj v Adj v

n n

Wlog vertex numbers v v v satisfy this prop erty in the graph H Since v w

n i i

E wehave w Adj v Since Adj v Adj v ifji it follows that Adj v

i i i j j

w w Therefore jAdj v jn i In fact jAdj v j n i is a minimal

j n i i

P

nn

n

chain graph containing G and jE j n i n

i

For a bipartite graph G P Q E we dene a new graph C GN E

where N P Q E E fu v jv P gfu v j u v Qg ie we replace the

indep endent set on each part by a clique See Figure

Figure C G for the graph G K

We shall need the following lemma due to Yannakakis

Lemma G is a chain graph i C G is triangulated

We are now ready to construct our concrete family of examples Let G V E

n

W where b e the following graph V

i

i

n

W fw w w g

i

i i i

and

j

k i i

E fw w j i j k ng fw w j i j k ng

i

i j k

ie the graph

G W E W W

i i i i

is a clique and the subgraph

ij

G W W E W W

i j i j

is isomorphic to nK

The graph G V Ehas n vertices and

n

jE j n n n

For an example of the graph we build see Figure

w1,3 w2,3 w1,1 w2,1

w1,2 w2,2

w3,2 w3,3

w3,1

Figure the graph we construct in for n

Claim A smal lest chordal completion set of G contains n edges

Pro of Let H V E b e a smallest chordal graph containing G Since chordality

is hereditaryevery subgraph of H is chordal Let us lo ok at the subgraph H of

ij

ij

H induced by W W H is a chordal sup ergraph of G therefore by Lemma

i j ij

ij ij

H W W E W W is a chain sup ergraph of G By Lemma

i j i j

n

n edges must have b een added b etween W and W Since there are dierent

i j

subgraphs fH ji j g andineach such subgraph dierent edges must b e added

ij

n edges were added

We note that Kaplan and Szegedi have recently argued that for certain ex

pander graphs m nand m n n In contrast our example is elemen

F

tary and do es not rely on the deep expander theory

Random graphs

In this chapter we study probabilistically the size of a minimum completion set in

some random graphs For background on random graphs see We start with the

denition of a random graph

Let n b e a p ositiveinteger and let p A random graph Gn pisa

probability space over the set of graphs on the vertex set fng determined by

Pri j G p

indep endently for each pair i j

We give b elow some basic b ounds on large deviations which will b e used in this

chapter The results are based on the seminal pap er of H Cherno

Theorem Cherno bounds Let p X X be mutual ly inde

n

pendent random variables with Pr X p and PrX p and dene

i i

X X X Then

n

a a

aln apnln

pn pn

PrX np a e



a

n

n

If p then PrjX j a e

pn

we have a simple bound When p o a O pn and a

pn

pn



PrX np a PrX np e

Theorem Chebychev Inequality For any positive

q

Pr jX E X j VarX

An event A on a family of random graphs Gn p happ ens with high probability if

lim PrA

n

The minimum chordal completion of a random graph

p

In this chapter we study the random graph Gn We shall show that with high

n

probability G has O n edges but the smallest chordal graph that contains it has



n

edges



log n

p

Claim With high probability the random graph Gn has O n edges

n

Pro of Let m denote the random variable of the numb er of edges in G Dene a

random variable e for each edge i where e ifi E and e otherwise

i i i

n

P



p

Then m e Since the variables are indep endent and Pre

i i

i

n

n

p

E m By Theorem taking say a n PrmEmn

n



n

n

p



n e Prm So with high probability m O n

n

Claim Let X be the number of labeled cycles in G not necessarily induced

With high probability X n

n

Pro of The numb er of p ossible cycles in a graph is wehavetocho ose the

four vertices of the cycle and order them For every p ossible cycles we will dene

n

with X if that cycle is presentinG and X an indicator X i

i i i

P

otherwise Dene X X By linearity of exp ectation wehave

i

n

E X p

ie E X n

X X

CovX X VarX VarX

i j i

i j

since X is an indicator

i

VarX p p p E X

i i i i i

wehave

X

VarX E X CovX X

i j

i j

X X are not indep endent For indices i j we shall write i j if i j and the events

i j

If i j

CovX X E X X E X E X E X X PrX X

i j i j i j i j i j

and therefore

X

VarX E X PrX X

i j

ij

In our case i j if X and X are dierent and have common edges X and X may

i j i j

have one or two common edges There are O n sets with one common edge and the

probability for each such set is p and there are O n sets with two common edges

and the probabilityforeach such set is p

Therefore

VarX O n O n p O n p O n

p

Using Chebychev Inequality Theorem with n

q

p

Pr jX E X j n VarX

n

Hence with probability larger than

n

q

jX E X j nV ar X

Since VarX O n for n suciently large VarX Cn for some constant

C Since E X n for n suciently large An EX Bn for some

q

p

constants A B Since whp jX E X j nV ar X nC n it follows that

p p

for n suciently large An Cn X Bn Cn Therefore whp

X n

Let i j k V Wesay that k is a common neighbor of i and j if k i E and

k j E

Claim Let i j V and let Z be the number of their common neighbors With

high probability Zlnn

Pro of Wlog let i j For k n let Z if i k E and j k E

k

P

n

Z we will add more indep endent dummy and Z otherwise Hence Z

i k

i

variables Z Z to simplify the calculations and PrZ Therefore since

k

n

np by Theorem

lnnlnlnnlnnlnlnn lnn

PrZ lnn e e

n

The last inequality is true since lnn lnlnn lnnln lnn

lnn lnnln lnn lnniflnlnn ie if n

Claim With high probability no pair of vertices i j V has more than lnn

common neighbors

Pro of From the previous claim the probability for a pair i j to have more than

n

Since there are pairs the probability lnn common neighb ors is less than



n

n

that at least one of them has more than lnn such neighb ors is less than



n

ie the probability for this event is less than

n

Corollary With high probability no pair of vertices i j is incident on more

than lnn common cycles such that i j are not neighbors in the cycle

p

Theorem With high probability the random graph Gn has O n edges

n



n

and its smal lest chordal supergraph has O edges



log n

Pro of Let us lo ok at the graph G obtained by minimum chordal completion of a

p

random graph Gn Every cycle in G must havea chord By Claim whp

n

p

Gn hasn cycles not necessarily induced and by Corollary whp

n



n

an edge can b e a chord of at most ln n cycles Therefore G has edges



ln n



n

edges so wemust have added



log n

The minimum p erfect graph containing a random graph

In this chapter we study the random graph Gn We shall show that with high



n

probability G has O n edges but the smallest p erfect graph that contains it has

n edges

Claim For a random graph Gn with high probability m O n



n

The pro of is analogous to that of Claim

Claim With high probability the number of labeled cycles in G is n

Pro of The pro of is along the lines of Claim The numb er of p ossible lab eled

n

wehavetocho ose the vertices of the cycle and order cycles in a graph is

n

them For every cycle we will dene an indicator X i with X

i i

P

if that cycle is presentinG and X otherwise Dene X X By linearity

i i

of exp ectation wehave E X n As shown in Claim VarX E X

P

PrX X where X X if X and X refer to dierent cycles and have

i j i j i j

ij

common edges X and X mayhave one two or three common edges There are

i j

O n sets with one common edge and the probability for eachsuchsetisp there

wo common edges and the probability for eachsuch set is p and are O n sets with t

there are O n sets with three common edges and the probability for each such set

is p

Therefore

VarX O n O n p O n p O n p O n

Again using Equation since E X n then with probability larger than

q

nV ar X O n ie X n jX E X j

n

Claim With high probability the degree of every vertex in G is at most n

Pro of For i n let X if i E and X otherwise Dene

i i

P

n

X X by Theorem wehave

i

i



n



PrX n n e

n

the last inequality is true for large enough n Therefore the probability that there

is a vertex with degree higher than n is less than

n

Claim With high probability thereisnopair of vertices i j V for which there

exist n pairs of vertices k l i j such that i k k l l j E

Pro of For a pair of vertices i j let us lo ok at the set B Adj i Adj j n

ij

fi j g Since with high probability the degree of every vertex is less than n with

high probability jB j n for all i j Wenow argue that the probability that

ij

jE G To see this let us lo ok at the graph Gt j n is less than

B



ij

n

t

n n With probability larger than G has n n



n

edges the pro of is similar to that of Lemma The number of pairs k l such

that i k k l l j E is less than the numb er of edges in G therefore with

B

ij

probability less than there is a pair i j with n such paths

n

Claim Let X be the number common neighbors vertices of vertices i and j in

ij

V With high probability X for al l i j

ij

P

k l

Pro of Let X ifi k E and j k E and dene X X

ij

ij ij

n n

X X

n

i ni i i

PrX C n n n n

ij

i

iC iC

n

X

i C

n n n

n

iC

Therefore with high probability for all i j is X C

ij

Claim With high probability thereisnopair of vertices i j V that arecon

tainedinn cycles such that i j are not neighbors in those cycles

Pro of Supp ose i j are contained in a cycle such that i j are not neighb ors in

this cycle Then there exist three vertices k l m such that i k j k E and

i l l m j m E By the previous claims there are at most C such k s

and O n such pairs of l m Therefore there are O n such cycles

Theorem For the random graph Gn n with high probability the smal lest

perfect graph containing G has n edges

Pro of Consider a minimal p erfect graph G containing GEvery cycle in G has

achord From Claim G has n cycles not necessarily induced and from



n

Claim an edge can b e a chord of O n cycles Therefore G has



n

n edges so we added n edges

The minimum p erfect deletion in a random graph

We shall show that with high In this chapter we study the random graph Gn

probability G has n edges and the minimum p erfect deletion set also contains

n edges Let m be the numb er of edges in G

Claim For the random graph Gn with high probability m n

Pro of Dene a random variable e for each edge e ifi E and e

i i i

n

P



otherwise Then m e Since the variables are indep endent and pe

i i

i



n

n

p n



n



by Theorem Prjm j n n e e Therefore with high

n n

 

probability n m n

Claim Let X be the number of induced cycles in G With high probability

X n

Pro of We assign an indicator X for every set of vertices Let X if the graph

i i

induced by these vevertices is an induced cycle and X otherwise It can b e

i

easily checked that PrX

i



n

P

n



X and by linearity of exp ectation wehave E X n X

i

i

X

VarX E X PrX X

i j

ij

In our case i j if X and X are dierentandhave at least common vertices There

i j

are O n sets with two common vertices O n sets with three common vertices

and O n set with four common vertices The probability PrX X is less than

i j

therefore VarX O n Using Equation since E X n with

q

nV ar X O n ie X n probability larger than jX E X j

n

Claim Any edge in E belongs to O n induced cycles

Pro of Let e E Each cycle containing e as an edge contains only more

i i

vertices and there is a constantnumb er of p ossible cycles on eachsetofvertices

has n edges and Theorem With high probability the random graph Gn

minimum perfect deletion set of size n

Pro of Let us lo ok at a minimum p erfect deletion set of the random graph Gn

Wehave to remove an edge from every induced cycle Since there are n induced

cycles and each edge can participate in at most O n induced cycles wehaveto



n

n edges remove



n

Bibliography

A Agrawal P Klein and R Ravi Ordering problems approximated single

pro cessor scheduling and interval graph completion In Proc ICALP pages

Springer LNCS

A Agrawal P Klein and R Ravi Cutting down on ll using nested dissec

tion provably go o d elimination orderings In A George J R Gilb ert and

J W H Liu editors Graph Theory and Sparse Matrix Computation pages

Springer

N Alon J Sp encer and P Erdos The Probabilistic MethodWiley New York

A BenDor Private communication

Amir BenDor and Zohar Yakhini Clustering gene expression patterns In Proc

RECOMB

H L Bo dlaender M R Fellows and T J Warnow Twostrikes against p erfect

phylogeny InWKuich editor Proc th ICALP pages Berlin

Springer Lecture Notes in Computer Science Vol

HL Bo dlaender and B de Fluiter On intervalizing kcolored graphs for DNA

physical mapping Discrete Applied Math

B Bollobas Random Graphs Academic Press Chapter XI I

Andreas Brandstadt Van Bang Le and JeremyP Spinrad Graph Classes A

Survey Siam Monographs on Discrete Mathematics and Applications

L Cai Fixedparameter tractability of graph mo dication problems for heredi

tary prop erties Information Processing Letters

H Cherno A measure of the asymptotic eciency for tests of a hyp othesis

based on the sum of observations Annals of Mathematical Statistics

K Cirino S Muthukrishnan NS Narayanaswamy and H Ramesh Graph

editing to bipartite interval graphs exact and asymptotic b ounds Technical

rep ort Bell Lab oratories Innovations LucentTechnologies

M R Fellows M T Hallet and H T Wareham DNA physical mapping Three

ways dicult In Proc First European Symp on Algorithms ESA pages

Springer LNCS

M R Garey and D S Johnson Computers and Intractability A Guide to the

Theory of NPCompletenessWHFreeman and Co San Francisco

M R Garey D S Johnson and L Sto ckmeyer Theor Comput Sci

P W Goldb erg M C Golumbic H Kaplan and R Shamir Four strikes

against physical mapping of DNA Journal of Computational Biology

Algorithmic Graph Theory and Perfect Graphs Academic M C Golumbic

Press New York

M C Golumbic H Kaplan and R Shamir On the complexity of DNA physical

mapping Advances in Applied Mathematics

M C Golumbic H Kaplan and R Shamir Graph sandwich problems Journal

of Algorithms

M C Golumbic and R Shamir Complexity and algorithms for reasoning ab out

time A graphtheoretic approach J ACM

S Louis Hakimi Edward F Schmeichel and Neal E Young Orienting graphs to

optimize reachability Information Processing Letters September

P L Hammer T Ibaraki and U N Peled Threshold numb ers and threshold

completions In P Hansen editor Studies on Graphs and Discrete Programming

pages NorthHolland

P L Hammer and B Simeone The splittance of a graph Combinatorica

J Hastad Some optimal inapproximability results In Proc th STOC pages

Full version available as ECCC Rep ort numb er TR

H Kaplan and R Shamir Bounded degree interval sandwich problems Algo

rithmica

H Kaplan R Shamir and R E Tarjan Tractability of parameterized comple

tion problems on chordal strongly chordal and prop er interval graphs SIAM

Journal on Computing

H Kaplan and M Szegedi Private communication

Richard M Karp Reducibility among combinatorial problems In R E Miller

and J W Thatcher editors Complexity of Computer Computations pages

New York Plenum Press

T Kashiwabara and T Fujisawa An NPcomplete problem on interval graphs

In IEEE International Symposium on Circuits and Systems th pages

Institute of Electrical and Electronics Engineers Piscataway NJ

JM Lewis and M Yannakakis The no de deletion problem for hereditary prop

erties is NPcomplete J of Computer and System Sciences

L Lovasz A characterization of p erfect graphs J Combin Theory pages

C Lund and M Yannakakis The approximation of maximum subgraph prob

lems In A Lingas R Karlsson and S Carlsson editors Proceedings of Inter

national ConferenceonAutomata Languages and Programming ICALP

pages Berlin Germany July Springer LNCS

NVR Mahadev and UN Peled Threshold Graphs and RelatedTopics Elsevier

NorthHolland Annals of Discrete Mathematics Vol

F R McMorris T J Warnow and T Wimer Triangulating vertex colored

graphs SIAM J Discrete Math

A Natanzon R Shamir and R Sharan A p olynomial approximation algorithm

for the minimum llin problem In Proceedings of the th Annual ACM Sympo

sium on Theory of Computing STOC pages New York May

ACM Press

A Natanzon R Shamir and R Sharan Complexity classication of some edge

mo dication problems In Proc th Int Workshop WG GraphTheoretic

Concepts in Computer Science SpringerVerlag

J D Rose A graphtheoretic study of the numerical solution of sparse p ositive

denite systems of linear equations In R C Reed editor Graph Theory and

Computing pages Academic Press NY

R Sharan Private communication

PLTchebyshef Des valeurs moyennes Journal de Mathematiques pures et

appliquees pages

L Trevisan GB Sorkin M Sudan and DP Williamson Gadgets approxi

mation and linear programming In Proc IEEE Symposium on Foundations of

Computer ScienceFOCS pages

M Yannakakis No de and edge deletion NPcomplete problems In Proc th

STOC pages

M Yannakakis Computing the minimum llin is NPcomplete SIAM J Alg

Disc Meth