TREE-LIKE STRUCTURE IN GRAPHS AND EMBEDABILITY TO TREES

A dissertation submitted to Kent State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

by

Muad Mustafa Abu-Ata

May 2014 Dissertation written by

Muad Mustafa Abu-Ata

B.S., Yarmouk University, 2000

M.Sc., Yarmouk University, 2003

Ph.D., Kent State University, 2014

Approved by

Dr. Feodor F. Dragan , Chair, Doctoral Dissertation Committee

Dr. Ruoming Jin , Members, Doctoral Dissertation Committee

Dr. Ye Zhao

Dr. Artem Zvavitch

Accepted by

Dr. Javed Khan , Chair, Department of

Dr. James L. Blank , Dean, College of Arts and Sciences

ii TABLE OF CONTENTS

LIST OF FIGURES ...... vii

LIST OF TABLES ...... ix

Acknowledgements ...... xi

Dedication ...... xii

1 Introduction ...... 1

1.1 Research contribution ...... 5

1.2 Publication notes ...... 7

1.3 Preliminaries and Notations ...... 7

1.3.1 Tree-decomposition ...... 11

1.4 Related work ...... 13

1.4.1 Low distortion embedding ...... 13

1.4.2 Embedding into a metric of a (weighted) tree...... 14

1.4.3 Tree spanners ...... 16

1.4.4 Sparse spanners ...... 17

1.4.5 Collective tree spanners ...... 19

1.4.6 Spanners with bounded tree-width...... 20

iii 2 Metric tree-like structures in real-life networks:

an empirical study ...... 21

2.1 Introduction ...... 21

2.2 Datasets ...... 24

2.3 Layering Partition, its Cluster-Diameter and Cluster-Radius ...... 28

2.4 Hyperbolicity ...... 33

2.5 Tree-Distortion ...... 39

2.6 Tree-Breadth, Tree-Length and Tree-Stretch ...... 46

2.7 Use of Metric Tree-Likeness ...... 53

2.7.1 Approximate distance queries ...... 53

2.7.2 Approximating optimal routes ...... 56

2.7.3 Approximating diameter and radius ...... 58

2.8 Conclusion ...... 61

3 Collective Additive Tree Spanners and the Tree-Breadth of a Graph with Con-

sequences ...... 65

3.1 Introduction ...... 65

3.2 Collective Additive Tree Spanners and the Tree-Breadth of a Graph . . . 68

3.3 Hierarchical decomposition of a graph with bounded tree-breadth . . . . 69

3.4 Construction of collective additive tree spanners ...... 72

3.5 Additive spanners for graphs admitting (multiplicative) tree t-spanners . 80

4 Collective Additive Tree Spanners of Graphs with Bounded k-Tree-Breadth, k ≥ 2 81

4.1 Introduction ...... 81

iv 4.2 Balanced separators for graphs with bounded k-tree-breadth ...... 82

4.3 Decomposition of a graph with bounded k-tree-breadth ...... 85

4.4 Construction of a hierarchical tree ...... 87

4.5 Construction of collective additive tree spanners ...... 89

4.6 Additive Spanners for Graphs Admitting (Multiplicative) t-Spanners of

Bounded Tree-width...... 93

4.6.1 k-Tree-breadth of a graph admitting a t-spanner of bounded tree-

width ...... 93

4.6.2 Consequences ...... 95

5 Embedding of Weighted Graphs into Trees: Theoretical Grounds and Empirical

Analysis on Real Datasets ...... 97

5.1 Layering partition for weighted graphs ...... 98

5.2 Properties of layering partition for weighted graphs ...... 99

5.3 Construction of tree embedding ...... 102

5.4 Experiment ...... 107

5.4.1 Datasets ...... 107

5.4.2 Layering partition results ...... 113

5.4.3 Non-contractive embedding results ...... 113

5.4.4 Edge subdivision (h ≤ w)...... 115

5.4.5 Contractive embedding: weighting clusters with their own diameters118

5.4.6 Embedding with recursive partitioning of clusters ...... 118

6 Conclusion and Future Work ...... 123

v BIBLIOGRAPHY ...... 127

vi LIST OF FIGURES

1 A graph and its tree-decomposition of width 3, of length 3, and of breadth

2...... 12

2 Layering partition and associated constructs...... 29

3 Illustration to the proof of Proposition 3...... 38

′ 4 Embedding into trees H,Hℓ and Hℓ...... 42

5 Illustration to the proof of Proposition 9...... 48

6 Distortion distribution for embedding of a graph dataset into its canonic

tree H...... 55

7 Four tree-likeness measurements scaled...... 64

8 Tree-likeness measurements: pairwise comparison...... 64

9 A graph G with a disk-separator Dr(v, G) and the corresponding graphs

+ + G1 ,...,G4 obtained from G. c1, . . . , c4 are meta vertices representing the

disk Dr(v, G) in the corresponding graphs...... 70

10 a) A graph G and its balanced disk-separator D1(13,G). b) A hierarchical

0 0 tree H(G) of G. We have G = G(↓ Y ), Y = D1(13,G). Meta vertices are

shown circled, disk centers are shown in bold. c) The graph G(↓ Y 1) with

1 1 1 its balanced disk-separator D1(23,G(↓ Y )) = Y . G(↓ Y ) is a minor of

G(↓ Y 0). d) The graph G(↓ Y 2), a minor of G(↓ Y 1) and of G(↓ Y 0).

Y 2 = V (G(↓ Y 2)) is a leaf of H(G)...... 73

vii 11 Illustration to the proof of Lemma 4: “unfolding” meta vertices. . . . . 75

12 Illustration to the proof of Lemma 7...... 77

3 13 A graph G with a balanced Dr -separator and the corresponding graphs

+ + + G1 ,...,G4 obtained from G. Each Gi has three meta vertices represent-

ing the three disks...... 86

14 Illustration to the proof of Lemma 14. A tree-decomposition for G is

obtained from a tree-decomposition of H...... 96

15 A layering partition of a weighted graph G...... 100

16 Illustration of proof of Lemma 17...... 105

17 Cluster-width versus average distortion, maximum distortion and number

of dummy vertices for the Celegans dataset...... 116

18 Cluster-width versus average distortion, maximum distortion and number

of dummy vertices for the CornellKing dataset...... 117

viii LIST OF TABLES

1 Known results on approximate embedding problems for multiplicative dis-

tortion; λ is used to denote the optimal distortion and n to denote the

number of points in the input metric. The table contains only the results

that hold for the multiplicative definition of the distortion; there is a rich

body of work that applies to other definitions of distortion, notably the

additive or average distortion, see [17] for an overview...... 15

2 Graph datasets and their parameters: number of vertices, number of edges,

diameter, radius...... 25

3 Layering partitions of the datasets and their parameters. ∆s(G) is the

largest diameter of a cluster in LP(G, s), where s is a randomly selected

start vertex. For all datasets, the average diameter of a cluster is between

0 and 1. For most datasets, more than 95% of clusters are cliques. . . . . 31

4 Frequency of diameters of clusters in layering partition LP(G, s) (three

datasets)...... 32

5 δ-hyperbolicity of the graph datasets...... 35

6 Relative frequency of δ-hyperbolicity of quadruplets in our graph datasets

that have less than 10K vertices...... 36

7 Distortion results of embedding datasets into a canonic tree H...... 44

ix 8 Distortion results of non-contractive embedding of datasets into trees Hℓ

′ and Hℓ...... 45

9 Lower and upper bounds on the tree-breadth of our graph datasets. . . . 50

10 Estimation of diameters and radii...... 59

11 Summary of tree-likeness measurements...... 62

12 Real datasets parameters: n: the number of vertices, m: the number of

edges, the largest edge weight, the smallest edge weight and the diameter

of the graph...... 108

13 Layering partitions of the datasets and their parameters. h is the cluster-

width of LP(s, h) and set equal to the longest edge weight. s is a randomly

selected start vertex...... 114

14 Distortion results for non-contractive embedding of the datasets into tree

H. Cluster-width is equal to the largest edge weight (h = w)...... 115

15 Distortion results for non-contractive embedding of the datasets into tree

H. Cluster-width is less than or equal the largest edge weight (h ≤ w). . 117

16 Distortion results for embedding of the datasets into tree H′. Edges inside

each cluster C are weighted equal to diam(C)/2...... 119

17 Percentage of vertex pairs with distortion up to a given value by embedding

datasets into tree H′ with own diameter weighting...... 120

18 Distortion results for embedding with P-centers partitioning for datasets

into tree H′. P-centers has negligible improvement of distortion for other

datasets of table 12 ...... 122

x Acknowledgements

I would like to express my deepest gratitude and thank my research advisor, Dr.

Feodor F. Dragan, for mentoring me during my PhD study and research. I have learned a lot from him. Without his persistent help, patience and guidance, this dissertation would not have materialized. I cannot thank him enough for his sincere and overwhelming help and support. Also, I would like to thank my dissertation committee, Dr. Ruoming Jin,

Dr. Ye Zhao and Dr. Artem Zvavitch, for their participation, comments and feedback.

Finally, I would like to thank the faculty and staff of the Department of Computer Science at Kent State University for their help and support.

xi This dissertation is dedicated to the memory of my mother, Hajar Ibdah, her endless love, care and support have sustained me throughout my life. Her passion, strength and

faith are the greatest lessons in my life.

xii CHAPTER 1

Introduction

The problem of embedding a graph metric into a “nice” and “simpler” metric space with low distortion has been a subject of extensive research, motivated from several applications in various domains and for its intrinsic mathematical interest. “Nice” met- ric spaces are those with well-studied structural properties, allowing to design efficient approximation algorithms, such as Euclidean or ℓ1 space, lines, weighted trees and dis- tributions over them. A very incomplete list of applications includes approximation algorithms for graph and network problems, such as sparsest cut [14, 126], minimum bandwidth [34, 89], low-diameter decompositions [126], buy-at-bulk network design [16], distance and routing labeling schemes [77, 79, 102, 164], and optimal group Steiner trees

[48, 93], and online algorithms for metrical task systems and file migration problems

[24,26]. These applications, together with its intrinsic mathematical interest, have made the study of low-distortion embeddings of graphs a significant field in its own right.

However, obtaining approximation algorithms for minimum distortion embeddings into certain host spaces (e.g., Rd (d ≥ 1)) has been a notoriously hard problem (see [17,128] and papers cited therein). Therefore, a particular host metric of choice, also favored from the algorithmic point of view, are simple graph metrics.

Again as mentioned earlier, tree metrics are a very natural class of simple graph metrics since many algorithmic problems become tractable on them. Ideally we would like 1 2

that distances in the tree metric are no smaller than those in the original metric and we

would like to bound the distortion or the maximum increase. Formally, A multiplicative

embedding of a graph G = (V,E) into a weighted tree (possibly with Steiner vertices)

T = (V ∪ S,F ) is an embedding such that dG(u, v) ≤ λdT (u, v) for all u, v ∈ V . The

parameter λ is called tree-distortion. Analogously an additive embedding of a graph

G = (V,E) into a weighted tree (possibly with Steiner vertices) T = (V ∪ S, F ) is an

embedding such that dG(u, v) ≤ dT (u, v) + r for all u, v ∈ V .

The study of tree metrics can be traced back to the beginning of the 20th century,

when it was first realized that weighted trees can in some cases serve as an (approximate)

model for the description of evolving systems. More recently, as indicated in [153], it

was observed that certain Internet originated metrics display tree-like properties. It is

well known [151] that tree metrics have a simple structure: d is a tree metric if and

only if all submetrics of d of size 4 are such. Moreover, the underlying tree is unique,

easily reconstructible, and has rigid local structure corresponding to the local structure

of d. But what about the structure of approximately tree metrics? We have only partial

answers for this question, and yet what we already know seems to indicate that a rich

theory might well be hiding there.

In distributed systems and communication networks, an important requirement is

that a host network (graph) S must be a subgraph of original network G (each link

present in S must be present in G as well). This would lead to the notion of spanners.

If we require from the host graph to be not an arbitrary tree but a spanning tree of

the original graph, we obtain a well known notion of a tree t-spanner. For t ≥ 1, a

(multiplicative) tree t-spanner of a graph G = (V,E) is a spanning tree T = (V,E′ ⊆ E) 3

such that the distance between every pair of vertices in T is at most t times their distance

in G, i.e. dH (u, v) ≤ t · dG(u, v) for all u, v ∈ V [44]. The parameter t is called the stretch

(or stretch factor) of T with a stretch t. For r ≥ 0, an additive tree r-spanner of G is

′ a spanning tree T = (V,E ⊆ E) such that dT (u, v) ≤ dG(u, v) + r, for all u, v ∈ V

[146]. The parameter r is called the surplus r. If we approximate the graph by a tree

spanner, we can solve a given problem on the tree and interpret the solution on the

original graph. The tree t-spanner problem asks, given a graph G and a positive

number t, whether G admits a tree t-spanner. Note that the problem of finding a tree

t-spanner of G minimizing t is known in the literature also as the Minimum Max-Stretch

spanning Tree problem (see, e.g., [86] and literature cited therein).

There are many applications of tree spanners in various areas. Tree spanners are useful

in designing approximation algorithms for combinatorial and algorithmic problems that

are concerned with distances in a finite metric space induced by a graph.

Tree spanners find applications also in network design and, in particular, in the con-

text of distributed systems. One such application is the arrow distributed directory pro-

tocol introduced in [64]. This protocol supports the location of mobile objects in a

distributed network. It is implemented over a spanning tree T that spans the network, and, as shown in [142], the worst case overhead ratio of the protocol is proportional to the stretch of T . Therefore, a good candidate for the backbone of the arrow protocol is a spanning tree with low stretch (see also [105]). Another application of tree spanners is in message routing in communication networks. In order to maintain succinct routing tables, efficient routing schemes can use only the edges of a tree spanner. A very efficient routing scheme is available for trees [157]. We refer to the survey paper of Peleg [141] 4 for an overview on spanners and their applications.

Unfortunately, not many graph families admit good tree spanners. This motivates the study of sparse spanners, i.e., spanners with a small amount of edges. There are many applications of spanners in various areas; especially, in distributed systems and commu- nication networks. In [144], close relationships were established between the quality of spanners (in terms of stretch factor and the number of spanner edges), and the time and communication complexities of any synchronizer for the network based on this spanner.

Sparse spanners are very useful in message routing in communication networks; in order to maintain succinct routing tables, efficient routing schemes can use only the edges of a sparse spanner [145]. The Sparsest t-Spanner problem asks, for a given graph G and a number t, to find a t-spanner of G with the smallest number of edges. We refer to the survey paper of Peleg [141] for an overview on spanners.

It is not difficult to show that there are metrics (e.g., cycles [101,147]) which cannot be embedded into tree metrics with o(n) distortion. Inspired by ideas from works of

Alon et al. [11], Bartal [24, 25], Fakcharoenphol et al. [87], and to extend those ideas to designing compact and efficient routing and distance labeling schemes in networks, in [79], a new notion of collective tree spanners1 was introduced. This notion is slightly weaker than the one of a tree spanner and slightly stronger than the notion of a sparse spanner. We say that a graph G = (V,E) admits a system of µ collective additive tree r-spanners if there is a system T (G) of at most µ spanning trees of G such that for any two vertices x, y of G a spanning tree T ∈ T (G) exists such that dT (x, y) ≤ dG(x, y) + r

(a multiplicative variant of this notion can be defined analogously). Clearly, if G admits

1Independently, Gupta et al. in [102] introduced a similar concept which is called tree covers there. 5

a system of µ collective additive tree r-spanners, then G admits an additive r-spanner

with at most µ × (n − 1) edges (take the union of all those trees), and if µ = 1, then G

admits an additive tree r-spanner.

Recently, in [75], spanners of bounded tree-width were introduced, motivated by the

fact that many algorithmic problems are tractable on graphs of bounded tree-width, and

a spanner H of G with small tree-width can be used to obtain an approximate solution to a problem on G. In particular, efficient and compact distance and routing labeling

schemes are available for bounded tree-width graphs (see, e.g., [77,102] and papers cited

therein), and they can be used to compute approximate distances and route along paths

that are close to shortest in G. The k-Tree-width t-spanner problem asks, for a given

graph G, an integer k and a positive number t ≥ 1, whether G admits a t-spanner of

tree-width at most k. Every connected graph with n vertices and at most n−1+m edges

is of tree-width at most m + 1 and hence this problem is a generalization of the Tree t-

Spanner and the Sparsest t-Spanner problems. Furthermore, t-spanners of bounded

tree-width have much more structure to exploit algorithmically than sparse t-spanners

(which have a small number of edges but may lack other nice structural properties).

1.1 Research contribution

In this dissertation we study the “tree-likeness” and different problems described ear-

lier of embedding graph metrics into tree metrics, tree-spanners, collective tree-spanners

and sparse spanners. In Chapter 2, we study tree-like structure in real-world graph

datasets from a metric point of view. We empirically investigate the problem of embed-

ding (unweighted) graphs into trees using the recent state of the art graph embedding 6 techniques. Furthermore, we present strong evidence, based on solid theoretical foun- dations, that a number of real-life networks, taken from different domains like Internet measurements, biological datasets, web graphs, social and collaboration networks, ex- hibit tree-like structures from a metric point of view. Specifically, we investigate few graph parameters, namely, the tree-distortion and the tree-stretch, the tree-length and the tree-breadth, the Gromov’s hyperbolicity, the cluster-diameter and the cluster-radius in a layering partition of a graph, which capture and quantify this phenomenon of being metrically close to a tree. By bringing all those parameters together, we not only provide efficient means for detecting such metric tree-like structures in large-scale networks but also show how such structures can be used, for example, to efficiently and compactly encode approximate distance and almost shortest path information and to fast and accu- rately estimate diameters and radii of those networks. Estimating the diameter and the radius of a graph or distances between its arbitrary vertices are fundamental primitives in many data and graph mining algorithms.

Chapters 3 and 4 concern the problem of collective tree spanners and sparse spanners.

Specifically, we study collective additive tree spanners for families of graphs enjoying spe- cial Robertson-Seymour’s tree-decompositions, and demonstrate interesting consequences of obtained results. We demonstrate in Chapter 3 that there is a polynomial time al- gorithm that, given an n-vertex graph G admitting a multiplicative tree t-spanner, con- structs a system of at most log2 n collective additive tree O(t log n)-spanners of G. That is, with a slight increase in the number of trees and in the stretch, one can “turn” a multiplicative tree spanner into a small set of collective additive tree spanners.

In Chapter 4, we extend the result from Chapter 3 by showing that if a graph G 7

admits a multiplicative t-spanner with tree-width k − 1, then G admits a Robertson-

Seymour’s tree-decomposition each bag of which can be covered with at most k disks of

G of radius at most ⌈t/2⌉ each. This is used to demonstrate that, for every fixed k, there

is a polynomial time algorithm that, given an n-vertex graph G admitting a multiplicative

− t-spanner with tree-width k 1, constructs a system of at most k(1 + log2 n) collective

additive tree O(t log n)-spanners of G.

In Chapter 5, we investigate the problem of embedding a weighted graph metric into

a tree metric. We develop an approach with proven theoretical bounds for this problem.

Furthermore, we apply and empirically test our approach on real-world graph datasets.

1.2 Publication notes

The results of Chapter 2 are to be submitted for publication to a relevant conference.

The results of Chapters 3 and 4 are accepted for publication and will appear in the Journal

of Theoretical Computer Science (TCS) and have already being partially published in

[73] at the 39th International Conference on Current Trends in Theory and Practice

of Computer Science (SOFSEM 2013). Results of Chapter 5 are in preparation for

submission for publication.

1.3 Preliminaries and Notations

A metric space is an ordered pair (M, d) where M is a set and d is a measure of

distance between elements of M, i.e., d is a function d : M × M −→ R, such that for any x, y, z ∈ M, the following three conditions hold:

1. d(x, y) = 0 if and only if x = y. 8

2. d(x, y) = d(y, x) (symmetry).

3. d(x, y) ≤ d(x, z) + d(z, y) (triangle inequality).

For simplicity, we may refer to a metric space (M, d) by only M.

A metric space (M, d) is isometrically embeddable into a host metric space (M ′, d′) if there exists a map φ : M −→ M ′ such that d′(φ(p), φ(q)) = d(p, q) for all p, q ∈ M.

In this case, we say M is a subspace of M ′. A low-distortion embedding between two metric spaces (M, d) and (M ′, d′) is a (non-contractive) mapping φ such that for any pair

of points p, q in the original metric space, their distance d(p, q) before the mapping is

the same as the distance d′(φ(p), φ(q)) after the mapping, up to a (small) multiplicative

factor λ. Low-distortion embeddings have been a subject of extensive mathematical

studies, and found numerous applications in computer science (see [106,107,125]).

Formally, a low-distortion embedding of a metric space (M, d) into another metric

space (M ′, d′) with distance functions d and d′, is a mapping φ : M → M ′ such that for

any pair of points p, q in the original metric space M, their distance d(p, q) before the

mapping is the same as the distance d′(φ(p), φ(q)) after the mapping, up to a (small)

multiplicative factor λ. The mapping φ has contraction cφ and expansion eφ if for every

pair of points p, q in M,

′ d(p, q) ≤ cφ · d (φ(p), φ(q))

and

′ eφ · d(p, q) ≥ d (φ(p), φ(q)),

respectively. We say that φ is non-contracting if cφ is at most 1. A non-contracting

′ mapping φ has distortion λ if eφ is at most λ. Also, we say that φ : M → M is an 9

embedding with (multiplicative) distortion λ ≥ 1 if d(x, y) ≤ d′(φ(x), φ(y)) ≤ λ · d(x, y) for all x, y in M.

Analogously we can define embedding with additive distortion: φ : M → M ′ is an embedding with additive distortion λ ≥ 0 if d(x, y) ≤ d′(φ(x), φ(y)) ≤ d(x, y) + λ for all

x, y in M.

Throughout the dissertation, we will often omit the word multiplicative when we refer

to embedding with multiplicative distortion.

Given an undirected graph G with the vertex set V (G) and the edge set E(G), the

graph metric of G denoted as M(G) is the metric induced by the shortest path distances of

G, i.e, M(G) = (V (G), dG), where the distance function dG is the shortest path distance

between u and v for every pair of vertices u, v ∈ V (G).

All graphs occurring in this dissertation are connected, finite, undirected, loopless and

without multiple edges. Also, the graphs in all chapters are unweighted except for those

in Chapter 5. For a graph G = (V,E), we use n and |V | interchangeably to denote the

number of vertices in G. Also, we use m and |E| to denote the number of edges. A clique

is a set of pairwise adjacent vertices of G. By G[S] we denote a subgraph of G induced

by vertices of S ⊆ V . Let also G \ S be the graph G[V \ S] (which is not necessarily

connected). A set S ⊆ V is called a separator of a connected graph G if the graph

G[V \ S] has more than one connected component, and S is called a balanced separator

of G if each connected component of G[V \ S] has at most |V |/2 vertices. A set C ⊆ V

is called a balanced clique-separator of G if C is both a clique and a balanced separator

of G. For a vertex v of G, the sets NG(v) = {w ∈ V |vw ∈ E} and NG[v] = NG(v) ∪ {v}

are called the open neighborhood and the closed neighborhood of v, respectively. 10

In a graph G the length of a path from a vertex v to a vertex u is the number of edges in the path. The distance dG(u, v) between vertices u and v is the length of a shortest path connecting u and v in G. The disk/ ball of G of radius r centered at vertex v is the set

of all vertices at distance at most k to v: Dr(v, G) = Br(v, G) = {w ∈ V |dG(v, w) ≤ r}.

We omit the graph name G as in Dr(v) or Br(v) if the context is about only one graph.

A disk Dr(v, G) is called a balanced disk-separator of G if the set Dr(v, G) is a balanced

separator of G.

The diameter diam(G) of a graph G = (V,E) is the largest distance between a

pair of vertices in G, i.e., diam(G) = maxu,v∈V dG(u, v). The eccentricity of a vertex v,

denoted by ecc(v), is the largest distance from that vertex v to any other vertex, i.e.,

ecc(v) = maxu∈V dG(v, u). The radius rad(G) of a graph G = (V,E) is the minimum

eccentricity of a vertex in G, i.e., rad(G) = minv∈V maxu∈V dG(v, u). The center C(G) =

{c ∈ V : ecc(c) = rad(G)} of a graph G = (V,E) is the set of vertices with minimum

eccentricity. The diameter in G of a set S ⊆ V is maxx,y∈S dG(x, y) and its radius in G is

minx∈V maxy∈S dG(x, y) (in some papers they are called the weak diameter and the weak

radius to indicate that the distances are measured in G not in G[S]).

An approximation algorithm is an algorithm that runs in polynomial time and pro-

duces a solution that is within a guaranteed factor of the optimum solution for some

optimization problem. A constant approximation algorithm produces a solution within

a guaranteed constat factor c of the optimum solution (called a c-approximation). A

Polynomial Time Approximation Scheme (PTAS) is an approximation algorithm that

produces a solution that is within a factor of 1 + ϵ of the optimum solution and runs in

polynomial time for every fixed ϵ > 0. 11

1.3.1 Tree-decomposition

There are in the literature few graph parameters measuring metric tree-likeness of a

graph and related to the tree t-spanner problem. They all are based on the notion of

tree-decomposition introduced by Robertson and Seymour in their work on graph minors

[150].

A tree-decomposition of a graph G = (V,E) is a pair ({Xi|i ∈ I},T = (I,F )) where

{Xi|i ∈ I} is a collection of subsets of V , called bags, and T is a tree. The nodes of T

are the bags {Xi|i ∈ I} satisfying the following three conditions (see Figure 1): ∪ 1. i∈I Xi = V ;

2. for each edge uv ∈ E, there is a bag Xi such that u, v ∈ Xi;

∩ 3. for all i, j, k ∈ I, if j is on the path from i to k in T , then Xi Xk ⊆ Xj. Equiv-

alently, this condition could be stated as follows: for all vertices v ∈ V , the set of

bags {i ∈ I|v ∈ Xi} induces a connected subtree Tv of T .

For simplicity we denote a tree-decomposition ({Xi|i ∈ I},T = (I,F )) of a graph G by

T (G).

Tree-decompositions were used to define several graph parameters to measure how

close a given graph is to some known graph class (e.g., to trees or to chordal graphs) where

many algorithmic problems could be solved efficiently. The width of a tree-decomposition

T (G) = ({Xi|i ∈ I},T = (I,F )) is maxi∈I |Xi| − 1. The tree-width of a graph G, denoted by tw(G), is the minimum width, over all tree-decompositions T (G) of G [150]. The trees

are exactly the graphs with tree-width 1. The problem of determining if a given graph

admits a treewidth at most k, where k is variable, is NP-complete [13]. However, when k 12

(a) A graph G. (b) A tree-decomposition of G.

Figure 1: A graph and its tree-decomposition of width 3, of length 3, and of breadth 2.

is a fixed constant, the problem has a linear time solution that also finds a width k tree

decomposition for the given graph [35]. It is worth noting that the time of the algorithm

of [35] is exponential on k.

The length of a tree-decomposition T (G) of a graph G is λ := maxi∈I maxu,v∈Xi dG(u, v)

(i.e., each bag Xi has diameter at most λ in G). The tree-length of G, denoted by tl(G), is the minimum of the length, over all tree-decompositions of G [71]. The chordal graphs are exactly the graphs with tree-length 1. Note that these two graph parameters are not related to each other. For instance, a clique on n vertices has tree-length 1 and tree-width n − 1, whereas a cycle on 3n vertices has tree-width 2 and tree-length n. The breadth of

a tree-decomposition T (G) of a graph G is the minimum integer r such that for every

i ∈ I there is a vertex vi ∈ V with Xi ⊆ Dr(vi,G) (i.e., each bag Xi can be covered by 13 a disk Dr(vi,G) := {u ∈ V (G): dG(u, vi) ≤ r} of radius at most r in G). Note that vertex vi does not need to belong to Xi. The tree-breadth of G, denoted by tb(G), is the minimum of the breadth over all tree-decompositions of G [76]. It turns out that tree-breadth is related to the problem of tree t-spanner problem [76]. Unfortunately, while graphs with tree-length 1 (as they are exactly the chordal graphs) can be recog- nized in linear time, the problem of determining whether a given graph has tree-length at most λ is NP-complete for every fixed λ > 1 (see [127]). Judging from this result, it is conceivable that the problem of determining whether a given graph has tree-breadth at most ρ is NP-complete, too. We say that a family of graphs G is of bounded tree-breadth

(of bounded tree-width, of bounded tree-length) if there is a constant c such that for each graph G from G, tb(G) ≤ c (resp., tw(G) ≤ c, tl(G) ≤ c).

1.4 Related work

1.4.1 Low distortion embedding

The work of Bourgain [37] presents first embeddings with guaranties. It was shown that any finite metric on n nodes can be embedded into ℓ2 with logarithmic distortion with the number of dimensions exponential in n. Linial et al. [126] modified Bourgain’s

2 result to apply for ℓ1 metrics and to use O(log n) dimensions. In [124], Linial et al. used Bourgain’s result to discover properties of the distance metric between protein sequences. They observed that many interesting biological properties of proteins can be

(re-)discovered by analyzing the embedding of the metric into ℓ2. Aumann and Rabani

[14] and Linial et al. [126] gave also several other applications, including a proof of a logarithmic bound on max-flow min-cut gap for multicommodity flow problems. They 14

also gave a lower bound on the distortion of any embeddings of general graphs into ℓ1.

For more details, we point the reader to the recent survey by Indyk and Matousek [107].

Obtaining approximation algorithms for minimum distortion embeddings into certain

host spaces has been a notoriously hard problem. In many cases of interest, such as

embedding into Rd (d ≥ 1), the problem is known to be hard to approximate within polynomial factors (see [17,128] and papers cited therein).

Table 1 shows known results on approximate embedding problems for multiplicative

distortion.

1.4.2 Embedding into a metric of a (weighted) tree.

The strongest results were obtained, so far, for the additive distortion. Research on

the algorithmic aspects of finding a tree metric of least additive distortion has culminated

in the paper [9] (see also [56]), where a 6-approximation algorithm was established (in

the notation of [9], it is a 3-approximation algorithm, however, in our more restrictive

definition, requiring that the metric is dominated by the approximating one, it is a

6-approximation), together with a (rather close) hardness result. Relaxing the local

condition on d by allowing its size-4 submetrics to be δ-close to a tree metric, one gets

precisely Gromov’s δ-hyperbolic geometry. For study of algorithmic and other aspects of

such geometries, see, e.g., [52,53,119].

The situation with the multiplicative distortion is less satisfactory. The best result

for embedding general metrics into tree metrics is obtained in [21]: the approximation √ factor is exponential in log ∆/ log log n, where ∆ is the spread of the metric. Judging

from the parallel results of [17] for embedding into line metrics, it is conceivable that 15

From Into Distortion Source Comments

general metrics L2 λ [126] uses SDP general metrics ultrametrics λ [10] general metrics line O(∆3/4λ11/4) [17] ∆ is the spread of the metric 1 general metrics trees, line (λ log n)O(log 2 ∆) [21] 1 1 general metrics Rd Ω(n (22d−10) λ) [17, 128] hard to n (22d−10) - approximate, for every d ≥ 1 R3 R3 > (3 − ϵ)λ [137] hard to 3-approximate, embedding is a bijection line line λ [111] λ is constant, embedding is a bijection line line > nΩ(1) [103] λ = nΩ(1), embedding is a bijection ultrametrics Rd λO(d) [18] weighted trees line λO(1) [17] 1/12 1 weighted trees line Ω(n λ) – hard to O(n 12 )-approximate even for ∆ = nO(1)

weighted trees Lp O(λ) [120] unweighted graphs trees 6λ [19, 21, 54] improved from 100λ [21] to 27λ [19] to 6λ [54] unweighted graphs boun.deg. trees λ [111] λ is constant, embedding is a bijection unweighted graphs spanning trees O(λ log n) [21, 76, 86] unweighted graphs spanning trees NP-complete [44] planar graphs spanning trees NP-complete [90] apex-minor– spanning trees λ [75] λ is constant;planar and free graphs bounded genus graphs are there outerplanar graphs spanning trees λ [139] √ unweighted graphs line O(λ2) [20] implies n-approximation unweighted graphs line > ac [20] hard to a-approximate for some a > 1 unweighted graphs line λ√ [20] λ is constant unweighted trees line O(λ3/2 log λ) [20]

Table 1: Known results on approximate embedding problems for multiplicative distortion;

λ is used to denote the optimal distortion and n to denote the number of points in the input metric. The table contains only the results that hold for the multiplicative definition of the distortion; there is a rich body of work that applies to other definitions of distortion, notably the additive or average distortion, see [17] for an overview. 16

any constant factor approximation for optimal embedding of general metrics into tree

metrics is NP-hard. For some small constant γ, the hardness result of [9] implies that it

is NP-hard to approximate the multiplicative distortion better than γ even for metrics

that come from unit-weighted graphs. For a special interesting case of shortest path

metrics of unit-weighted graphs, [21] gets a large (around 100) constant approximation

factor which was improved in [19] to a factor of 27 and later improved to a factor of 6

in [54] by using a method of decomposition(layering partition) of the graph. Also, in

[54], Chepoi and Dragan et al. present the first algorithm for embedding into anything

more complicated than trees where they achieve constant approximation of embedding

into outplanner graphs (K2,3 minor free graphs).

1.4.3 Tree spanners

Substantial work has been done on the tree t-spanner problem on unweighted

graphs. Cai and Corneil [44] have shown that, for a given graph G, the problem to

decide whether G has a tree t-spanner is NP-complete for any fixed t ≥ 4 and is linear

time solvable for t = 1, 2 (the status of the case t = 3 is open for general graphs)2. The

NP-completeness result was further strengthened in [40] and [41], where Branst¨adtet al. showed that the problem remains NP-complete even for the class of chordal graphs (i.e., for graphs where each induced cycle has length 3) and every fixed t ≥ 4, and for the class of chordal bipartite graphs (i.e., for bipartite graphs where each induced cycle has length

4) and every fixed t ≥ 5.

The tree t-spanner problem on planar graphs was studied in [75,90]. In [90], Fekete

2When G is an unweighted graph, t can be assumed to be an integer. 17

and Kremer proved that the tree t-spanner problem on planar graphs is NP-complete

(when t is part of the input) and polynomial time solvable for t = 3. For fixed t ≥ 4,

the complexity of the tree t-spanner problem on arbitrary planar graphs was left as an open problem in [90]. This open problem was recently resolved in [75] by Dragan et al., where it was shown that the tree t-spanner problem is linear time solvable for

every fixed constant t on the class of apex-minor-free graphs which includes all planar

graphs and all graphs of bounded genus. Note also that a number of particular graph

classes (like interval graphs, permutation graphs, asteroidal-triple-free graphs, strongly

chordal graphs, dually chordal graphs, and others) admit additive tree r-spanners for

small values of r (we refer reader to [39–41,44,90,118,122,141,142,146] and papers cited

therein).

The first O(log n)-approximation algorithm for the minimum value of t for the tree

t-spanner problem was developed by Emek and Peleg in [86] (where n is the number

of vertices in a graph). Recently, another logarithmic approximation algorithm for the

problem was proposed in [76] (we elaborate more on this in Chapter 3). Emek and

Peleg also established in [86] that unless P = NP, the problem cannot be approximated

additively by any o(n) term. Hardness of approximation is established also in [122],

where it was shown that approximating the minimum value of t for the tree t-spanner

problem within factor better than 2 is NP-hard (see also [142] for an earlier result).

1.4.4 Sparse spanners

Sparse t-spanners were introduced by Peleg, Sch¨afferand Ullman in [143, 144] and

since that time were studied extensively. It was shown by Peleg and Sch¨afferin [143] that 18

the problem of deciding whether a graph G has a t-spanner with at most m edges is NP-

complete. Later, Kortsarz [116] showed that for every t ≥ 2 there is a constant c < 1 such

that it is NP-hard to approximate the sparsest t-spanner within the ratio c·log n, where n

is the number of vertices in the graph. On the other hand, the problem admits a O(log n)-

ratio approximation for t = 2 [116, 117] and a O(n2/(t+1))-ratio approximation for t > 2

[84]. For some other inapproximability and approximability results for the Sparsest

t-Spanner problem on general graphs we refer the reader to [32,33,66,67,82,84,85,158]

and papers cited therein. It is interesting to note also that any (even weighted) n-vertex

graph admits an O(2k − 1)-spanner with at most O(n1+1/k) edges for any k ≥ 1, and

such a spanner can be constructed in polynomial time [12,28,158].

On planar graphs the Sparsest t-Spanner problem was studied as well. Brandes and Handke have shown that the decision version of the problem remains NP-complete on planar graphs for every fixed t ≥ 5 (the case 2 ≤ t ≤ 4 is open) [38]. Duckworth,

Wormald, and Zito [80] have shown that the problem of finding a sparsest 2-spanner of a 4-connected planar triangulation admits a polynomial time approximation scheme

(PTAS). Dragan et al. [74] proved that the Sparsest t-Spanner problem admits PTAS

for graph classes of bounded local tree-width (and therefore for planar and bounded genus

graphs).

Sparse additive spanners were considered in [27, 68, 83, 123, 162]. It is known that

every n-vertex graph admits an additive 2-spanner with at most Θ(n3/2) edges [68,83], an

additive 6-spanner with at most O(n4/3) edges [27], and an additive O(n(1−1/k)/2)-spanner

with at most O(n1+1/k) edges for any k ≥ 1 [27]. All those spanners can be constructed

in polynomial time. We refer the reader to the paper [162] for a good summary of the 19

state of the art of results on the sparsest additive spanner problem in general graphs.

1.4.5 Collective tree spanners

The problem of finding “small” systems of collective additive tree r-spanners for small

values of r was examined on special classes of graphs in [60,77–79,164]. For example, in

[60,79], sharp results were obtained for unweighted chordal graphs and c-chordal graphs

(i.e., the graphs where each induced cycle has length at most c): every c-chordal graph

⌊ ⌋ admits a system of at most log2 n collective additive tree (2 c/2 )-spanners, constructible

in polynomial time; no system of constant number of collective additive tree r-spanners

can exist for chordal graphs (i.e., when c = 3) and r ≤ 3, and no system of constant

number of collective additive tree r-spanners can exist for outerplanar graphs for any

constant r.

Only papers [77,102,164] have investigated collective (multiplicative or additive) tree

spanners in weighted graphs. It was shown that any weighted n-vertex planar graph √ admits a system of O( n) collective multiplicative tree 1-spanners (equivalently, additive tree 0-spanners) [77,102] and a system of at most 2 log3/2 n collective multiplicative tree 3-

spanners [102]. Furthermore, any weighted graph with genus at most g admits a system of √ O( gn) collective additive tree 0-spanners [77,102], any weighted graph with tree-width

− at most k 1 admits a system of at most k log2 n collective additive tree 0-spanners

[77, 102], any weighted graph G with clique-width at most k admits a system of at

most k log3/2 n collective additive tree (2w)-spanners [77], any weighted c-chordal graph

⌊ ⌋ G admits a system of log2 n collective additive tree (2 c/2 w)-spanners [77] (where w

denotes the maximum edge weight in G). 20

Collective tree spanners of Unit Disk Graphs (UDGs) (which often model wireless

ad hoc networks) were investigated in [164]. It was shown that every n-vertex UDG G

admits a system T (G) of at most 2 log 3 n + 2 spanning trees of G such that, for any two 2 vertices x and y of G, there exists a tree T in T (G) with dT (x, y) ≤ 3 · dG(x, y) + 12.

That is, the distances in any UDG can be approximately represented by the distances in

at most 2 log 3 n + 2 of its spanning trees. Based on this result a new compact and low 2

delay routing labeling scheme was proposed for Unit Disk Graphs.

1.4.6 Spanners with bounded tree-width.

The k-Tree-width t-spanner problem was considered in [75] and [91]. It was

shown that the problem is linear time solvable for every fixed constants t and k on the

class of apex-minor-free graphs [75], which includes all planar graphs and all graphs of

bounded genus, and on the graphs with bounded degree [91]. CHAPTER 2

Metric tree-like structures in real-life networks:

an empirical study

2.1 Introduction

Large networks are everywhere. Can we understand their structure and exploit it?

For example, understanding key structural properties of large-scale data networks is cru- cial for analyzing and optimizing their performance, as well as improving their reliability and security [129]. In prior empirical and theoretical studies researchers have mainly focused on features like small world phenomenon, power law degree distribution, naviga- bility, high clustering coefficients, etc. (see [22,23,36,57,88,113,114,121,160]). Those nice features were observed in many real-life complex networks and graphs arising in Internet applications, in biological and social sciences, in chemistry and physics. Although those features are interesting and important, as it is noted in [129], the impact of intrinsic geo- metrical and topological features of large-scale data networks on performance, reliability and security is of much greater importance.

Recently, a few papers explored a little-studied before geometric characteristic of real- life networks, namely the hyperbolicity (sometimes called also the global curvature) of the network (see, e.g., [50, 62, 110, 129, 154]). It was shown that a number of data networks, including Internet application networks, web networks, collaboration networks, social 21 22 networks, and others, have small hyperbolicity. It was suggested in [129] that property, observed in real-life networks, that traffic between nodes tends to go through a relatively small core of the network, as if the shortest path between them is curved inwards, may be due to global curvature of the network. Furthermore, the paper [110] proposes that

“hyperbolicity in conjunction with other local characteristics of networks, such as the degree distribution and clustering coefficients, provide a more complete unifying picture of networks, and helps classify in a parsimonious way what is otherwise a bewildering and complex array of features and characteristics specific to each natural and man-made network.”

The hyperbolicity of a graph/network can be viewed as a measure of how close a graph is to a tree metrically; the smaller the hyperbolicity of a graph is the closer it is metrically to a tree. Recent empirical results of [50, 62, 110, 129, 154] on hyperbolicity suggest that many real-life complex networks and graphs may possess tree-like structures from a metric point of view.

In this chapter, we substantiate this claim through analysis of a collection of real data networks. We investigate few recently introduced graph parameters, namely, the tree-distortion and the tree-stretch of a graph, the tree-length and the tree-breadth of a graph, the Gromov’s hyperbolicity of a graph, the cluster-diameter and the cluster-radius in a layering partition of a graph. All these parameters are trying to capture and quantify this phenomenon of being metrically close to a tree and can be used to measure metric tree-likeness of a real-life network. Recent advances in theory (see appropriate sections for details) allow us to calculate or accurately estimate those parameters for sufficiently large networks. By examining topologies of numerous publicly available networks, we 23 demonstrate existence of metric tree-like structures in wide range of large-scale networks, from communication networks to various forms of social and biological networks.

Throughout this chapter we discuss these parameters and recently established rela- tionships between them for unweighted and undirected graphs. It turns out that all these parameters are at most constant or logarithmic factors apart from each other. Hence, a constant bound on one of them translates in a constant or almost constant bound on another. We say that a graph has a tree-like structure from a metric point of view

(equivalently, is metrically tree-like) if anyone of those parameters is a small constant.

Recently, paper [8] pointed out that “although large informatics graphs such as social and information networks are often thought of as having hierarchical or tree-like structure, this assumption is rarely tested, and it has proven difficult to exploit this idea in practice;

... it is not clear whether such structure can be exploited for improved graph mining and machine learning ....”

In this chapter, by bringing all those parameters together, we not only provide effi- cient means for detecting such metric tree-like structures in large-scale networks but also show how such structures can be used, for example, to efficiently and compactly encode approximate distance and almost shortest path information and to fast and accurately estimate diameters and radii of those networks. Estimating accurately and quickly dis- tances between arbitrary vertices of a graph is a fundamental primitive in many data and graph mining algorithms.

Graphs that are metrically tree-like have many algorithmic advantages. They allow efficient approximate solutions for a number of optimization problems. For example, they admit a PTAS for the Traveling Salesman Problem [119], have an efficient approximate 24 solution for the problem of covering and packing by balls [55], admit additive sparse spanners [53, 70] and collective additive tree-spanners [73], enjoy efficient and compact approximate distance [53, 94] and routing [53, 69] labeling schemes, have efficient algo- rithms for fast and accurate estimations of diameters and radii [52], etc. We elaborate more on these results in appropriate sections.

This chapter is structured as follows. In Section 2.2, we describe our graph datasets.

The next four sections are devoted to analysis of corresponding parameters measuring metric tree-likeness of our graph datasets: layering partition and its cluster-diameter and cluster-radius in Section 2.3; hyperbolicity in Section 2.4; tree-distortion in Section 2.5; tree-breadth, tree-length and tree-stretch in Section 2.6. In each section we first give theoretical background on the parameter(s) and then present our experimental results.

Additionally, an overview of implications of those results is provided. In Section 2.7, we further discuss algorithmic advantages for a graph to be metrically tree-like. Finally, in

Section 2.8, we give some concluding remarks.

2.2 Datasets

Our datasets come from different domains like Internet measurements, biological datasets, web graphs, social and collaboration networks. Table 2 shows basic statis- tics of our graph datasets. Each graph represents the largest connected component of the original graph as some datasets consist of one large connected component and many very small ones. 25

Graph n= m= diameter radius G = (V,E) |V | |E| diam(G) rad(G) PPI [108] 1458 1948 19 11 Yeast [43] 2224 6609 11 6 DutchElite [63] 3621 4311 22 12 EPA [1] 4253 8953 10 6 EVA [133] 4475 4664 18 10 California [112] 5925 15770 13 7 Erd¨os[29] 6927 11850 4 2 Routeview [4] 10515 21455 10 5 Homo release 3.2.99 [155] 16711 115406 10 5 AS Caida 20071105 [47] 26475 53381 17 9 Dimes 3/2010 [152] 26424 90267 8 4 Aqualab 12/2007- 09/2008 [49] 31845 143383 9 5 AS Caida 20120601 [45] 41203 121309 10 5 itdk0304 [46] 190914 607610 26 14 DBLB-coauth [165] 317080 1049866 23 12 Amazon [165] 334863 925872 47 24

Table 2: Graph datasets and their parameters: number of vertices, number of edges, diameter, radius.

Biological Networks

PPI [108]: It is a protein-protein interaction network in the yeast Saccharomyces cere- visiae. Each node represents a protein with an edge representing an interaction between two proteins. Self loops have been removed from the original dataset. The dataset has been analyzed and described in [108].

Yeast [43]: It is a protein-protein interaction network in budding yeast. Each node represents a protein with an edge representing an interaction between two proteins. Self loops have been removed from the original dataset. The dataset has been analyzed and described in [43]. 26

Homo [155]: It is a dataset of protein and genetic interactions in Homo sapiens (Hu- man). Each node represents a protein or a gene. An edge represents an interac- tion between two proteins/genes. Parallel edges, representing different resources for an interaction, have been removed. The dataset is obtained from BioGRID, a freely accessible database/repositiory of physical and genetic interactions available at http:

//www.thebiogrid.org. The dataset has been analyzed and described in [155].

Social and Collaboration Networks

DutchElite [63]: This is data on the administrative elite in Netherland, April 2006.

Data collected and analyzed by De Volkskrant and Wouter de Nooy. A 2-mode network data representing person’s membership in the administrative and organization bodies in

Netherland in 2006. A node represents either a person or an organization body. An edge exists between two nodes if the person node belongs to the organization node.

EVA [133]: It is a network of interconnection between corporations where an edge exists between two companies (vertices) if one of them is the owner of the other company.

Erd¨os [29]: It is a collaboration network with mathematician Paul Erd¨os. Each ver- tex represents an author with an edge representing a paper co-authorship between two authors.

DBLB-coauth [165]: It is a co-authorship network of the DBLP computer science bibli- ography. Vertices of the network represent authors with edges connecting two authors if they published at least one paper together.

Web Graphs

EPA [1]: It is a dataset representing pages linking to www.epa.gov obtained from Jon 27

Kleinberg’s web page, http://www.cs.cornell.edu/courses/cs685/2002fa/. The pag-

es were constructed by expanding a 200-page response set to a search engine query, as in

the hub/authority algorithm. This data was collected some time back, so a number of

the links may not exist anymore. The vertices of this graph dataset represent web pages

with edges representing links. The graph was originally directed. We ignored direction

of edges to obtain an undirected graph version of the dataset.

California [112]: This graph dataset was also constructed by expanding a 200-page re- sponse set to a search engine query ‘California’, as in the hub/authority algorithm.

The dataset was obtained from Jon Kleinberg’s page, http://www.cs.cornell.edu/

courses/cs685/2002fa/. The vertices of this graph dataset represent web pages with

edges representing links between them. The graph was originally directed. We ignored

direction of edges to obtain an undirected graph version of the dataset.

Internet Measurements Networks

Routeview [4]: It is an Autonomous System (AS) graph obtained by University of Oregon

Route-views project using looking glass data and routing registry. A vertex in the dataset

represents an AS with an edge linking two vertices if there is at least one physical link

between them.

AS Caida [45,47]: These are datasets of the Internet Autonomous Systems (AS) relation-

ships derived from BGP table snapshots taken at 24-hour intervals over a 5-day period by

CAIDA. The AS relationships available are customer-provider (and provider-customer,

in the opposite direction), peer-to-peer, and sibling-to-sibling.

Dimes 3/2010 [152]: It is an AS relationship graph of the Internet obtained from Dimes.

The Dimes project performs traceroutes and pings from volunteer agents (of about 1000 28 agent computers) to infer AS relationships. A weekly AS snapshot is available. The dataset Dimes 3/2010 represents a snapshot aggregated over the month of March, 2010.

It provides the set of AS level nodes and edges that were found in that month and were seen at least twice.

Aqualab [49]: Peer-to-peer clients are used to collect traceroute paths which are used to infer AS interconnections. Probes were made between December 2007 and September

2008 from approximately 992,000 P2P users in 3,700 ASes.

Itdk [46]: This is a dataset of Internet router-level graph where each vertex repre- sents a router with an edge between two vertices if there is a link between the cor- responding routers. The dataset snapshot is computed from ITDK0304 skitter and iffinder measurements. The dataset is provided by CAIDA for April 2003 (see http:

//www.caida.org/data/active/internet-topology-data-kit).

Information network

Amazon [165]: It is an Amazon product co-purchasing network. The vertices of the net- work represent products purchased from the Amazon website and the edges link “com- monly/frequently” co-purchased products.

2.3 Layering Partition, its Cluster-Diameter and Cluster-Radius

Layering partition is a graph decomposition procedure that has been introduced in [39,

51] and has been used in [39, 51, 54] and [21] for embedding graph metrics into trees. It provides a central tool in our investigation.

A layering of a graph G = (V,E) with respect to a start vertex s is the decomposition

i of V into the layers (spheres) L = {u ∈ V : dG(s, u) = i}, i = 0, 1, . . . , r.A layering 29 partition LP(G, s) = {Li , ··· ,Li : i = 0, 1, . . . , r} of G is a partition of each layer Li 1 pi into clusters Li ,...,Li such that two vertices u, v ∈ Li belong to the same cluster Li 1 pi j if and only if they can be connected by a path outside the ball Bi−1(s) of radius i − 1 centered at s. See Figure 2 for an illustration. A layering partition of a graph can be constructed in O(n + m) time (see [51]).

(a) Layering of graph G with respect to s. (b) Clusters of the layering partition LP(G, s).

(c) Layering tree Γ(G, s). (d) Canonic tree H obtained from the layering

partition. Figure 2: Layering partition and associated constructs.

A layering tree Γ(G, s) of a graph G with respect to a layering partition LP(G, s) is

LP i ′ i′ the graph whose nodes are the clusters of (G, s) and two nodes C = Lj and C = Lj′ are adjacent in Γ(G, s) if and only if there exist a vertex u ∈ C and a vertex v ∈ C′ such 30

that uv ∈ E. It was shown in [39] that the graph Γ(G, s) is always a tree and, given a

start vertex s, can be constructed in O(n + m) time [51]. Note that, for a fixed start

vertex s ∈ V , the layering partition LP(G, s) of G and its tree Γ(G, s) are unique.

The cluster-diameter ∆s(G) of layering partition LP(G, s) with respect to vertex s is

the largest diameter of a cluster in LP(G, s), i.e., ∆s(G) = maxC∈LP(G,s) maxu,v∈C dG(u, v).

The cluster-diameter ∆(G) of a graph G is the minimum cluster-diameter over all layering

partitions of G, i.e., ∆(G) = mins∈V ∆s(G).

The cluster-radius Rs(G) of layering partition LP(G, s) with respect to a vertex s is

the smallest number r such that for any cluster C ∈ LP(G, s), there is a vertex v ∈ V

with C ⊆ Br(v). The cluster-radius R(G) of a graph G is the minimum cluster-radius

over all layering partitions of G, i.e., R(G) = mins∈V Rs(G).

Clearly, in view of tree Γ(G, s) of G, the smaller the parameters ∆s(G) and Rs(G) of

G are, the closer the graph G is to a tree metrically.

Finding cluster-diameter ∆s(G) and cluster-radius Rs(G) for a given layering partition

LP(G, s) of a graph G requires O(nm) time1, although the construction of layering partition LP(G, s) itself, for a given vertex s, takes only O(n + m) time. Since the

diameter of any set is at least its radius and at most twice its radius, we have the

following inequality:

Rs(G) ≤ ∆s(G) ≤ 2Rs(G).

In Table 3, we show empirical results on layering partitions obtained for datasets

described in Section 2.2. For each graph dataset G = (V,E), we randomly selected a

start vertex s and built layering partition LP(G, s) of G with respect to s. For each

1The parameters ∆(G) and R(G) can also be computed in total O(nm) time for any graph G. 31 dataset, Table 3 shows the cluster-diameter ∆s(G), the number of clusters in layering partition LP(G, s) and the average diameter of clusters in LP(G, s). It turns out that all graph datasets have small average diameter of clusters. Most clusters have diameter

0 or 1, i.e., they are essentially cliques (=complete subgraphs) of G. For most datasets, more than 95% of clusters are cliques.

Graph n= diameter # of clusters cluster- average % of clusters G = (V,E) |V | diam(G) in LP(G, s) diameter diameter having diameter 0

∆s(G) of clusters in or 1 (i.e., cliques) LP(G, s) PPI 1458 19 1017 8 0.118977384 97.05014749% Yeast 2224 11 1838 6 0.119575699 96.33558341% DutchElite 3621 22 2934 10 0.070211316 98.02317655% EPA 4253 10 2523 6 0.06698375 98.5731272% EVA 4475 18 4266 9 0.031879981 99.2030005% California 5925 13 2939 8 0.092208234 97.141885% Erd¨os 6927 4 6288 4 0.001113232 99.9681934% Routeview 10515 10 6702 6 0.063264697 98.4482244% Homo release 3.2.99 16711 10 6817 5 0.03432595 99.2518703% AS Caida 20071105 26475 17 17067 6 0.056424679 98.5527626% Dimes 3/2010 26424 8 16065 4 0.056582633 98.5434174% Aqualab 12/2007- 09/2008 31845 9 16287 6 0.05826733 98.5816909% AS Caida 20120601 41203 10 26562 6 0.055568105 98.5731496% itdk0304 190914 26 89856 11 0.270377048 91.3851051% DBLB-coauth 317080 23 99828 11 0.45350002 92.97091% Amazon 334863 47 72278 21 0.489056144 86.049697%

Table 3: Layering partitions of the datasets and their parameters. ∆s(G) is the largest diameter of a cluster in LP(G, s), where s is a randomly selected start vertex. For all datasets, the average diameter of a cluster is between 0 and 1. For most datasets, more than 95% of clusters are cliques.

To have a better picture on the overall distribution of diameters of clusters, in Table 4, we show the frequencies of diameters of clusters for three sample datasets: PPI, Yeast, 32 and AS Caida 20071105. It is interesting to note that, in all datasets, the clusters with large diameters induce a connected subtree in the tree Γ(G, s). For example, in PPI, the cluster with diameter 8 is adjacent in Γ(G, s) to all clusters with diameters 6 and 5. This may indicate that all those clusters are part of the well connected network core.

diameter frequency relative of a cluster frequency 0 966 0.9499 diameter frequency relative diameter frequency relative 1 21 0.0206 of a cluster frequency of a cluster frequency 2 14 0.0138 0 981 0.946 0 16459 0.9644 3 5 0.0049 1 18 0.0174 1 361 0.0216 4 5 0.0049 2 23 0.0223 2 174 0.0102 5 1 0.0001 3 6 0.0058 3 46 0.0027 6 4 0.0039 4 5 0.0048 4 21 0.0012 7 0 0 5 2 0.0019 5 4 0.0002 8 1 0.0001 6 2 0.0019 6 2 0.0001

(a) PPI (b) Yeast (c) AS Caida 20071105 Table 4: Frequency of diameters of clusters in layering partition LP(G, s) (three datasets).

Most of the graph parameters discussed in this paper could be related to a special tree H introduced in [54] and produced from a layering partition of a graph G.

Canonic tree H: A tree H = (V,F ) of a graph G = (V,E), called a canonic tree of

G, is constructed from a layering partition LP(G, s) of G by identifying for each cluster

i ∈ LP ∈ i C = Lj (G, s) an arbitrary vertex xC Li−1 which has a neighbor in C = Lj and by making xC adjacent in H with all vertices v ∈ C (see Figure 2d for an illustration).

i Vertex xC is called the support vertex for cluster C = Lj. It was shown in [54] that tree

H for a graph G can be constructed in O(n + m) total time.

The following statement from [54] relates the cluster-diameter of a layering partition 33

of G with embedability of graph G into the tree H.

Proposition 1 ([54]). For every graph G = (V,E) and any vertex s of G,

∀x, y ∈ V, dH (x, y) − 2 ≤ dG(x, y) ≤ dH (x, y) + ∆s(G).

The above proposition shows that the distortion of embedding of a graph G into tree H is additively bounded by ∆s(G), the largest diameter of a cluster in a layering partition of G. This result confirms that the smaller the cluster-diameter ∆s(G) (cluster-

radius Rs(G)) of G is, the closer the graph G is to a tree metric. Note that trees have

cluster-diameter and cluster-radius equal to 0. Results similar to Proposition 1 were used

in [39] to embed a chordal graph to a tree with an additive distortion at most 2, in [51]

to embed a k-chordal graph to a tree with an additive distortion at most k/2 + 2, and in [54] to obtain a 6-approximation algorithm for the problem of optimal non-contractive embedding of an unweighted graph metric into a weighted tree metric. For every chordal graph G (a graph whose largest induced cycles have length 3), ∆s(G) ≤ 3 and Rs(G) ≤ 2

hold [39]. For every k-chordal graph G (a graph whose largest induced cycles have length

k), ∆s(G) ≤ k/2 + 2 holds [51]. For every graph G embeddable non-contractively into a

(weighted) tree with multiplicative distortion α, ∆s(G) ≤ 3α holds [54]. See Section 2.5

for more on this topic.

2.4 Hyperbolicity

δ-Hyperbolic metric spaces have been defined by M. Gromov [99] in 1987 via a simple

4-point condition: for any four points u, v, w, x, the two larger of the distance sums

d(u, v) + d(w, x), d(u, w) + d(v, x), d(u, x) + d(v, w) differ by at most 2δ. They play an 34

important role in geometric group theory, geometry of negatively curved spaces, and have

recently become of interest in several domains of computer science, including algorithms

and networking. For example, (a) it has been shown empirically in [154] (see also [6])

that the Internet topology embeds with better accuracy into a hyperbolic space than

into an Euclidean space of comparable dimension, (b) every connected finite graph has

an embedding in the hyperbolic plane so that the greedy routing based on the virtual

coordinates obtained from this embedding is guaranteed to work (see [115]). A connected

graph G = (V,E) equipped with standard graph metric dG is δ-hyperbolic if the metric space (V, dG) is δ-hyperbolic.

More formally, let G be a graph and u, v, w and x be arbitrary four of its vertices.

Denote by S1,S2,S3 the three distance sums, dG(u, v)+dG(w, x), dG(u, w)+dG(v, x) and dG(u, x) + dG(v, w) sorted in non-decreasing order S1 ≤ S2 ≤ S3. Define the hyperbolicity

S3−S2 of a quadruplet u, v, w, x as δ(u, v, w, x) = 2 . Then the hyperbolicity δ(G) of a graph

G is the maximum hyperbolicity over all possible quadruplets of G, i.e.,

δ(G) = max δ(u, v, w, x). u,v,w,x∈V

δ-Hyperbolicity measures the local deviation of a metric from a tree metric; a metric is

a tree metric if and only if it has hyperbolicity 0. Note that chordal graphs, mentioned in

Section 2.3, have hyperbolicity at most 1 [42], while k-chordal graphs have hyperbolicity

at most k/4 [163].

In Table 5, we show the hyperbolicities of most of our graph datasets. The computa-

tion of hyperbolicities is a costly operation. We did not compute it for only three very 35 large graph datasets since it would take very long time to calculate. The best known al- gorithm to calculate hyperbolicity has time complexity of O(n3.69), where n is the number of vertices in the graph; it was proposed in [92] and involves matrix multiplications. This algorithm still takes long running time for large graphs and is hard to implement. Au- thors of [92] also propose a 2-approximation algorithm for calculating hyperbolicity that

2.69 2 runs in O(n ) time and a 2 log2 n-approximation algorithm that runs in O(n ) time. In our computations, we used the naive algorithm which calculates the exact hyperbolicity of a given graph in O(n4) time via calculating the hyperbolicities of its quadruplets. It is easy to show that the hyperbolicity of a graph is realized on its biconnected component.

Thus, for very large graphs, we needed to check hyperbolicities only for quadruplets coming from the same biconnected component. Additionally, we used an algorithm by

Cohen et al. from [58] which has O(n4) time complexity but performs well in practice as it prunes the search space of quadruplets.

Graph n= m= δ(G) G = (V,E) |V | |E| PPI 1458 1948 3.5 Yeast 2224 6609 2.5 DutchElite 3621 4311 4 EPA 4253 8953 2.5 EVA 4475 4664 1 California 5925 15770 3 Erd¨os 6927 11850 2 Routeview 10515 21455 2.5 Homo release 3.2.99 16711 115406 2 AS Caida 20071105 26475 53381 2.5 Dimes 3/2010 26424 90267 2 Aqualab 12/2007- 09/2008 31845 143383 2 AS Caida 20120601 41203 121309 2

Table 5: δ-hyperbolicity of the graph datasets. 36

It turns out that most of the quadruplets in our datasets have small δ values (see

Table 6). For example, more than 96% of vertex quadruplets in EVA and Erd¨osdatasets have δ values equal to 0. For the remaining graph datasets in Table 6, more than 96% of the quadruplets have δ ≤ 1, indicating that all of those graphs are metrically very close to trees.

@ @Graph @ PPI Yeast DucthElite EPA EVA California Erd¨os δ @ @ 0 0.4831 0.487015 0.54122195 0.5778 0.9973 0.49057007 0.96694 0.5 0.3634 0.450362 0 0.3655 0.0007 0.41052969 0.03278 1 0.1336 0.060844 0.42201697 0.0552 0.0020 0.09527387 0.00028 1.5 0.0179 0.001762 0 0.0015 – 0.00344690 6.80E-08 2 0.0019 0.000017 0.03642388 2.09E-05 – 0.00017945 3.64E-11 2.5 3.55E-05 2.4641E-09 0 1.37E-10 – 0.00000001 – 3 1.65E-06 – 0.00033717 – – 1.88E-11 – 3.5 3.79E-09 – 0 – – – – 4 – – 0.00000004 – – – – % ≤ 1 98.01 99.8221 96.323891 99.84 100 99.637364 99.99999

Table 6: Relative frequency of δ-hyperbolicity of quadruplets in our graph datasets that have less than 10K vertices.

In the remaining part of this section, we discuss the theoretical relations between parameters δ(G) and ∆s(G) of a graph. In [52], the following inequality was proven.

Proposition 2 ([52]). For every n-vertex graph G and any vertex s of G,

≤ ∆s(G) 4 + 12δ(G) + 8δ(G) log2 n.

Here we complement that inequality by showing that the hyperbolicity of a graph is at most ∆s(G). 37

Proposition 3. For every n-vertex graph G and any vertex s of G,

δ(G) ≤ ∆s(G).

Proof. Let LP(G, s) be a layering partition of G and Γ(G, s) be the corresponding lay-

ering tree (consult Figure 2). From construction of LP(G, s) and Γ(G, s), every cluster

C of LP(G, s) separates in G any two vertices belonging to nodes (clusters) of different

subtrees of the forest obtained from Γ(G, s) by removing node C. Note that every vertex

of G belongs to exactly one node (cluster) of the layering tree Γ(G, s).

Consider an arbitrary quadruplet x, y, z, w of vertices of G. Let X,Y,Z,W be the four nodes in Γ(G, s) (i.e., four clusters in LP(G, s)) containing vertices x, y, z, w, respectively.

In the tree Γ(G, s), consider a median node M of nodes X,Y,Z,W , i.e., a node M

removing of which from Γ(G, s) leaves no connected subtree with more that two nodes

from {X,Y,Z,W }. As a consequence, any connected component of graph G[V \ M] (the

graph obtained from G by removing vertices of M) cannot have more than 2 vertices out

of {x, y, z, w}. Thus, M separates at least 4 pairs out of the 6 possible pairs formed by

vertices x, y, z, w. Assume, without loss of generality, that M separates in G vertices x

and y from vertices z and w. See Figure 3 for an illustration.

Let µa be the distance from a ∈ {x, y, z, w} to its closest vertex in M. Let a, b be

a pair of vertices from {x, y, z, w}. If the vertices a, b belong to different components

of G[V \ M], then M separates a from b and therefore µa + µb ≤ dG(a, b). Since M

separates in G vertices x and y from vertices z and w, we get dG(x, z) + dG(y, w) ≥

µx +µy +µz +µw and dG(x, w)+dG(y, z) ≥ µx +µy +µz +µw. On the other hand, all three

sums dG(x, z)+dG(y, w), dG(x, w)+dG(y, z) and dG(x, y)+dG(z, w) are less than or equal 38

(a) M is a median node for X,Y,Z,W in (b) M separates in G vertices x and y from

Γ(G, s). vertices z and w. Figure 3: Illustration to the proof of Proposition 3.

to µx +µy +µz +µw +2∆s(G), since, by the triangle inequality, dG(a, b) ≤ µa +µb +∆s(G) for every a, b ∈ {x, y, z, w}. Now, since the two larger distance sums are between µ and

µ + 2∆s(G), where µ := µx + µy + µz + µw, we conclude that the difference between the two larger distance sums is at most 2∆s(G). Thus, necessarily δ(G) ≤ ∆s(G).

Combining Proposition 2 with Proposition 1, one obtains also the following interesting result relating the hyperbolicity of a graph G with additive distortion of embedding of

G to its canonic tree H.

Proposition 4 ([52]). For any graph G = (V,E) and its canonic tree H = (V,F ) the following is true:

∀u, v ∈ V, dH (u, v) − 2 ≤ dG(u, v) ≤ dH (u, v) + O(δ(G) log n).

Since a canonic tree H is constructible in linear time for a graph G, by Proposition 4, the distances in n-vertex δ-hyperbolic graphs can efficiently be approximated within an 39

additive error of O(δ log n) by a tree metric and this approximation is sharp (see [96,99]

and [52,94]).

Graphs and general geodesic spaces with small hyperbolicities have many other al-

gorithmic advantages. They allow efficient approximate solutions for a number of op-

timization problems. For example, Krauthgamer and Lee [119] presented a PTAS for

the Traveling Salesman Problem when the set of cities lies in a hyperbolic metric space.

Chepoi and Estellon [55] established a relationship between the minimum number of

balls of radius r + 2δ covering a finite subset S of a δ-hyperbolic geodesic space and the

size of the maximum r-packing of S and showed how to compute such coverings and packings in polynomial time. Chepoi et al. gave in [52] efficient algorithms for fast and accurate estimations of diameters and radii of δ-hyperbolic geodesic spaces and graphs.

Additionally, Chepoi et al. showed in [53] that every n-vertex δ-hyperbolic graph has an additive O(δ log n)-spanner with at most O(δn) edges and enjoys an O(δ log n)-additive routing labeling scheme with O(δ log2 n) bit labels and O(log δ) time routing protocol.

We elaborate more on these results in Section 2.7.

2.5 Tree-Distortion

The problem of approximating a given graph metric by a “simpler” metric is well

motivated from several different perspectives. A particularly simple metric of choice, also

favored from the algorithmic point of view, is a tree metric, i.e., a metric arising from

shortest path distance on a tree containing the given points. In recent years, a number

of authors considered problems of minimum distortion embeddings of graphs into trees

(see [9, 19, 21, 54]), most popular among them being a non-contractive embedding with 40 minimum multiplicative distortion.

Let G = (V,E) be a graph. The (multiplicative) tree-distortion td(G) of G is the smallest integer α such that G admits a tree (possibly weighted and with Steiner points) with

∀u, v ∈ V, dG(u, v) ≤ dT (u, v) ≤ α dG(u, v).

The problem of finding, for a given graph G, a tree T = (V ∪ S, F ) satisfying dG(u, v) ≤ dT (u, v) ≤ td(G)dG(u, v), for all u, v ∈ V , is known as the problem of minimum distortion non-contractive embedding of graphs into trees. In a non-contractive embedding, the distance in the tree must always be larger than or equal to the distance in the graph, i.e., the tree distances “dominate” the graph distances.

It is known that this problem is NP-hard, and even more, the hardness result of [9] implies that it is NP-hard to approximate td(G) better than γ, for some small constant

γ. The best known 6-approximation algorithm using layering partition technique was recently given in [54]. It improves the previously known 100-approximation algorithm from [21] and 27-approximation algorithm from [19]. Below we will provide a short description of the method of [54].

The following proposition establishes a relationship between the tree-distortion and the cluster-diameter of a graph.

Proposition 5 ([54]). For every graph G and any vertex s, ∆s(G)/3 ≤ td(G) ≤ 2∆s(G)+

2.

Proposition 5 shows that the cluster-diameter ∆s(G) of a layering partition of a graph

G linearly bounds the tree-distortion td(G) of G. 41

Combining Proposition 5 and Proposition 1, the following result is obtained.

Proposition 6 ([54]). For any graph G = (V,E) and its canonic tree H = (V,F ) the

following is true:

∀u, v ∈ V, dH (u, v) − 2 ≤ dG(u, v) ≤ dH (u, v) + 3 td(G).

Surprisingly, a multiplicative distortion turned into an additive distortion. Further-

more, while a tree T = (V ∪ S, F ) satisfying dG(u, v) ≤ dT (u, v) ≤ td(G)dG(u, v), for all

u, v ∈ V , is NP-hard to find, a canonic tree H of G can be constructed in O(m) time

(where m = |E|).

By assigning proper weights to edges of a canonic tree H or adding at most n = |V |

new Steiner points to H, the authors of [54] achieve a good non-contractive embedding of

a graph G into a tree. Recall that a canonic tree H = (V,F ) of G = (V,E) is constructed

i ∈ LP in the following way: identify for each cluster C = Lj (G, s) of a layering partition

LP ∈ i (G, s) of G an arbitrary vertex xC Li−1 which has a neighbor in C = Lj and make

xC adjacent in H with all vertices v ∈ C (see Figure 4a). Note that H is an unweighted

tree, without any Steiner points, and resembles a BFS-tree of G. Two other trees for G

are constructed as follows.

Tree Hℓ : Tree Hℓ = (V, F, ℓ) is obtained from H by assigning uniformly the weight

ℓ = max{dG(u, v): uv is an edge of H} to all edges of H. So, Hℓ is a uniformly weighted tree without Steiner points. It turns out that G embeds in tree Hℓ non-contractively.

Note that, although the topology of the tree Hℓ can be determined in O(m) time (Hℓ is

isomorphic to H), computation of the weight ℓ requires O(nm) time. Thus, the tree Hℓ

is constructible in O(nm) total time. See Figure 4a for an illustration. 42

′ ′ ∪ ′ Tree Hℓ : Tree Hℓ = (V S, F , ℓ) is obtained from H by first introducing one Steiner

i point pC for each cluster C := Lj and adding an edge between each vertex of C and pC and

an edge between pC and the support vertex xC for C, and then by assigning uniformly

1 { { }} the weight ℓ = 2 max ∆s(G), max dG(u, v): uv is an edge of H to all edges of the

′ obtained tree. So, Hℓ is a uniformly weighted tree with at most O(n) Steiner points.

′ ′ Again, G embeds into tree Hℓ non-contractively and Hℓ can be obtained in O(nm) total

time. See Figure 4b for an illustration.

′ (a) Topology of trees H and Hℓ. (b) Topology of tree Hℓ. Squares denote

Steiner points.

′ Figure 4: Embedding into trees H,Hℓ and Hℓ.

Constructed trees have the following distance properties (for comparison reasons, we

include also the results for H mentioned earlier).

Proposition 7 ([54]). Let G = (V,E) be a graph, s be an arbitrary vertex, α = td(G),

′ ∆s = ∆s(G), and H, Hℓ, Hℓ be trees as described above. Then, for any two vertices x and y of G, the following are true:

dH (x, y) − 2 ≤ dG(x, y) ≤ dH (x, y) + ∆s,

dH (x, y) − 2 ≤ dG(x, y) ≤ dH (x, y) + 3α, 43

≤ ≤ dG(x, y) dHℓ (x, y) (∆s + 1)(dG(x, y) + 2),

≤ ≤ { − } dG(x, y) dHℓ (x, y) max 3α 1, 2α + 1 (dG(x, y) + 2) ,

d (x, y) ≤ d ′ (x, y) ≤ (∆ + 1)(d (x, y) + 1), G Hℓ s G

d (x, y) ≤ d ′ (x, y) ≤ 3α(d (x, y) + 1). G Hℓ G

′ As pointed out in [54], tree Hℓ provides a 6-approximate solution to the problem of minimum distortion non-contractive embedding of a graph into tree.

In our empirical study, we analyze embeddings of our graph datasets into each of

these three trees and measure how closely these graph datasets resemble a tree from this

perspective. We compute the following measures:

dT (u,v) - maximum distortion right := max{ : u, v ∈ V, dT (u, v) > dG(u, v) > 0}; dG(u,v)

dG(u,v) - maximum distortion left := max{ : u, v ∈ V, dG(u, v) > dT (u, v) > 0}; dT (u,v)

dT (u,v) - average distortion right := avg{ : u, v ∈ V, dT (u, v) > dG(u, v) > 0}; dG(u,v)

dG(u,v) - average distortion left := avg{ : u, v ∈ V, dG(u, v) > dT (u, v) > 0}; dT (u,v)

| − | - average relative distortion := avg{ dT (u,v) dG(u,v) : u, v ∈ V }; dG(u,v)

1 dT (u,v) - distance-weighted average distortion := Σu,v∈V (dG(u, v) · ) Σu,v∈V dG(u,v) dG(u,v)

= Σu,v∈V dT (u,v) . Σu,v∈V dG(u,v)

A pair of distinct vertices u, v of G = (V,E) we call a right pair with respect to tree

H = (V,F ) if dG(u, v) < dH (u, v). If dH (u, v) < dG(u, v) then they are called a left pair.

′ ′ Note that G has no left pairs with respect to trees Hℓ and Hℓ, in case of trees Hℓ and Hℓ, 44 we talk only about maximum distortion, average distortion, average relative distortion and distance-weighted average distortion. Distance-weighted average distortion is used in literature when distortion of distant pairs of vertices is more important than that of close pairs, as it gives larger weight values to distortion of distant pairs (see [109]).

Clearly, any tree graph would have maximum distortion, average relative distortion and distance-weighted average distortion equal to 1, 0 and 1, respectively.

Graph average max % of average max % of % of average distance- distortion distor- left distortion distor- right pairs relative weighted

left tion pairs right tion pairs dT = dG distortion average left (round.) right (round.) (round.) distortion PPI 1.50159 7 70.5 1.34140 3 9.1 20.4 0.24669 0.790311 Yeast 1.48714 5 56.3 1.38989 3 12.2 31.5 0.219268 0.850311 DutchElite 1.54045 7 73.0 1.41254 3 3.9 23.1 0.252341 0.760714 EPA 1.50416 5 44.66 1.38107 3 10.47 44.87 0.178557 0.878082 EVA 1.29905 6 32.31 1.27780 3 14.77 52.92 0.110271 0.951626 California 1.52477 5 61.82 1.37071 3 7.92 30.25 0.227176 0.810647 Erd¨os 1.35242 3 2.75 1.41097 3 8.91 88.34 0.0437277 1.02241 Routeview 1.40636 4 24.39 1.41413 3 33.34 42.28 0.205375 1.03343 Homo release 3.2.99 1.533 4 2.83 1.67827 3 25.16 72.01 0.180092 1.13402 AS Caida 20071105 1.48085 4 21.43 1.35730 3 35.42 43.15 0.192302 1.02943 Dimes 3/2010 1.53666 3 5.74 1.37247 3 44.42 49.84 0.184767 1.12555 Aqualab 1.42269 4 31.71 1.41923 3 35.75 32.54 0.241815 1.03194 AS Caida 20120601 1.34538 4 22.42 1.40429 3 20.43 57.15 0.138869 1.0068 itdk0304 1.60077 8 94.85 1.26367 3 0.55 4.60 0.331656 0.673012 DBLB-coauth 1.77416 9 95.82 1.24977 3 0.59 3.59 0.383101 0.615328 Amazon 2.48301 19 99.17 1.20027 3 0.20 0.63 0.536656 0.536656

Table 7: Distortion results of embedding datasets into a canonic tree H.

Tables 7 and 8 show the results of embedding our graph datasets into trees H,Hℓ and

′ Hℓ, respectively. It turns out that most of the datasets embed into tree H with average distortion (right or left, right being usually better) between 1 and 1.5. Also, many pairs of vertices enjoy exact embedding to tree H; they preserve their original graph distances 45

′ tree Hℓ tree Hℓ Graph average max average distance- average max average distance-

distor- distor- relative weighted distor- distor- relative weighted

tion tion distor- average tion tion distor- average

tion distor- tion distor-

tion tion PPI 5.70566 21 4.70566 5.53218 5.29652 16 4.29652 5.2027 Yeast 4.37781 15 3.37781 4.25155 3.79318 12 2.79318 3.74159 DutchElite 5.45299 21 4.45299 5.325 6.53269 20 5.53269 6.4574 EPA 4.50619 15 3.50619 4.39041 4.06901 12 3.06901 3.99447 EVA 5.83084 18 4.83084 5.70976 7.77752 18 6.77752 7.65544 California 4.15785 15 3.15785 4.05324 4.98668 16 3.98668 4.92935 Erd¨os 3.08843 9 2.08843 3.06724 3.06705 8 2.06705 3.05622 Routeview 4.28302 12 3.28302 4.13371 4.80363 12 3.80363 4.66503 Homo release 3.2.99 4.64504 12 3.64504 4.53609 3.96703 10 2.96703 3.94713 AS Caida 20071105 4.24314 12 3.24314 4.11772 4.76795 12 3.76795 4.65617 Dimes 3/2010 3.43833 9 2.43833 3.37664 3.35917 8 2.35917 3.32159 Aqualab 12/2007- 09/2008 4.23183 12 3.23183 4.12775 4.54116 12 3.54116 4.4587 AS Caida 20120601 4.10547 12 3.10547 4.0272 4.53051 12 3.53051 4.4896 itdk0304 5.370078 24 4.37008 5.3841 5.710122 22 4.71012 5.82908 DBLB-coauth 5.57869 27 4.57869 5.53795 5.12724 22 4.12724 5.14932 Amazon 8.81911 57 7.81911 8.78382 7.87004 42 6.87004 7.95201

Table 8: Distortion results of non-contractive embedding of datasets into trees Hℓ and

′ Hℓ.

(for example, around 88% of the pairs in the Erd¨osdataset, 72% of pairs in Homo release

3.2.99, 57% in AS Caida 20120601 preserve their original graph distances). Comparing

′ the results of non-contractive embeddings to trees Hℓ and Hℓ, we observe that maximum

′ distortions are slightly improved in Hℓ over distortions in Hℓ, but average distortions are very much comparable. Furthermore, distance-weighted average distortions are better in

′ Hℓ than in Hℓ. This confirms Gupta’s claim in [101] that the Steiner points do not really help. 46

′ As tree Hℓ provides a 6-approximate solution to the problem of minimum distortion non-contractive embedding of graph into tree, dividing by 6 the maximum distortion

′ values in Table 8 for tree Hℓ, we obtain a lower bound on td(G) for each graph dataset

G. For example, td(G) is at lest 4/3 for Erd¨osand Dimes 3/2010, at least 5/3 for

Homo release 3.2.99, at least 2 for Yeast, EPA, Routeview, AS Caida 20071105, Aqualab

12/2007-09/2008 and AS Caida 20120601, at least 8/3 for PPI and California, at least

10/3 for DutchElite, at least 3 for EVA, at least 11/3 for itdk0304 and DBLB-coauth, and at least 7 for Amazon.

2.6 Tree-Breadth, Tree-Length and Tree-Stretch

There are two other graph parameters measuring metric tree likeness of a graph that are based on the notion of tree-decomposition introduced by Robertson and Seymour in their work on graph minors [150]. Analysis of few real-life networks (like Aqualab,

AS Caida, Dimes) performed in [62] shows that although those networks have small hyperbolicities, they all have sufficiently large tree-width due to well connected cores.

As we demonstrate below, the tree-length of those graph datasets is relatively small.

Evidently, for any graph G, 1 ≤ tb(G) ≤ tl(G) ≤ 2tb(G) holds. Hence, if one parameter is bounded by a constant for a graph G then the other parameter is bounded for G as well. Clearly, in view of tree-decomposition T (G) of a graph G, the smaller the parameters tl(G) and tb(G) of G are, the closer the graph G is to a tree metrically.

Unfortunately, while graphs with tree-length 1 (as they are exactly the chordal graphs) can be recognized in linear time, the problem of determining whether a given graph has tree-length at most λ is NP-complete for every fixed λ > 1 (see [127]). Judging from 47 this result, it is conceivable that the problem of determining whether a given graph has tree-breadth at most ρ is NP-complete, too.

The following proposition from [71] establishes a relationship between the tree-length and the cluster-diameter of a layering partition of a graph.

Proposition 8 ([71]). For every graph G and any vertex s, ∆s(G)/3 ≤ tl(G) ≤ ∆s(G)+

1.

Thus, the cluster-diameter ∆s(G) of a layering partition provides easily computable bounds for the hard to compute parameter tl(G).

One can prove similar inequalities relating the tree-breadth and the cluster-radius of a layering partition of a graph.

Proposition 9. For every graph G and any vertex s,

∆s(G)/6 ≤ Rs(G)/3 ≤ tb(G) ≤ Rs(G) + 1 ≤ ∆s(G) + 1.

Furthermore, a tree-decomposition of G with breadth at most 3tb(G) can be constructed in O(n + m) time.

Proof. The proof is similar to the proof from [71] of Proposition 8. First we show

Rs(G)/3 ≤ tb(G). Let T (G) be a tree-decomposition of G with minimum breadth tb(G).

Let X1X2 be an edge of T (G) and T1, T2 be subtrees of T (G) after removing the edge ∩ X1X2. It is known [65] that set I = X1 X2 separates in G vertices belonging to bags of T1 but not to I from vertices belonging to bags of T2 but not to I. Assume that T (G) is rooted at a bag containing vertex s, the source of layering partition LP(G, s). Let C

j ··· be a cluster from layer Li (i.e., C = Li for some j = 1, , pi). Let Z be the nearest 48 common ancestor of all bags of T (G) containing vertices of C. Let z be the vertex such that Z ⊆ Btb(G)(z, G).

Figure 5: Illustration to the proof of Proposition 9.

Consider arbitrary vertex x ∈ C. Necessarily, there is a vertex y ∈ C and two bags X and Y of T (G) containing vertices x and y, respectively, such that Z = NCAT (G)(X,Y )

(i.e., Z is the nearest common ancestor of X and Y in T (G)). Let P be a shortest path of G from s to x. By the separator property above, P intersects Z. See Figure 5 for an ∩ illustration. Let a be a vertex of P Z closest to s in G. Since both x and y belong to C, there exists a path Q from x to y in G using only intermediate vertices w with dG(s, w) ≥ i. Let b ∈ Q ∩ Z (i.e. Q intersects Z at vertex b). We have dG(s, x) = i = dG(s, a) + dG(a, x) and i ≤ dG(s, b) ≤ dG(s, a) + dG(a, z) + dG(z, b) ≤ dG(s, a) + 2tb(G). 49

Hence, dG(a, x) = i − dG(s, a) ≤ 2tb(G) and therefore

dG(x, z) ≤ dG(x, a) + dG(a, z) ≤ 2tb(G) + tb(G) = 3tb(G).

Thus, any vertex x of C is at distance at most 3tb(G) from z in G, implying Rs(G)/3 ≤ tb(G).

Note that, for the neighbor x′ of x on P , d(x′, z) ≤ 3tb(G) − 1 must hold, i.e.,

j B3tb(G)(z, G) contains not only all vertices of C = Li but also all neighbors of vertices of

C lying in layer Li−1. This fact will be useful in the second part of this proof.

Now we show that tb(G) ≤ Rs(G) + 1. Consider tree Γ(G, s) of a layering partition

LP(G, s) and assume Γ(G, s) is rooted at node {s}. Let p(C) be the parent of node C in

Γ(G, s). Clearly, Γ(G, s) satisfies already conditions 1 and 3 of tree-decompositions and only violates condition 2 as the edges joining vertices in different (neighboring) layers are not yet covered by bags (which are the clusters in this case). We can obtain a tree- decomposition Γ′ from Γ(G, s) as follows. Γ′ will have the same structure as Γ(G, s), only the nodes of Γ(G, s) will slightly expand to cover additional edges of G and form

′ the bags of Γ . To each node C of Γ(G, s) (assume C ⊆ Li) we add all vertices from its parent p(C)(p(C) ⊆ Li−1) which are adjacent to vertices of C in G. This expansion of

C results in a bag C+ of Γ′ which, by construction, contains now also each edge uv of

′ G with u ∈ C ⊆ Li and v ∈ p(C) ⊆ Li−1. Thus, Γ satisfies conditions 1 and 2 of tree-

+ decompositions. Also, if C ⊆ Br(z) for some vertex z and integer r, then C ⊆ Br+1(z) must hold. Furthermore, each vertex v of G that was in a node C now belongs to bag

C+ and to all bags formed from children of C in Γ(G, s) (and only to them). Hence, all bags containing v form a star in Γ′. All these indicate that Γ′ is a tree-decomposition of 50

G with breadth at most Rs(G) + 1, i.e., tb(G) ≤ Rs(G) + 1.

Furthermore, as we indicated in the first part of this proof, for any cluster C there is

+ ′ a vertex z in G such that C ⊆ B3tb(G)(z, G). The latter implies that the tree Γ obtained from Γ(G, s) has breadth at most 3tb(G). Finally, since Γ′ is constructible in linear time and Rs(G) ≤ ∆s(G) ≤ 2Rs(G) holds for every graph G, the proposition follows.

Hence, the cluster-radius Rs(G) of a layering partition provides easily computable bounds for the tree-breadth tb(G) of a graph. In Table 9, we show the corresponding lower and upper bounds on the tree-breadth for some of our datasets. The lower bound is obtained by dividing Rs(G) by 3, the upper bound is obtained by calculating the breadth of the tree-decomposition Γ′.

Graph Rs(G) lower bound upper bound G = (V,E) on tb(G) on tb(G) PPI 4 2 5 Yeast 4 2 4 DutchElite 6 2 6 EPA 4 2 4 EVA 5 2 5 California 4 2 4 Erd¨os 2 1 2 Routeview 3 1 4 Homo release 3.2.99 3 1 3 AS Caida 20071105 3 1 3 Dimes 3/2010 2 1 2 Aqualab 12/2007- 09/2008 3 1 3 AS Caida 20120601 3 1 3 itdk0304 6 2 6 DBLB-coauth 7 3 7 Amazon 12 4 12

Table 9: Lower and upper bounds on the tree-breadth of our graph datasets.

Reformulating Proposition 1, we obtain the following result. 51

Proposition 10. For any graph G = (V,E) and its canonic tree H = (V,F ), the follow- ing is true:

∀u, v ∈ V, dH (u, v) − 2 ≤ dG(u, v) ≤ dH (u, v) + 3 tl(G) ≤ dH (u, v) + 6 tb(G).

Graphs with small tree-length or small tree-breadth have many other nice proper-

ties. Every n-vertex graph with tree-length tl(G) = λ has an additive 2λ-spanner with

O(λn + n log n) edges and an additive 4λ-spanner with O(λn) edges, both constructible

in polynomial time [70]. Every n-vertex graph G with tb(G) = ρ has a system of at most

log2 n collective additive tree (2ρ log2 n)-spanners constructible in polynomial time [73].

Those graphs also enjoy a 6λ-additive routing labeling scheme with O(λ log2 n) bit la-

bels and O(log λ) time routing protocol [69], and a (2ρ log2 n)-additive routing labeling

scheme with O(log3 n) bit labels and O(1) time routing protocol with O(log n) message

initiation time (by combining results of [73] and [78]). See Section 2.7 for some details.

Here we elaborate a little bit more on a connection established in [76] between the tree-

breadth and the tree-stretch of a graph (and the corresponding tree t-spanner problem).

The tree-stretch ts(G) of a graph G = (V,E) is the smallest number t such that G

′ admits a spanning tree T = (V,E ) with dT (u, v) ≤ tdG(u, v) for every u, v ∈ V . The tree T is called a tree t-spanner of G and the problem of finding such a tree T for G is

known as the tree t-spanner problem. Note that as T is a spanning tree of G, necessarily

′ dG(u, v) ≤ dT (u, v) and E ⊆ E. The latter makes the tree-stretch parameter different

from the tree-distortion where new (not from graph) edges can be used to build a tree. It

is known that the tree t-spanner problem is NP-hard [44]. The best known approximation

algorithms have approximation ratio of O(log n) [76,86]. 52

The following two results were obtained in [76].

Proposition 11 ([76]). For every graph G, tb(G) ≤ ⌈ts(G)/2⌉ and tl(G) ≤ ts(G).

≤ Proposition 12 ([76]). For every n-vertex graph G, ts(G) 2tb(G) log2 n. Furthermore,

≤ ∈ a spanning tree T of G with dT (u, v) 2tb(G) log2 n dG(u, v), for every u, v V, can be

constructed in polynomial time.

Proposition 12 is obtained by showing that every n-vertex graph G with tb(G) =

ρ admits a tree (2ρ log2 n)-spanner constructible in polynomial time. Together with

Proposition 11, this provides a log2 n-approximate solution for the tree t-spanner problem

in general unweighted graphs.

We conclude this section with two other inequalities establishing relations between

the tree-stretch and the tree-distortion and hyperbolicity of a graph.

≤ ≤ ≤ Proposition 13 ([72]). For every graph G, tl(G) td(G) ts(G) 2td(G) log2 n.

Proposition 14 ([72]). For every δ-hyperbolic graph G, ts(G) ≤ O(δ log2 n).

Proposition 13 says that if a graph G is non-contractively embeddable into a tree with distortion td(G) then it is embeddable into a spanning tree with stretch at most

2td(G) log2 n. Furthermore, a spanning tree with stretch at most 2td(G) log2 n can be

constructed in polynomial time. Proposition 14 says that every δ-hyperbolic graph G

admits a tree O(δ log2 n)-spanner. Furthermore, such a spanning tree for a δ-hyperbolic

graph can be constructed in polynomial time. 53

2.7 Use of Metric Tree-Likeness

As we have mentioned earlier, metric tree-likeness of a graph is useful in a number

of ways. Among other advantages, it allows to design compact and efficient approximate

distance labeling and routing labeling schemes, fast and accurate estimation of the diam-

eter and the radius of a graph. In this section, we elaborate more on these applications.

In general, low distortion embedability of a graph G into a tree T allows to solve approx-

imately many distance related problems on G by first solving them on the tree T and

then interpreting that solution on G.

2.7.1 Approximate distance queries

Commonly, when one makes a query concerning a pair of vertices in a graph (adja-

cency, distance, shortest route, etc.), one needs to make a global access to the structure

storing that information. A compromise to this approach is to store enough information

locally in a label associated with a vertex such that the query can be answered using only

the information in the labels of the two vertices in question and nothing else. Motivation

of localized data structure in is surveyed and widely discussed

in [95,138].

Here, we are mainly interested in the distance and routing labeling schemes introduced

by Peleg (see, e.g., [138]). Distance labeling schemes are schemes that label the vertices of a graph with short labels in such a way that the distance between any two vertices u and v can be determined or estimated efficiently by merely inspecting the labels of u and v, without using any other information. Routing labeling schemes are schemes that label the vertices of a graph with short labels in such a way that given the label of a 54

source vertex and the label of a destination, it is possible to compute efficiently the port

number of the edge from the source that heads in the direction of the destination.

It is known that n-vertex trees enjoy a distance labeling scheme where each vertex is

assigned a O(log2 n)-bit label such that given labels of two vertices the distance between

them can be inferred in constant time [140]. We can use for our datasets their canonic

trees to compactly and distributively encode their approximate distance information.

Given a graph dataset G, we first compute in linear time its canonic tree H. Then, we

preprocess H in O(n log n) time (see [140]) to assign each vertex v ∈ V an O(log2 n)-bit distance label. Given two vertices u, v ∈ V , we can compute in O(1) time the distance dH (u, v) from their labels and output this distance as a good estimate for the distance between u and v in G.

In Figure 6, we demonstrate how accurately canonic trees represent pairwise distances

in our datasets. For a given number ϵ ≥ 1, we show how many vertex pairs had a

distortion less than ϵ, i.e., pairs u, v ∈ V with max{ dH (u,v) , dG(u,v) } < ϵ. We can see that dG(u,v) dH (u,v)

H approximates distances for most vertex pairs with a high level of accuracy. Exact

graph distances were preserved in H for at least 40% of pairs in 8 datasets (EPA, EVA,

Erd¨os,Routeview, Homo, AS Caida 20071105, Dimes 3/2010 and AS Caida 20120601).

At least 50% of pairs of 6 datasets have distance distortion in H less than 1.2. At least

60% of pairs for 6 datasets have distance distortion less than 1.3. At least 70% of pairs

of 10 datasets have distance distortion less than 1.5. At least 80% of pairs of 14 datasets

have distance distortion less than 2. At least 90% of pairs of 14 datasets have distance

distortion less than 2.2. For the DBLB-coauth dataset, 80% (90%) of pairs embed into

H with distortion no more than 2.2 (2.4, respectively; not shown on table). For the 55

Graph distortion G = V,E) = 1 < 1.2 < 1.3 < 1.5 < 2 < 2.2 PPI 20.41 37.68 47.90 65.93 90.68 96.37 Yeast 31.51 38.45 53.22 72.30 91.03 98.55 DutchElite 23.13 27.99 42.97 64.60 88.71 95.44 EPA 44.87 50.83 65.50 76.52 91.82 98.68 EVA 52.92 73.37 82.68 92.83 99.12 99.88 California 30.25 40.21 51.89 64.53 88.97 98.06 Erd¨os 88.34 88.34 89.84 96.99 99.55 99.98 Routeview 42.28 44.75 58.17 81.94 96.40 99.85 Homo release 3.2.99 72.01 72.13 73.48 79.08 90.79 99.97 AS Caida 20071105 43.15 46.60 62.39 84.54 95.68 99.90 Dimes 3/2010 49.84 50.06 56.77 89.30 97.05 99.99 Aqualab 12/2007- 09/2008 32.54 33.23 44.61 76.46 95.93 99.98 AS Caida 20120601 57.15 59.57 71.82 89.58 98.65 99.98 itdk0304 4.60 15.18 23.67 42.54 81.98 93.55 DBLB-coauth 3.59 12.08 17.60 30.64 67.92 83.10 Amazon 0.63 2.67 4.57 10.16 33.10 46.53

(a) Percentage of vertex pairs whose distance was distorted only up-to a given value.

1

0.9

0.8

0.7

0.6

0.5

0.4

accumulative frequency accumulative 0.3

0.2

0.1

0 1 1.2 1.4 1.6 1.8 2 2.2 distortion PPI Yeast DutchElite EPA EVA California Erdös Routeview Homo release 3.2.99 AS_Caida_20071105 Dimes Aqualab AS_Caida_20120601 itdk0304 DBLB-coauth Amazon

(b) Accumulative frequency chart. Figure 6: Distortion distribution for embedding of a graph dataset into its canonic tree

H. 56

Amazon dataset, 80% (90%) of pairs embed into H with distortion no more than 3.2

(3.8, respectively; not shown on table).

Hence, using embeddings of our datasets into their canonic trees, we obtain a compact and efficient approximate distance labeling scheme for them. Each vertex of a graph dataset G gets O(log2 n)-bit label from the canonic tree and the distance between any two vertices of G can be computed with a good level of accuracy in constant time from their labels only.

2.7.2 Approximating optimal routes

First we formally define approximate routing labeling schemes. A family ℜ of graphs is said to have an l(n) bit (s, r)-approximate routing labeling scheme if there exist a function

L, labeling the vertices of each n-vertex graph in ℜ with distinct labels of up to l(n) bits, and an efficient algorithm/function f, called the routing decision or routing protocol, that given the label of a current vertex v and the label of the destination vertex (the header of the packet), decides in time polynomial in the length of the given labels and using only those two labels, whether this packet has already reached its destination, and if not, to which neighbor of v to forward the packet. Furthermore, the routing path from any source s to any destination t produced by this scheme in a graph G from ℜ must have the length at most s · dG(s, t) + r. For simplicity, (1, r)-approximate labeling schemes (distance or routing) are called r-additive labeling schemes, and (s, 0)-approximate labeling schemes are called s-multiplicative labeling schemes.

A very good routing labeling scheme exists for trees [157]. An n-vertex tree can be preprocessed in O(n log n) time so that each vertex is assigned an O(log n)-bit routing 57

label. Given the label of a source vertex and the label of a destination, it is possible to

compute in constant time the port number of the edge from the source that lays on the

(shortest) path to the destination.

Unfortunately, a canonic tree H of a graph G is not suitable for approximately routing in G; H may have artificial edges (not coming from G) and therefore a path of H from

a source to a destination may not be available for routing in G. To reduce the problem

of routing in G to routing in a tree T , the tree T needs to be a spanning tree of G.

Hence, a spanning tree T of G with minimum stretch (i.e., a tree t-spanner of G with

t = ts(G)) would be a perfect choice. Unfortunately, finding a tree t-spanner of a graph

with minimum t is an NP-hard problem.

For our graph datasets, one can exploit the facts that they have small tree-breadth/tree-

length and/or small hyperbolicity.

If the tree-breadth of an n-vertex graph G is ρ, then, by a result from [76], G admits a

tree (2ρ log2 n)-spanner constructible in polynomial time. Hence, G enjoys a (2ρ log2 n)-

multiplicative routing labeling scheme with O(log n) bit labels and O(1) time routing

protocol (routing is essentially done in that tree spanner). Another result for graphs with

tb(G) = ρ, useful for designing routing labeling schemes, is presented in [73]. It states that

every n-vertex graph G with tb(G) = ρ has a system of at most log2 n collective additive

T tree (2ρ log2 n)-spanners, i.e., a system of at most log2 n spanning trees of G such that

T ≤ for any two vertices u, v of G there is a tree T in with dT (u, v) dG(u, v) + 2ρ log2 n.

Furthermore, such a system T for G can be constructed in polynomial time [73]. By

combining this with a result from [78], we obtain that every n-vertex graph G with

3 tb(G) = ρ enjoys a (2ρ log2 n)-additive routing labeling scheme with O(log n) bit labels 58 and O(1) time routing protocol with O(log n) message initiation time. The approach of [78] is to assign to each vertex of G a label with O(log3 n) bits (distance and routing labels coming from log2 n spanning trees) and then, using the label of source vertex v and the label of destination vertex u, identify in O(log n) time the best spanning tree in

T to route from v to u.

If the tree-length of an n-vertex graph G is λ, then, by result from [69], G enjoys a

6λ-additive routing labeling scheme with O(λ log2 n) bit labels and O(log λ) time routing protocol.

If the hyperbolicity of an n-vertex graph G is δ, then, by result from [53], G enjoys an O(δ log n)-additive routing labeling scheme with O(δ log2 n) bit labels and O(log δ) time routing protocol. Note that for any graph G, the hyperbolicity of G is at most its tree-length [52].

Thus, for our graph datasets, there exists a very compact labeling scheme (at most

O(log2 n) or O(log3 n) bits per vertex) that encodes logarithmic length routes between any

{ } ≤ pair of vertices, i.e., routes of length at most dG(u, v) + min O(δ log n), 6λ, 2ρ log2 n diam(G) + O(log n) ≤ O(log n) for each vertex pair u, v of G. The latter implies very good navigability of our graph datasets. Recall that, for our graph datasets, diam(G) ≤

O(log n) holds.

2.7.3 Approximating diameter and radius

Recall that the eccentricity of a vertex v of a graph G, denoted by ecc(v), is the maximum distance from v to any other vertex of G, i.e., ecc(v) := maxu∈V dG(v, u). The diameter diam(G) of G is the largest eccentricity of a vertex in G, i.e., diam(G) := 59

Graph diameter radius # of BFS scans estimated radius G = (V,E) diam(G) rad(G) needed to get or ecc(·) of a diam(G) middle vertex PPI 19 11 3 12 Yeast 11 6 3 6 DutchElite 22 12 4 13 EPA 10 6 2 7 EVA 18 10 2 10 California 13 7 2 8 Erd¨os 4 2 2 3 Routeview 10 5 2 5 Homo release 3.2.99 10 5 2 6 AS Caida 20071105 17 9 2 9 Dimes 3/2010 8 4 2 5 Aqualab 12/2007- 09/2008 9 5 2 5 AS Caida 20120601 10 5 2 5 itdk0304 26 14 2 15 DBLB-coauth 23 12 2 14 Amazon 47 24 2 26

Table 10: Estimation of diameters and radii.

maxv∈V ecc(v) = maxv,u∈V dG(u, v). The radius rad(G) of G is the smallest eccentricity

of a vertex in G, i.e., rad(G) := minv∈V ecc(v). A vertex c of G with ecc(v) = rad(G)

(i.e., a smallest eccentricity vertex) is called a central vertex of G. The center C(G) of

G is the set of all central vertices of G. Let also F (v) := {u ∈ V : dG(v, u) = ecc(v)} be

the set of vertices of G furthest from v.

In general (even unweighted) graphs, it is still an open problem whether the diameter

and/or the radius of a graph G can be computed faster than the time needed to compute

the entire distance matrix of G (which requires O(nm) time for a general unweighted

graph). On the other hand, it is known that both the diameter and the radius of a

tree T can be calculated in linear time. That can be done by using two Breadth-First-

Search (BFS) scans as follows. Pick an arbitrary vertex u of T . Run a BFS starting 60

from u to find v ∈ F (u). Run a second BFS starting from v to find w ∈ F (v). Then

dT (v, w) = diam(T ), i.e., v, w is a diametral pair of T , and rad(T ) = ⌊(dT (v, w) + 1)/2⌋.

To find the center of T it suffices to take one or two adjacent middle vertices of the

(v, w)-path of T .

Interestingly, in [52], Chepoi et al. established that this approach of 2 BFS-scans can

be adapted to provide fast (in linear time) and accurate approximations of the diameter,

radius, and center of any finite set S of δ-hyperbolic geodesic spaces and graphs. In particular, for a δ-hyperbolic graph G, it was shown that if v ∈ F (u) and w ∈ F (v), then

dG(v, w) ≥ diam(G) − 2δ and rad(G) ≤ ⌊(dG(v, w) + 1)/2⌋ + 3δ. Furthermore, the center

C(G) of G is contained in the ball of radius 5δ + 1 centered at a middle vertex c of any

shortest path connecting v and w in G.

Since our graph datasets have small hyperbolicities, according to [52], few (2, 3, 4,

...) BFS-scans, each next starting at a vertex last visited by the previous scan, should

provide a pair of vertices x and y such that dG(x, y) is close to the diameter diam(G) of

G. Surprisingly (see Table 10), few BFS-scans were sufficient to get exact diameters of all of our datasets: for thirteen datasets, two BFS-scans (just like for trees) were sufficient to find the exact diameter of a graph. Two datasets needed three BFS-scans to find the diameter, and only one dataset required four BFS-scans to get the diameter. We also computed the eccentricity of a middle vertex of a longest shortest path produced by these few BFS-scans and reported this eccentricity as an estimation for the graph radius. It turned out that the eccentricity of that middle vertex was equal to the exact radius for six datasets, was only one apart from the exact radius for eight datasets, and only for two datasets was two units apart from the exact radius. 61

2.8 Conclusion

Based on solid theoretical foundations, we presented strong evidence that a number of real-life networks, taken from different domains like Internet measurements, biolog- ical datasets, web graphs, social and collaboration networks, exhibit metric tree-like structures. We investigated a few graph parameters, namely, the tree-distortion and the tree-stretch, the tree-length and the tree-breadth, the Gromov’s hyperbolicity, the cluster-diameter and the cluster-radius in a layering partition of a graph, which capture and quantify this phenomenon of being metrically close to a tree. Recent advances in theory allowed us to calculate or accurately estimate these parameters for sufficiently large networks. All these parameters are at most constant or (poly)logarithmic factors apart from each other. Specifically, graph parameters td(G), tl(G), tb(G), ∆s(G), Rs(G) are within small constant factors from each other. Parameters ts(G) and δ(G) are within factor of at most O(log n) from td(G), tl(G), tb(G), ∆s(G), Rs(G). Tree-stretch ts(G) is within factor of at most O(log2 n) from hyperbolicity δ(G). One can summarize those relationships with the following chains of inequalities:

δ(G) ≤ ∆s(G) ≤ O(δ(G) log n); Rs(G) ≤ ∆s(G) ≤ 2Rs(G); tb(G) ≤ tl(G) ≤ 2tb(G);

≤ ≤ ≤ ≤ ≤ 2 δ(G) tl(G) td(G) ts(G) 2tb(G) log2 n O(δ(G) log n);

tl(G) − 1 ≤ ∆s(G) ≤ 3tl(G) ≤ 3td(G) ≤ 3(2∆s(G) + 2);

tb(G) − 1 ≤ Rs(G) ≤ 3tb(G) ≤ 3⌈ts(G)/2⌉.

If one of these parameters or its average version has small value for a large scale network, we say that that network has a metric tree-like structure. Among these parameters, 62

theoretically smallest ones are δ(G), Rs(G) and tb(G)(tb(G) being at most Rs(G) + 1).

Our experiments showed that average versions of ∆s(G) and of td(G) have also very small

values for the investigated graph datasets.

In Table 11, we provide a summary of metric tree-likeness measurements calculated for

our datasets. Figure 7 shows four important metric tree-likeness measurements (scaled)

in comparison. Figure 8 gives pairwise dependencies between those measurements (one

as a function of another).

′ Graph diameter radius cluster- average δ(G) Tree H Hℓ Hℓ cluster- G = (V,E) diam(G)rad(G)diameter diameter average average average radius * ∆s(G) of clusters in distortion distortiondistortion Rs(G) LP(G, s) (round.) PPI 19 11 8 0.118977384 3.5 1.38471 5.70566 5.29652 4 Yeast 11 6 6 0.119575699 2.5 1.32182 4.37781 3.79318 4 DutchElite 22 12 10 0.070211316 4 1.41056 5.45299 6.53269 6 EPA 10 6 6 0.06698375 2.5 1.26507 4.50619 4.06901 4 EVA 18 10 9 0.031879981 1 1.13766 5.83084 7.77752 5 California 13 7 8 0.092208234 3 1.35380 4.15785 4.98668 4 Erd¨os 4 2 4 0.001113232 2 1.04630 3.08843 3.06705 2 Routeview 10 5 6 0.063264697 2.5 1.23716 4.28302 4.80363 3 Homo release 3.2.99 10 5 5 0.03432595 2 1.18574 4.64504 3.96703 3 AS Caida 20071105 17 9 6 0.056424679 2.5 1.22959 4.24314 4.76795 3 Dimes 3/2010 8 4 4 0.056582633 2 1.19626 3.43833 3.35917 2 Aqualab 12/2007- 09/2008 9 5 6 0.05826733 2 1.28390 4.23183 4.54116 3 AS Caida 20120601 10 5 6 0.055568105 2 1.16005 4.10547 4.53051 3 itdk0304 26 14 11 0.270377048 – 1.57126 5.370078 5.710122 6 DBLB-coauth 23 12 11 0.45350002 – 1.74327 5.57869 5.12724 7 Amazon 47 24 21 0.489056144 – 2.47109 8.81911 7.87004 12 * avg. distortion right×#right pairs + avg. distortion left×#left pairs +#undistorted pairs = n (2)

Table 11: Summary of tree-likeness measurements.

From the experiment results we observe that in almost all cases the measurements

seem to be monotonic with respect to each other. The smaller one measurement is 63 for a given dataset, the smaller the other measurements are. There are also a few ex- ceptions. For example, EVA dataset has relatively large cluster-diameter, ∆s(G) = 9, but small hyperbolicity, δ(G) = 1. On the other hand, Erd¨osdataset has ∆s(G) = 4 while its hyperbolicity δ(G) is equal to 2 (see Figure 8a). Yet Erd¨osdataset has bet-

′ ter embedability (smaller average distortions) to trees H,Hℓ and Hℓ than that of EVA, suggesting that the (average) cluster-diameter may have greater impact on the embed-

′ ability into trees H,Hℓ and Hℓ. Comparing the measurements of Erd¨osvs. Homo release

3.2.99, we observe that both have the same hyperbolicity 2, but Erd¨oshas better em-

′ bedability (average distortion) to trees H,Hℓ,Hℓ. This could be explained by smaller

∆s(G) and average diameter of clusters in Erd¨osdataset. Comparing measurements of

PPI vs. California (the same holds for AS Caida 20071105 vs. AS Caida 20120601), both have the same ∆s(G) and Rs(G) values but California (AS Caida 20120601) has smaller hyperbolicity and average diameter of clusters. We also observe that the datasets

Routeview and AS Caida 20071105 have the same values of ∆s(G), Rs(G) and δ(G) but

AS Caida 20071105 has a relatively smaller average diameter of clusters. This could ex-

′ plain why AS Caida 20071105 has relatively better embedability to H,Hℓ and Hℓ than

Routeview. We can see that the difference in average diameters of clusters was relatively small, resulting in small difference in embeddability.

From these observations, one can suggest that for classification of our datasets all these tree-likeness measurements are important; they collectively capture and explain metric tree-likeness of the datasets. We suggest that metric tree-likeness measurements in conjunction with other local characteristics of networks, such as the degree distribution and clustering coefficients, provide a more complete unifying picture of networks. 64

25

20

15

10

5

0

tree H average distortion*10 ¡(G) average diameter of clusters*10 ∆s

Figure 7: Four tree-likeness measurements scaled.

12 1.6

10 1.5

8 1.4

(G) 6 1.3 ∆

4 distortion avg. 1.2

2 1.1

0 1 1 2 2 2 2 2 2.5 2.5 2.5 2.5 3 3.5 4 1 2 2 2 2 2 2.5 2.5 2.5 2.5 3 3.5 4 δ(G) δ(G)

(a) hyperbolicity δ(G) vs. cluster-diameter (b) hyperbolicity δ(G) vs. avg. distortion of

∆s(G). H.

2.4 2.4 2.2 2.2 2 2 1.8 1.8 1.6

1.6 distortion avg. 1.4 avg. distortion avg. 1.4 1.2 1 1.2

1 4 4 5 6 6 6 6 6 6 8 8 910111121 s(G) avg. diameter of clusters

(c) cluster-diameter ∆s(G) vs. avg. distortion (d) avg. diameter of clusters vs. avg. distor- of H. tion of H. Figure 8: Tree-likeness measurements: pairwise comparison. CHAPTER 3

Collective Additive Tree Spanners and the

Tree-Breadth of a Graph with Consequences

3.1 Introduction

The work in this chapter was inspired by few recent results from [70,76,84,86]. Elkin

and Peleg in [84], among other results, described a polynomial time algorithm that,

given an n-vertex graph G admitting a tree t-spanner, constructs a t-spanner of G with

O(n log n) edges. Emek and Peleg in [86] presented the first O(log n)-approximation algorithm for the minimum value of t for the tree t-spanner problem. They de- scribed a polynomial time algorithm that, given an n-vertex graph G admitting a tree t-spanner, constructs a tree O(t log n)-spanner of G. Later, a simpler and faster O(log n)- approximation algorithm for the problem was given by Dragan and K¨ohler[76]. Their result uses a new necessary condition for a graph to have a tree t-spanner: if a graph G

has a tree t-spanner, then G admits a Robertson-Seymour’s tree-decomposition with bags of radius at most ⌈t/2⌉ in G. In other words, if a graph G admits a tree t-spanner, then its

tree-breadth is at most ⌈t/2⌉ and its tree-length is at most t. Furthermore, any graph G

≤ ⌊ ⌋ with tree-breadth tb(G) ρ admits a tree (2ρ log2 n )-spanner that can be constructed

in polynomial time. Thus, these two results gave a new log2 n-approximation algorithm

for the tree t-spanner problem on general (unweighted) graphs (see [76] for details). 65 66

The algorithm of [76] is conceptually simpler than the previous O(log n)-approximation

algorithm proposed for the problem by Emek and Peleg [86].

Dourisboure et al. in [70] considered the construction of additive spanners with few

edges for n-vertex graphs having a tree-decomposition into bags of diameter at most

λ, i.e., the tree-length λ graphs. For such graphs, they construct additive 2λ-spanners with O(λn + n log n) edges, and additive 4λ-spanners with O(λn) edges. Combining these results with the results of [76], we obtain the following interesting fact (in a sense, turning a multiplicative stretch into an additive surplus without much increase in the number of edges).

Theorem 1. (combining [70] and [76]) If a graph G admits a (multiplicative) tree t-

spanner, then it has an additive 2t-spanner with O(tn + n log n) edges and an additive

4t-spanner with O(tn) edges, both constructible in polynomial time.

This fact raises a few intriguing questions. Does a polynomial time algorithm exist

that, given an n-vertex graph G admitting a (multiplicative) tree t-spanner, constructs

an additive O(t)-spanner of G with O(n) or O(n log n) edges (where the number of

edges in the spanner is independent of t)? Is a result similar to the one presented by

Elkin and Peleg in [84] possible? Namely, does a polynomial time algorithm exist that,

given an n-vertex graph G admitting a (multiplicative) tree t-spanner, constructs an

additive (t − 1)-spanner1 of G with O(n log n) edges? If we allow to use more trees (like

in collective tree spanners), does a polynomial time algorithm exist that, given an n-

vertex graph G admitting a (multiplicative) tree t-spanner, constructs a system of O˜(1)

collective additive tree O˜(t)-spanners of G (where O˜ is similar to Big-O notation up to 1Note that any additive (t − 1)-spanner is a multiplicative t-spanner (see Proposition 16). 67 a poly-logarithmic factor)? Note that an interesting question whether a multiplicative tree spanner can be turned into an additive tree spanner with a slight increase in the stretch is (negatively) settled already in [86]: if there exist some δ = o(n) and ϵ > 0 and a polynomial time algorithm that for any graph admitting a tree t-spanner constructs a tree ((6/5 − ϵ)t + δ)-spanner, then P=NP.

We give some partial answers to these questions. Moreover, we investigate a more general question whether a graph with bounded tree-breadth admits a small system of collective additive tree spanners. We show that any n-vertex graph G has a system of

≤ at most log2 n collective additive tree (2ρ log2 n)-spanners, where ρ tb(G). This settles also an open question from [70] whether a graph with tree-length λ admits a small system of collective additive tree O˜(λ)-spanners.

As a consequence, we obtain that there is a polynomial time algorithm that, given an n-vertex graph G admitting a (multiplicative) tree t-spanner, constructs:

- a system of at most log2 n collective additive tree O(t log n)-spanners of G (compare

with [76, 86] where a multiplicative tree O(t log n)-spanner was constructed for G

in polynomial time; thus, we “have turned” a multiplicative tree O(t log n)-spanner

into at most log2 n collective additive tree O(t log n)-spanners);

- an additive O(t log n)-spanner of G with at most n log2 n edges (compare with

Theorem 1).

It is well known that the t-spanners can equivalently be defined as follows.

Proposition 15 ([44]). Let G be a connected graph and t be a positive number. A spanning subgraph H of G is a t-spanner of G if and only if for every edge xy of G, 68

dH (x, y) ≤ t holds.

This proposition implies that the stretch of a spanning subgraph of a graph G is

always obtained on a pair of vertices that form an edge in G. Consequently, throughout

this dissertation, t can be considered as an integer which is greater than 1 (the case t = 1

is trivial since H must be G itself).

It is also known that every additive r-spanner of G is a (multiplicative) (r+1)-spanner

of G.

Proposition 16 ([146]). Every additive r-spanner of G is a (multiplicative) (r + 1)-

spanner of G. The converse is generally not true.

3.2 Collective Additive Tree Spanners and the Tree-Breadth of

a Graph

In this section, we show that every n-vertex graph G has a system of at most log2 n

≤ collective additive tree (2ρ log2 n)-spanners, where ρ tb(G). We also discuss conse-

quences of this result. Our method is a generalization of techniques used in [79] and [76].

We will assume that n ≥ 4 since any connected graph with at most 3 vertices has an

additive tree 1-spanner.

Note that we do not assume here that a tree-decomposition T (G) of breadth ρ is given

for G as part of the input. Our method does not need to know T (G), our algorithm works

directly on G. For a given graph G and an integer ρ, even checking whether G has a tree-decomposition of breadth ρ could be a hard problem. For example, while graphs with tree-length 1 (as they are exactly the chordal graphs) can be recognized in linear 69

time, the problem of determining whether a given graph has tree-length at most λ is

NP-complete for every fixed λ > 1 (see [127]).

We will need the following results proven in [76].

Lemma 1 ([76]). Every graph G has a balanced disk-separator Dr(v, G) centered at some

vertex v, where r ≤ tb(G).

Lemma 2 ([76]). For an arbitrary graph G with n vertices and m edges, a balanced disk-separator Dr(v, G) with minimum r can be found in O(nm) time.

3.3 Hierarchical decomposition of a graph with bounded tree-

breadth

In this section, following [76], we show how to decompose a graph with bounded

tree-breadth and build a hierarchical decomposition tree for it. This hierarchical decom-

position tree is used later for construction of collective additive tree spanners for such a

graph.

Let G = (V,E) be an arbitrary connected n-vertex m-edge graph with a disk-separator

Dr(v, G). Also, let G1,...,Gq be the connected components of G[V \ Dr(v, G)]. Denote

by Si := {x ∈ V (Gi)|dG(x, Dr(v, G)) = 1} the neighborhood of Dr(v, G) with respect

+ to Gi. Let also Gi be the graph obtained from component Gi by adding a vertex ci

(representative of Dr(v, G)) and making it adjacent to all vertices of Si, i.e., for a vertex

∈ ∈ + ∈ ∈ x V (Gi), cix E(Gi ) if and only if there is a vertex xD Dr(v, G) with xxD E(G).

See Figure 9 for an illustration. In what follows, we will call vertex ci a meta vertex

+ representing disk Dr(v, G) in graph Gi . Given a graph G and its disk-separator Dr(v, G), 70

+ + the graphs G1 ,...,Gq can be constructed in total time O(m). Furthermore, the total

+ + number of edges in the graphs G1 ,...,Gq does not exceed the number of edges in G, and the total number of vertices (including q meta vertices) in those graphs does not exceed the number of vertices in G[V \ Dr(v, G)] plus q.

Figure 9: A graph G with a disk-separator Dr(v, G) and the corresponding graphs

+ + G1 ,...,G4 obtained from G. c1, . . . , c4 are meta vertices representing the disk

Dr(v, G) in the corresponding graphs.

Denote by G/e the graph obtained from G by contracting its edge e. Recall that edge e contraction is an operation which removes e from G while simultaneously merging together the two vertices e previously connected. If a contraction results in multiple edges, we delete duplicates of an edge to stay within the class of simple graphs. The operation may be performed on a set of edges by contracting each edge (in any order).

+ The following lemma guarantees that the tree-breadths of the graphs Gi , i = 1, . . . , q, are no larger than the tree-breadth of G.

Lemma 3 ([76]). For any graph G and its edge e, tb(G) ≤ ρ implies tb(G/e) ≤ ρ.

≤ + ≤ Consequently, for any graph G with tb(G) ρ, tb(Gi ) ρ holds for each i = 1, . . . , q. 71

+ Clearly, one can get Gi from G by repeatedly contracting (in any order) edges of G

+ that are not incident to vertices of Gi. In other words, Gi is a minor of G. Recall that a graph G′ is a minor of G if G′ can be obtained from G by contracting some edges, deleting some edges, and deleting some isolated vertices. The order in which a sequence of such contractions and deletions is performed on G does not affect the resulting graph

G′.

Let G = (V,E) be a connected n-vertex, m-edge graph and assume that tb(G) ≤ ρ.

Lemma 1 and Lemma 2 guarantee that G has a balanced disk-separator Dr(v, G) with

r ≤ ρ, which can be found in O(nm) time by an algorithm that works directly on graph

G and does not require construction of a tree-decomposition of G of breadth ≤ ρ. Using

these and Lemma 3, we can build a (rooted) hierarchical tree H(G) for G as follows. If G

is a connected graph with at most 5 vertices, then H(G) is one node tree with root node

(V (G),G). Otherwise, find a balanced disk-separator Dr(v, G) in G with minimum r (see

+ + + + Lemma 2) and construct the corresponding graphs G1 ,G2 ,...,Gq . For each graph Gi

+ ≤ H + (i = 1, . . . , q) (by Lemma 3, tb(Gi ) ρ), construct a hierarchical tree (Gi ) recursively

and build H(G) by taking the pair (Dr(v, G),G) to be the root and connecting the root

H + of each tree (Gi ) as a child of (Dr(v, G),G).

The depth of this tree H(G) (that is, the length of a longest path from the root to

any node) is the smallest integer k such that

n 1 1 + + ··· + + 1 ≤ 5, 2k 2k−1 2

− that is, the depth is at most log2 n 1.

It is also easy to see that, given a graph G with n vertices and m edges, a hierarchical 72

tree H(G) can be constructed in O(nm log2 n) total time. There are at most O(log n)

levels in H(G), and one needs to do at most O(nm log n) operations per level since the

total number of edges in the graphs of each level is at most m and the total number of

vertices in those graphs cannot exceed O(n log n).

For an internal (i.e., non-leaf) node Y of H(G), since it is associated with a pair

′ ′ ′ ′ ′ ′ ′ ′ (Dr′ (v ,G ),G ), where r ≤ ρ, G is a minor of G and v is the center of disk Dr′ (v ,G )

of G′, it will be convenient in what follows to denote G′ by G(↓ Y ), v′ by c(Y ), r′ by

′ ′ ′ ′ ′ r(Y ), and Dr′ (v ,G ) by Y itself. Thus, (Dr′ (v ,G ),G ) = (Dr(Y )(c(Y ),G(↓ Y )),G(↓

Y )) = (Y,G(↓ Y )) in these notations, and we identify node Y of H(G) with the set

Y = Dr(Y )(c(Y ),G(↓ Y )) and associate with this node also the graph G(↓ Y ). See Figure

10 for an illustration. Each leaf Y of H(G), since it corresponds to a pair (V (G′),G′), we

identify with the set Y = V (G′) and use, for convenience, the notation G(↓ Y ) for G′.

If now (Y 0,Y 1,...,Y h) is the path of H(G) connecting the root Y 0 of H(G) with a node Y h, then the vertex set of the graph G(↓ Y h) consists of some (original) vertices

i i i of G plus at most h meta vertices representing the disks Dr(Y )(c(Y ),G(↓ Y )) = Y ,

i = 0, 1, . . . , h − 1. Note also that each (original) vertex of G belongs to exactly one node

of H(G).

3.4 Construction of collective additive tree spanners

Unfortunately, the class of graphs of bounded tree-breadth is not hereditary, i.e.,

induced subgraphs of a graph with tree-breath ρ are not necessarily of tree-breadth at

most ρ (for example, a cycle of length ℓ with one extra vertex adjacent to each vertex of

the cycle has tree-breadth 1, but the cycle itself has tree-breadth ℓ/3). Thus, the method 73

Figure 10: a) A graph G and its balanced disk-separator D1(13,G). b) A hierarchical

0 0 tree H(G) of G. We have G = G(↓ Y ), Y = D1(13,G). Meta vertices are shown circled,

disk centers are shown in bold. c) The graph G(↓ Y 1) with its balanced disk-separator

1 1 1 0 2 D1(23,G(↓ Y )) = Y . G(↓ Y ) is a minor of G(↓ Y ). d) The graph G(↓ Y ), a minor

of G(↓ Y 1) and of G(↓ Y 0). Y 2 = V (G(↓ Y 2)) is a leaf of H(G).

presented in [79], for constructing collective additive tree spanners for hereditary classes

of graphs admitting balanced disk-separators, cannot be applied directly to the graphs

of bounded tree-breadth. Nevertheless, we will show that, with the help of Lemma 3,

the notion hierarchical tree from the previous section and a careful analysis of distance

changes (see Lemma 4), it is possible to generalize the method of [79] and construct in

polynomial time for every n-vertex graph G a system of at most log2 n collective additive

≤ tree (2ρ log2 n)-spanners, where ρ tb(G). Unavoidable presence of meta vertices in the 74 graphs resulting from a hierarchical decomposition of the original graph G complicates the construction and the analysis. Recall that, in [79], it was shown that if every induced subgraph of a graph G enjoys a balanced disk-separator with radius at most r, then G admits a system of at most log2 n collective additive tree 2r-spanners.

Let G = (V,E) be a connected n-vertex, m-edge graph and assume that tb(G) ≤ ρ.

Let H(G) be a hierarchical tree of G. Consider an arbitrary internal node Y h of H(G), and let (Y 0,Y 1,...,Y h) be the path of H(G) connecting the root Y 0 of H(G) with Y h.

Let Gb(↓Y j) be the graph obtained from G(↓Y j) by removing all its meta vertices (note that Gb(↓Y j) may be disconnected).

Lemma 4. For any vertex z from Y h ∩ V (G), there exists an index i ∈ {0, 1, . . . , h} such that c(Y i) is not a meta vertex and vertices z and c(Y i) are connected in the graph b i i G(↓ Y ) by a path of length at most ρ(h+1). In particular, dG(z, c(Y )) ≤ ρ(h+1) holds.

↓ h h Gh Proof. Set Gh := G( Y ), c := c(Y ), and let SPc,z be a shortest path of Gh connecting

h ≤ Gh vertices c and z. We know that this path has at most r(Y ) ρ edges. If SPc,z does not contain any meta vertices, then this path is a path of Gb(↓ Y h) and of G and therefore dG(c, z) ≤ ρ holds.

Gh ′ Assume now that SPc,z does contain meta vertices and let µ be the closest to z meta

Gh Gh ′ ′ ′ vertex in SPc,z . See Figure 11 for an illustration. Let SPc,z = (c, . . . , a , µ , b , . . . , z).

By construction of H(G), meta vertex µ′ was created at some earlier recursive step to

i′ i′ ′ represent disk Y of graph Gi′ := G(↓ Y ) for some i ∈ {0, . . . , h − 1}. Hence, there

G ′ ′ ′ ′ ′ i ′ i is a path Pc′,z = (c , . . . , b , . . . , z) of length at most 2ρ in Gi with c := c(Y ). Again,

′ Gi′ b ↓ i if Pc′,z does not contain any meta vertices, then this path is a path of G( Y ) and of 75

′ ≤ Gi′ G and therefore dG(c , z) 2ρ holds. If Pc′,z does contain meta vertices, then again,

′′ Gi′ Gi′′ “unfolding” a meta vertex µ of Pc′,z closest to z, we obtain a path Pc′′,z of length at

i′′ ′′ i′′ ′′ ′ most 3ρ in Gi′′ := G(↓ Y ) with c := c(Y ) for some i ∈ {0, . . . , i − 1}.

By continuing “unfolding” this way meta vertices closest to z, after at most h steps,

we will arrive at the situation when, for some index i∗ ∈ {0, 1, . . . , h}, a path of length

∗ ∗ at most ρ(h + 1) will connect vertices z and c(Y i ) in the graph Gb(↓ Y i ).

Figure 11: Illustration to the proof of Lemma 4: “unfolding” meta vertices.

Consider two arbitrary vertices x and y of G, and let S(x) and S(y) be the nodes

of H(G) containing x and y, respectively. Let also NCAH(G)(S(x),S(y)) be the nearest

common ancestor of nodes S(x) and S(y) in H(G) and (Y 0,Y 1,...,Y h) be the path of

0 h H(G) connecting the root Y of H(G) with NCAH(G)(S(x),S(y)) = Y (in other words,

Y 0,Y 1,...,Y h are the common ancestors of S(x) and S(y)). Clearly, Y 0 ∪ Y 1 ∪ · · · ∪ Y h separates vertices x and y in G.

G 0 ∪ Lemma 5. Any path Px,y connecting vertices x and y in G contains a vertex from Y

Y 1 ∪ · · · ∪ Y h. 76

G i Let SPx,y be a shortest path of G connecting vertices x and y, and let Y be the node ∩ 0 1 h G i ̸ ∅ of the path (Y ,Y ,...,Y ) with the smallest index such that SPx,y Y = in G. The following lemma holds.

′ b j Lemma 6. For each j = 0, . . . , i, we have dG(x, y) = dG′ (x, y), where G := G(↓Y ).

G ′ Proof. It is enough to show that the path SPx,y consists of only vertices of G . Assume,

G ′ by a way of contradiction, that there is a vertex z of SPx,y that does not belong to G . Let

G G H SPx,z be a subpath of SPx,y between x and z. Clearly, the node S(z) of (G), containing

vertex z, is not a descendent of Y i. Therefore, the nearest common ancestor of S(x) and

S(z) in H(G) is a node Y j from {Y 0,Y 1,...,Y h} with j < i. But then, by Lemma 5,

G G 0 ∪ 1 ∪ · · · ∪ j the path SPx,z (and hence the path SPx,y) must have a vertex in Y Y Y ,

contradicting the choice of Y i, i > j.

Let now Bi ,...,Bi be the nodes at depth i of the tree H(G). For each node Bi that 1 pi j

H i i i is not a leaf of (G), consider its (central) vertex cj := c(Bj). If cj is an original vertex of

G (not a meta vertex created during the construction of H(G)), then define a connected

i ↓ i graph Gj obtained from G( Bj) by removing all its meta vertices. If removal of those

i meta vertices produced few connected components, choose as Gj that component which

i i i i i contains the vertex cj. Denote by Tj a BFS–tree of graph Gj rooted at vertex cj of Bj.

i H i If Bj is a leaf of (G), then Bj has at most 5 vertices. In this case, remove all meta

↓ i vertices from G( Bj) and for each connected component of the resulting graph construct an additive tree spanner with optimal surplus ≤ 3. Note that the diameter of a tree with

i 5 vertices is at most 4. Denote the resulting subtree (forest) by Tj .

i H The trees Tj (i = 0, 1, . . . , depth( (G)), j = 1, 2, . . . , pi) obtained this way, are called 77

Figure 12: Illustration to the proof of Lemma 7.

local subtrees of G. Clearly, the construction of these local subtrees can be incorporated

into the procedure of constructing the hierarchical tree H(G) of G and will not increase

the overall O(nm log2 n) run-time (see Section 3.3).

Lemma 7. For any two vertices x, y ∈ V (G), there exists a local subtree T such that

≤ − dT (x, y) dG(x, y) + 2ρ log2 n 1.

G i Proof. We know, by Lemma 6, that a shortest path SPx,y, intersecting Y and not inter-

l ′ b i i secting any Y (l < i), lies entirely in G := G(↓ Y ). Thus, dG(x, y) = dG′ (x, y). If Y is

a leaf of H(G), then for a local subtree T ′ (it could be a forest) of G constructed for G′,

the following holds:

′ ≤ ′ ≤ − dT (x, y) dG (x, y) + 3 = dG(x, y) + 3 dG(x, y) + 2ρ log2 n 1

(since n ≥ 4 and ρ ≥ 1). Assume now that Y i is an internal node of H(G). We have

≤ − H − ∈ i i log2 n 2, since the depth of (G) is at most log2 n 1. Let z Y be a vertex on

G ∈ { } the shortest path SPx,y. By Lemma 4, there exists an index j 0, 1, . . . , i such that 78 the vertices z and c(Y j) can be connected in the graph Gb(↓ Y j) by a path of length at most ρ(i + 1). See Figure 12 for an illustration. Set G′′ := Gb(↓ Y j) and c := c(Y j). By

′′ Lemma 6, dG(x, y) = dG′ (x, y) = dG′′ (x, y). Let T be the local tree constructed for graph

G′′ = Gb(↓ Y j), i.e., a BFS–tree of a connected component of the graph G′′ = Gb(↓ Y j) and rooted at vertex c = c(Y j).

We have dT ′′ (x, c) = dG′′ (x, c) and dT ′′ (y, c) = dG′′ (y, c). By the triangle inequality,

dT ′′ (x, c) = dG′′ (x, c) ≤ dG′′ (x, z) + dG′′ (z, c) and

dT ′′ (y, c) = dG′′ (y, c) ≤ dG′′ (y, z) + dG′′ (z, c).

That is,

dT ′′ (x, y) ≤ dT ′′ (x, c)+dT ′′ (y, c) ≤ dG′′ (x, z)+dG′′ (y, z)+2dG′′ (z, c) = dG′′ (x, y)+2dG′′ (z, c).

′′ ≤ ≤ − Now, using Lemma 6 and inequality dG (z, c) ρ(i + 1) ρ(log2 n 1), we get

′′ ≤ ′′ ′′ ≤ − dT (x, y) dG (x, y) + 2dG (z, c) dG(x, y) + 2ρ(log2 n 1).

This lemma implies two important results. Let G be a graph with n vertices and m edges having tb(G) ≤ ρ. Also, let H(G) be its hierarchical tree and LT (G) be the family of all its local subtrees (defined above). Consider a graph H obtained by taking the union of all local subtrees of G (by putting all of them together), i.e.,

∪ { i| i ∈ LT } ∪{ i | i ∈ LT } H := Tj Tj (G) = (V, E(Tj ) Tj (G) ). 79

Clearly, H is a spanning subgraph of G, constructible in O(nm log2 n) total time, and,

≤ − for any two vertices x and y of G, dH (x, y) dG(x, y) + 2ρ log2 n 1 holds. Also, since for every level i (i = 0, 1, . . . , depth(H(G))) of hierarchical tree H(G), the corresponding

local subtrees T i,...,T i are pairwise vertex-disjoint, their union has at most n−1 edges. 1 pi − Therefore, H cannot have more than (n 1) log2 n edges in total. Thus, we have proven

the following result.

≤ Theorem 2. Every graph G with n vertices and tb(G) ρ admits an additive (2ρ log2 n)- spanner with at most n log2 n edges. Furthermore, such a sparse additive spanner of G

can be constructed in polynomial time.

Instead of taking the union of all local subtrees of G, one can fix i (i ∈ {0, 1,...,

depth(H(G))}) and consider separately the union of only local subtrees T i,...,T i , cor- 1 pi responding to the level i of the hierarchical tree H(G), and then extend in linear O(m) time that forest to a spanning tree T i of G (using, for example, a variant of Kruskal’s

Spanning Tree algorithm for the unweighted graphs). We call this tree T i the spanning tree of G corresponding to the level i of the hierarchical tree H(G). In this way we can

H obtain at most log2 n spanning trees for G, one for each level i of (G). Denote the collection of those spanning trees by T (G). Thus, we obtain the following theorem.

Theorem 3. Every graph G with n vertices and tb(G) ≤ ρ admits a system T (G) of at

most log2 n collective additive tree (2ρ log2 n)-spanners. Furthermore, such a system of

collective additive tree spanners of G can be constructed in polynomial time. 80

3.5 Additive spanners for graphs admitting (multiplicative) tree

t-spanners

Now we give two implications of the above results for the class of tree t-spanner admissible graphs. In [76], the following important (“bridging”) lemma was proven.

Lemma 8 ([76]). If a graph G admits a tree t-spanner, then its tree-breadth is at most

⌈t/2⌉.

Note that the tree-breadth bounded by ⌈t/2⌉ provides only a necessary condition for a graph to have a multiplicative tree t-spanner. There are (chordal) graphs which have tree-breadth 1 but any multiplicative tree t-spanner of them has t = Ω(log n) [76].

Furthermore, a cycle on 3n vertices has tree-breadth n but admits a system of 2 collective additive tree 0-spanners.

Combining Lemma 8 with Theorem 2 and Theorem 3, we deduce the following results.

Theorem 4. Let G be a graph with n vertices and m edges having a (multiplicative) tree

⌈ ⌉ t-spanner. Then G admits an additive (2 t/2 log2 n)-spanner with at most n log2 n edges constructible in O(nm log2 n) time.

Theorem 5. Let G be a graph with n vertices and m edges having a (multiplicative)

T tree t-spanner. Then G admits a system (G) of at most log2 n collective additive tree

⌈ ⌉ 2 (2 t/2 log2 n)-spanners constructible in O(nm log n) time. CHAPTER 4

Collective Additive Tree Spanners of Graphs with

Bounded k-Tree-Breadth, k ≥ 2

4.1 Introduction

In this chapter we generalize the method of Chapter 3. We define a new notion which

combines both the tree-width and the tree-breadth of a graph.

We define a new notion/ parameter that is related to the problem of k-Tree-width

t-spanner. This parameter combines both the tree-width and the tree-breadth of a graph. The k-breadth of a tree-decomposition T (G) = ({Xi|i ∈ I},T = (I,F )) of a

graph G is the minimum integer r such that for each bag Xi, i ∈ I, there is a set of

{ i | i ∈ } ∈ at most k vertices Ci = vj vj V (G), j = 1, . . . , k such that for each u Xi, we

have dG(u, Ci) ≤ r (i.e., each bag Xi can be covered with at most k disks of G of radius

⊆ i ∪ ∪ i at most r each; Xi Dr(v1,G) ... Dr(vk,G)). The k-tree-breadth of a graph G,

denoted by tbk(G), is the minimum of the k-breadth, over all tree-decompositions of G.

We say that a family of graphs G is of bounded k-tree-breadth, if there is a constant c such

that for each graph G from G, tbk(G) ≤ c. Clearly, for every graph G, tb(G) = tb1(G),

and tw(G) ≤ k − 1 if and only if tbk(G) = 0 (consider each vertex in the bags of the

tree-decomposition of width k as a disk center of radius 0). Thus, the notions tree-width

and the tree-breadth are particular cases of the k-tree-breadth. 81 82

In this chapter, we show that any n-vertex graph G with tbk(G) ≤ ρ has a system of at most k(1 + log2 n) collective additive tree (2ρ(1 + log2 n))-spanners constructible in polynomial time for every fixed k. We will assume that n > k, since any graph with n vertices has a system of n − 1 collective additive tree 0-spanners (consider n − 1 BFS- trees rooted at different vertices). Also, In Section 4.6, we extend a result from [76] and show that if a graph G admits a (multiplicative) t-spanner H with tw(H) = k − 1 then its k-tree-breadth is at most ⌈t/2⌉. As a consequence, we obtain that, for every fixed k, there is a polynomial time algorithm that, given an n-vertex graph G admitting a

(multiplicative) t-spanner with tree-width at most k − 1, constructs:

- a system of at most k(1 + log2 n) collective additive tree O(t log n)-spanners of G;

- an additive O(t log n)-spanner of G with at most O(kn log n) edges.

4.2 Balanced separators for graphs with bounded k-tree-breadth

We will need the following balanced clique-separator result for chordal graphs. Recall that a graph is chordal if each of its induced cycles has length three.

Theorem 6 ([97]). Every chordal graph G with n vertices and m edges contains a maxi- mal clique C such that if the vertices in C are deleted from G, every connected component in the graph induced by any remaining vertices is of size at most n/2. Such a balanced clique-separator C of a connected chordal graph can be found in O(m) time.

| | ≥ k We say that a graph G = (V,E) with V k has a balanced Dr -separator if there exists a collection of k disks Dr(v1,G),Dr(v2,G),...,Dr(vk,G) in G, centered at (dif- ferent) vertices v1, v2, . . . , vk and each of radius r, such that the union of those disks 83 ∪ k k Dr := i=1 Dr(vi,G) forms a balanced separator of G, i.e., each connected component

\ k | | of G[V Dr ] has at most V /2 vertices. The following result generalizes Lemma 1.

≤ k Lemma 9. Every graph G with at least k vertices and tbk(G) ρ has a balanced Dρ -

separator.

Proof. The proof of this lemma follows from acyclic hypergraph theory. First we review

some necessary definitions and an important result characterizing acyclic hypergraphs.

Recall that a hypergraph H is a pair H = (V, E) where V is a set of vertices and E is a set of non-empty subsets of V called hyperedges. For these and other hypergraph notions see [31].

Let H = (V, E) be a hypergraph with the vertex set V and the hyperedge set E. For

every vertex v ∈ V , let E(v) = {e ∈ E |v ∈ e}. The 2–section graph 2SEC(H) of a

hypergraph H has V as its vertex set and two distinct vertices are adjacent in 2SEC(H)

if and only if they are contained in a common hyperedge of H. A hypergraph H is

called conformal if every clique of 2SEC(H) is contained in a hyperedge e ∈ E, and a

hypergraph H is called acyclic if there is a tree T with node set E such that for all vertices

v ∈ V , E(v) induces a subtree Tv of T . It is a well-known fact (see, e.g., [15,30,31]) that

a hypergraph H is acyclic if and only if H is conformal and 2SEC(H) of H is a chordal

graph.

Let now G = (V,E) be a graph with tbk(G) = ρ and T (G) = ({Xi|i ∈ I},T =

(I,F )) be its tree-decomposition of k-breadth ρ. Evidently, the third condition of tree-

decompositions can be restated as follows: the hypergraph H = (V (G), {Xi|i ∈ I}) is

an acyclic hypergraph. Since each edge of G is contained in at least one bag of T (G), 84 the 2–section graph G∗ := 2SEC(H) of H is a chordal supergraph of the graph G (each edge of G is an edge of G∗, but G∗ may have some extra edges between non-adjacent vertices of G contained in a common bag of T (G)). By Theorem 6, the chordal graph

G∗ contains a balanced clique-separator C ⊆ V (G). By conformality of H, C must be contained in a bag of T (G). From the definition of k-breadth, there must exist k vertices

⊆ k k ∪ · · · ∪ v1, v2, . . . , vk such that C Dρ , where Dρ = Dρ(v1,G) Dρ(vk,G). As the removal of the vertices of C from G∗ leaves no connected component in G∗[V \ C] with more than

| | ∗ k V /2 vertices and since G is a supergraph of G, clearly, the removal of the vertices of Dρ

\ k | | from G leaves no connected component in G[V Dρ ] with more than V /2 vertices.

Again, as in Chapter 3, we do not assume that a tree-decomposition T (G) of k-breadth

ρ is given for G as part of the input. Our method does not need to know T (G). For a given graph G, integers k ≥ 1 and ρ ≥ 0, even checking whether G has a tree-decomposition of k-breadth ρ is a hard problem (as tbk(G) = 0 if and only if tw(G) ≤ k − 1) (see

Subsection 1.3.1).

Let G be an arbitrary connected n-vertex m-edge graph. In [76], an algorithm was described which, given G and its arbitrary fixed vertex v, finds in O(m) time a balanced disk separator Dr(v, G) of G centered at v and with minimum r. We can use this

k k algorithm as a subroutine to find for G in O(n m) time a balanced Dr -separator with minimum r. Given arbitrary k vertices v1, v2, . . . , vk of G, we can add a new dummy vertex x to G and make it adjacent to only v1, v2, . . . , vk in G. Denote the resulting graph by G + x. Then, a balanced disk separator Dr+1(x, G + x) of G + x with minimum r+1 gives a balanced separator of G of the form Dr(v1,G)∪· · ·∪Dr(vk,G) (for particular 85

disk centers v1, v2, . . . , vk) with minimum r. Iterating over all k vertices of G, we can find

k a balanced Dr -separator of G with the smallest (absolute minimum) radius r. Thus, we

have the following result.

Proposition 17. Let k be a positive integer (assumed to be small). For an arbitrary

≥ k graph G with n k vertices and m edges, a balanced Dr -separator with the smallest

radius r can be found in O(nkm) time.

4.3 Decomposition of a graph with bounded k-tree-breadth

Let G = (V,E) be an arbitrary connected graph with n vertices and m edges and ∪ k k k with a balanced Dr -separator, where Dr = j=1 Dr(vj,G). Note that some disks

{ } k in Dr(v1,G),...,Dr(vk,G) may overlap. In what follows, we will partition Dr = ∪ k j=1 Dr(vj,G) into k sets D1,...,Dk such that no two of them intersect and each

Dj, j = 1, . . . , k, contains at least one vertex vj and induces a connected subgraph

of G[Dr(vj,G)]. Create a graph G+s by adding a new dummy vertex s to G and making

it adjacent to only v1, v2, . . . , vk in G. Let T be a BFS-tree of G + s started at vertex s

′ and T be a subtree of T formed by vertices {v ∈ V (G+s)|dT (s, v) ≤ r+1} and rooted at

′ s. Let also T (v1),...,T (vk) be the subtrees of T \{s} rooted at v1, . . . , vk, respectively.

Clearly, each T (vj), j = 1, . . . , k, is a subtree (not necessarily spanning) of G[Dr(vj,G)] ∪ k k and Dr = j=1 V (T (vj)). Set now Dj := V (T (vj)), j = 1, . . . , k.

\ k j { ∈ Let G1,G2,...,Gq be the connected components of G[V Dr ]. Denote by Si = v

V (Gi)|dG(v, Dj) = 1}, i = 1, . . . , q, j = 1, . . . , k, the neighborhood of Dj in Gi. Also,

+ j let Gi be the graph obtained from component Gi by adding one meta vertex ci for each disk Dr(vj,G) (a representative of Dr(vj,G)), j = 1, . . . k, and making it adjacent to all 86

j ∈ j ∈ + vertices of Si , i.e., for a vertex x V (Gi), ci x E(Gi ) if and only if there is a vertex

∈ ⊆ ∈ j j xD Dj Dr(vj,G) with xxD E(G). If Si is empty for some j, then vertex ci is not

+ j l added to Gi . Also, add an edge between any two representatives ci and ci if vertices vj and vl are connected by a path in G[V \ V (Gi)]. See Figure 13 for an illustration.

k + + Given an n-vertex m-edge graph G and its balanced Dr -separator, the graphs G1 ,...,Gq can be constructed in total time O(kqm). Furthermore, the total number of edges in

+ + 2 graphs G1 ,...,Gq does not exceed m + qk , and the total number of vertices in those

\ k graphs does not exceed the number of vertices in G[V Dr ] plus qk.

3 Figure 13: A graph G with a balanced Dr -separator and the corresponding graphs

+ + + G1 ,...,G4 obtained from G. Each Gi has three meta vertices representing the

three disks.

+ Note that Gi is a minor of G and can be obtained from G by a sequence of edge contractions in the following way. First contract all edges (in any order) that are incident

′ ′ to V (Gi′ ), for all i = 1, . . . , q, i ≠ i. Then, for each j = 1, . . . , k, contract (all edges of)

j connected subgraph G[Dj] of G to get meta vertex ci representing the disk Dr(vj,G) in

+ Gi .

Let again G/e be the graph obtained from G by contracting edge e. We have the 87 following analog of Lemma 3.

Lemma 10. For any graph G and its edge e, tbk(G) ≤ ρ implies tbk(G/e) ≤ ρ. Conse-

≤ + ≤ quently, for any graph G with tbk(G) ρ, tbk(Gi ) ρ holds for i = 1, . . . , q.

Proof. Our proof is similar to the proof from [76] of Lemma 3. We provide it here for the sake of completeness. Let T (G) = ({Xi|i ∈ I},T = (I,F )) be a tree-decomposition of G with k-breadth ρ. Let e = xy be an arbitrary edge of G. We can obtain a tree- decomposition T (G/e) of the graph G/e by replacing in each bag Xi, i ∈ I, vertices x and y with a new vertex x′ representing them (if some bag A contained both x and y, only one copy of x′ is kept). Evidently, the first and the second conditions of tree- decompositions are fulfilled for T (G/e). Furthermore, the topology (the tree T = (I,F ))

′ of the tree-decomposition did not change. Still, for any vertex v ≠ x of G/e, the bags of T (G/e) containing v form a subtree in T (G/e). Since vertices x and y were adjacent in G, there was a bag A of T (G) containing both those vertices. Hence, a subtree of

′ T (G/e) formed by bags of T (G/e) containing vertex x is nothing else but the union of two subtrees (one for x and one for y) of T (G) sharing at least one common bag A.

Also, contracting an edge can only reduce the distances in a graph. Hence, still, for each bag B of T (G/e), there must exist corresponding vertices v1, . . . , vk in G/e with

⊆ ∪ · · · ∪ ≤ + B Dρ(v1,G/e) Dρ(vk,G/e). Thus, tbk(G/e) ρ. Since Gi can be obtained from

+ ≤ G by a sequence of edge contractions, we also have tbk(Gi ) ρ.

4.4 Construction of a hierarchical tree

Here we show how a hierarchical tree for a graph with bounded k-tree-breadth is built. 88

Let G = (V,E) be a connected n-vertex, m-edge graph with tbk(G) ≤ ρ and n ≥ k.

k ≤ Lemma 9 guarantees that G has a balanced Dr -separator with r ρ. Proposition 17 says

k k that such a balanced Dr -separator of G can be found in O(n m) time by an algorithm that works directly on the graph G and does not require construction of a tree-decomposition of G with k-breadth ≤ ρ. Using these and Lemma 10, we can build a rooted hierarchical- tree H(G) for G, which is constructed as follows. If G is a connected graph with at most

2k + 1 vertices, then H(G) is a one node tree with root node (V (G),G). It is known

[104] that any connected graph with p ≥ 2 vertices has a dominating set of size ⌊p/2⌋, i.e., all vertices of it can be covered by ⌊p/2⌋ disks of radius one. Hence, in our case,

G with at most 2k + 1 vertices can be covered by k disks of radius one each, i.e., there are k vertices v1, . . . , vk such that V (G) = Dr(v1,G) ∪ · · · ∪ Dr(vk,G) for r = 1 ≤ ρ. If

k G is a connected graph with more than 2k + 1 vertices, find a balanced Dr -separator of

k + + minimum radius r in O(n m) time and construct the corresponding graphs G1 ,...,Gq .

+ ∈ { } + ≤ For each graph Gi , i 1, . . . , q , (by Lemma 10, tbk(Gi ) ρ) construct a hierarchical

H + H k tree (Gi ) recursively and build (G) by taking the pair (Dr ,G) to be the root and

H + k connecting the root of each tree (Gi ) as a child of (Dr ,G).

The depth of this tree H(G) is the smallest integer p such that

n 1 1 + k( + ··· + + 1) ≤ 2k + 1, 2p 2p−1 2

that is, the depth is at most log2 n. It is also not hard to see that, given a graph G with n vertices and m edges, a hierarchical tree H(G) can be constructed in O((kn)k+2 logk+1 n) total time. There are at most O(log n) levels in H(G), and one needs to do at most

O((n + kn log n)k(m + k2n log n)) ≤ O((kn)k+2 logk n) operations per level since the total 89

number of edges in the graphs of each level is at most O(m + k2n log n) and the total

number of vertices in those graphs can not exceed O(n + kn log n).

For nodes of H(G), we use the same notation as in Chapter 3. For a node Y of

H k ′ ′ ≤ ′ (G), since it is associated with a pair (Dr′ ,G ), where r ρ, G is a minor of G and

′ ′ ′ ′ ′ ′ ′ k ′ ∪ · · · ∪ ′ ↓ { } Dr′ = Dr (v1,G ) Dr (v1,G ), it is convenient to denote G by G( Y ), v1, . . . , vk

{ } ′ k k ′ by c(Y ) = c1(Y ), . . . , ck(Y ) , r by r(Y ), and Dr′ by Y itself. Thus, (Dr′ ,G ) = ∪ k ↓ ↓ ↓ ( l=1 Dr(Y )(cl(Y ),G( Y )),G( Y )) = (Y,G( Y )) in these notations, and we identify ∪ H k ↓ node Y of (G) with the set l=1 Dr(Y )(cl(Y ),G( Y )) and associate with this node

also the graph G(↓ Y ). If now (Y 0,Y 1,...,Y h) is the path of H(G) connecting the

root Y 0 of H(G) with a node Y h, then the vertex set of the graph G(↓ Y h) consists

of some (original) vertices of G plus at most kh meta vertices representing the disks

i i i i i Dr(Y )(c1(Y ),G(↓ Y )),...,Dr(Y )(ck(Y ),G(↓ Y )) of Y , i = 0, 1, . . . , h − 1. Note also

that each (original) vertex of G belongs to exactly one node of H(G).

4.5 Construction of collective additive tree spanners

Let G = (V,E) be a connected n-vertex, m-edge graph and assume that tbk(G) ≤ ρ and n ≥ k. Let H(G) be a hierarchical tree of G. Consider an arbitrary node Y h of

H(G), and let (Y 0,Y 1,...,Y h) be the path of H(G) connecting the root Y 0 of H(G) with

Y h. Let Gb(↓Y j) be the graph obtained from G(↓Y j) by removing all its meta vertices

(note that Gb(↓Y j) may be disconnected and that all meta vertices of G(↓Y j) come from

previous levels of H(G)). We have the following analog of Lemma 4.

Lemma 11. For any vertex z from Y h ∩ V (G), there exists an index i ∈ {0, 1, . . . , h}

i such that the vertices z and cl(Y ), for some l ∈ {1, . . . , k}, can be connected in the graph 90 b i i G(↓ Y ) by a path of length at most ρ(h + 1). In particular, dG(z, cl(Y )) ≤ ρ(h + 1) holds.

h Proof. The proof is similar to the proof of Lemma 4 of Chapter 3. Set Gh := G(↓ Y )

h h and c := cl(Y ), where z ∈ Dl ⊆ Dr(Y h)(cl(Y ),Gh) (for the definition of set Dl see the

Gh first paragraph of Section 4.3). Let SPc,z be a shortest path of Gh connecting vertices c

h ≤ Gh and z. We know that this path has at most r(Y ) ρ edges. If SPc,z does not contain

b h any meta vertices, then this path is a path of G(↓ Y ) and of G and therefore dG(c, z) ≤ ρ holds.

Gh ′ Assume now that SPc,z does contain meta vertices and let µ be the closest to z meta

Gh Gh ′ ′ ′ vertex in SPc,z (consult with Figure 11 of Chapter 3 ). Let SPc,z = (c, . . . , a , µ , b , . . . , z).

By construction of H(G), meta vertex µ′ was created at some earlier recursive step to

i′ i′ ′ represent one disk of Y of graph Gi′ := G(↓ Y ) for some i ∈ {0, . . . , h − 1}. Hence,

G ′ ′ ′ ′ ′ i ′ ′ i there is a path Pc′,z = (c , . . . , b , . . . , z) of length at most 2ρ in Gi with c := cl (Y )

′ ∈ { } Gi′ for some l 1, . . . , k . Again, if Pc′,z does not contain any meta vertices, then this

′ b ↓ i ′ ≤ Gi′ path is a path of G( Y ) and of G and therefore dG(c , z) 2ρ holds. If Pc′,z does

′′ Gi′ contain meta vertices then again, “unfolding” a meta vertex µ of Pc′,z closest to z, we

G ′′ ′′ ′′ ′′ i ′′ ↓ i ′′ i obtain a path Pc′′,z of length at most 3ρ in Gi := G( Y ) with c := cl (Y ) for some i′′ ∈ {0, . . . , i′ − 1} and l′′ ∈ {1, . . . , k}.

We continue “unfolding” this way meta vertices closest to z. Eventually, after at most h steps, we will arrive at the situation when, for some index i∗ ∈ {0, 1, . . . , h}, a path of

i∗ ∗ length at most ρ(h + 1) will connect vertices z and cl∗ (Y ), for some l ∈ {1, . . . , k}, in

∗ the graph Gb(↓ Y i ). 91 ∪ Let Bi ,...,Bi be the nodes at depth i of the tree H(G). Assume Bi = k D (ci (l),G(↓ 1 pi j l=1 r j

i i i i { i i i } Bj)), where r := r(Bj). Denote k central vertices of Bj by Cj = cj(1), cj(2), . . . , cj(k) .

i i ∈ { } i For each node Bj, consider its (central) vertex cj(l)(l 1, . . . , k ). If cj(l) is an original vertex of G (not a meta vertex created during the construction of H(G)), then define

i ↓ i a connected graph Gj(l) obtained from G( Bj) by removing all its meta vertices. If

i removal of those meta vertices produces few connected components, choose as Gj(l) that

i i i component which contains the vertex cj(l). Denote by Tj (l) a BFS–tree of graph Gj(l)

i i rooted at vertex cj(l) of Bj.

i H The trees Tj (l)(i = 0, 1, . . . , depth( (G)), j = 1, 2, . . . , pi, l = 1, 2, . . . , k), obtained

this way, are called local subtrees of G. Clearly, the construction of these local subtrees

can be incorporated into the procedure of constructing a hierarchical tree H(G) of G and

will not increase the overall O((kn)k+2 logk+1 n) run-time (see Section 4.4).

Since Lemma 5 and Lemma 6 hold for G, similarly to the proof of Lemma 7, one can prove its analog for graphs with bounded k-tree-breadth.

Lemma 12. For any two vertices x, y ∈ V (G), there exists a local subtree T such that

≤ dT (x, y) dG(x, y) + 2ρ(1 + log2 n).

This lemma implies the following two results. Let G be a graph with n vertices and

m edges having tbk(G) ≤ ρ. Let also H(G) be its hierarchical tree and LT (G) be the

family of all its local subtrees (defined above). Consider a graph H obtained by taking

the union of all local subtrees of G (by putting all of them together). Clearly, H is a

spanning subgraph of G, constructible in polynomial time for every fixed k. We have

≤ dH (x, y) dG(x, y) + 2ρ(1 + log2 n) for any two vertices x and y of G. Also, since for 92 every level i (i = 0, 1, . . . , depth(H(G))) of hierarchical tree H(G), the corresponding local subtrees T i(l),...,T i (l) for each fixed index l ∈ {1, . . . , k} are pairwise vertex- 1 pi disjoint, their union has at most n − 1 edges. Therefore, H cannot have more than

− k(n 1)(1 + log2 n) edges in total. Thus, we have the following result.

Theorem 7. Every graph G with n vertices and tbk(G) ≤ ρ admits an additive (2ρ(1 + log2 n))-spanner with at most O(kn log n) edges constructible in polynomial time for every

fixed k.

i H T i { i i } For a node Bj of (G), let j = Tj (1),...,Tj (k) be the set of its local subtrees. In- stead of taking the union of all local subtrees of G, one can fix i (i ∈ {0, 1, . . . , depth(H(G))}) and fix l ∈ {1, . . . , k} and consider separately the union of only local subtrees T i(l),...,T i (l), 1 pi corresponding to the lth subtrees of level i of the hierarchical tree H(G), and then ex- tend in linear O(m) time that forest to a spanning tree T i(l) of G (using, for example, a variant of Kruskal’s Spanning Tree algorithm for the unweighted graphs). We call this tree T i(l) the lth spanning tree of G corresponding to the level i of the hierarchical tree

H (G). In this way we can obtain at most k(1 + log2 n) spanning trees for G, k trees for each level i of H(G). Denote the collection of those spanning trees by T (G). Thus, we deduce the following theorem.

Theorem 8. Every graph G with n vertices and tbk(G) ≤ ρ admits a system T (G) of at most k(1 + log2 n) collective additive tree (2ρ(1 + log2 n))-spanners constructible in polynomial time for every fixed k. 93

4.6 Additive Spanners for Graphs Admitting (Multiplicative)

t-Spanners of Bounded Tree-width.

In this section, we show that if a graph G admits a (multiplicative) t-spanner H with

tw(H) = k−1 then its k-tree-breadth is at most ⌈t/2⌉. As a consequence, we obtain that, for every fixed k, there is a polynomial time algorithm that, given an n-vertex graph G

admitting a (multiplicative) t-spanner with tree-width at most k −1, constructs a system of at most k(1 + log2 n) collective additive tree O(t log n)-spanners of G.

4.6.1 k-Tree-breadth of a graph admitting a t-spanner of bounded tree-width

Let H be a graph with tree-width k − 1, and let T (H) = ({Xi|i ∈ I},T = (I,F ))

− ≥ (r) ∈ be its tree-decomposition of width k 1. For an integer r 0, denote by Xi , i I, ∪ the set D (X ,H) := D (x, H). Clearly, X(0) = X for every i ∈ I. The following r i x∈Xi r i i

important lemma holds.

≥ (r) { (r)| ∈ } Lemma 13. For every integer r 0, T (H) := ( Xi i I ,T = (I,F )) is a tree-

decomposition of H with k-breadth ≤ r.

Proof. It is enough to show that the third condition of tree-decompositions (see Subsec-

tion 1.3.1) is fulfilled for T (r)(H). That is, for all i, j, k ∈ I, if j is on the path from i to ∩ ∩ (r) (r) ⊆ (r) ⊆ k in T , then Xi Xk Xj . We know that Xi Xk Xj holds and need to show

that for every vertex v of H, dH (v, Xi) ≤ r and dH (v, Xk) ≤ r imply dH (v, Xj) ≤ r.

Assume, by way of contradiction, that for some integer r > 0 and for some vertex v of

H, dH (v, Xj) > r while dH (v, Xi) ≤ r and dH (v, Xk) ≤ r.

Consider the original tree-decomposition T (H). It is known [65] that if ab (a, b ∈ I)

is an edge of the tree T = (I,F ) of tree-decomposition T (H), and Ta, Tb are the subtrees 94

of T obtained after removing edge ab from T , then S = Xa ∩ Xb separates in H vertices

belonging to bags of Ta but not to S from vertices belonging to bags of Tb but not to S.

We will use this nice separation property.

Let T \{j} be the forest obtained from T by removing node j, and let T (i) and T (k) be the trees from this forest containing nodes i and k, respectively. Clearly, T (i) and T (k)

are disjoint. The above separation property and inequalities dH (v, Xi) ≤ r < dH (v, Xj) ensure that the vertex v belongs to a node (a bag) of T (i)(Xj cannot separate in H vertex

v from a vertex xi of Xi with dH (v, Xi) = dH (v, xi) since otherwise dH (v, Xi) > dH (v, Xj)

will hold). Similarly, inequalities dH (v, Xk) ≤ r < dH (v, Xj) and the above separation

property guarantee that the vertex v belongs to a node of T (k). But then, the third

condition of tree-decompositions says that v must also belong to the bag Xj of T (H).

The latter, however, is in a contradiction to the assumption that dH (v, Xj) > r ≥ 0.

Now we can prove the main lemma of this section.

Lemma 14. If a graph G admits a t-spanner with tree-width k − 1, then tbk(G) ≤ ⌈t/2⌉.

Proof. Let H be a t-spanner of G with tw(G) = k−1 and T (H) = ({Xi|i ∈ I},T = (I,F ))

be a tree-decomposition of H of width k − 1. We claim that T (G) := T (⌈t/2⌉)(H) :=

{ (⌈t/2⌉)| ∈ } ≤ ⌈ ⌉ ( Xi i I ,T = (I,F )) is a tree-decomposition of G with k-breadth t/2 . See

Figure 14 for an illustration.

By Lemma 13, T (⌈t/2⌉)(H) is a tree-decomposition of H with k-breadth ≤ ⌈t/2⌉.

Hence, the first and the third conditions of tree-decompositions hold for T (G). For every

pair u, v of vertices of G, dG(u, v) ≤ dH (u, v). Therefore, every disk D⌈t/2⌉(x, H) of H is

contained in a disk D⌈t/2⌉(x, G) of G. This implies that every bag of T (G) is covered by 95

at most k disks of G of radius at most ⌈t/2⌉ each, i.e.,

∪ ∪ (⌈t/2⌉) ⊆ Xi = D⌈t/2⌉(Xi,H) = D⌈t/2⌉(x, H) D⌈t/2⌉(x, G). x∈Xi x∈Xi

We need only to show additionally that each edge uv of G belongs to some bag of

T (G). Since H is a t-spanner of G, dH (u, v) ≤ t holds. Let x be a middle vertex

of a shortest path connecting u and v in H. Then, both u and v belong to the disk

D⌈t/2⌉(x, H). Let Xi be a bag of T (H) containing vertex x. Then, both u and v are

(⌈t/2⌉) contained in Xi , a bag of T (G).

4.6.2 Consequences

Now we give two implications of the above results for the class of graphs admitting

(multiplicative) t-spanners with tree-width k−1. They are direct consequences of Lemma

14, Theorem 7 and Theorem 8.

Theorem 9. Let G be a graph with n vertices and m edges having a (multiplicative)

− ⌈ ⌉ t-spanner with tree-width k 1. Then G admits an additive (2 t/2 (1 + log2 n))-spanner

with at most O(kn log n) edges constructible in polynomial time for every fixed k.

Theorem 10. Let G be a graph with n vertices and m edges having a (multiplicative)

− T t-spanner with tree-width k 1. Then G admits a system (G) of at most k(1 + log2 n)

⌈ ⌉ collective additive tree (2 t/2 (1 + log2 n))-spanners constructible in polynomial time for

every fixed k. 96

(a) A graph G. (b) A 2-spanner H of G with tree-width 2.

(c) Tree-decomposition T (H) of width 2. (d) Tree-decomposition T (G) = T (1)(H) of

3-tree-breadth equal 1. Figure 14: Illustration to the proof of Lemma 14. A tree-decomposition for G is obtained from a tree-decomposition of H. CHAPTER 5

Embedding of Weighted Graphs into Trees:

Theoretical Grounds and Empirical Analysis on Real

Datasets

In this chapter, we present our work on the problem of embedding weighted graphs into (weighted) trees. One of the applications of this problem is the reconstruction of the evolutionary tree from evolutionary distances between species [81, section 4.3] and [5].

We say that a weighted graph G = (V,E) has a non-contractive embedding into a tree

T = (V ∪ S, E′), (weighted tree possibly with Steiner vertices), with distortion λ, if T satisfies the following two conditions:

(1) ∀x, y ∈ V, dG(x, y) ≤ dT (x, y) (non-contractibility);

(2) ∀x, y ∈ V, dT (x, y) ≤ λdG(x, y) (bounded expansion).

The problem of the minimum distortion non-contractive embedding of a weighted graph is to find a tree embedding with the minimum distortion λ∗.

The approach we use is an extension of the approach of [54] of embedding unweighted graphs into trees. First we present a graph decomposition procedure (layering partition) used for our embedding.

97 98

5.1 Layering partition for weighted graphs

Layering partition has been introduced in [39] and being used in [21,54] for embedding

graph metrics into trees. We extend the procedure of layering partition on unweighted

graphs to weighted graphs.

Let h be a positive real number and G = (V,E) be a weighted connected graph with a distinguished vertex s and let r = ⌈maxx∈V dG(s, x)/h⌉.A layering of a weighted graph G

with respect to the special vertex s is the partition of V into the layers (spheres or rings)

i L = {v ∈ V : ih ≤ dG(s, v) < (i + 1)h}, i = 0, 1, . . . , r of width h.A layering partition

LP(s, h) = {Li ,...,Li } of G is a partition of each layer Li into clusters Li , ··· ,Li 1 pi 1 pi

∈ i j such that two vertices u, v L belong to the same cluster Li if and only if they can

be connected via a path outside the ball B(i−1)h(s) of radius (i − 1)h centered at s. In

j other words, clusters could be defined as following: if Xi is a connected component of

\{ 0 i−1} j j ∩ i G L ,...,L , then cluster Li is equal to Xi L . For illustration see Figure 15.

It was proved in [51] that such layering partition can be found in a linear time for

unweighted graphs. We extend the approach of [51] to work for weighted graphs. This is

done in two phases. The first phase finds the layers {L0,...,Lr} using Dijkstra’s single

source shortest path algorithm, starting from the special vertex s. The second phase

finds the clusters Li ,...,Li for each layer. This is done as follows. Start from the 1 pi

layer Lr farthest from s and find the connected components of the graph induced by Lr.

These connected components are the clusters of the layer Lr. Then, contract each of

these connected components into a single node. Then find the connected components in

the graph induced by Lr−1 and the set of contracted nodes. We proceed in the same way 99

downward the layers until layer 1. The running time for our layering partition procedure

of weighted graphs would take O(|E| log |V |) time, where |E| and |V | are the numbers of

edges and vertices of a graph G = (V,E), respectively.

j Let Γ(s, h) be the graph whose vertex set is the set of all clusters Li of a layering

LP j ′ j′ partition (s, h) of a given graph G. Two nodes C = Li and C = Li′ are adjacent

∈ j ∈ j′ ≤ in Γ(s, h) if and only if there exist u Li and v Li′ such that dG(u, v) h. See

Figure 15c for illustration. It was proved in [51] that Γ has a tree structure and is being

called the layering tree. For a weighted graph with non-negative weights, Γ is found in

|E| log |V | time using the above procedure of layering partition with Dijkstra’s algorithm.

In our following discussion, we assume that the layering tree Γ(s,h) is rooted at the cluster

containing the special vertex s. Also, to guarantee that no edge crosses non-consecutive

layers in LP(s, h), we assume that the cluster-width h is larger than or equal the weight

w of the longest edge in the graph (i.e., h ≥ w).

5.2 Properties of layering partition for weighted graphs

In the following we prove some properties of layering partition related to our problem

of embedding weighted graphs into trees. First we prove a bound on the diameter of

clusters in a layering partition for such graphs.

We use proofs similar to [54] to prove the following two lemmas.

Lemma 15. If a graph G embeds into a tree T with multiplicative distortion λ, then

for any x, y ∈ V, any path PG(x, y) between x and y in G and any vertex c ∈ PT (x, y),

≤ λw dT (c, PG(x, y)) 2 , where w is the largest edge weight of the graph G.

Proof. Removing c from T , we consider the subtree Ty of T \{c} containing vertex y. 100

(a) Layering of G with respect to s. (b) Clusters of the layering partition

LP(s, h) of G.

(c) The layering tree Γ(s, h). (d) The tree H associated with LP(s, h). Figure 15: A layering partition of a weighted graph G. 101

Since x∈ / Ty, we can find an edge ab of PG(x, y) with a ∈ Ty and b∈ / Ty. Therefore,

λw λw the path PT (a, b) must go via c. If dT (c, a) > 2 and dT (c, b) > 2 , then dT (a, b) = dT (a, c) + dT (b, c) > λw. This would lead to a contradiction with the assumption that the embedding of G has a distortion of at most λ, as condition 2 implies that dT (a, b) ≤

≤ ≤ { } ≤ λw λdG(a, b) λw. By the fact dT (c, PG(x, y)) min dT (c, a), dT (c, b) 2 , we conclude

our proof.

Lemma 16. For a given graph G that is embeddable into a tree T with distortion λ, the diameter of any cluster C of a layering partition with width h of G is at most 3λw + 2h.

In other words, ∀x, y ∈ C, dG(x, y) ≤ 3λw + 2h.

j Proof. Let PG(x, y) be a path connecting x and y in Xi . Let PG(s, x) and PG(s, y) be

two shortest paths of G connecting s, x and y, s, respectively. Let c be the least common

ancestor of x and y in T (i.e., c = PT (x, y) ∩ PT (s, x) ∩ PT (s, y)). Let a, b and z be the

closest three vertices of PG(s, x),PG(s, y) and PG(x, y), respectively, to c in the tree T , i.e.,

dT (c, a) = dT (c, PG(s, x)), dT (c, b) = dT (c, PG(s, y)) and dT (c, z) = dT (c, PG(x, y)). By

≤ λw ≤ λw ≤ λw applying Lemma 15 three times, we have: dT (c, a) 2 , dT (c, b) 2 and dT (c, z) 2 .

From the triangle inequality, condition 1 and the previous inequalities, we conclude that

λw λw d (a, z) ≤ d (a, c) + d (c, z) ≤ d (a, c) + d (c, z) ≤ + ≤ λw. G G G T T 2 2

Also, we claim that dG(a, x) ≤ λw + h. Since dG(s, a) = dG(s, x) − dG(a, x) and by the triangle inequality, we have

dG(s, z) ≤ dG(s, a) + dG(a, z) = dG(s, x) − dG(a, x) + dG(a, z).

From the definition of clusters, we have dG(s, x) < (i + 1)h and dG(s, z) ≥ ih. Thus, we

have dG(a, x) ≤ (i + 1)h − ih + λw = λw + h. In an analogous way, we can prove that 102

dG(b, y) ≤ λw + h. Now, by condition 1 of non-contractibility and the triangle inequality we have λw λw d (a, b) ≤ d (a, b) = d (a, c) + d (c, b) ≤ + ≤ λw. G T T T 2 2

Now, summing these inequalities, we conclude our proof

dG(x, y) ≤ dG(x, a) + dG(a, b) + dG(b, y) ≤ 3λw + 2h.

Corollary 1. Given the tree embedding of a graph G with the minimum distortion of λ∗,

− ∗ ≥ ∆s(h) 2h λ 3w , where ∆s(h) is the maximal diameter of a cluster in the layering partition

LP(s, h) of G.

5.3 Construction of tree embedding

Given a weighted graph G = (V,E) and a layering partition LP(s, h) of G with

cluster-width h, our embedding constructs a tree H = (V ∪ S, E′), where S is a set of

Steiner points, such that H closely reproduces the global structure of the layering tree

j ∈ LP k ∈ Γ(s, h). Let C = Li (s, h) be a node (cluster) in Γ(s, h) and P (C) = Li−1

LP(s, h) be its parent in Γ(s, h). The construction of H creates for each cluster C a

new vertex (Steiner point) sC and makes it adjacent in H to all vertices v ∈ C. Also,

it connects each Steiner point sC to the Steiner point of its parent sP (C) (for illustration

see Figure 15d).

The weighting of edges of the constructed tree H = (V ∪ S,E′) is done as follows.

Edges between Steiner points are weighted uniformly with the cluster-width h . Edges

between the vertices of a given cluster C and their Steiner point are weighted with 103

∆s(h)/2 + h, where ∆s(h) is the largest cluster diameter of the layering partition.

Now, we will show that such weighting of H will produce a non-contractive embedding

with bounded distortion.

Lemma 17. Given a weighted graph G = (V,E) and a weighted tree H = (V ∪ S, E′) constructed as described above, H provides a non-contractive embedding of G (i.e., ∀x, y ∈

V dG(x, y) ≤ dH (x, y)). Also, ∀x, y ∈ V, dH (x, y) ≤ dG(x, y) + 3λw + 6h.

Proof. First, we prove the non-contractiveness of the tree H. Let Cx and Cy be the

two clusters in Γ(s, h) containing vertices x and y, respectively. Let C be the nearest

common ancestor of Cx and Cy in Γ(s, h). Assume the depths of Cx,Cy and C in Γ are

i, j and k, respectively. Let x′ be the closest vertex in C to x (i.e., x′ ∈ C such that ∀z ∈

′ ′ ′ C, dG(x, x ) ≤ dG(x, z)). Let y be the closest vertex in C to y (i.e., y ∈ C such that ∀z ∈

′ C, dG(y, y ) ≤ dG(y, z)). For illustration see Figure 16. By our construction, we have the following inequalities:

′ kh ≤ dG(s, x ) < (k + 1)h,

′ kh ≤ dG(s, y ) < (k + 1)h,

ih ≤ dG(s, x) < (i + 1)h,

jh ≤ dG(s, y) < (j + 1)h.

Now, let x′′ ∈ C be a vertex on the shortest path from s to x. Also, let y′′ ∈ C be a

vertex on the shortest path from s to y. By our assumption that h ≥ w, we can guarantee

that no edge of the shortest path tree SPT (s) rooted at s crosses non-consecutive layers.

Thus, such vertices x′′ and y′′ must exist. Since x′ is the closest vertex in C to x, we have 104

′ ′′ ′′ dG(x, x ) ≤ dG(x, x ) = dG(s, x) − dG(s, x ) < (i − k)h + h. In the same way, we have

′ dG(y, y ) < (j − k)h + h. By our construction of H and the way weights are assigned to

its edges, we have dH (x, y) = (i − k)h + (j − k)h + ∆s(h) + 2h. By the triangle inequality,

we have:

′ ′ ′ ′ dG(x, y) ≤ dG(x, x ) + dG(y, y ) + dG(x , y )

< (i − k)h + h + (j − k)h + h + ∆s(h)

= (i − k)h + (j − k)h + ∆s(h) + 2h

= dH (x, y),

thus proving the non-contractiveness of our embedding into H.

Second, we prove the upper bound result on the distances in the tree H. By the

′ ′ triangle inequality, we have dG(x, x ) ≥ dG(s, x) − dG(s, x ). Since ih ≤ dG(s, x) and

′ ′ dG(s, x ) < (k + 1)h, we have dG(x, x ) > ih − (k + 1)h = (i − k)h − h. In the same way,

′ we have dG(y, y ) > (j −k)h−h. Since dH (x, y) = (i−k)h+(j −k)h+∆s(h)+2h and by

′ ′ applying the last two inequalities, we have dH (x, y) < dG(x, x ) + dG(y, y ) + ∆s(h) + 4h.

′ ′ Furthermore, we have dG(x, y) ≥ dG(x, x ) + dG(y, y ). Applying this, we have dH (x, y) <

dG(x, y) + ∆s(h) + 4h. By Lemma 16, we have ∆s(h) ≤ 3λw + 2h, thus we can conclude

that dH (x, y) < dG(x, y) + 3λw + 6h.

An outline of our algorithm for embedding weighted graphs into trees is described in

Algorithm 1.

Given a weighted graph G = (V,E) with n vertices and m edges, the construction

of the layering partition of G builds a shortest path tree (SPT (s)) originating from the vertex s using Dijkstra’s algorithm in O(m log n) time. The weighting of the edges of 105

Figure 16: Illustration of proof of Lemma 17.

H requires finding the largest cluster diameter and thus finding distances in G. We can calculate all pairwise distances in the graph by applying Disjktra’s algorithm n times yielding O(nm log n) total time. Thus, our algorithm requires O(nm log n) time to construct the tree embedding H of the graph G.

Now we conclude our work with the following theorem.

Theorem 11. If a weighted graph G = (V,E) with n vertices and m edges admits a non-contractive embedding into a tree with distortion λ, then we construct a non- contractive tree embedding H of G in O(nm log n) time such that: ∀x, y ∈ V, dH (x, y) ≤ dG(x, y) + 3λw + 6h.

It is worth noting that our algorithm for the problem of embedding weighted graphs into trees with multiplicative distortion would produce additive distortion error of 3λw + 106

Algorithm 1 Approximation Algorithm for Embedding into Tree Metric Input: A weighted graph G = (V,E), a root vertex s and the cluster-width h

Output: Tree embedding H for G

Find the layering partition LP(s, h) = {Li ,...,Li : i = 0, 1, . . . , r} of G 1 pi

Set initially H := (V, ∅)

for i = r down to 1 do do

for each cluster C from {Li ,...,Li } do 1 pi

Add to H a Steiner point sc

Add to H edges {vsc : v ∈ C} with weights ∆s(h)/2 + h

for each child cluster Z of C in Li+1 do

Add to H the edge between Steiner points scsz with weight h

end for

end for

end for

Return tree H

6h. To compare with other results, we recall the best results achieved in [21] for embed- ding a general metric into a tree metric. In [21], they produce a multiplicative error of

(λ log n)log1/2 ∆, where ∆ is the spread of the metric (i.e., the ratio of the diameter over the minimum distance in the metric). Comparing with our result, our distortion has w and h as additive terms, while ∆ appears as exponent in the distortion error of [21]. Also, comparing with the results of [7], their algorithm requires O(n4) running time, while ours requires O(nm log n) time. The approach of [7] embeds a general metric into a tree with distortion (1 + ϵ)(O(log n)), where ϵ is a measure quantifying how close a given metric is to 107 a tree with values in the range [0, 1]. We found that ϵ values for our datasets are equal to 1.

5.4 Experiment

In this section, we experiment our algorithm on real datasets. We test on a variety of real graph datasets including datasets of Internet measurements (MIT-PlanetLab,

Cornell-King, HP-PlanetLab and routeview). We used these datasets since empirical studies of the Internet measurements [148] indicated that the Internet has a tree-like structure to a certain degree (i.e., a good embeddability to a tree). Therefore, these datasets would be useful to verify that our algorithm practically produces good tree embeddings. Also, we run our algorithm on other types of datasets (social, biological and information networks) to measure tree-likeness in different domains. Furthermore, three datasets (routeview, yeast and Dutchelite) are uniformly weighted (unweighted) graphs. They can be used to obtain a view of how edge weights affect the results of our algorithm.

5.4.1 Datasets

The datasets are obtained from different domains (Internet measurements, social and collaboration networks, biological and information networks). Some parameters of these datasets are shown in Table 12. Original datasets have been preprocessed to remove violations of the triangle inequality in order to make each dataset a metric space. Also, some of the graph datasets were not connected, therefore we run our algorithm on the largest connected component of such graphs. 108

graph n m largest edge smallest edge diameter MIT-PlanetLab [156] 416 10277 6708.4 0.1215 8623.81 Cornell-King [161] 2500 60758 146777 1001 284367 HP-PlanetLab [3] 410 76943 1352720 1215.21 3893440 NetScience [132] 379 898 4.67763 0.0526316 69.5212 Geom [61] 3621 9438 77 1 1069 Facebook-like Social Network [136] 1893 13830 184 1 1445 FFN-msg-sum [135] 897 70904 1568 1 6159 FFN-char-sum [135] 897 68772 127792 1 494136 FFN-msg-newman [135] 897 70845 52.8877 0.008 201.271 FFN-char-newman [135] 897 70904 6726.23 0.016 26171 Celegans [159] 297 2087 72 1 344 cond-mat-99-joint [131] 13861 44619 37 1 650 cond-mat-99-newman [131] 13861 44616 22.3333 0.0588235 382.949 US Top 500-Airport Network [59] 500 2872 2253990 9 14714300 US Airport Network [134] 1572 16786 2974630 1 23763600 OpenFlights [134] 2905 15601 11 1 151 cond-mat-2003 [131] 27519 116173 35.2 0.0416667 551.315 cond-mat-2005 [131] 36458 171731 46 7.00118 806.662 hep-th [131] 5835 13811 33.999 0.0434783 614.537 astro-ph [131] 14845 119648 16.5 0.178571 225.317 routeview [4] 10515 21455 1 1 10 yeast [43] 2224 6609 1 1 11 DutchElite [63] 3621 4311 1 1 16 Table 12: Real datasets parameters: n: the number of vertices, m: the number of edges, the largest edge weight, the smallest edge weight and the diameter of the graph.

Internet measurement datasets:

MIT-PlanetLab [156]: A dataset of round-trip latency times between 497 PlanetLab [2] nodes/hosts measured using the Ping utility. The dataset was collected on 12/01/2005 at MIT. The latency times has been averaged over 10 pings.

Cornell-King [161]: A dataset of round-trip latency times between 2500 DNS servers measured using the KING technique [100]. The data was collected between 5/5/2004 and 5/13/2004 by Jeremy Stribling at Cornell University. The latency times are the medians of 10 measurements. 109

HP-PlanetLab [3]: A dataset of the available bandwidth measurement between PlanetLab

nodes/servers using the pathChirp tool [149]. The dataset was collected at HP labs.

The above three Internet measurement datasets are originally directed (i.e., two mea-

surements could exist between two nodes in both directions). In such case, we take the

average of the two measurements and replace both edges by one undirected edge. If only

one measurement exists between two nodes, we regard that measurement as an undi-

rected edge between the two nodes. To make our dataset a metric space, we remove

those edges causing violation of the triangle inequality.

Collaboration networks:

NetScience [132]: A co-authorship network of authors in the area of network theory and

experiment. The data was compiled by M.J. Newman in May 2006 from the bibliographies

of two review papers on networks. The weight of each edge is the M.J. Newman assigned ∑ weight [130] such that the weight between authors i and j is defined as w(i, j) = 1 , p Ap−1

where p is a joint paper of i and j and Ap is the number of authors of p.

Geom [61]: A co-authorship network of authors in the area of computational geometry.

An edge exists between two authors if they coauthored at least one joint work. The weight

of an edge between two authors is the number of joint collaborations. The dataset was

compiled in February 2002.

Condensed Matter collaborations 1999 (cond-mat-99-joint) [131]: A co-authorship net-

work between authors posting preprints on Condensed Matter in the arXiv E-Print

Archive between January 1, 1995 and December 31, 1999. An edge between two au-

thors is weighted by the number of joint papers on the subject of Condensed Matter.

Condensed Matter collaborations 1999 (cond-mat-99-newman) [131]: The same network 110

as the one above but with different edge weights. Edges are being weighted by New-

man’s weighting method such that the edge weight between authors i and j is defined as ∑ w(i, j) = 1 , where p is a joint paper of i and j and A is the number of authors p Ap−1 p

of p.

Condensed Matter collaborations 2003 (cond-mat-2003) [131]: An updated co-authorship

network between authors posting preprints on Condensed Matter in the arXiv E-Print

Archive between January 1, 1995 and June 30, 2003. The network is weighted using

Newman’s weighting method described above.

Condensed Matter collaborations 2005 (cond-mat-2005) [131]: An updated co-authorship

network between authors posting preprints on Condensed Matter in the arXiv E-Print

Archive between January 1, 1995 and March 31, 2005. The network is weighted using

Newman’s weighting method described above.

High-energy theory collaborations (hep-th) [131]: A co-authorship network between sci-

entists posting preprints on the High-Energy Theory E-Print Archive between January

1, 1995 and December 31, 1999. The network is weighted using Newman’s weighting

method described above.

Astrophysics collaborations(astro-ph) [131]: A co-authorship network between scientists

posting preprints on the Astrophysics E-Print Archive between January 1, 1995 and De-

cember 31, 1999. The network is weighted using Newman’s weighting method described

above.

Social networks:

Facebook-like Social Network [136]: A dataset of messages between an online community

of students at the University of California, Irvine. The dataset includes all students who 111

sent or received at least one message. The dataset is originally directed (sent/received

messages). We drop the edge direction and weight each edge by the total number of

messages exchanged between two students.

Facebook-like Forum Networks [135]: Datasets of online forum activity between an online

community of students at the University of California, Irvine. The datasets include

all forum users who posted messages on different topics of the forum. An edge exists

between two users if they both posted messages on at least one common topic. Four

different networks obtained from the datasets depending on the method of weighting

edges between users.

FFN-msg-sum: Edges between two nodes (users) are weighted by the total number of messages posted by both users on the same topics (i.e., the weight between users i and ∑ j is defined as w(i, j) = t mit + mjt, where t is a topic received posts from both users

i and j, and mit and mjt are the total number of messages being posted by i and j,

respectively, on t).

FFN-char-sum: Edges between two nodes (users) are weighted by the total number of characters of all messages being posted by both users on the same topics (i.e., w(i, j) = ∑ t cit + cjt, where t is a topic received posts from both users i and j, and cit and cjt are the number of characters posted by i and j, respectively, on t).

FFN-msg-newman: Edges between two nodes (users) are weighted by Newman’s weight-

ing method proportional to the total number of messages posted by both users on the ∑ same topics (i.e., w(i, j) = mit+mjt , where t is a topic received posts from both users t Mt

i and j, and mit and mjt are the number of messages posted by i and j, respectively, on

t, and Mt is the total number of messages posted on t by all users). 112

FFN-char-newman: Edges between two nodes (users) are weighted by Newman’s weight-

ing method proportional to the total number of characters posted by both users on the ∑ same topics (i.e., w(i, j) = cit+cjt , where t is a topic received posts from both users t Ct

i and j, and cit and cjt are the number of characters posted by i and j, respectively, on

topic t, and Ct is the total number of characters of all messages posted on t by all users).

Biological Datasets:

Celegans [159]: A dataset of the neural network of the Caenorhabditis elegans worm (C. elegans). Each node represents a neuron. An edge exists between two neurons if there is at least one synapse or gap junction between them. The weight of an edge is the number of synapses and gap junctions between two neurons. The dataset is originally directed.

Information Networks:

US Top 500-Airport Network [59]: A network of the 500 busiest commercial airports in

the US. An edge exists between two airports if a flight was scheduled between them in

the year 2002 with weight equal to the total number of seats available on the scheduled

flights in 2002. The data was obtained from Tore Oplash website [134].

US Airport Network [134]: A network of the commercial airports in the US. An edge exists between two airports if a flight was scheduled between them in the year 2010 with weight equal to the total number of seats available on the scheduled flights in 2010. The data was downloaded and compiled from the Bureau of Transportation Statistics (BTS)

Transtats site by Tore Oplash [134].

OpenFlight [134]: A network of commercial airports in the US and two other non-US

based airports. The weight of an edge is the number of routes between two airports. The

data was downloaded and compiled from Openflights.org by Tore Oplash [134]. 113

For all of the datasets except MitPlanetLab and Cornell-King, the semantic of an original edge weight represents the similarity between the two vertices being connected by that edge. In such case, we change the edges’ weights as following: w′ := max w − w + min w, where w′ is the new weight and w is the original edge weight. This, would guarantee that the smaller the distance between two vertices, the more similar and thus closer to each other.

5.4.2 Layering partition results

Recall that the embeddability of weighted graphs into tree metrics is related to the largest cluster diameter ∆s(h) of the layering partition LP(s, h) as shown by Lemma 16 in Section 5.2. Also, the construction of our tree embedding uses the layering partition.

Table 13 shows the results of the layering partition obtained for the datasets described in

Subsection 5.4.1. For each graph dataset, we randomly select a start vertex s and build the layering partition LP(s, h) with respect to s. Table 13 shows the cluster-diameter

∆s(h), the number of clusters in the layering partition LP(s, h) and the average diameter of clusters in LP(s, h). We find that all graph datasets have relatively small average diameter of clusters compared to their diameters. More than 40% of clusters having diameter of 0 (i.e., singleton clusters).

5.4.3 Non-contractive embedding results

We embed our datasets into the tree H. Our embedding depends on the cluster- width h of the layering partition as shown in Lemma 17. Lemma 17 shows that the smaller the value of h the smaller distortion of our embedding into H. Also, since we have the requirement that h ≥ w, we set h to the longest edge weight w. Table 14 114

Graph n = diameter cluster- # of cluster- average % of G = (V,E) |V | diam(G) width h clusters in diameter diameter clusters

LP(s, h) ∆s(h) of clusters having in LP(s, h) diameter 0 MIT-PlanetLab 416 8623.81 6708.94 2 2482.69 1241.345 50% Cornell-King 2500 284367 146777 2 211549 105774.5 50% HP-PlanetLab 410 3893440 1352720 10 2694960 269496 90% NetScience 379 69.5212 4.67763 120 20.1798 3.984 40% Geom 3621 1069 77 1540 536 31.4727 69.026 % Facebook-like Social Network 1893 1445 184 701 897 16.6377 94.15% FFN-msg-sum 897 6159 1568 73 4692 834.82 61.64% FFN-char-sum 897 494136 127792 59 382456 56899.03 72.88% FFN-msg-newman 897 201.271 52.8877 70 155.801 23.82 68.57% FFN-char-newman 897 26171 6726.23 59 20107.1 2832.39 72.88% Celegans 297 344 72 36 282 26.61 83.33% cond-mat-99-joint 13861 650 37 391 3681 20.88 58.63% cond-mat-99-newman 13861 382.949 22.3333 3676 230.314 12.4 58.71% US Top 500-Airport Network 500 14714300 2253990 189 8981270 922660.21 74.07% US Airport Network 1572 23763600 2974630 632 11865000 197311.91 96.20% OpenFlights 2905 151 11 1253 54 3.06 83.08% cond-mat-2003 27519 551.315 35.2 6202 339.633 18.15 61.54% cond-mat-2005 36458 806.662 46 7880 409.525 22.78 62.07% hep-th 5835 614.537 33.999 2143 329.046 15.83 68.92% astro-ph 14845 225.317 16.5 2750 138.284 7.294 65.71% routeview 10515 10 1 6702 6 0.0632 96.08% yeast 2224 11 1 1037 6 0.11956 94.56% DutchElite 3621 16 1 2934 10 0.07 98.02% Table 13: Layering partitions of the datasets and their parameters. h is the cluster-width of LP(s, h) and set equal to the longest edge weight. s is a randomly selected start vertex.

shows the results of embedding by our algorithm running on our datasets. We report the average distortion ratio, the maximum distortion ratio, the average relative distortion and the distance-weighted average distortion. These results show that some datasets have a “good” small average distortion but a very large maximum distortion. This could be justified as being due to a few “anomaly” vertices in the graphs which do not fit well into the tree metric. Table 14 shows that thirteen datasets have average distortion of less 115 than 3. It is worth noting that these datasets of small average distortion have relatively small values of the longest edge weight and thus cluster-width h.

graph avg. max avg. distance- ∆s(h) h distortion distortion relative weighted ratio ratio distortion average distortion MIT-PlanetLab 486.222 130869 485.222 86.2065 2482.69 6708.4 Cornell-King 56.0313 504.598 55.0313 37.6178 211549 146777 HP-PlanetLab 4.228 4444.01 3.228 4.05282 2694960 1352720 NetScience 2.33795 561.166 1.33795 2.04786 20.1798 4.67763 Geom 2.46234 690 1.46234 2.29134 536 77 Facebook-like Social Network 2.94284 1265 1.94284 2.79683 897 184 FFN-msg-sum 3.21513 7828 2.21513 2.99343 4692 1568 FFN-char-sum 4.88572 638040 3.88572 3.06892 382456 127792 FFN-msg-newman 3.22905 32697 2.22905 2.94213 155.801 52.8877 FFN-char-newman 8.38588 2097470 7.38588 2.95605 20107.1 6726.23 Celegans 3.29534 426 2.29534 3.00878 282 72 cond-mat-99-joint 2.43412 461 1.43412 2.30791 391 37 cond-mat-99-newman 2.44168 4674.67 1.44168 2.31843 230.314 22.3333 US Top 500-Airport Network 15.2011 1498810 14.2011 2.94326 8981270 2253990 US Airport Network 17.2603 17814200 16.2603 2.641 11865000 2974630 OpenFlights 2.55111 76 1.55111 2.40262 54 11 cond-mat-2003 2.51556 9840.79 1.51556 2.40361 339.633 35.2 cond-mat-2005 2.46088 78.2047 1.46088 2.35567 409.525 46 hep-th 2.30369 9132 1.30369 2.15664 329.046 33.999 astro-ph 2.74645 9590.9 1.74645 2.59063 138.284 16.5 routeview 2.87213 9 1.87213 2.70627 6 1 yeast 2.35655 9 1.35655 2.22493 6 1 DutchElite 2.07811 13 1.07811 1.93266 10 1 Table 14: Distortion results for non-contractive embedding of the datasets into tree H.

Cluster-width is equal to the largest edge weight (h = w).

5.4.4 Edge subdivision (h ≤ w)

From our analytical analysis of the constructed tree H (consult Lemma 17), the distances in H depend on the value of the cluster-width h. That is, the smaller h the 116 smaller bound on the distances in H. Recall that the construction of H required that no edge of the shortest path tree SPT (s) rooted at s crosses non-consecutive layers in the layering partition LP(s, h) (see Lemma 17). Setting h to be smaller than w requires subdivision of those SPT (s) edges longer than h and thus introducing new Steiner vertices to the graph. We call these Steiner vertices “dummy” vertices to distinguish them from the other Steiner vertices used in the construction of H. Table 15 shows the results of embedding our datasets with values of cluster-width smaller than or equal to the longest edge weight for each dataset. Table 15 shows that we obtained better embedding results compared to the results of Table 14, obtained by embedding with cluster-width set equal to longest edge weight. Generally, smaller values of h would yield smaller distortion

(better embedding) but at the expense of increasing the graph size by adding dummy vertices. This is shown in Figure 17 and Figure 18. These figures show that smaller values of h result in smaller distortion up to some point. Decreasing h after this point will not produce better distortion. This could be due to the fact that as we add more dummy nodes, we increase the diameter of those clusters containing these dummy vertices.

20000 20000 18000 18000 16000 16000 14000 14000 12000 12000 10000 10000 8000 8000 6000 6000 4000 4000 2000 2000 0 0 72 60 50 40 30 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 72 60 50 40 30 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 cluster-width cluster-width

# of dummy nodes avg error ratio*1000 # of dummy nodes max error ratio*10 Figure 17: Cluster-width versus average distortion, maximum distortion and number of dummy vertices for the Celegans dataset. 117

graph avg. max avg. distance- ∆s(h) h # of distortion distortion relative weighted dummy ratio ratio distortion average vertices distortion MIT-PlanetLab 7.58017 1767.77 6.58017 2.14556 212.784 1 26752 Cornell-King 7.09732 60.6909 6.09732 5.05324 57870 1001 9959 HP-PlanetLab 2.17509 2254.58 1.17509 2.09186 2679790 30000 17323 NetScience 1.76056 163.652 0.76056 1.63624 8.45536 0.0526316 29322 Geom 2.1185 553 1.1185 1.9818 533 10 23961 Facebook-like Social Network 2.24927 905.6 1.24927 2.14829 825.6 40 6763 FFN-msg-sum 1.12194 4861 1.15249 2.01465 4661 100 12474 FFN-char-sum 3.20813 386634 2.20813 2.09797 376634 5000 19776 FFN-msg-newman 2.19989 20725.1 1.19989 2.02006 155.801 5 8153 FFN-char-newman 5.26393 1282490 4.26393 1.94908 19919.8 300 18313 Celegans 2.46878 301 1.46878 2.26691 279 11 1485 cond-mat-99-joint 2.11556 387.5 1.11556 2.01216 357.5 15 19698 cond-mat-99-newman 2.23679 4246.83 1.23679 2.1277 229.814 10 15179 US Top 500-Airport Network 8.1244 749096 7.1244 2.0063 6541860 100000 10123 US Airport Network 12.1137 12185500 11.1137 2.11725 11785500 200000 21371 OpenFlights 1.90162 53.6667 0.901621 1.80349 47.6667 2 12403 cond-mat-2003 2.51556 9840.79 1.51556 2.40361 339.633 35.2 0 cond-mat-2005 2.46088 78.2047 1.46088 2.35567 409.525 46 0 hep-th 2.16475 8488.05 1.16475 2.03154 329.046 20 3750 astro-ph 2.42821 8298.2 1.42821 2.29359 128.182 10 8973 Table 15: Distortion results for non-contractive embedding of the datasets into tree H.

Cluster-width is less than or equal the largest edge weight (h ≤ w).

10000 10000 9000 9000 8000 8000 7000 7000 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0

cluster-width cluster-width

# of dummy nodes avg error ratio*100 # of dummy nodes max error ratio*10 Figure 18: Cluster-width versus average distortion, maximum distortion and number of dummy vertices for the CornellKing dataset. 118

5.4.5 Contractive embedding: weighting clusters with their own diameters

In the non-contractive embedding into the tree H, the weighting of edges inside each cluster is proportional to the largest cluster diameter ∆s(h). This could result in large distortion for vertices with small graph distances. If an embedding with a smaller average distortion is rather more desirable, we drop the requirement of non-contraction, where we would weight edges inside each cluster using the cluster diameter (i.e., edges inside cluster C are weighted by diam(C)/2). This weighting may result in contraction of the tree distances with respect to the original graph distances. Table 16 shows the results for embedding with weighting using clusters’ own diameters into the tree H′. In Table

16, we compute the average and the maximum distortions as follows:

Σ d (u,v)/d (u,v)+Σ ≥ d (u,v)/d (u,v) u,v:dT (u,v)

- maximum distortion := max{ dH (u,v) , dG(u,v) }. dG(u,v) dH (u,v)

It turns out that seven datasets have average distortion between 1 and 1.5 while nine of them have average distortion between 1.5 and 2. Furthermore, in Table 17 we show the number of vertex pairs having a distortion less than a specific value, i.e., pairs u, v ∈ V with max{ dH (u,v) , dG(u,v) } < ϵ. We can see that at least 40% of pairs of thirteen datasets dG(u,v) dH (u,v) have distance distortion in H′ less than 1.3. At least 50% of pairs of sixteen datasets have distance distortion less than 1.5.

5.4.6 Embedding with recursive partitioning of clusters

The distortion error could be large for vertices with small graph distances. Such vertices tend to be within the same cluster of the layering partition LP(s, h) used for the 119

Graph avg. max avg. distance- h # of distortion distortion relative weighted dummy distortion average vertices distortion MIT-PlanetLab 2.1476 778.017 1.14171 1.16353 1 26752 Cornell-King 2.15488 25.9803 1.15016 1.67531 1001 9959 HP-PlanetLab 2.07588 2168.83 1.07585 1.98151 30000 17323 NetScience 1.24350 305.541 0.232137 1.08414 2 377 Geom 1.38654 353 0.300326 0.943303 30 5527 Facebook-like Social Network 1.46909 731 0.445232 1.24037 130 725 FFN-msg-sum 1.76963 1569 0.767333 1.61176 800 872 FFN-char-sum 2.22802 127782 1.22761 1.76627 127792 0 FFN-msg-newman 1.77188 6596.1 0.76893 1.59801 30 801 FFN-char-newman 3.66260 755492 2.66155 1.63005 6053.6 24 Celegans 1.61404 141.457 0.569447 1.38213 3 6359 cond-mat-99-joint 1.48059 218 0.398237 1.08351 30 4091 cond-mat-99-newman 1.49963 2939.93 0.397248 1.05721 10 15179 US Top 500-Airport Network 5.33292 478798 4.32615 1.34539 2185380 3 US Airport Network 6.30806 5949250 5.29057 1.30221 2974630 0 OpenFlights 1.36410 46.25 0.294977 0.992853 7 1246 hep-th 1.51081 4565.36 0.404596 1.034 33.999 0 astro-ph 1.50542 4605.94 0.475554 1.27308 16.5 0 cond-mat-2003 1.6371 6633.59 0.580958 1.33231 35.2 0 cond-mat-2005 1.52376 52.2427 0.473886 1.25279 46 0 routeview 1.47421 6.5 0.368122 1.01176 1 0 yeast 1.52098 7 0.414786 1.06257 1 0 DutchElite 1.71461 11 0.372014 0.71202 1 0 Table 16: Distortion results for embedding of the datasets into tree H′. Edges inside each cluster C are weighted equal to diam(C)/2. construction of the tree embedding H. But, since edges inside each cluster are weighted uniformly using the diameter of the cluster (or the largest cluster diameter), the distortion for these vertices could be large. In order to reduce the distortion error between vertices within the same cluster, we recursively partition each cluster into groups or partitions.

Then, we add a Steiner point to each partition p and connect this new Steiner point to each vertex of the partition with weight equal to Dp/2, where Dp is the diameter of p. 120

Graph distortion h < 1.2 < 1.3 < 1.5 < 2 < 2.5 MIT-PlanetLab 51.98 62.57 75.12 85.62 89.48 1 Cornell-King 16.14 22.23 29.99 47.04 69.31 1001 HP-PlanetLab 4.24 4.83 5.15 25.31 97.50 30000 NetScience 60.99 76.81 89.55 97.31 98.99 2 Geom 38.54 52.87 72.76 93.20 98.20 30 Facebook-like Social Network 37.29 43.773 67.42 87.52 97.37 130 FFN-msg-sum 8.27 15.36 23.99 81.78 82.38 800 FFN-char-sum 2.69 8.70 21.27 68.35 80.91 127792 FFN-msg-newman 8.62 12.61 32.43 81.24 82.13 30 FFN-char-newman 5.74 9.81 27.92 81.43 82.24 6053.6 Celegans 27.07 36.76 56.62 81.79 93.70 3 cond-mat-99-joint 32.84 44.77 62.30 88.45 97.17 30 cond-mat-99-newman 33.96 44.78 59.72 87.54 96.34 10 US Top 500-Airport Network 28.22 40.53 63.29 89.22 97.35 2185380 US Airport Network 26.73 35.29 61.71 89.16 98.50 2974630 OpenFlights 43.15 56.20 75.58 93.24 98.32 7 hep-th 33.29 44.60 60.76 87.14 95.47 33.999 astro-ph 29.75 41.66 59.27 89.89 97.84 16.5 cond-mat-2003 25.17 34.39 50.01 78.17 92.44 35.2 cond-mat-2005 31.29 40.33 52.68 86.38 96.76 46 routeview 35.21 43.58 54.04 82.28 95.04 1 yeast 27.10 45.52 54.72 78.52 92.67 1 DutchElite 21.97 30.06 43.17 69.60 86.96 1 Table 17: Percentage of vertex pairs with distortion up to a given value by embedding datasets into tree H′ with own diameter weighting.

Also, we connect the partition Steiner point to the Steiner point of the partition from the previous iteration, with edge weight ∆s(h)/2+h−Dp/2. Description of this procedure is given in Algorithm 2. We consider the partitioning of a cluster as a P -centers problem, where P is the number of partitions. We apply the farthest point heuristic algorithm to solve the P -centers problem. The algorithm runs in P iterations. The first iteration randomly chooses a vertex and adds it to the set of centers (S). Each subsequent iteration chooses a vertex v with maximum dG(S, v) and adds v to S. This algorithm achieves a 121 factor 2 approximation for the P -centers problem in O(nP ) time [98].

Algorithm 2 Tree Embedding with Cluster Partitioning

Input: A cluster (or partition) C with Steiner point sc, P the number of partitions and

tree H.

Partition C into P partitions using the farthest point heuristic

for each partition p do

Add to p a Steiner point sp

Add to H edges {vsp : v ∈ p} with weights diam(p)/2

Add to H the edge scsp with weight diam(C) − diam(p)/2

end for

Return tree H

We tested our embedding algorithm with the above partition procedure on our graph datasets. It achieved better average distortions for some of our datasets as shown in Table

18. For example, embedding with the use of the P -centers partitioning almost halved the average distortions for US Top 500-Airport Network and US Airport Network datasets compared to the average distortions of embedding without partitioning. The partitioning technique gave negligible improvement for other datastes, where they already have small embedding distortions without partitioning. 122

Graph avg. max h # of # of distortion distortion dummy clusters and vertices partitions FFN-char-sum * 2.2027 127530 127792 248 307 FFN-char-sum 2.22802 127782 127792 0 59 FFN-char-newman * 3.0713 522556 6053.6 197 219 FFN-char-newman 3.66260 755492 6053.6 24 46 US Top 500-Airport Network * 3.4379 243443 2185380 58 241 US Top 500-Airport Network 5.33292 478798 2185380 3 186 US Airport Network * 3.1059 2005240 2974630 253 885 US Airport Network 6.30806 5949250 2974630 0 632 *Embedding with cluster partitioning using P -center Table 18: Distortion results for embedding with P-centers partitioning for datasets into tree H′. P-centers has negligible improvement of distortion for other datasets of table 12

. CHAPTER 6

Conclusion and Future Work

In Chapter 2, we discussed geometric properties characterizing “tree-likeness” of a graph from a metric point of view. Specifically, we investigated a few graph parameters, namely, the tree-distortion and the tree-stretch when embedding a graph into a tree

(tree spanner), the tree-length and the tree-breadth, Gromov’s hyperbolicity, the cluster- diameter and the cluster-radius in a layering partition of a graph, which capture and quantify this phenomenon of being metrically close to a tree. We provided a detailed and comprehensive survey on the theory related to the graph parameters used and, in particular, on the bounds relating these parameters. Furthermore, we calculated or accurately estimated those parameters on a wide range of real-life networks, taken from different domains like Internet measurements, biological datasets, web graphs, social and collaboration networks. Measuring these parameters allowed us to demonstrate existence of metric tree-like structures in these networks.

Finally in Chapter 2, we discussed algorithmic advantages for a graph to be metrically tree-like and a few applications of graph approximation with a tree or a tree spanner using the existing embedding techniques. Such applications include solving some problems related to routing and distance approximation in a network, as well as graph diameter and radius estimation.

123 124

From the observations in Chapter 2, we suggest that all these tree-likeness measure- ments are important where they collectively capture and explain metric tree-likeness of a given graph. Also, we suggest that metric tree-likeness measurements in conjunction with other local characteristics of networks, such as the degree distribution and clustering coefficients, provide a more complete unifying picture of networks.

One challenge intended for future investigation would be how to efficiently calculate

Gromov’s hyperbolicity for very large graphs. The best known algorithm to calculate hyperbolicity has time complexity of O(n3.69) [92]. One algorithm that performs well in practice is by Cohen et al. from [58], but still has O(n4) time complexity. Propositions

2 and 3 of Chapter 2 established lower and upper bounds on the value of hyperbolicity using cluster-diameter of a layering partition.

• Can we utilize layering partition of a graph to efficiently calculate hyperbolicity?

• Can we obtain an algorithm that works well in practice for very large graphs even

better than the algorithm of [58].

In Chapters 3 and 4, by using Robertson-Seymour’s tree-decomposition of graphs, we described a necessary condition for a graph to have a multiplicative t-spanner of tree- width k (in particular, to have a multiplicative tree t-spanner, when k = 1). As we have mentioned earlier, this necessary condition is far from being sufficient. The following interesting problem remains open.

• Does there exist a clean “if and only if” condition under which a graph admits

a multiplicative (or, additive) t-spanner of tree-width k (in particular, admits a

multiplicative (or, additive) tree t-spanner (k = 1 case))? 125

That necessary condition was very useful in demonstrating that, for every fixed k, there is a polynomial time algorithm that, given an n-vertex graph G admitting a multiplicative t-spanner with tree-width k, constructs a system of at most (k + 1)(1 + log2 n) collective additive tree O(t log n)-spanners of G. In particular, we showed that when k = 1, there is a polynomial time algorithm that, given an n-vertex graph G admitting a multiplicative tree t-spanner, constructs a system of at most log2 n collective additive tree O(t log n)- spanners of G. Can these results be improved?

• Does a polynomial time algorithm exist that, given an n-vertex graph G admitting

a multiplicative tree t-spanner, constructs a system of O(1) collective additive tree

O(t)-spanners of G?

• Does a polynomial time algorithm exist that, given an n-vertex graph G admitting

a multiplicative t-spanner with tree-width k, constructs a system of O(k) collective

additive tree O(t)-spanners of G?

As we have mentioned earlier, an interesting particular question whether a multiplicative tree spanner can be turned in polynomial time into an (one) additive tree spanner with a slight increase in the stretch is (negatively) settled already in [86]. Yet, it is interesting to know whether an exponential time procedure that performs such a transformation exists.

Two more interesting challenging questions we leave for future investigation.

• Is there any polynomial time algorithm which, given a graph admitting a system of

at most µ collective tree t-spanners, constructs a system of at most α(µ, n) collective

tree β(t, n)-spanners, where α(µ, n) is O(µ) (or O(µ log n)) and β(t, n) is O(t) (or

O(t log n))? 126

In this approximation question, we assume that one knows that a graph G admits a system of at most µ collective tree t-spanners, but (s)he does not know how to find it in polynomial time and wonders if something weaker can be constructed efficiently. The following question is about approximating the k-tree-width t-spanner problem.

• Is there a polynomial time algorithm that, for every unweighted graph G admitting

a t-spanner of tree-width k, constructs a (O(k log n)t)-spanner with tree-width at

most k?

In Chapter 5, we investigated the problem of embedding a weighted graph metric into a tree metric. We developed an approach with proven theoretical bounds for this problem. Furthermore, we applied and empirically tested our approach on real world graph datasets. Generally, we obtained a good embedding results with low distortion error on average for the tested graphs. BIBLIOGRAPHY

[1] Pages linking to www.epa.gov. Obtained from Jon Kleinberg’s web page. Avaial- able at: http://www.cs.cornell.edu/courses/cs685/2002fa/. [2] Planetlab: An open platform for developing, deploying, and accessing planetary- scale services. https://www.planet-lab.org. [3] S3: Scalable sensing service. http://networking.hpl.hp.com/s-cube. [4] University of oregon route-views project. http://www.routeviews.org/. [5] Will there ever be a tree of life that systematists can agree on? Science, 125th anniversary issue, 2005. http://www.sciencemag.org/sciext/125th/. [6] Ittai Abraham, Mahesh Balakrishnan, Fabian Kuhn, Dahlia Malkhi, Venugopalan Ramasubramanian, and Kunal Talwar. Reconstructing approximate tree metrics. In PODC, pages 43–52, 2007. [7] Ittai Abraham, Mahesh Balakrishnan, Fabian Kuhn, Dahlia Malkhi, Venugopalan Ramasubramanian, and Kunal Talwar. Reconstructing approximate tree metrics. In PODC, pages 43–52, 2007. [8] Aaron B. Adcock, Blair D. Sullivan, and Michael W. Mahoney. Tree-like structure in large social and information networks. In ICDM, pages 1–10, 2013. [9] Richa Agarwala, Vineet Bafna, Martin Farach, Mike Paterson, and Mikkel Thorup. On the approximability of numerical taxonomy (fitting distances by tree metrics). SIAM J. Comput., 28(3):1073–1085, 1999. [10] Noga Alon, Mihai Badoiu, Erik D. Demaine, Martin Farach-Colton, Moham- mad Taghi Hajiaghayi, and Anastasios Sidiropoulos. Ordinal embeddings of min- imum relaxation: general properties, trees, and ultrametrics. In SODA, pages 650–659. SIAM, 2005. [11] Noga Alon, Richard M. Karp, David Peleg, and Douglas West. A graph-theoretic game and its application to the k-server problem. SIAM J. COMPUT, 24:78–100, 1995. [12] Ingo Alth¨ofer,Gautam Das, David P. Dobkin, Deborah Joseph, and Jos´eSoares. On sparse spanners of weighted graphs. Discrete & Computational Geometry, 9:81– 100, 1993. [13] Stefan Arnborg, Derek G. Corneil, and Andrzej Proskurowski. Complexity of find- ing embeddings in a k-tree. SIAM J. Algebraic Discrete Methods, 8(2):277–284, April 1987. 127 128

[14] Yonatan Aumann and Yuval Rabani. An o(log k) approximate min-cut max-flow theorem and approximation algorithm. SIAM J. Comput., 27(1):291–301, February 1998. [15] Giorgio Ausiello, Alessandro D’Atri, and Marina Moscarini. Chordality proper- ties on graphs and minimal conceptual connections in semantic data models. J. Comput. Syst. Sci., 33(2):179–202, 1986. [16] Baruch Awerbuch and Yossi Azar. Buy-at-bulk network design. In FOCS, pages 542–547, 1997. [17] Mihai Badoiu, Julia Chuzhoy, Piotr Indyk, and Anastasios Sidiropoulos. Low- distortion embeddings of general metrics into the line. In STOC, pages 225–233, 2005. [18] Mihai Badoiu, Julia Chuzhoy, Piotr Indyk, and Anastasios Sidiropoulos. Embed- ding ultrametrics into low-dimensional spaces. In Symposium on Computational Geometry, pages 187–196, 2006. [19] Mihai Badoiu, Erik D. Demaine, MohammadTaghi Hajiaghayi, Anastasios Sidiropoulos, and Morteza Zadimoghaddam. Ordinal embedding: Approximation algorithms and dimensionality reduction. In APPROX-RANDOM, pages 21–34, 2008. [20] Mihai Badoiu, Kedar Dhamdhere, Anupam Gupta, Yuri Rabinovich, Harald R¨acke, R. Ravi, and Anastasios Sidiropoulos. Approximation algorithms for low-distortion embeddings into low-dimensional spaces. In SODA, pages 119–128, 2005. [21] Mihai Badoiu, Piotr Indyk, and Anastasios Sidiropoulos. Approximation algo- rithms for embedding general metrics into trees. In SODA, pages 512–521, 2007. [22] A. L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509–512, 1999. [23] Albert-L´aszl´oBarab´asi,R´eka Albert, and Hawoong Jeong. Scale-free characteris- tics of random networks: the topology of the world-wide web. Physica A: Statistical Mechanics and its Applications, 281(1-4):69–77, June 2000. [24] Yair Bartal. Probabilistic approximation of metric spaces and its algorithmic ap- plications. In In 37th Annual Symposium on Foundations of Computer Science, pages 184–193, 1996. [25] Yair Bartal. On approximating arbitrary metrics by tree metrics. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pages 161–168, 1998. [26] Yair Bartal, Avrim Blum, Carl Burch, and Andrew Tomkins. A polylog(n)- competitive algorithm for metrical task systems. In STOC, pages 711–719, 1997. [27] Surender Baswana, Telikepalli Kavitha, , and Seth Pettie. New constructions of (alpha, beta)-spanners and purely additive spanners. In SODA, pages 672–681, 2005. 129

[28] Surender Baswana and Sandeep Sen. A simple linear time algorithm for computing a (2k-1)-spanner of o(n1+1/k) size in weighted graphs. In ICALP, pages 384–296, 2003. [29] Vladimir Batagelj and Andrej Mrvar. Some analyses of Erdos collaboration graph. Social Networks, 22(2):173–186, May 2000. http://vlado.fmf.uni-lj.si/pub/ networks/data/Erdos/Erdos02.net. [30] Catriel Beeri, Ronald Fagin, David Maier, and Mihalis Yannakakis. On the Desir- ability of Acyclic Database Schemes. Journal of the ACM, 30(3):479–513, 1983. [31] C. Berge. Hypergraphs: Combinatorics of Finite Sets. North-Holland, 1989. [32] Piotr Berman, Arnab Bhattacharyya, Konstantin Makarychev, Sofya Raskhod- nikova, and Grigory Yaroslavtsev. Improved approximation for the directed span- ner problem. In ICALP (1), pages 1–12, 2011. [33] Arnab Bhattacharyya, Elena Grigorescu, Kyomin Jung, Sofya Raskhodnikova, and David P. Woodruff. Transitive-closure spanners. SIAM J. Comput., 41(6):1380– 1425, 2012. [34] Avrim Blum, Goran Konjevod, R. Ravi, and Santosh Vempala. Semi-definite re- laxations for minimum bandwidth and other vertex-ordering problems. In STOC, pages 100–105, 1998. [35] Hans L. Bodlaender. A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM J. Comput., 25(6):1305–1317, December 1996. [36] M. Bogu˜n´a,D. Krioukov, and K. C. Claffy. Navigability of complex networks. Nature Physics, 5(1):74–80, 2009. [37] J. Bourgain. On lipschitz embedding of finite metric spaces in Hilbert space. Isr. J. of Math., 52(1):46–52, March 1985. [38] Ulrik Brandes and Dagmar Handke. Np-completeness results for minimum planar spanners. Discrete Mathematics & Theoretical Computer Science, 3(1):1–10, 1998. [39] Andreas Brandst¨adt,Victor Chepoi, and Feodor F. Dragan. Distance approxi- mating trees for chordal and dually chordal graphs. J. Algorithms, 30(1):166–184, 1999. [40] Andreas Brandst¨adt,Feodor F. Dragan, Ho`ang-OanhLe, and Van Bang Le. Tree spanners on chordal graphs: complexity and algorithms. Theor. Comput. Sci., 310(1-3):329–354, 2004. [41] Andreas Brandst¨adt,Feodor F. Dragan, Ho`ang-OanhLe, Van Bang Le, and Ryuhei Uehara. Tree spanners for bipartite graphs and probe interval graphs. Algorith- mica, 47(1):27–51, 2007. [42] G. Brinkmann, J. Koolen, and V. Moulton. On the hyperbolicity of chordal graphs. Annals of Combinatorics, 5(1):61–69, 2001. 130

[43] Dongbo Bu, Yi Zhao, Lun Cai, Hong Xue, Xiaopeng Zhu, Hongchao Lu, Jingfen Zhang, Shiwei Sun, Lunjiang Ling, Nan Zhang, Guojie Li, and Runsheng Chen. Topological structure analysis of the proteinprotein interaction network in budding yeast. Nucleic Acids Research, 31(9):2443–2450, May 2003. Dataset available at: http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/Yeast.htm. [44] Leizhen Cai and Derek G. Corneil. Tree spanners. SIAM J. Discrete Math., 8(3):359–387, 1995.

[45] CAIDA. The CAIDA AS relationships dataset, 1 June 2012- 5 June 2012. http: //www.caida.org/data/active/as-relationships. [46] CAIDA. The internet topology data kit #0304, April 2003. http://www.caida. org/data/active/internet-topology-data-kit. [47] CAIDA. The CAIDA AS relationships dataset, 5 November 2007. http://www. caida.org/data/active/as-relationships. [48] Moses Charikar, Chandra Chekuri, Ashish Goel, and Sudipto Guha. Rounding via trees: Deterministic approximation algorithms for group steiner trees and k-median. In STOC, pages 114–123, 1998. [49] Kai Chen, David R. Choffnes, Rahul Potharaju, Yan Chen, Fabian E. Bustamante, Dan Pei, and Yao Zhao. Where the sidewalk ends: extending the internet as graph using traceroutes from p2p users. In Proceedings of the 5th international conference on Emerging networking experiments and technologies, CoNEXT ’09, pages 217– 228, New York, NY, USA, 2009. ACM. http://www.aqualab.cs.northwestern. edu/projects. [50] Wei Chen, Wenjie Fang, Guangda Hu, and Michael W. Mahoney. On the hyper- bolicity of small-world and tree-like random graphs. In ISAAC, volume 7676 of Lecture Notes in Computer Science, pages 278–288. Springer, 2012. [51] Victor Chepoi and Feodor F. Dragan. A note on distance approximating trees in graphs. Eur. J. Comb., 21(6):761–766, 2000. [52] Victor Chepoi, Feodor F. Dragan, Bertrand Estellon, Michel Habib, and Yann Vax`es. Diameters, centers, and approximating trees of delta-hyperbolicgeodesic spaces and graphs. In Symposium on Computational Geometry, pages 59–68, 2008. [53] Victor Chepoi, Feodor F. Dragan, Bertrand Estellon, Michel Habib, Yann Vax`es, and Yang Xiang. Additive spanners and distance and routing labeling schemes for hyperbolic graphs. Algorithmica, 62(3-4):713–732, 2012. [54] Victor Chepoi, Feodor F. Dragan, Ilan Newman, Yuri Rabinovich, and Yann Vax`es. Constant approximation algorithms for embedding graph metrics into trees and outerplanar graphs. Discrete & Computational Geometry, 47(1):187–214, 2012. [55] Victor Chepoi and Bertrand Estellon. Packing and covering delta -hyperbolic spaces by balls. In APPROX-RANDOM, pages 59–73, 2007. 131

[56] Victor Chepoi and Bernard Fichet. l∞-approximation via subdominants. J. Math. Psychol., 44(4):600–616, 2000. [57] Fan R. K. Chung and Linyuan Lu. The average distance in a random graph with given expected degrees. Internet Mathematics, 1(1):91–113, 2003. [58] Nathann Cohen, David Coudert, and Aur´elienLancin. Exact and approximate algorithms for computing the hyperbolicity of large-scale graphs. Rapport de recherche RR-8074, INRIA, September 2012. [59] V Colizza, R Pastor-Satorras, and A Vespignani. Reaction–diffusion processes and metapopulation models in heterogeneous networks. Nature Physics, 3:276–282, January 2007. [60] Derek G. Corneil, Feodor F. Dragan, Ekkehard K¨ohler,and Chenyu Yan. Collective tree 1-spanners for interval graphs. In WG, pages 151–162, 2005.

[61] Pajek datasets. Geom: Collaboration network in computational geometry. http: //vlado.fmf.uni-lj.si/pub/networks/data/collab/geom.htm. [62] Fabien de Montgolfier, Mauricio Soto, and Laurent Viennot. Treewidth and hyper- bolicity of the internet. In NCA, pages 25–32. IEEE Computer Society, 2011. [63] W. de Nooy. The network data on the administrative elite in the netherlands in April- June 2006. http://vlado.fmf.uni-lj.si/pub/networks/data/2mode/ DutchElite.htm. [64] Michael J. Demmer and Maurice Herlihy. The arrow distributed directory protocol. In Shay Kutten, editor, DISC, volume 1499 of Lecture Notes in Computer Science, pages 119–133. Springer, 1998. [65] Reinhard Diestel. Graph Theory, 4th Edition, volume 173 of Graduate texts in mathematics. Springer, 2012. [66] Michael Dinitz, Guy Kortsarz, and Ran Raz. Label cover instances with large girth and the hardness of approximating basic k-spanner. CoRR, abs/1203.0224, 2012. [67] Michael Dinitz and Robert Krauthgamer. Directed spanners via flow-based linear programs. In STOC, pages 323–332, 2011. [68] Dorit Dor, Shay Halperin, and Uri Zwick. All-pairs almost shortest paths. SIAM J. Comput., 29(5):1740–1759, 2000. [69] Yon Dourisboure. Compact routing schemes for generalised chordal graphs. J. Graph Algorithms Appl., 9(2):277–297, 2005. [70] Yon Dourisboure, Feodor F. Dragan, Cyril Gavoille, and Chenyu Yan. Spanners for bounded tree-length graphs. Theor. Comput. Sci., 383(1):34–44, 2007. [71] Yon Dourisboure and Cyril Gavoille. Tree-decompositions with bags of small di- ameter. Discrete Mathematics, 307(16):2008–2029, 2007. 132

[72] Feodor F. Dragan. Tree-like structures in graphs: a metric point of view. In WG, 2013. [73] Feodor F. Dragan and Muad Abu-Ata. Collective additive tree spanners of bounded tree-breadth graphs with generalizations and consequences. In SOFSEM, pages 194–206, 2013. [74] Feodor F. Dragan, Fedor V. Fomin, and Petr A. Golovach. Approximation of minimum weight spanners for sparse graphs. Theor. Comput. Sci., 412(8-10):846– 852, 2011. [75] Feodor F. Dragan, Fedor V. Fomin, and Petr A. Golovach. Spanners in sparse graphs. J. Comput. Syst. Sci., 77(6):1108–1119, 2011. [76] Feodor F. Dragan and Ekkehard K¨ohler. An approximation algorithm for the tree t-spanner problem on unweighted graphs via generalized chordal graphs. In APPROX-RANDOM, pages 171–183, 2011. [77] Feodor F. Dragan and Chenyu Yan. Collective tree spanners in graphs with bounded parameters. Algorithmica, 57(1):22–43, 2010. [78] Feodor F. Dragan, Chenyu Yan, and Derek G. Corneil. Collective tree spanners and routing in at-free related graphs. J. Graph Algorithms Appl., 10(2):97–122, 2006. [79] Feodor F. Dragan, Chenyu Yan, and Irina Lomonosov. Collective tree spanners of graphs. SIAM J. Discrete Math., 20(1):241–260, 2006. [80] William Duckworth, Nicholas C. Wormald, and Michele Zito. A ptas for the spars- est 2-spanner of 4-connected planar triangulations. J. Discrete Algorithms, 1(1):67– 76, 2003. [81] Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998. [82] Michael Elkin and David Peleg. Strong inapproximability of the basic k-spanner problem. In ICALP, pages 636–647, 2000. [83] Michael Elkin and David Peleg. (1+epsilon, beta)-spanner constructions for general graphs. In Proceedings of the thirty-third annual ACM symposium on Theory of computing, STOC ’01, pages 173–182, New York, NY, USA, 2001. ACM. [84] Michael Elkin and David Peleg. Approximating k-spanner problems for kge2. Theor. Comput. Sci., 337(1-3):249–277, 2005. [85] Michael Elkin and David Peleg. The hardness of approximating spanner problems. Theory Comput. Syst., 41(4):691–729, 2007. [86] Yuval Emek and David Peleg. Approximating minimum max-stretch spanning trees on unweighted graphs. SIAM J. Comput., 38(5):1761–1781, 2008. 133

[87] Jittat Fakcharoenphol, Satish Rao, and Kunal Talwar. A tight bound on approx- imating arbitrary metrics by tree metrics. J. Comput. Syst. Sci., 69(3):485–497, 2004. [88] Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. On power-law rela- tionships of the internet topology. In SIGCOMM, pages 251–262, 1999. [89] Uriel Feige. Approximating the bandwidth via volume respecting embeddings. J. Comput. Syst. Sci., 60(3):510–539, 2000. [90] S´andorP. Fekete and Jana Kremer. Tree spanners in planar graphs. Discrete Applied Mathematics, 108(1-2):85–103, 2001. [91] Fedor V. Fomin, Petr A. Golovach, and Erik Jan van Leeuwen. Spanners of bounded degree graphs. Inf. Process. Lett., 111(3):142–144, 2011. [92] Herv´eFournier, Anas Ismail, and Antoine Vigneron. Computing the gromov hy- perbolicity of a discrete metric space. CoRR, abs/1210.3323, 2012. [93] Naveen Garg, Goran Konjevod, and R. Ravi. A polylogarithmic approximation algorithm for the group steiner tree problem. In Proceedings of the Ninth An- nual ACM-SIAM Symposium on Discrete Algorithms, SODA ’98, pages 253–259, Philadelphia, PA, USA, 1998. Society for Industrial and Applied Mathematics. [94] Cyril Gavoille and Olivier Ly. Distance labeling in hyperbolic graphs. In ISAAC, pages 1071–1079, 2005. [95] Cyril Gavoille and David Peleg. Compact and localized distributed data structures. Distributed Computing, 16(2-3):111–120, 2003. [96] E. Ghys and P. de la Harpe eds. Les groupes hyperboliques d’apr`esm. gromov. Progress in Mathematics, 83, 1990. [97] J. R. Gilbert, D. J. Rose, and A. Edenbrandt. A separator theorem for chordal graphs. SIAM Journal on Algebraic and Discrete Methods, 5(3):306–313, 1984. [98] T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoret- ical Computer Science, 38:293–306, 1985. [99] M Gromov. Hyperbolic groups: Essays in group theory. MSRI Publ., 8:75263, 1987. [100] P. Krishna Gummadi, Stefan Saroiu, and Steven D. Gribble. King: estimating latency between arbitrary internet end hosts. Computer Communication Review, 32(3):11, 2002. [101] Anupam Gupta. Steiner points in tree metrics don’t (really) help. In SODA, pages 220–227, 2001. [102] Anupam Gupta, Amit Kumar, and Rajeev Rastogi. Traveling with a pez dispenser (or, routing issues in mpls). SIAM J. Comput., 34(2):453–474, 2004. 134

[103] Alexander Hall and Christos H. Papadimitriou. Approximating the distortion. In APPROX-RANDOM, pages 111–122, 2005. [104] Teresa W. Haynes, Stephen Hedetniemi, and Peter Slater. Fundamentals of Dom- ination in Graphs (Pure and Applied Mathematics (Marcel Dekker)). CRC, 1998. [105] Maurice Herlihy, Fabian Kuhn, Srikanta Tirthapura, and Roger Wattenhofer. Dy- namic analysis of the arrow distributed protocol. Theory Comput. Syst., 39(6):875– 901, 2006. [106] Piotr Indyk. Algorithmic applications of low-distortion geometric embeddings. In FOCS, pages 10–33, 2001. [107] Piotr Indyk and Jiri Matousek. Low-distortion embeddings of finite metric spaces. In in Handbook of Discrete and Computational Geometry, pages 177–196. CRC Press, 2004. [108] H. Jeong, S. P. Mason, A.-L. Barabsi, and Z. N. Oltvai. Lethality and centrality in protein networks. Nature, 411(6833):41–42, 2001. Avaialable at: http://www3. nd.edu/~networks/resources.htm. [109] Mong-Jen Kao, Der-Tsai Lee, and Dorothea Wagner. Approximating metrics by tree metrics of small distance-weighted average stretch. CoRR, abs/1301.3252, 2013. [110] W. S. Kennedy, O. Narayan, and I. Saniee. On the Hyperbolicity of Large-Scale Networks. ArXiv e-prints, June 2013. [111] Claire Kenyon, Yuval Rabani, and Alistair Sinclair. Low distortion maps between point sets. In STOC, pages 272–280, 2004. [112] Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604–632, September 1999. http://www.cs.cornell.edu/courses/cs685/ 2002fa/. [113] Jon M. Kleinberg. The small-world phenomenon: an algorithm perspective. In STOC, pages 163–170, 2000. [114] Jon M. Kleinberg. Small-world phenomena and the dynamics of information. In NIPS, pages 431–438, 2001. [115] Robert Kleinberg. Geographic routing using hyperbolic space. In INFOCOM, pages 1902–1909, 2007. [116] Guy Kortsarz. On the hardness of approximating spanners. Algorithmica, 30(3):432–450, 2001. [117] Guy Kortsarz and David Peleg. Generating sparse 2-spanners. J. Algorithms, 17(2):222–236, 1994. [118] D. Kratsch, H. Le, H. Mller, E. Prisner, and D. Wagner. Additive tree spanners. SIAM Journal on Discrete Mathematics, 17(2):332–340, 2003. 135

[119] Robert Krauthgamer and James R. Lee. Algorithms on negatively curved spaces. In FOCS, pages 119–132, 2006. [120] James R. Lee, Assaf Naor, and . Trees and markov convexity. In SODA, pages 1028–1037, 2006. [121] Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. Com- munity structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 6(1):29–123, 2009. [122] Christian Liebchen and Gregor W¨unsch. The zoo of tree spanner problems. Dis- crete Applied Mathematics, 156(5):569–587, 2008. [123] A.L. Liestman and T. Shermer. Additive graph spanners. Networks, 23(4):343–364, 1993. [124] Michal Linial, Nathan Linial, Naftali Tishby, and Golan Yona. Global self organi- zation of all known protein sequences reveals inherent biological signatures, 1997. [125] Nathan Linial. Finite metric spaces – combinatorics, geometry and algorithms. In Proceedings of the International Congress of Mathematicians III, pages 573–586, 2002. [126] Nathan Linial, Eran London, and Yuri Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15(2):215–245, 1995. [127] Daniel Lokshtanov. On the complexity of computing treelength. Discrete Applied Mathematics, 158(7):820–827, 2010. [128] Jir´ıMatousek and Anastasios Sidiropoulos. Inapproximability for metric embed- dings into rd. In FOCS, pages 405–413, 2008. [129] Onuttom Narayan and Iraj Saniee. Large-scale curvature of networks. Physical Review E, 84(6):066108, 2011. [130] M. E. J. Newman. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Physical Review E, 64(1):016132+, June 2001. [131] M. E. J. Newman. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences, 98(2):404–409, January 2001. [132] M. E. J. Newman. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(3):036104+, September 2006. Dataset available at:http://www-personal.umich.edu/~mejn/netdata/. [133] K. Norlen, G. Lucas, M. Gebbie, and J. Chuang. EVA: Extraction, Visualization and Analysis of the Telecommunications and Media Ownership Network. Pro- ceedings of International Telecommunications Society 14th Biennial Conference (ITS2002), Seoul Korea, August 2002. Dataset available at: http://vlado.fmf. uni-lj.si/pub/networks/data/econ/Eva/Eva.htm. 136

[134] Tore Opsahl. Why anchorage is not (that) important: Binary ties and sample selection. Available at:http://wp.me/poFcY-Vw. [135] Tore Opsahl. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients. Social Networks, 35(2):159–167, 2013. Dataset available at:http://toreopsahl.com/datasets/#online_forum_network. [136] Tore Opsahl and Pietro Panzarasa. Clustering in weighted networks. Social Networks, 31(2):155–163, 2009. Dataset available at:http://toreopsahl.com/ datasets/#online_social_network. [137] Christos H. Papadimitriou and Shmuel Safra. The complexity of low-distortion embeddings between point sets. In SODA, pages 112–118, 2005. [138] D. Peleg. Distributed Computing: A Locality-Sensitive Approach. SIAM Mono- graphs on Discrete Math. Appl. SIAM, Philadelphia, 2000. [139] D. Peleg and D. Tendler. Low stretch spanning trees for planar graphs,. Technical report, Weizmann Science Press of Israel, 2001. [140] David Peleg. Proximity-preserving labeling schemes and their applications. In WG, pages 30–41, 1999. [141] David Peleg. Low stretch spanning trees. In MFCS, pages 68–80, 2002. [142] David Peleg and Eilon Reshef. Low complexity variants of the arrow distributed directory. J. Comput. Syst. Sci., 63(3):474–485, 2001. [143] David Peleg and Alejandro A. Sch¨affer.Graph spanners. Journal of Graph Theory, 13(1):99–116, 1989. [144] David Peleg and Jeffrey D. Ullman. An optimal synchronizer for the hypercube. SIAM J. Comput., 18(4):740–747, 1989. [145] David Peleg and Eli Upfal. A tradeoff between space and efficiency for routing tables (extended abstract). In STOC, pages 43–52, 1988. [146] Erich Prisner. Distance approximating spanning trees. In STACS, pages 499–510, 1997. [147] Yuri Rabinovich and Ran Raz. Lower bounds on the distortion of embedding finite metric spaces in graphs. Discrete & Computational Geometry, 19(1):79–94, 1998. [148] Venugopalan Ramasubramanian, Dahlia Malkhi, Fabian Kuhn, Mahesh Balakrish- nan, Archit Gupta, and Aditya Akella. On the treeness of internet latency and bandwidth. In SIGMETRICS/Performance, pages 61–72, 2009. [149] Vinay J. Ribeiro, Rudolf H. Riedi, Richard G. Baraniuk Jiri Navratil, and Les Cottrell. pathChirp: Efficient available bandwidth estimation for network paths. In Ronn Ritke, Tony McGregor, and J¨orgMicheel, editors, PAM 2003, 4th Passive and Active Measurement Workshop. NLANR/MNA, UCSD, apr 2002. 137

[150] N. Robertson and P. D. Seymour. Graph minors II: algorithmic aspects of tree- width. Journal Algorithms, 7:309–322, 1986. [151] Charles Semple and Mike Steel. Phylogenetics, volume 24 of Oxford lecture series in mathematics and its applications 24. Oxford University Press, 2003. [152] Yuval Shavitt and Eran Shir. Dimes: Let the internet measure itself. CoRR, abs/cs/0506099, 2005. Avaialable at: http://www.netdimes.org. [153] Yuval Shavitt and Tomer Tankel. On the curvature of the internet and its usage for overlay construction and distance estimation. In INFOCOM, 2004. [154] Yuval Shavitt and Tomer Tankel. Hyperbolic embedding of internet graph for distance estimation and overlay construction. IEEE/ACM Trans. Netw., 16(1):25– 36, 2008. [155] Chris Stark, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Bre- itkreutz, and Mike Tyers. Biogrid: a general repository for interaction datasets. Nucleic Acids Research, 34(Database-Issue):535–539, 2006. Dataset available at: http://thebiogrid.org/, release 3.2.99. [156] Jeremy Stribling. Planetlab all-pairs-pings. http://pdos.csail.mit.edu/ ~strib/pl_app. [157] Mikkel Thorup and Uri Zwick. Compact routing schemes. In SPAA, pages 1–10, 2001. [158] Mikkel Thorup and Uri Zwick. Approximate distance oracles. J. ACM, 52(1):1–24, 2005. [159] D. J. Watts and S. H. Strogatz. Collective dynamics of’small-world’networks. Nature, 393(6684):409–10, 1998. Dataset available at:http://toreopsahl.com/ datasets/#celegans. [160] D.J. Watts and S.H. Strogatz. Collective dynamics of ’small-world’ networks. Na- ture, (393):440–442, 1998. [161] Bernard Wong, Aleksandrs Slivkins, and Emin G¨unSirer. Meridian: a lightweight network location service without virtual coordinates. In SIGCOMM, pages 85–96, 2005. www.cs.cornell.edu/people/egs/meridian. [162] David P. Woodruff. Additive spanners in nearly quadratic time. In ICALP (1), pages 463–474, 2010. [163] Yaokun Wu and Chengpeng Zhang. Hyperbolicity and chordality of a graph. Electr. J. Comb., 18(1), 2011. [164] Chenyu Yan, Yang Xiang, and Feodor F. Dragan. Compact and low delay routing labeling scheme for unit disk graphs. Comput. Geom., 45(7):305–325, 2012. 138

[165] Jaewon Yang and Jure Leskovec. Defining and evaluating network communi- ties based on ground-truth. In ICDM, pages 745–754, 2012. Avaialable at: http://snap.stanford.edu/data/com-Amazon.html, http://snap.stanford. edu/data/com-DBLP.html.