Metric Embedding via Shortest Path Decompositions

Ittai Abraham Arnold Filtser VMWare Ben-Gurion University of the Negev Israel Israel [email protected] [email protected] Anupam Gupta Ofer Neiman Carnegie Mellon University Ben-Gurion University of the Negev USA Israel [email protected] [email protected]

ABSTRACT ϕ : V → RD , and a norm ∥ · ∥, the contraction and expansion of the We study the problem of embedding weighted graphs of map ϕ are the smallest τ, ρ ≥ 1, respectively, such that for every min{1/p,1/2} pair x,y ∈ V , k into ℓp spaces. Our main result is an O(k )-distortion embedding. For p = 1, this is a super-exponential improvement over 1 ∥ϕ(x) − ϕ(y)∥ the best previous bound of Lee and Sidiropoulos. Our distortion ≤ ≤ ρ . τ d(x,y) bound is asymptotically tight for any fixed p > 1. Our result is obtained via a novel embedding technique that is The distortion of the map is then τ · ρ In this paper we will investi- based on low depth decompositions of a graph via shortest paths. gate embeddings into ℓp norms; the most prominent of which are The core new idea is that given a geodesic shortest path P, we can the Euclidean norm ℓ2 and the cut norm ℓ1; the former for obvious probabilistically embed all points into 2 dimensions with respect reasons, and the latter because of its close connection to graph to P. For p > 2 our embedding also implies improved distortion partitioning problems, and in particular the Sparsest Cut problem. 1/p on bounded graphs (O((k logn) )). For asymptotically Specifically, the ratio between the Sparsest Cut and the multicom- large p, our results also implies improved distortion on graphs modity flow equals the distortion of the optimal embedding into ℓ1 excluding a minor. (see [16, 27] for more details). We focus on embedding of metrics arising from certain graph CCS CONCEPTS families. Indeed, since general n-point metrics require Ω(log n/p)- • Theory of computation → Graph algorithms analysis; Ran- distortion to embed into ℓp -norms, much attention was given to dom projections and metric embeddings; embeddings of restricted graph families that arise in practice. (Em- bedding an (edge-weighted) graph is short-hand for embedding the KEYWORDS shortest path metric of the graph generated by these edge-weights.) Metric Embeddings, Shortest Path Decomposition, Normed Spaces, Since the class of graphs embeddable with some distortion into some Pathwidth, Treewidth. target normed space is closed under taking minors, it is natural to focus on minor-closed graph families. A long-standing conjecture ACM Reference Format: in this area is that all non-trivial minor-closed families of graphs Ittai Abraham, Arnold Filtser, Anupam Gupta, and Ofer Neiman. 2018. Metric Embedding via Shortest Path Decompositions. In Proceedings of 50th embed into ℓ1 with distortion depending only on the graph family Annual ACM SIGACT Symposium on the Theory of Computing (STOC’18). and not the size n of the graph. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3188745.3188808 While this question remains unresolved in general, there has been some progress on special classes of graphs. The class of out- 1 INTRODUCTION erplanar graphs (which exclude K2,3 and K4 as a minor) embeds Low-distortion metric embeddings are a crucial component in the isometrically into ℓ1; this follows from results of Okamura and Sey- modern algorithmist toolkit. Indeed, they have applications in ap- mour [31]. Following [16], Chakrabarti et al. [12] show that every proximation algorithms [27], online algorithms [7], distributed graph with treewidth-2 (which excludes K4 as a minor) embeds algorithms [21], and for solving linear systems and computing into ℓ1 with distortion 2 (which is tight, as shown by [24]). Lee and graph sparsifiers [33]. Given a (finite) metric space (V ,d), a map Sidiropoulos [26] showed that every graph with pathwidth k can k 3+1 be embedded into ℓ1 with distortion (4k) . See Section 1.3 for Permission to make digital or hard copies of all or part of this work for personal or additional results. classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation We note that ℓ2 is a potentially more natural and useful target on the first page. Copyrights for components of this work owned by others than ACM space than ℓ1 (in particular, finite subsets of ℓ2 embed isometrically must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, into ℓ ). Alas, there are only few (natural) families of metrics that to post on servers or to redistribute to lists, requires prior specific permission and/or a 1 fee. Request permissions from [email protected]. admit constant distortion embedding into Euclidean space, such STOC’18, June 25–29, 2018, Los Angeles, CA, USA as “snowflakes” of a doubling metrics6 [ ], doubling trees [14] and © 2018 Association for Computing Machinery. graphs of bounded bandwidth [8]. All these families have bounded ACM ISBN 978-1-4503-5559-9/18/06...$15.00 https://doi.org/10.1145/3188745.3188808 doubling dimension. (For definitions, see Section 2.)

952 STOC’18, June 25–29, 2018, Los Angeles, CA, USA Ittai Abraham, Arnold Filtser, Anupam Gupta, Ofer Neiman

Table 1: Our and previous results for embedding certain graph families into ℓp . (For H -minor-free graphs, д(H ) is some function of |H |.)

Graph Family Our results. Previous results 1/p k3+1 Pathwidth k O(k ) (4k) into ℓ1 [26] Treewidth k O((k log n)1/p ) O(k1−1/p · log1/p n) [22] O((log(k log n))1−1/p (log1/p n)) [20] Planar O(log1/p n) O(log1/p n) [32] H -minor-free O((д(H ) log n)1/p ) O(|H |1−1/p log1/p n) [4]+[22]

1.1 Our Results Theorem 1.2 (Embeddings for SPD Families). Let G = (V , E) In this paper we develop a new technique for embedding certain be a weighted graph with an SPD of depth k. Then there exists an → ( 1/p ) graph families into ℓp spaces with constant distortion. Our main embedding f : V ℓp with distortion O k . result is an improved embedding for bounded pathwidth graphs (see Section 2 for the definition of pathwidth and other graph families). Since bounded-pathwidth graphs admit SPDs of low depth, we get Theorem 1.1 as a simple corollary of Theorem 1.2. Moreover, we Theorem 1.1 (Pathwidth Theorem). Any graph with pathwidth derive several other results, which are summarized in Table 1; these 1/p k embeds into ℓp with distortion O(k ). results either improve on the state-of-the-art, or provide matching bounds using a new approach. Note that this is a super-exponential improvement over the best Our result of Theorem 1.1 (and thus also Theorem 1.2) is asymp- 3 previous distortion bound of O(k)k , by Lee and Sidiropoulos. Their totically tight for any fixed p > 1. The family exhibiting this fact is approach was based on probabilistic embedding into trees, which the diamond graphs. implies embedding only into ℓ1. Such an approach cannot yield distortion better than O(k), due to known lower bounds for the Theorem 1.3 ([23, 28, 30]). For any fixed p > 1 and every k ≥ 1, diamond graph [16], that has pathwidth k +1. Our embedding holds there exists a graph G = (V , E) with pathwidth-k, such that every min{1/p,1/2} for any ℓp space, and we can overcome the barrier of O(k) (since embedding f : V → ℓp has distortion Ω(k ). ℓ ℓ finite subsets of 2 embed isometrically√ into p , the distortion of Theorem 1.1 is never larger than O( k)). In particular, we obtain The bound in Theorem 1.3 was proven first for p = 2 in [30], generalized to 1 < p ≤ 2 in [23] and for p ≥ 2 by [28] (see also embeddings√ of pathwidth-k graphs into both ℓ2 and ℓ1 with dis- tortion O( k). Moreover, an embedding with this distortion can [18, 19]). The proofs of [23, 30] were done using the diamond graph, be found efficiently via semidefinite-programming; see, e.g.,27 [ ], while [28] used the Laakso graph. For completeness, we provide a even without access to the actual path decomposition (which is NP- proof of the case p ≥ 2 using the diamond graph in Section 8. We hard even to approximate [9]). We remark that graphs of bounded note that for ℓ1 only the trivial Ω(logk) lower bound is known. pathwidth can have arbitrarily large doubling dimension (exhib- ited by star graphs that have pathwidth 1), and thus our result is 1.2 Technical Ideas a noteworthy example of a non-trivial Euclidean embedding with Many known embeddings [1, 10, 22, 32] are based on a collection constant distortion for a family of metrics with unbounded doubling of 1-dimensional embeddings, where we embed each point to its dimension. from a given subset of points. We follow this approach, but Since graphs of treewidth k have pathwidth O(k logn) (see, e.g., differ in two aspects. Firstly, the subset of points we use is notbased [17]), Theorem 1.1 provides an embedding of such graphs into ℓ p on random sampling or probabilistic clustering. Rather, inspired (( )1/p ) with distortion O k logn . This strictly improves the best previ- by the works of [5] and [4], the subset used is a geodesic shortest ously known bound, which follows from a theorem in [22] (who ob- path. The second is that our embedding is not 1-dimensional but 1−1/p 1/p tained distortion O(k log n) ), for any p > 2, and matches it 2-dimensional: this seemingly small change crucially allows us to for 1 ≤ p ≤ 2. While [20] obtained recently a distortion bound with use the structure of the shortest paths to our advantage. 1−1/p 1/p improved dependence onk, their resultO((log(k logn)) (log n)) The SPD induces a collection of shortest paths (each shortest has sub-optimal dependence on n. path lies in some connected component). A natural initial attempt is to embed a v relative to a geodesic path P using two A General Embedding Framework. The embedding of Theorem 1.1 dimensions:1 follows as a special case of a more general theorem: we devise embeddings for any graph family which admits “shortest path de- • The first coordinate ∆1 is the distance to the path d(v, P). compositions” (SPDs) of “low depth”. Every (weighted) path graph • The second coordinate ∆2 is the distance d(v,r) to the end- has an SPD of depth 1. A graph G has an SPD of depth k if after point of the path, called its “root”. removing some shortest path P, every connected component in G \P has an SPD of depth k − 1. (An alternative definition appears in Definition 3.1.) Our main technical result is the following. 1In fact, we use different dimensions for each connected components.

953 Metric Embedding via Shortest Path Decompositions STOC’18, June 25–29, 2018, Los Angeles, CA, USA

P P P v v C2 C2 C2 C1 C1 C1 v u u u (a) r (b) r (c) r

Figure 1: A shortest path P, rooted at r , partitions cluster X into clusters C1 and C2. In cases (a) and (b), the first coordinate (the distance to P) provides sufficient contribution. In case (c) the second coordinate (the distance toroot r ) provides the contribution.

v graphs, combining the results of [4, 22] give ℓp -embeddings with ) 1−1/p 1/p (v, r O(|H | log n) distortion. = d ∆ = d(v, P ) ∆2 1 Following [5, 29], the idea of using geodesic shortest paths to P decompose the graph has been used for many algorithmic tasks: r MPLS routing [15], directed connectivity, distance labels and com- pact routing [34], object location [3], and nearest neighbor search [2]. However, to the best of our knowledge, this is the first time it Figure 2: An illustration of our initial attempt. The first coordinate has been used directly for low-distortion embeddings into normed ∆1 is the distance to the path d(v, P). The second coordinate ∆2 is the distance d(v, r ) to the endpoint of the path, called its “root”. spaces.

Unfortunately, this embedding may have unbounded expansion: If two vertices u,v are separated by some shortest path, in future 2 PRELIMINARIES AND NOTATION iterations v may have a large distance to the root of a path P in its For k ∈ Z, let [k] := {1,...,k}. For p ≥ 1, the ℓp -norm of a component, while u has zero in that coordinate (because it’s not vector x = (x ,..., x ) ∈ Rd is ∥x ∥ := (Íd |x |p )1/p , where in that component), incurring a large stretch. The natural fix is to 1 d p i=1 i ∥x ∥∞ := max |x |. enforce a Lipschitz condition on every coordinate: for v in cluster i i Doubling dimension. The doubling dimension of a metric is a X, we truncate the value v can receive at O(d (v,V \ X)). I.e., a G measure of its local “growth rate”. Formally, a metric space (X,d) vertex close to the boundary of X cannot get a large value. Using has doubling dimension λ if for every x ∈ X and radius r, the ball the fact that the SPD has depth k, each vertex will have only O(k) X B(x,r) can be covered by 2λX balls of radius r . A family is doubling nonzero coordinates, which implies expansion O(k1/p ). 2 if the doubling dimension of all metrics in it is bounded by some To bound the contraction, for each pair u,v we consider the first universal constant. path P in the SPD that lies “close” to {u,v} or separates them to Graphs. We consider connected undirected graphs G = (V , E) different connected components. Then we show that at least oneof with edge weights w : E → R . Let d denote the shortest path the two coordinates should give sufficient contribution. >0 G metric in G; we drop subscripts when there is no ambiguity. For a But what about the effect of truncation on contraction? A careful vertex x ∈ V and a set A ⊆ V , let d (x, A) := min ∈ d(x, a), where recursive argument shows that the contribution to u,v from the G a A d (x, ∅) := ∞. For a subset of vertices A ⊆ V , let G[A] denote the first coordinate (the distance from the path P) is essentially not G induced graph on A, and let d := d be the shortest path metric affected by this truncation. Hence the argument in cases (a) and(b) A G[A] in the induced graph. Let G \ A := G[V \ A] be the graph after of Figure 1 still works. However, the argument using the distance to deleting the vertex set A from G. the root of P, case (c), can be ruined. Solving this issue requires some Special graph families. Given a graph G = (V , E), a decom- new non-trivial ideas. Our solution is to introduce a probabilistic position of G is a tree T with nodes B ,..., B (called bags) where sawtooth function that replaces the simple truncation. The main 1 s each B is a subset of V such that the following properties hold: technical part of the paper is devoted to showing that a collection i • For every edge {u,v} ∈ E, there is a bag B containing both of these functions for all possible distance scales, with appropriate i u and v. random shifts, suffices to control the expected contraction in case • For every vertex v ∈ V , the set of bags containing v form a (c), for all relevant pairs simultaneously. connected subtree of T . 1.3 Other Related Work The width of a tree decomposition is maxi {|Bi | − 1}. The treewidth of G is the minimal width of a tree decomposition of G. There has been work on embedding several other graph families A path decomposition of G is a special kind of tree decomposition into normed spaces: Chekuri et al. [13] extend the Okamura and where the underlying tree is a path. The pathwidth of G is the Seymour bound for outerplanar graphs to k-outerplanar graphs, and minimal width of a path decomposition of G. O(k) showed that these embed into ℓ1 with distortion 2 . Rao [32] (see A graph H is a minor of a graph G if we can obtain H from G by 1/p also [22]) embed planar graphs into ℓp with distortion O(log n). edge deletions/contractions, and vertex deletions. A graph family For graphs with genus д,[25] showed an embedding into Euclidean G is H-minor-free if no graph G ∈ G has H as a minor. space with distortion O(logд + plogn). Finally, for H-minor-free

954 STOC’18, June 25–29, 2018, Los Angeles, CA, USA Ittai Abraham, Arnold Filtser, Anupam Gupta, Ofer Neiman

2.1 The Sawtooth Function let SPDdepth(G) := maxG ∈G {SPDdepth(G)}. In the following we Another component in our embeddings will be the following saw- consider the SPDdepth of some graph families. tooth function. For t ∈ N, we define дt : R+ → R the sawtooth t t+1 function w.r.t. 2 as follows. For x ≥ 0, if qx := ⌊x/2 ⌋ then 3.1 The SPD Depth for Various Graph Families t  t+1 t  One advantage of defining the shortest path decomposition is that дt (x) = 2 − x − qx · 2 + 2 . several well-known graph families have bounded depth SPD. • Pathwidth. Every graph G = (V , E,w) with pathwidth k has t x 2 2 an SPDdepth of k +1. Indeed, let T = ⟨B1,..., Bs ⟩ be a path decomposition ofG, where B1, Bs are the two bags at the end x1 x3 of this path. Choose arbitrary vertices x ∈ B1 and y ∈ Bs , and let P be a shortest path in G from x to y. By the definition of a path decomposition, the path P contains at least one 0 2t+1 2 2t+1 3 2t+1 4 2t+1 5 2t+1 · · · · vertex from every bag Bi . Hence, deleting the vertices of P would reduce the size of each bag by one; consequently each Figure 3: The graph of the “sawtooth” function дt . The points x1 = connected component of G \ P has pathwidth k − 1, and by · t−1 and = · t−1 are mapped to t−1, while = · t−1 is 5 2 x3 15 2 2 x2 10 2 induction SPDdepth k. Finally, a connected component of mapped to 2t . pathwidth 0 is necessarily a singleton, which has SPDdepth 1. The following observation is straightforward. • Treewidth. Since every tree has pathwidth O(logn), we Observation 1. The Sawtooth function дt is 1-Lipschitz, bounded can show that an n-vertex treewidth-k graph has path- by 2t , periodic with period 2t+1. width O(k logn). Hence, treewidth-k graphs have SPDdepth O(k logn). The proof of the following lemma appears in Section 7. • Planar. Using cycle separators [29] as in [15, 34], every planar Lemma 2.1 (Sawtooth Lemma). Let x,y ∈ R+. Let α ∈ [0, 1], graph has SPDdepth O(logn); this follows as each cycle β ∈ [0, 4] be drawn uniformly and independently. The following separator can be constructed as union of two shortest paths. properties hold: • Minor-free. Finally, every H-minor-free graph admits a bal-  t+1  t−1 anced separator consisting of д(H) shortest paths (for some (1) Eα, β дt (βx + α · 2 ) = 2 . function д)[3], and hence has an O(д(H)· logn). (2) If |x − y| ≤ 2t−1, then SPDdepth  t+1 t+1  Eα, β дt (βx + α · 2 ) − дt (βy + α · 2 ) = Ω(|x − y|). Combining these observation with Theorem 1.2, we get the fol- (3) If |x − y| > 2t−1, then lowing set of results:  t+1 t+1  t Eα, β дt (βx + α · 2 ) − дt (βy + α · 2 ) = Ω(2 ). Corollary 3.2. Consider an n-vertex weighted graph G, Theo- 3 SHORTEST PATH DECOMPOSITIONS rem 1.2 implies the following: 1/p Our embeddings will crucially depend on the notion of shortest • If G has pathwidth k, it embeds into ℓp with distortion O(k ). path decompositions. In the introduction we provided a recursive • If G has treewidth k, it embeds into ℓp with distortion 1/ definition for SPD. Here we show an equivalent definition which O((k logn) p ). 1/p will be more suitable for our purposes. • If G is planar, it embeds into ℓp with distortion O(log n). • For every fixed H, if G excludes H as a minor, it embeds into Definition 3.1 (Shortest Path DecompositionSPD ( )). Given a weighted ℓ with distortion O(log1/p n), where the constant in the big-O graph G = (V , E,w), a SPD of depth k is a pair {X, P}, where X p 2 depends on H. is a collection X1,..., Xk of partial partitions of V , and P is a collection of sets of paths P ,..., P , where X = {V }, X = P , 1 k 1 k k As mentioned in Section 1, we get a substantial improvement and the following properties hold: for the pathwidth case. Our result for treewidth improves upon ≤ ≤ ∈ X 1 1 (1) For every 1 i k and every subset X i , there exist that from [22] for p > 2; they got O(k1− /p (logn) /p ) distortion a unique path P ∈ P such that P is a shortest path in 1 X i X compared to our O((k logn) /p ). Our result appears to be closer to G[X]. the truth, since the distortion tends to O(1) as p → ∞. (2) For every 2 ≤ i ≤ k, X consists of all connected components i Finally, our results for planar graphs match the current state-of- of G[X \ P ] over all X ∈ X . X i−1 the-art. Ðk In other words, i=1 Pk is a partition of V into paths, where Our results for minor-free graphs depend on the Robertson- each path PX is a shortest path in the component X it belongs to at Seymour decomposition, and hence are currently better only for the point it is deleted. large values of p. (It remains an open question to improve the For a given graph G let SPDdepth(G) be the minimum k such SPDdepth of H-minor-free graphs to have a poly(|H |) logn depen- that G admits an SPD of depth k. For a given family of graphs G dence, perhaps using the ideas from [4].) In general, we hope that

2 ′ ′ our results will be useful in getting other embedding results, and i.e. for every X ∈ Xi , X ⊆ V , and for every different subsets X, X ∈ Xi , X ∩ X = ∅. will spur further work on understanding shortest path separators.

955 Metric Embedding via Shortest Path Decompositions STOC’18, June 25–29, 2018, Los Angeles, CA, USA

4 THE EMBEDDING ALGORITHM two distance scales closest to 2dG (v,V \X). Recall that the sawtooth Let G = (V , E) be a weighted graph, and let {X, P} = function does not guarantee contribution for u,v due to its period- icity: we may be unlucky and have д (d (v,r)) = д (d (u,r)) even {{X1,,..., Xk } , {P1,..., Pk }} be an SPD of depth k for G. By t G t G scaling, we can assume that the minimum weight of an edge is 1; when dG (v,r) and dG (u,r) are very different. To guarantee a large let M ∈ N be the minimal such that the diameter of G is strictly enough contribution for all relevant pairs simultaneously, we add a bounded by 2M . Pick α ∈ [0, 1] and β ∈ [0, 4] uniformly and inde- random shift α, and apply a random “stretch” β to dG (v,r) before pendently. feeding it to дt . Lemma 2.1 then shows that many of the choices of α β u,v For every i ∈ [k], and X ∈ Xi , we now construct an embedding and give substantially different values for . D Formally, the mapping is as follows. The function f root consists fX : V → R (for some number of dimensions D ∈ N). This map X of M + 1 coordinates, one for each distance scale t ∈ {0, 1,..., M}. fX consists of two parts. The coordinate corresponding to t is denoted by f root. Let r be an First coordinate: Distance to the Path. The first coordinate of the X,t arbitrary endpoint of PX ; we will call r the “root” of PX . Let tv ∈ N embedding implements the distance to the path PX , and is denoted d (v,V \X )− tv ( \ ) ∈ [ tv tv +1) 2 G 2 path be such that 2dG v,V X 2 , 2 . Set λv = 2tv . by f . Let X1,..., Xs ∈ Xi+1 be the connected components of X Note that 0 ≤ λv < 1. For v ∈ X, we define G [X \ PX ] (note that it is also possible that s = 0). We use a separate t+1 path s λv · дt (β · dX (v,r) + α · 2 ) if t = tv + 1, coordinate for each Xj , and hence f : V → R . Moreover, for  X root( ) ( − )· ( · ( ) · t+1) v ∈ X we truncate at 2dG (v,V \X) in order to guarantee Lipschitz- fX,t v = 1 λv дt β dX v,r + α 2 if t = tv ,  ness. In particular, the coordinate corresponding to Xj is set to 0 otherwise.  ( (1)   min {d (v, P ), 2d (v,V \ X)} if v ∈ X , path ( ) = X X G j fX v root( ) ì Xj 0 otherwise. For all nodes v < X, we set fX v = 0. f = f path ⊕ f root See Figure 4 for an illustration. Define the map X X X , and the final embedding is k Ê Ê f = fX , P i=1 ∈X (0, 0) X X i i.e., the concatenation of all the constructed embeddings. Before (0, 0) we start the analysis, let us record some simple observations. (7, 0) X2 (0, 0) 3 7 2 For the map f defined above, the following hold: 3 Observation 2. (6, 0) X1 (0, 4) • The number of coordinates in f does not depend on α, β. (0, 3) • For every X ∈ Xi and v < X, the map fX (v) is the constant (0, 0) 2 vector 0ì. (2, 0) • For every X ∈ Xi and v ∈ X, the map fX is nonzero in at most 3 coordinates. Hence, since Xi is a partial partition of V and the depth of the SPD ( ) Figure 4: The set X ∈ Xi surrounded by a closed curve. The path PX is k, we get that f v is nonzero in at most 3k coordinates for each ∈ X path v ∈ V . partitions X into X1, X2 i+1. The embedding fX consists of two coordinates, and represented in the figure by a horizontal vector next to each vertex, where the first entry is w.r.t. X1 and the second w.r.t. 5 THE ANALYSIS X2. Each point on PX , or not in X maps to 0 in both the coordinates. The main technical lemmas now show that the per-coordinate ex- Each point in X1 maps to min {dX (v, PX ), 2dG (v, V \ X )} in the first pansion is constant, and that for every pair, there exists a coordinate coordinate and to 0 in the second. for which the expected contraction is constant. Lemma 5.1 (Expansion Bound). For any vertices u,v, every co- Second coordinate: Distance to the Root. The second part is de- ordinate j, and every choice of α, β, root noted fX , which is intended to capture the distance from the fj (v) − fj (u) = O(dG (u,v)). root r of the path. Again, to get the Lipschitz-ness, we would like to ( \ ) path Lemma 5.2 (Contraction Bound). For any vertices u,v, there truncate the value at 2dG v,V X as we did for fX . However, a problem with this idea is that the root r can be arbitrarily far exists some coordinate j such that ,   from some pair u v that needs contribution from this coordinate. Eα, β fj (v) − fj (u) = Ω(dG (u,v)). And hence, even if |dG (u,r) −dG (v,r)| ≈ dG (u,v), there may be no contribution after the truncation. So we use the sawtooth function. Given these two lemmas, we can combine them together to show that the entire embedding has small distortion. (Proof of the Specifically, we replace the ideal contribution dG (v,r) by the composition lemma can be found in Section 6.) sawtooth function дt (dG (v,r)), where the scale t for the function t is chosen such that 2 ≈ dG (v,V \ X). To avoid the case that two Lemma 5.3 (Composition Lemma). Let (X,d) be a metric space. nearby points use two different scales (and hence to guarantee Suppose that there are constants ρ,τ and a function f : X → Rs , Lipschitz-ness), we take an appropriate linear combination of the drawn from some probability space such that:

956 STOC’18, June 25–29, 2018, Los Angeles, CA, USA Ittai Abraham, Arnold Filtser, Anupam Gupta, Ofer Neiman

(1) For everyu,v ∈ X and every j ∈ [s], |fj (v)−fj (u)| ≤ ρ·d(v,u). a simple case analysis. Indeed, suppose dX (u, PX ) ≤ 2dG (u,V \ X). Then we get (2) For every u,v ∈ X, there exists j ∈ [s] such that E[ |fj (v) − 1 ∥ path( ) − path( )∥ fj (u)| ] ≥ τ · d(v,u). fX v fX u ∞ ∈ ⊆ [ ] (3) For every v X, there is a subset of indices Iv s of size = min {dX (v, PX ) , 2dG (v,V \ X)} − dX (u, PX ) |I | ≤ k, such that for every j < I , f (v) = 0. In other words, v v j ≤ d (v, P ) − d (u, P ) ≤ d (u,v) = d (u,v). for every v ∈ X, f (v) has support of size at most k. X X X X X G ≥ ( ) Then, for every p 1, there is an embedding of X,d into ℓp with The other case is that dX (u, PX ) > 2dG (u,V \ X), and then distortion O(k1/p ). Moreover, if there is an efficient algorithm for sam- pling such an f , then there is a randomized algorithm that constructs ∥ path( ) − path( )∥ the embedding efficiently (in expectation). fX v fX u ∞ { ( ) ( \ )} − ( \ ) Now Theorem 1.2, our embedding for graphs with low depth = min dX v, PX , 2dG v,V X 2dG u,V X SPDs, immediately follows by applying the Composition Lemma ≤ 2dG (v,V \ X) − 2dG (u,V \ X) ≤ 2dG (u,v). (Lemma 5.3) to Lemma 5.1, Lemma 5.2, and Observation 2. Hence the expansion is bounded by 2.

5.1 Proof of Lemma 5.1 root Expansion of f . Let r be the root of PX . For t ∈ In this section we bound the expansion of any coordinate in our em- X {0, 1,..., M}, let pt (respectively, qt ) be the “weight” of v (respec- bedding. Recall that the embedding of v lying in some component tively, u) on дt —in other words, pt is the constant in (1) such that X consists of two sets of coordinates: its distance from the path, and root t+1 f (v) = pt ·дt (β ·d (v,r)+α ·2 ). Note that pt ∈ {0, λv , 1−λv } its distance from the root. As mentioned in the introduction, since X,t X is chosen deterministically, and is nonzero for at most two indices points outside X are mapped to zero, maintaining Lipschitz-ness t. requires us to truncate the contribution of v of any coordinate to First, observe that for every t, its distance from the boundary. This truncation (either via taking a minimum with d (v,V \ X), or via the sawtooth function), means G f root(v) − f root(u) that our proofs of expansion require more care. Let us now give X,t X,t the details. t+1 = pt · дt (β · dX (v,r) + α · 2 ) Consider any level i, any set X ∈ X , and any pair of vertices u,v. i ∥ ( ) − ( )∥ ( ( )) t+1 It suffices to show that fX v fX u ∞ = O dG u,v . To begin, − qt · дt (β · dX (u,r) + α · 2 ) ∈ we may assume that both u,v X. Indeed, if both u,v < X, then ì ≤ · ( · ( ) · t+1) fX (v) = fX (u) = 0 and we are done. If one of them, say v belongs min {pt ,qt } дt β dX v,r + α 2 X u X f (u) = ì f (v) to while the other < , then X 0 while X is bounded t+1 t t +1 − д (β · d (u,r) + α · 2 ) + |p − q | · 2 by 2 v ≤ 4dG (v,V \ X) ≤ 4dG (u,v) in each coordinate. t X t t Moreover, we may also assume that the shortest u-v path in G t ≤ min {pt ,qt } · β · |dX (v,r) − dX (u,r)| + |pt − qt | · 2 contains only vertices from X. Indeed, suppose their shortest path t ≤ O(d (u,v)) + |pt − qt | · 2 . (2) in G uses vertices from V \ X, then dG (u,V \ X) + dG (v,V \ X) ≤ G dG (u,v). But since both fX (v), fX (u) are bounded in each coor- t The first inequality used that дt is bounded by 2 , and the second dinate by 4 · max {dG (u,V \ X),dG (v,V \ X)}, we have constant inequality that дt is 1-Lipschitz; both follow from Observation 1. expansion. Henceforth, we can assume that dG (u,v) = dX (u,v). We The last inequality follows by the triangle inequality (since we now bound the expansion in each of the two parts of fX separately. assumed that the shortest path from v to u is contained within X). path t Expansion of fX . Let Xv , Xu be the connected components Hence, it suffices to show that |pt − qt | = O(dG (u,v)/2 ). In- in G [X \ PX ] such that v ∈ Xv and u ∈ Xu . Consider the first case deed, for indices t < {tu , tu + 1, tv , tv + 1}, pt = qt = 0, hence Xv , Xu , then PX intersects the shortest path between v and u. In |pt − qt | = 0. Let us consider the other cases. W.l.o.g., assume that particular, dG (v,V \ X) ≥ dG (u,V \ X) and hence tv ≥ tu . path path ∥ f (v) − f (u)∥∞ • tu = tv : In this case, pt − qt = |(1 − λv ) − (1 − λu )| = X X v v λ − λ = p − q . Moreover, this quantity is ≤ min {dX (v, PX ) , 2dG (v,V \ X)} v u tv +1 tv +1 + min {d (u, P ) , 2d (u,V \ X)} t t X X G 2dG (v,V \ X) − 2 2dG (u,V \ X) − 2 λv − λu = − ≤ dX (v, PX ) + dX (u, PX ) ≤ dX (v,u) = dG (v,u) . 2t 2t 2 (d (v,V \ X) − d (u,V \ X)) Otherwise, Xv = Xu and the two vertices G G = t lie in the same component. Now ∥ f path(v) − 2 X 2d (u,v) path ≤ G f (u)∥∞ equals min {dX (v, PX ) , 2dG (v,V \ X)} − t . X 2 min {dX (u, PX ) , 2dG (u,V \ X)} . Assuming (without loss of t generality) that the first term is at least the second, we can drop the Hence, we get that |pt − qt | = O(dG (u,v)/2 ) for both t ∈ absolute value signs. Now the bound on the expansion follows from {tv , tv + 1}.

957 Metric Embedding via Shortest Path Decompositions STOC’18, June 25–29, 2018, Los Angeles, CA, USA

• tu = tv − 1 : It holds that Suppose first that (2) occurs but not (1). Let j be the coordinate path in f created for the connected component of v in X \ PX . Then λv + (1 − λu ) X path path 2d (v,V \ X) − 2tv 2tu +1 − 2d (u,V \ X) (f ) (v) − (f ) (u) ≤ 2 · G + G X j X j 2tv 2tu = min {dX (v, PX ) , 2dG (v,V \ X)} − 0 2dG (v,V \ X) − 2dG (u,V \ X) 2dG (u,v) = ≤ .  ∆ ∆  ∆ 2tu 2tu ≥ min uv , 2 uv = uv . c c c If we define χ := λv + (1 − λu ), we conclude that Next assume that (1) occurs. W.l.o.g., dX (v, PX ) ≤ dX (u, PX ), so − ≤ ( ( , )/ tv +1) ptv +1 qtv +1 = λv χ = O dG u v 2 that dX (v, PX ) ≤ ∆uv /c. Suppose first that dX (u, PX ) ≥ 2∆uv /c. − | − − | ≤ ( ( )/ tv ) path ptv qtv = 1 λv λu χ = O dG u,v 2 Then in the coordinate j in fX created for the connected compo- tu nent of u in X \ PX , we have ptu − qtu = 1 − λu ≤ χ = O(dG (u,v)/2 ) path path (f )j (v) − (f )j (u) • tu < tv − 1 : By the definition of tv and tu , 2dG (v,u) ≥ X X t t +1 t −1 2 (dG (v,V \ X) − dG (u,V \ X)) ≥ 2 v − 2 u ≥ 2 v . ≥ min {dX (u, PX ) , 2dG (u,V \ X)} In particular, for every t ≤ tv + 1, ( )  ( )  |p − q | ≤ 1 ≤ 2dG u,v = O dG u,v . − { ( , ) , ( , \ )} t t 2tv −1 2t min dX v PX 2dG v V X  ∆ ∆  ∆ ∆ 5.2 Bounding the Contraction: Proof of ≥ min 2 uv , 2 uv − uv = uv . c c c c Lemma 5.2 (It does not matter whether v, u are in the same connected com- Recall that we want to prove that for any pair u,v of vertices, the ponent or not.) Thus it remains to consider the case dX (u, PX ) < embedding has a large contribution between them. A natural proof ′ ′ 2∆uv /c. Let r be the root of PX . Let v (resp. u ) be the closest idea is to show that vertices u,v would eventually be separated by vertex on PX to v (resp. u) in G[X]. Then by the triangle inequality the recursive procedure. When they are separated, either one of u,v ′ ′ ′ ′ c − 3 is far from the separating path P, or they both lie close to the path. dX (v ,u ) ≥ dX (v,u) − dX (v,v ) − dX (u,u ) ≥ ∆uv . In the former case, the distance d(v, P) gives a large contribution c to the embedding distance, and in the latter case the distance from In particular, one end of the path (the “root”) gives a large contribution. |dX (v,r) − dX (u,r)| However, there’s a catch: the value of the v’s embedding in ′ ′ ′ ′ ≥ d (v ,r) − d (u ,r) − d (v,v ) − d (u,u ) any single coordinate cannot be more than v’s distance to the X X X X , c − 6 1 boundary, and this causes problems. Indeed, if u v fall very close ≥ ∆uv = ∆uv , (4) to the path P at some step of the algorithm, they must get most c 2 of their contribution at this level, since future levels will not give where we used that PX is a shortest path in G[X] (implying ′ ′ ′ ′ much contribution. How can we do it, without assigning large |dX (v ,r) − dX (u ,r)| = dX (v ,u )). See Figure 5 for illustration. values? This is where we use the sawtooth function: it gives a good contribution between points without assigning any vertex too large u a value in any coordinate. ∆ 2∆ uv Formally, to bound the contraction and prove Lemma 5.2, for P uv ≥ v X c ∆ nodes u,v we need to show that there exists a coordinate j such uv ≥ 0 c 0 that Eα, β [|fj (v) − fj (u)|] = Ω(dG (u,v)). For brevity, define r u v ∆uv := dG (u,v). (3) Fix c = 12. Let i be the minimal index such that there exists Figure 5: P is a shortest path with root r . v (resp. u) is at dis- ∈ X ∈ X X i with u,v X, and at least one of the following holds: ∆uv 2∆uv ′ ′ tance at most c (resp. c ) from v (resp u ), it’s closest ver- (1) min {dX (v, PX ),dX (u, PX )} ≤ ∆uv /c (i.e., we choose a path ′ ′ 3 tex on PX . By triangle inequality dX (v , u ) ≥ (1 − c )∆uv . As ′ ′ ′ close to {v,u}). u , v lay on the same shortest path starting at r , |dX (v , r ) − ′ ′ ′ (2) v and u are in different components of X \ PX . dX (u , r )| = dX (v , u ). Using the triangle inequality again we con- clude |d (v, r ) − d (u, r )| ≥ |d (v′, r ) − d (u′, r )| − 3 ∆ ≥ ( − Note that such an index i indeed exists: if v and u are separated by X X X X c uv 1 6 )∆ . the SPD then condition (2) holds. The only other possibility that v c uv and u are never separated is when at least one of them lies on one of the shortest paths. In such a case, surely condition (1) holds. By the Set x = dX (v,r) and y = dX (u,r). Assume first that ′ ′ ′ minimality of i, for every X ∈ Xi′ such that i < i and u,v ∈ X , dG (v,V \ X) ≥ dG (u,V \ X). In particular, tv ≥ tu . By the def- t +1 necessarily min{dX ′ (v, PX ′ ),dX ′ (u, PX ′ )} > ∆uv /c. Therefore, the inition of tv , 2dG (v,V \ X) ≤ 2 v . Thus ball with radius ∆uv /c around each of v,u is contained in X. In t ∆uv { ( , \ ), ( , \ )} > / 2 v ≥ = Ω(∆uv ) (5) particular, min dG v V X dG u V X ∆uv c. c

958 STOC’18, June 25–29, 2018, Los Angeles, CA, USA Ittai Abraham, Arnold Filtser, Anupam Gupta, Ofer Neiman

Claim 1. Let t ≥ tv , then there is a constant ϕ such that (3) For every v ∈ X, there is a subset of indices Iv ⊆ [s] of size |Iv | ≤ k, such that for every j < Iv , fj (v) = 0. In other words, E  д (β · x + α · 2t+1) − д (β · y + α · 2t+1)  ≥ ∆ /ϕ. α, β t t uv for every v ∈ X, f (v) has support of size at most k. t−1 Proof. If |x − y| ≤ 2 , then using Property 2 of Lemma 2.1 Then, for every p ≥ 1, there is an embedding of (X,d) into ℓp with 1/p  t+1 t+1  distortion O(k ). Moreover, if there is an efficient algorithm for sam- E дt (β · x + α · 2 ) − дt (β · y + α · 2 ) α, β pling such an f , then there is a randomized algorithm that constructs (4) = Ω (|x − y|) = Ω(∆uv ) . the embedding efficiently (in expectation).

Otherwise, using Proof. Fix n = |X |, and set m = 48ρτ · lnn. Let  t+1 t+1  (1) (2) (m) → s Eα, β дt (β · x + α · 2 ) − дt (β · y + α · 2 ) f , f ,..., f : X R be functions chosen i.i.d accord- −1/p Ém (i) (5) ing to the given distribution. Set д = m i=1 f . We argue t  1/p = Ω 2 ≥ Ω(∆uv ) . □ that with high probability, д has distortion O(k ) in ℓp . Fix some pair of vertices v,u ∈ V . Set d(v,u) = ∆. The upper  8c Set S = max 8ϕ, 2 . We consider two cases: bound follows from Property 1 and Property 3 of the lemma: • If for some t ∈ {0, 1,..., M}, |p − q | · 2t > ∆uv , then t t S m  p p Õ Õ −1/p (i) (i) h i ∥д(v) − д(u)∥p = m · f (v) − f (u) E f root(v) − f root(u) j j α, β X,t X,t i=1 j ∈Iv ∪Iu  t+1 t+1  m = E pt · дt (β · x + α · 2 ) − qt · дt (β · y + α · 2 ) Õ Õ 1 α, β ≤ ·(ρ · ∆)p ≤ 2k ·(ρ · ∆)p , m ≥ ·  ( · · t+1) i=1 j ∈I ∪I pt Eα, β дt β x + α 2 v u  t+1  1/p − qt · Eα, β дt (β · y + α · 2 ) thus ∥д(v) − д(u)∥p ≤ O(k · ∆). t Next, for the contraction bound, let j be the index of Property 2 = |pt − qt | · 2 = Ω(∆uv ) . w.r.t v,u. Set F = {f : fj (v) − fj (u) ≥ ∆/2τ } to be the event Where the equality follows by Property 1 of Lemma 2.1. that we draw a function with significant contribution to v,u. Then t ∆uv • The other case is that for every t, |pt − qt | · 2 ≤ S . Note using Property 1 and Property 2, thatptv +ptv +1 = (1−λv )+λv = 1. Let t ∈ {tv , tv +1} be such 1 ∆uv 1 2c ∆uv 1 ∆   that pt ≥ . Using (5), qt ≥ pt − t ≥ − · ≥ . ≤ E fj (v) − fj (u) 2 2 v ·S 2 2∆uv S 4 τ In particular, h i ∆ ∆ h i ≤ Pr F · + Pr [F] · ρ∆ ≤ + Pr [F] · ρ∆ , E root( ) − root( ) 2τ 2τ α, β fX,t v fX,t u ( )  t+1 t+1  [F] ≥ 1 Q i = Eα, β pt · дt (β · x + α · 2 ) − qt · дt (β · y + α · 2 ) which implies that Pr 2ρτ . Let u,v be an indicator random h (i) ∈ F Ím (i) ≥ { } · E ( · · t+1) variable for the event f , and set Qu,v = i=1 Qu,v . By min pt ,qt α, β дt β x + α 2 m linearity of expectation, E[Qu,v ] ≥ = 24 · lnn. By a Chernoff i 2ρτ − ( · · t+1) − − · t bound дt β y + α 2 |pt qt | 2   1 ∆uv ∆uv   1   ≥ · − = Ω(∆uv ) , Pr Qu,v ≤ 12 · lnn ≤ Pr Qu,v ≤ · E Qu,v 4 ϕ S 2 1   where in the first inequality we used Property 1 of Lemma 2.1, ≤ exp(− E Qu,v ) and in the second inequality we used Claim 1. 8 ≤ exp(−3 lnn) = n−3 . Finally, recall that we assumed dG (v,V \ X) ≥ dG (u,V \ X) for the proof above. The other case (dG (v,V \ X) < dG (u,V \ X)) is n By taking a union bound over the 2 pairs, with probability at least completely symmetric. 1 1 − n , for every u,v ∈ V , Qu,v > 12 lnn = Ω(m). (Recall that both ρ,τ are universal constants.) If this event indeed occurs, then the 6 PROOF OF contraction is indeed bounded: THE COMPOSITION LEMMA (5.3) m Õ  p We restate the lemma for convenience: ∥ ( ) − ( )∥p ≥ −1/p · (i)( ) − (i)( ) д v д u p m fj v fj u i=1 Lemma 5.3 (Composition Lemma). Let (X,d) be a metric space. 1 Õ ( ) ( ) p Suppose that there are constants ρ,τ and a function f : X → Rs , ≥ f i (v) − f i (u) m j j drawn from some probability space such that: (i) i:Qu,v =1 (1) For every u,v ∈ X and every j ∈ [s], |f (v) − f (u)| ≤ ρ ·  p  p  j j Qu,v ∆ ∆ d(v,u). ≥ · = Ω . m 2ρτ 2ρτ (2) For every u,v ∈ X, there exists j ∈ [s] such that E[ |fj (v) − 1 , ∥ ( ) − ( )∥ ≥ ( ) fj (u)| ] ≥ τ · d(v,u). In particular, for every u v, д v д u p Ω ∆ . □

959 Metric Embedding via Shortest Path Decompositions STOC’18, June 25–29, 2018, Los Angeles, CA, USA

2t 2t 2t

2t+1 2t+1 2t+1 t t z α = 0 α = 2 − z α = 2 − 2

2t 2t 2t 2t

t+1 t+1 t+1 t+1 t 2 t+1 2 t+1 2 z t+1 2 α = 2 α = 2 − z α = 2 − 2 α = 2

Figure 6: α is going from 0 to 2t+1. z ≤ 2t . In each of the figures the leftmost red point represents α while the rightmost red point represents z + α. Each of the middle figures represent a moment when дt (z + α) − дt (z) changes its derivative.

7 PROOF OF THE SAWTOOTH LEMMA (2.1) For z > 2t , set w = 2t+1 − z. Then using that дt is periodic, We restate Lemma 2.1 for convenience:  t+1 t+1  Eα ∈[0,1] дt (w + α · 2 ) − дt (α · 2 ) Lemma 2.1 (Sawtooth Lemma). Let x,y ∈ R . Let α ∈ [0, 1], +  t+1 t+1  β ∈ [0, 4] be drawn uniformly and independently. The following = Eα ∈[0,1] дt (w + z + α · 2 ) − дt (z + α · 2 ) properties hold:  t+1 t+1 t+1  = Eα ∈[0,1] дt (2 + α · 2 ) − дt (z + α · 2 )  ( · t+1) t−1  t+1 t+1  (1) Eα, β дt βx + α 2 = 2 . = Eα ∈[0,1] дt (z + α · 2 ) − дt (α · 2 ) . (2) If |x − y| ≤ 2t−1, then

E  д (βx + α · 2t+1) − д (βy + α · 2t+1)  = Ω(|x − y|). α, β t t (2t+1−w)w (2t +1−z)z t−1 (∗) = = (3) If |x − y| > 2 , then Hence by the first case, 2t +1 2t +1 . □  t+1 t+1  t Eα, β дt (βx + α · 2 ) − дt (βy + α · 2 ) = Ω(2 ).

Property 1 is straightforward, as by Observation 1 дt is For the proofs of Property 2 and Property 3 periodic with period length 2t+1. Indeed, for every fixed β, assume w.l.o.g x > y. Set z = x − y, and  t+1   t+1  t−1  t+1 t+1  Eα дt (βx + α · 2 ) = Eα дt (α · 2 ) = 2 . The following (∗) = Eα, β дt (βx + α · 2 ) − дt (βy + α · 2 ) . claim will be useful in the proofs of Property 2 and Property 3. As дt is a periodic function, we have that (∗) =   t+1 t+1  Eβ Eα дt (βz + α · 2 ) − дt (α · 2 ) . Claim 2. For z ∈ [0, 2t+1], (2t+1−z)z E  д (z + α · t+1) − д (α · t+1)  = . α ∈[0,1] t 2 t 2 2t +1 Proof of Property2. Using Claim 2, we have  t+1 t+1  Proof. Set (∗) = Eα ∈[0,1] дt (z + α · 2 ) − дt (α · 2 ) . " t+1  #  t+1 2  By substituting the variable of integration, (∗) = 1 · 2 − βz βz 1 1 2 z z 2t+1 (∗) = E = · · β2 − · β3 |4 t+1 β t+1 t+1 4 2 3 0 ∫ 2 | ( ) − ( )| ≤ t 2 2 0 дt z + α дt α dα. First assume that z 2 , then there  2 6    |д (z + α) − д (α)| t+1 1 16 z 2 1 64 2 are 5 “phase changes” in t t from 0 to 2 , at = · · z − · ≥ · 8 − · z = · |x − y| , t t z t t+1 t+1 z t+1 · 2 − z, 2 − 2 , 2 , 2 − z, 2 − 2 . (see Figure 6 for illustration). 4 2 2 3 4 3 4 3 We calculate t−1 t − z z where in the inequality we used that z ≤ 2 . □ ∫ 2 z ∫ 2 ∫ 2 2t+1 · (∗) = zdα + (z − 2α)dα + 2αdα 0 0 0 t − z z ∫ 2 z ∫ 2 ∫ 2 + zdα + (z − 2α)dα + 2αdα Proof of Property3. As дt is periodic func- 0 0 0 tion, Claim 2 implies that for every w ≥ 0 it t z ∫ 2 −z ∫ holds that E  д (w + α · 2t+1) − д (α · 2t+1)  = 2  t+1  α t t = 2 · zdα + 2 · zdα = 2 − z z . (2t +1−(w mod 2t+1))(w mod 2t +1) a ∈ [ , ] a ·z = t+1 0 0 2t+1 . Let 0 4 such that 2

960 STOC’18, June 25–29, 2018, Los Angeles, CA, USA Ittai Abraham, Arnold Filtser, Anupam Gupta, Ofer Neiman

v (such a exists as |x − y| > 2t−1). The claim follows as, (∗) · 2t+1 h t+1 t+1  t+1 i = Eβ ∈[0,4] 2 − (βz mod 2 ) (βz mod 2 ) s t s t

⌊ 4 ⌋−1 (i+1)a Õa 1 ∫  2t+1  ≥ · 2t+1 − (β · mod 2t+1) D0 D1 4 a v u i=0 ia 2t+1 a b ·(β · mod 2t+1)dβ 1 1 a ⌊ 4 ⌋−1 a Õa 1 ∫  2t+1  2t+1 = · 2t+1 − β · · β · dβ 4 a a a2 b2 i=0 0 s 2t+1 t  4  1 a ∫   c = · · · 2t+1 − γ · γdγ 2 d2 a 4 2t+1 0  2 3  4 1 a γ γ t+1 ≥ · · · t+1 − |2 D2 t+1 2 0 2a 4 2 2 3 c1 d1 3 2 1 1 2t+1 2t+1 = · · = . u 2 2t+1 6 12 □ Figure 7: The first 3 diamond graphs. {s, t } is the level 0 di- agonal, {u, v } is the level 1 diagonal, {a1, a2 }, {b1, b2 }, {c1, c2 }, 8 LOWER BOUND: PROOF OF THEOREM 1.3 {d1, d2 } are the level 2 diagonals. E0 = {{s, t }}, E1 = {{s, u }, {t, u }, {t, v }, {s, v }} . We start with the definition of the diamond graphs Dk .

Definition 8.1 (Diamond Graphs). Let D0, D1, D2,... be a se- quence of graphs defined as follows: D is a single edge, and for 0 Define x1 = b − a, y1 = a − d and x2 = b − c, y2 = c − d. We get i ≥ 1, Di is obtained from Di−1 by replacing every edge of Di−1 with a square with two new vertices. See Figure 7 for illustration. p p p−1  p p  ∥b − d∥p + ∥b − 2a + d∥p ≤ 2 ∥a − b∥p + ∥d − a∥p . For each of the new squares created at level i, call the two new p p  p p  ∥b − d∥ + ∥b − 2c + d∥ ≤ 2p−1 ∥b − c∥ + ∥c − d∥ . vertices a diagonal at level-i. p p p p Consider the graph D . For 1 ≤ i ≤ k, denote by Di the set of k By summing up and dividing by 2, all level-i diagonals, and let Ei be the set of pairs of vertices which − were edges in D . It holds that |E | = 4k and D = 4k 1. Moreover, p p i k k ∥b − 2a + d∥ + ∥b − 2c + d∥ D has pathwidth 1, and it can be verified by induction that D p p p 0 k ∥b − d∥p + has pathwidth k + 1. 2 p−2  p p p p  Newman and Rabinovich [√30] proved that every embedding of ≤ 2 ∥a − b∥p + ∥b − c∥p + ∥c − d∥p + ∥d − a∥p . Dk into ℓ2 requires distortion k + 1. Lee and Naor [23] generalized it for 1 < p ≤ 2, by proving that every embedding of Dk into The claim now follows by convexity. □ p ℓp requires distortion 1 + (p − 1)k. We will prove that for p ≥ 2,   1/p k+1 Fix some p ≥ 2, and embedding f : Dk → ℓp . We will assume every embedding of Dk into ℓp requires distortion at least p−2 . 2 w.l.o.g that f is non-contractive, and denote by ρ its expansion, i.e. The following claim will be essential. distortion. Set α = 1 and for i > 0, α = 1 . Note 0 2k(p−2) i 2(k−i+1)(p−2) p−2 Claim 3 (ℓp Quadrilateral Ineqality). For p ≥ 2, and every that for i ≥ 1, αi · 2 = αi+1. Our proof will be based on the four vectors a,b,c,d ∈ ℓp , it holds that following Poincaré-type inequality p p ∥a − c∥p + ∥b − d∥p Claim 4 (Diamond graph ℓp Poincaré-type ineqality). p−2  p p p p  ≤ 2 ∥a − b∥p + ∥b − c∥p + ∥c − d∥p + ∥d − a∥p (6) k Õ Õ p αi · ∥f (x) − f (y)∥p Proof. The proof of the following inequality can be found at i=0 {x,y }∈Di [11, Theorem 11.12], Õ p ≤ α · ∥f (x) − f (y)∥ . (7)   k+1 p p p p−1 p p {x,y }∈E x,y ∈ ℓp , ∥x + y∥p + ∥x − y∥p ≤ 2 ∥x ∥p + ∥y∥p . k ∀

961 Metric Embedding via Shortest Path Decompositions STOC’18, June 25–29, 2018, Los Angeles, CA, USA

′ ′ Proof. For edge {x,y} ∈ Ei−1, denote by {x ,y } ∈ Di the Recall that f is non-contractive and has expansion ρ. Conse- diagonal created by it. We have quently,

Õ p Õ p k ∥f (x) − f (y)∥ + ∥f (x) − f (y)∥ k (8) Õ Õ p p p (k + 1)· 4 = αi · d (x,y) {x,y }∈Di {x,y }∈Ei−1 i=0 {x,y }∈Di Õ  p ′ ′ p  = ∥f (x) − f (y)∥ + f x − f y k p p Õ Õ p {x,y }∈Ei−1 ≤ αi · ∥f (x) − f (y)∥p i=0 {x,y }∈D (6) i p−2 Õ  ′ p ′ p ≤ 2 · f (x) − f x + f x − f (y) (7) Õ p p ≤ α · ∥f (x) − f (y)∥p {x,y }∈Ei−1 k+1 p ! {x,y }∈Ek  ( ) − ′ p ′ − ( ) p Õ p + f y f y p + f y f x p ≤ αk+1 · (ρ · d(x,y)) {x,y }∈E Õ k = p−2 · ∥f (x) − f (y)∥2 . p 2 2 = αk+1 ·(ρ) · |Ek | . {x,y }∈Ei We conclude ≤ ≤ Summing over 1 i k’s, with appropriate scaling, ! 1/p 1/ 4k k + 1 k + 1  p ρ ≥ · = = Ω(k1/p ) . p−2 k |Ek | αk+1 2 Õ Õ p αi · ∥f (x) − f (y)∥p i=1 {x,y }∈Di 9 CONCLUSIONS ! Õ p In this paper we introduced the notion of shortest path decomposi- + ∥f (x) − f (y)∥p tions with low depth. We showed how these can be used to give {x,y }∈E i−1 embeddings into ℓp spaces. Our techniques give optimal embed- k dings of bounded pathwidth graphs into ℓ2, and also new embed- Õ p−2 Õ p ≤ αi · 2 · ∥f (x) − f (y)∥p dings for graphs with bounded treewidth, as well as planar and i=1 {x,y }∈Ei excluded-minor families of graphs. Our embedding into ℓp admits k a tight lower bound for fixed p > 1. We hope that our techniques Õ Õ p = αi+1 · ∥f (x) − f (y)∥p . will be useful for other embedding results. i=1 {x,y }∈Ei Our work raises several open questions. While our embeddings are tight for fixed p > 1, can we improve the bounds for ℓ1 em- Hence bedding of bounded pathwidth graphs? Can we give better results for the SPDdepth of H-minor-free graphs? Our approach gives a k p Õ Õ p O( logn)-distortion embedding of planar graphs into ℓ1, which is αi · ∥ f (x) − f (y)∥p quite different from the previous known results via padded decom- i=1 {x,y }∈Di positions: can these ideas be used to make progress towards the Õ p + α1 · ∥f (x) − f (y)∥p planar graph embedding conjecture? {x,y }∈E0 Õ p ACKNOWLEDGMENTS ≤ αk+1 · ∥f (x) − f (y)∥p . Arnold Filtser is Partially supported by the Lynn and William {x,y }∈Ek Frankel Center for Computer Sciences, ISF grant 1817/17, and by BSF Grant 2015813. Anupam Gupta is supported in part by NSF As E0 = D0 and α0 = α1, the claim follows. □ awards CCF-1536002, CCF-1540541, and CCF-1617790. Ofer Neiman is supported in part by ISF grant 1817/17, and by BSF Grant 2015813. Next, we calculate REFERENCES k [1] Ittai Abraham, Yair Bartal, and Ofer Neiman. 2011. Advances in metric embedding Õ Õ p αi · d (x,y) theory. Advances in Mathematics 228, 6 (2011), 3026 – 3126. https://doi.org/10. i=0 {x,y }∈D 1016/j.aim.2011.08.003 i [2] Ittai Abraham, Shiri Chechik, Robert Krauthgamer, and Udi Wieder. 2015. Ap- p k p proximate Nearest Neighbor Search in Metrics of Planar Graphs. In Approxi-  k  Õ  k−i+1 mation, Randomization, and Combinatorial Optimization. Algorithms and Tech- = α0 · 2 + |Di | · αi · 2 niques, APPROX/RANDOM 2015, August 24-26, 2015, Princeton, NJ, USA. 20–42. i=1 https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2015.20 k [3] Ittai Abraham and Cyril Gavoille. 2006. Object location using path separators. Õ  2 = 22k + 4i−1 · 2k−i+1 = (k + 1)· 4k . (8) In Proceedings of the Twenty-Fifth Annual ACM Symposium on Principles of Dis- tributed Computing, PODC 2006, Denver, CO, USA, July 23-26, 2006. 188–197. i=1 https://doi.org/10.1145/1146381.1146411

962 STOC’18, June 25–29, 2018, Los Angeles, CA, USA Ittai Abraham, Arnold Filtser, Anupam Gupta, Ofer Neiman

[4] Ittai Abraham, Cyril Gavoille, Anupam Gupta, Ofer Neiman, and Kunal Talwar. arXiv:http://www.worldscientific.com/doi/pdf/10.1142/S1793525309000114 2014. Cops, robbers, and threatening skeletons: padded decomposition for minor- [20] Lior Kamma and Robert Krauthgamer. 2016. Metric Decompositions of free graphs. In Symposium on Theory of Computing, STOC 2014, New York, NY, Path-Separable Graphs. Algorithmica (2016), 1–9. https://doi.org/10.1007/ USA, May 31 - June 03, 2014. 79–88. https://doi.org/10.1145/2591796.2591849 s00453-016-0213-0 [5] Thomas Andreae. 1986. On a pursuit game played on graphs for which a minor [21] Maleq Khan, Fabian Kuhn, Dahlia Malkhi, Gopal Pandurangan, and Kunal is excluded. Journal of Combinatorial Theory, Series B 41, 1 (1986), 37 – 47. Talwar. 2012. Efficient distributed approximation algorithms via probabilis- https://doi.org/10.1016/0095-8956(86)90026-2 tic tree embeddings. Distributed Computing 25, 3 (2012), 189–205. https: [6] P. Assouad. 1983. Plongements lipschitziens dans Rn . Bull. Soc. Math. France //doi.org/10.1007/s00446-012-0157-9 111, 4 (1983), 429–448. [22] Robert Krauthgamer, James R. Lee, Manor Mendel, and Assaf Naor. 2005. Mea- [7] Nikhil Bansal, Niv Buchbinder, Aleksander Madry, and Joseph Naor. 2011. A sured descent: a new embedding method for finite metrics. Geometric and Func- Polylogarithmic-Competitive Algorithm for the k-Server Problem. In Foundations tional Analysis 15, 4 (2005), 839–858. https://doi.org/10.1007/s00039-005-0527-6 of Computer Science (FOCS), 2011 IEEE 52nd Annual Symposium on. 267 –276. [23] James R. Lee and Assaf Naor. 2004. Embedding the diamond graph in ℓp and https://doi.org/10.1109/FOCS.2011.63 dimension reduction in ℓ1. Geometric & Functional Analysis GAFA 14, 4 (2004), [8] Yair Bartal, Douglas E. Carroll, Adam Meyerson, and Ofer Neiman. 2013. Band- 745–747. https://doi.org/10.1007/s00039-004-0473-8 width and low dimensional embedding. Theor. Comput. Sci. 500 (2013), 44–56. [24] James R. Lee and Prasad Raghavendra. 2010. Coarse Differentiation and Multi- https://doi.org/10.1016/j.tcs.2013.05.038 flows in Planar Graphs. Discrete & Computational Geometry 43, 2 (2010), 346–362. [9] Hans L. Bodlaender, John R. Gilbert, Hjálmtýr Hafsteinsson, and Ton Kloks. https://doi.org/10.1007/s00454-009-9172-4 1992. Approximating treewidth, pathwidth, and minimum elimination tree height. [25] James R. Lee and Anastasios Sidiropoulos. 2010. Genus and the Geometry of the Springer Berlin Heidelberg, Berlin, Heidelberg, 1–12. https://doi.org/10.1007/ Cut Graph. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on 3-540-55121-2_1 Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17-19, 2010. 193–201. [10] Jean Bourgain. 1985. On Lipschitz embedding of finite metric spaces in Hilbert https://doi.org/10.1137/1.9781611973075.18 space. Israel J. Math. 52, 1-2 (1985), 46–52. [26] James R. Lee and Anastasios Sidiropoulos. 2013. Pathwidth, trees, and ran- [11] N. L. Carothers. 2004. A Short Course on Banach Space Theory. Cambridge dom embeddings. Combinatorica 33, 3 (2013), 349–374. https://doi.org/10.1007/ University Press. https://doi.org/10.1017/CBO9780511614057 s00493-013-2685-8 [12] Amit Chakrabarti, Alexander Jaffe, James R. Lee, and Justin Vincent. 2008. Embed- [27] N. Linial, E. London, and Y. Rabinovich. 1995. The geometry of graphs and some dings of Topological Graphs: Lossy Invariants, Linearization, and 2-Sums. In 49th of its algorithmic applications. Combinatorica 15, 2 (1995), 215–245. Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008, October [28] Manor Mendel and Assaf Naor. 2013. Markov convexity and local rigidity of 25-28, 2008, Philadelphia, PA, USA. 761–770. https://doi.org/10.1109/FOCS.2008.79 distorted metrics. Journal of the European Mathematical Society 15, 1 (2013), [13] Chandra Chekuri, Anupam Gupta, Ilan Newman, Yuri Rabinovich, and Alistair 287–337. https://doi.org/10.4171/JEMS/362 Sinclair. 2006. Embedding k-outerplanar graphs into ℓ1. SIAM J. Discrete Math. [29] Gary L. Miller. 1986. Finding Small Simple Cycle Separators for 2-Connected 20, 1 (2006), 119–136. https://doi.org/10.1137/S0895480102417379 Planar Graphs. J. Comput. Syst. Sci. 32, 3 (1986), 265–279. https://doi.org/10.1016/ [14] Anupam Gupta, Robert Krauthgamer, and James R. Lee. 2003. Bounded Geome- 0022-0000(86)90030-9 tries, Fractals, and Low-Distortion Embeddings. In FOCS ’03: Proceedings of the [30] Ilan Newman and Yuri Rabinovich. 2003. A lower bound on the distortion of 44th Annual IEEE Symposium on Foundations of Computer Science. IEEE Computer embedding planar metrics into Euclidean space. Discrete Comput. Geom. 29, 1 Society, Washington, DC, USA, 534. (2003), 77–81. https://doi.org/10.1007/s00454-002-2813-5 [15] Anupam Gupta, Amit Kumar, and Rajeev Rastogi. 2004. Traveling with a Pez [31] Haruko Okamura and P.D. Seymour. 1981. Multicommodity flows in planar dispenser (or, routing issues in MPLS). SIAM J. Comput. 34, 2 (2004), 453–474. graphs. Journal of Combinatorial Theory, Series B 31, 1 (1981), 75 – 81. https: https://doi.org/10.1137/S0097539702409927 //doi.org/10.1016/S0095-8956(81)80012-3 [16] Anupam Gupta, Ilan Newman, Yuri Rabinovich, and Alistair Sinclair. 2004. Cuts, [32] Satish Rao. 1999. Small Distortion and Volume Preserving Embeddings for Planar Trees and ℓ1s-Embeddings of Graphs. Combinatorica 24, 2 (2004), 233–269. and Euclidean Metrics. In Proceedings of the Fifteenth Annual Symposium on https://doi.org/10.1007/s00493-004-0015-x Computational Geometry, Miami Beach, Florida, USA, June 13-16, 1999. 300–306. [17] Anupam Gupta, Kunal Talwar, and David Witmer. 2013. On the Non-Uniform https://doi.org/10.1145/304893.304983 Sparsest Cut Problem on Bounded Treewidth Graphs. In Annual ACM Symposium [33] Yuval Shavitt and Tomer Tankel. 2004. Big-bang simulation for embedding on Theory of Computing. 281–290. https://doi.org/10.1145/2488608.2488644 network distances in Euclidean space. IEEE/ACM Trans. Netw. 12, 6 (2004), 993– [18] Alexander Jaffe, James R. Lee, and Mohammad Moharrami. 2011. On the Opti- 1006. https://doi.org/10.1109/TNET.2004.838597 mality of Gluing over Scales. Discrete & Computational Geometry 46, 2 (01 Sep [34] Mikkel Thorup. 2004. Compact Oracles for Reachability and Approximate 2011), 270–282. https://doi.org/10.1007/s00454-011-9359-3 Distances in Planar Digraphs. J. ACM 51, 6 (Nov. 2004), 993–1024. https: [19] William b. Johnson and Gideon Schechtman. 2009. DIAMOND //doi.org/10.1145/1039488.1039493 GRAPHS AND SUPER-REFLEXIVITY. Journal of Topology and Anal- ysis 01, 02 (2009), 177–189. https://doi.org/10.1142/S1793525309000114

963