Testable Bounded Graph Properties Are Random Order Streamable

Morteza Monemizadeh Rutgers , Amazon

Joint work with: S. Muthukrishnan, Pan Peng, Christian Sohler Sparse Graphs

Given a graph G(V,E) where n=|V| and m = |E| = O(n). Examples d-Bounded Degree Graphs

Given a graph G(V,E) whose maximum degree d is constant, where n=|V| and 0 ≤ m = |E| ≤ nd. (Ɛ,k,d)-Hyperfinite G(V,E)

G is (Ɛ,k,d)-hyperfinite graph if we remove a set of at most Ɛdn edges of G s.t. the remaining graph has connected components of size at most k.

☛ Is a way to quantify the density of a graph G(V,E). ☛ c = maxU {|E(U)|/(|U|-1)} where U is a subset of V. ☛ G can be partitioned into at most c forests. ☛ Planar graphs have arboricity c = 3. Maximum Matching

Given a graph G(V,E), find a set of pairwise non-adjacent of maximum size, i.e., no two edges share a common edge. Example Example Maximum Matching

☛ 30-years-old algorithm due to Micali and Vazirani with running time m√n. ☛ Greedy algorithm returns maximal matching (2- approximation of maximum matching). Maximum Matching

☛ 30-years-old algorithm due to Micali and Vazirani with running time m√n. ☛ Greedy algorithm returns maximal matching (2- approximation of maximum matching). Big Data Models for Graphs

☞ Data Streams: Graph Streams

☞ Property Testing: Testing Graph Properties.

☞ Sublinear Time Approximation Algorithms Streaming Model

2 3 1 7 4 5 8 6

Stream S = Streaming Model

2 3 1 7 4 5 8 6

Stream S = (2,4) , (3,6) , (2,7) , (4,6), ... Graph Streams

☞ Adverserial or Random Order Model

☞ O(c)-approximate the size of matching in c- bounded arboricity graphs using O(clog2 n) space in adversarial model.

☞ O(polylog n)-approximate the size of matching in general graphs using O(polylog n) space in random order model. Graph Streams

☞ Adverserial or Random Order Model

In general, it is not clear which graph problems can be solved with much smaller space in the random order stream than in the adversary order stream. Graph Streams

☞ Semi-Streaming Model: O(n polylog(n)) space

☞ Sparse Graphs: m=O(n)

☞ O(polylog(n)) or even better O(1) space Constant Query Property Testing d-Bounded Graph

Given a graph G(V,E) whose maximum degree d is constant, where n=|V| and 0 ≤ m = |E| ≤ nd. Adjacency List Model

Query access to the adjacency list of G:

For any vertex v and index i one can query the i-th neighbor (if exists) of v in constant time. Property Testing

A property ∏n for d-bounded n-vertex graphs is testable with query complexity q, if for every �, d and n, there exists an algorithm that performs q(n,d, �) queries to the adjacency list of the graph and with probability 2/3

❑ Accepts any n-vertex d-bounded graph G satisfying ∏n,

❑ Rejects any n-vertex d-bounded graph G that is �-far from satisfying ∏n,

❑ If q(d, �) is independent of n, we call ∏n constant query testable. Property Testing

Theorem: Any d-bounded graph property that is constant-query testable in the adjacency list model can be tested in random order streaming model with constant space. Examples

Adversary Order Model:

Testing k-edge connectivity, k-vertex connectivity and cycle-freeness of d-bounded degree graphs needs �(n1-O(ε)) space.

Dynamic graph stream algorithms in o(n) space. Huang and Peng, ICALP 2016. Examples

Random Order Model: k-edge connectivity, k-vertex connectivity and cycle- freeness of d-bounded degree graphs are testable in constant space in the random order stream model, since they are constant-query testable in the adjacency list model.

Property testing in bounded degree graphs. Oded Goldreich and Dana Ron, Algorithmica 2002 Property Testing

Proof (sketch): Every constant query property tester

❑ Samples a constant number of vertices

❑ Explores the k-discs of these vertices?

❑ Makes deterministic decisions based on the explored graph. k-disc The local neighborhood of depth k of a vertex is the subgraph induced by all vertices of at most k.

v k-disc The local neighborhood of depth k of a vertex is the subgraph induced by all vertices of distance at most k.

… … v k-disc The local neighborhood of depth k of a vertex is the subgraph induced by all vertices of distance at most k.

… … v

A k-disc has at most dk+1 vertices and dk+2 edges. Constant-Time Approximation Algorithms Adjacency List Model

Query access to the adjacency list of G:

For any vertex v index i one can query the the i-th neighbor (if exists) of v in constant time. (x,y)-Approximation

We call a value t an (x,y)-approximation for the problem P, if for any instance I, we have

OPT(I) ≤ t ≤ x⋅OPT(I)+y

For a minimization optimization problem P and an instance I, we let OPT(I) denote the value of some optimal solution of I. O(1)-time Approximation Algorithm

Theorem: There exists an algorithm that uses constant space in the random order model, and with probability 2/3, (1,ɛn)-approximates the size of a maximal matching.

Based on Locality Lemma due to Nguyen and Onak, FOCS’08 O(1)-time Approximation Algorithm

Similar result holds for minimum vertex cover, maximum matching, the number of connected components. O(1)-time Approximation Algorithm

Theorem: There exists an algorithm that uses constant space in the random order model, and with probability 2/3, (1±ɛ)-approximates the size of a maximal matching. O(1)-time Approximation Algorithm

Greedy Matching

Stream : e1 e2 e3 e4 … ei …

Current Matching M O(1)-time Approximation Algorithm

Greedy Matching

Stream : e1 e2 e3 e4 … ei …

Is ei in M? Current Matching M O(1)-time Approximation Algorithm

.3 .4 … .7 .4 … v .5 .6 .6 .2 k-Disc Primitive in Data Streams k-disc Primitive

Given a random order stream S of edges of an underlying d-bounded degree graph G(V,E).

Sample the full k-disc of a vertex v (almost) uniformly at random. 2-Pass Streaming Algorithm 2-Pass Algorithm

First Pass:

Sample a set S of (dk+2)! vertices and collect their observed k-discs in S.

In expectation, there exists at least one vertex in S whose full k-disc is observed. 2-Pass Algorithm

Second Pass:

Find the degree of vertices in (partially explored) k-discs of the vertices in S.

Report the k-disc of a vertex in S that is fully explored. 1-Pass Streaming Algorithm Partial Order

Hd,k={Δ1,…,Δx} : The set of all k-disc isomorphism types.

Δi≽Δj : Δj is root-preserving isomorphic to some subgraph of Δi.

Δi Δj Ordering

Order all the k-disc types Δ1,…, Δx such that if Δi≽Δj, then i ≤ j.

… … …

Δi Δj Ordering

Order all the k-disc types Δ1,…, Δx such that if Δi≽Δj, then i ≤ j.

… … …

Δi Δj

G(j): All the indices i, except j itself, such that Δi≽Δj. Frequency Vector F(G,d)

Vi: The set of vertices with k-disc isomorphic to Δi ,

Vi={v ∈ V: disck,G(v) ≅ Δi}

… … …

Δi Δj 10010 10 2324 8273 9744

|Vi| fi=|Vi|/n Marginal Probability

Let S be a random order Stream.

Let v be a vertex with k-disc isomorphic to Δi.

Marginal Probability: The probability λ(Δj|Δi) that the observed k-disc of v in S is disck(v,S) ≅ Δj for any j such that Δi≽Δj.

e’ e ≽ e e’’

Δi Δj Stream S: …, e’,…, e’’, …, e,… 1-Pass Algorithm

Preprocssing: Sample a set T of O(2(dk+2)! /Ɛ2) vertices. 1-Pass Algorithm

Preprocssing: Sample a set T of O(2(dk+2)! /Ɛ2) vertices.

Streaming: For each vertex v ∈ T:

☞ Collect the observed k-disc disck(v,S) from the stream S.

☞ Let Hv be disck(v,S). 1-Pass Algorithm

Postprocessing:

Let H= ∪v ∈ T Hv . 1-Pass Algorithm

Postprocessing:

Let H= ∪v ∈ T Hv . For i =1 to x where x=|F(G,d)| 1-Pass Algorithm

Postprocessing:

Let H= ∪v ∈ T Hv . For i =1 to x where x=|F(G,d)|

☞ Yi=|{v ∈ T: disck,H(v) ≅ Δi}|/|T| 1-Pass Algorithm

Postprocessing:

Let H= ∪v ∈ T Hv . For i =1 to x where x=|F(G,d)|

☞ Yi=|{v ∈ T: disck,H(v) ≅ Δi}|/|T| ☞ -1 Xi=(Yi-∑j∈G(i) Xj· λ(Δi|Δj))· λ (Δi|Δi)

G(i): All the indices j, except i itself, such that Δj≽Δi. 1-Pass Algorithm

Postprocessing:

Let H= ∪v ∈ T Hv . For i =1 to x where x=|F(G,d)|

☞ Yi=|{v ∈ T: disck,H(v) ≅ Δi}|/|T| ☞ -1 Xi=(Yi-∑j∈G(i) Xj· λ(Δi|Δj))· λ (Δi|Δi) Return X1,…, Xx.

G(i): All the indices j, except i itself, such that Δj≽Δi. Open Problems

☞ In general, it is not clear which graph problems can be solved with much smaller space in the random order stream than in the adversary order stream.

☞ What can we say about testing graph properties of unbounded planar (or minor-free) graphs in data streams? Thank You (Almost) Isomorphic Graphs

Benjamini, Shapira, and Schramm, STOC’08 Newman and Sohler, STOC’11

G1 and G2 : (Ɛ,k,d)-hyperfinite graphs.

If |F(G1,d)-F(G2,d)|1 ≤ Ɛdn, then G1 and G2 are Ɛ-close.

G1 and G2 are Ɛ-close: If we insert/delete at most Ɛdn edges from G1, then G1 and G2 becomes isomorphic.