Quick viewing(Text Mode)

The Hidden Clique Problem and Graphical Models

The hidden problem and graphical models

Andrea Montanari

Stanford University

July 15, 2013

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 1 / 39 Outline

1 Finding a clique in a haystack

2 A spectral

3 Improving over the spectral algorithm

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 2 / 39 Finding a clique in a haystack

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 3 / 39 General Problem

G = (V , E) a graph.

S ⊆ V supports a clique (i.e. (i, j) ∈ E for all i, j ∈ S)

Problem : Find S.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 4 / 39 General Problem

G = (V , E) a graph.

S ⊆ V supports a clique (i.e. (i, j) ∈ E for all i, j ∈ S)

Problem : Find S.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 4 / 39 General Problem

G = (V , E) a graph.

S ⊆ V supports a clique (i.e. (i, j) ∈ E for all i, j ∈ S)

Problem : Find S.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 4 / 39 Example 1: Zachary’s karate club

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 5 / 39 A catchier name

Finding a terrorist cell in a

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 6 / 39 Toy example: 150 nodes, 15 highly connected

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Here binary data: Can generalize. . .

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 7 / 39 Of course not the first 15. . . 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Where are the highly connected nodes? 1021 possilities.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 8 / 39 An efficient algorithm

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 9 / 39 The model [Alon,Krivelevich,Sudakov 1998]

Choose S ⊆ V with |S| = k uniformly at random.

Add an edge (i, j) for each pair s.t. i, j ∈ S.

Add an edge for each other pair (i, j) independently with prob p.

G ∼ G(n, p, k) Will assume p = 1/2

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998]

Choose S ⊆ V with |S| = k uniformly at random.

Add an edge (i, j) for each pair s.t. i, j ∈ S.

Add an edge for each other pair (i, j) independently with prob p.

G ∼ G(n, p, k) Will assume p = 1/2

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998]

Choose S ⊆ V with |S| = k uniformly at random.

Add an edge (i, j) for each pair s.t. i, j ∈ S.

Add an edge for each other pair (i, j) independently with prob p.

G ∼ G(n, p, k) Will assume p = 1/2

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998]

Choose S ⊆ V with |S| = k uniformly at random.

Add an edge (i, j) for each pair s.t. i, j ∈ S.

Add an edge for each other pair (i, j) independently with prob p.

G ∼ G(n, p, k) Will assume p = 1/2

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998]

Choose S ⊆ V with |S| = k uniformly at random.

Add an edge (i, j) for each pair s.t. i, j ∈ S.

Add an edge for each other pair (i, j) independently with prob p.

G ∼ G(n, p, k) Will assume p = 1/2

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 Question

How big k has to be for us to find the clique?

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 11 / 39 If you could wait forever

Exhaustive search Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: For all S ⊂ V , |S| = k; 3: Check if GS is a clique; 4: Output all cliques found;

Works if k > k∗, typical size of largest clique in G ∼ G(n, 1/2, k) that is not supported on S.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 12 / 39 If you could wait forever

Exhaustive search Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: For all S ⊂ V , |S| = k; 3: Check if GS is a clique; 4: Output all cliques found;

Works if k > k∗, typical size of largest clique in G ∼ G(n, 1/2, k) that is not supported on S.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 12 / 39 Largest random clique

Largest clique that does not share any with S

Equivalently G ∼ G(n − k, 1/2, 0) ≈ G(n, 1/2, 0).

Idea: compute expected number of cliques of size k

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 13 / 39 Largest random clique

Expected nb of cliques of size k

n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k

k∗(n) ≈ 2 log2 n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique

Expected nb of cliques of size k

n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k

k∗(n) ≈ 2 log2 n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique

Expected nb of cliques of size k

n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k

k∗(n) ≈ 2 log2 n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique

Expected nb of cliques of size k

n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k

k∗(n) ≈ 2 log2 n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique

Expected nb of cliques of size k

n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k

k∗(n) ≈ 2 log2 n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique

1e+12

1e+10

1e+08

1e+06

N(k) 10000

100

1

0.01

5 10 15 20 25 30 k

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 15 / 39 Can we do it in reasonable time?

Naive Algorithm Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: Sort vertices by degree; 2: Check if the k vertices with largest degree form a clique; 3: If yes, output them;

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 16 / 39 When does naive work?

For i 6∈ S

di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4)

1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2

Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 When does naive work?

For i 6∈ S

di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4)

1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2

Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 When does naive work?

For i 6∈ S

di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4)

1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2

Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 When does naive work?

For i 6∈ S

di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4)

1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2

Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 A spectral algorithm

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 18 / 39 Idea

 +1 if (i, j) ∈ E, W = ij −1 otherwise.  +1 if i ∈ S, (u ) = S i 0 otherwise.

Want to find uS from W .

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 19 / 39 Idea

 +1 if (i, j) ∈ E, W = ij −1 otherwise.  +1 if i ∈ S, (u ) = S i 0 otherwise.

Want to find uS from W .

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 19 / 39 Idea

 +1 if (i, j) ∈ E, W = ij −1 otherwise.  +1 if i ∈ S, (u ) = S i 0 otherwise.

Want to find uS from W .

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 19 / 39 Idea

T W = uS uS + Z − ZS,S

(Zij )i

 +1 with probability 1/2, Z = ij −1 with probability 1/2.

(ZS,S )ij = Zij if i, j ∈ S and = 0 otherwise

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 20 / 39 Idea

T W = uS uS + Z − ZS,S

(Zij )i

 +1 with probability 1/2, Z = ij −1 with probability 1/2.

(ZS,S )ij = Zij if i, j ∈ S and = 0 otherwise

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 20 / 39 Idea

T W = uS uS + Z − ZS,S

With overwhelming probability

ku u k2 = k , S S √ kZk ≈ 2 n , 2 √ kZS,S k2 ≈ 2 k  kZk2 .

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 21 / 39 Idea

T W = uS uS + Z − ZS,S

With overwhelming probability

ku u k2 = k , S S √ kZk ≈ 2 n , 2 √ kZS,S k2 ≈ 2 k  kZk2 .

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 21 / 39 Use matrix perturbation theory

Unperturbed matrix

T W0 = uS uS ,

λ1(W0) = k, λ2(W0) = ··· = λn(W0) = 0

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 22 / 39 The sin theta theorem

√ ubS = uS / k principal eigenvector of W0 v principal eigenvector of W

√ √ 2kZ + Z k kv − u k ≤ 2 sin θ(v, u ) ≤ S,S 2 bS 2 bS λ (W ) − λ (W ) √ 1 0 2 3 n ≤ √ k − 3 n

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 23 / 39 Summarizing

Proposition √ For k ≥ 100 n, whp 1 kv − u k ≤ bS 2 10

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 24 / 39 Let’s check how does it work. . .

Histogram of eA$val 60 40 Frequency 20 0

−50 0 50 100

eA$val

n = 2000, k = 100 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 25 / 39 Let’s check how does it work. . .

●●●●● ● ●● 0.10 ●●● ● ●●●●●●●● ●●●●●●●● ●●●●●●●●● ●●●●● ●●●●●●● ●●●●●●●●● ●●●●●● ● ● ● ● 0.08 0.06 0.04 ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● −eA$vec[, 1] −eA$vec[, ● ● ● ● ● ● ● ● ● ● 0.02 ● ●● ● ●● ● ● ●●●● ●● ● ● ● ● ● ● ● ●● ●●●● ● ●● ● ●●●●●● ●● ● ● ● ● ● ●●● ● ●●●●●● ●● ● ●● ●●● ● ●● ● ●● ● ●● ●● ● ● ● ●●● ● ●● ●●●●●●●●●●●●●● ●● ●●●●● ● ●● ● ●●●● ●● ●●● ●●●●●●●● ●●●●● ●●●● ●●●● ●● ●●●● ● ● ● ●● ● ● ●● ● ● ●●●●● ● ●● ●●● ●●●● ● ●● ●● ●●●●● ●● ● ●●●●●● ●● ● ●●●●●●●●● ●● ●● ●●● ●● ●●● ●●● ●●●●●● ● ●●●●●●● ●●● ●●●● ●●● ●●●●●●●● ● ● ●● ●●● ●● ●●● ● ●●●●●●●●●●●●●●●●● ●● ●●● ●●●●●● ●●●● ● ●●●●●● ●●●● ●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●● ●●● ●●●●●●●●●●●●●●●●●● ●● ●● ●● ●●●● ● ●●●● ●● ●●●●●●● ●● ●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●● ●●●● ●●●●●●●● ●●● ●● ●●●●● ● ●●● ●●●●●●●●● ● ●●● ●●● ●●● ●●●● ●●●●●●●● ● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●● ●●●●●●●●●●● ●●●● ●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●● ● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●● ● ● ●●●●●●●●●●●● ●●● ●● ●●●●●●●●●●●● ●●●●●● ●●● ●● ● ● ●●●●●●●●●●●● ● ●●●●●●●●●●●● ●● ● ●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●● ●● ●● ●●●●●●●●●● ●●●●●●● ●●●●● ●●●●●● ●●●●● ●●●●●● ●●●●●●● ●●●●●●●●● ●●●● ●●●●●●●●●●● ● ●●●●●●●● ●●●●●●●●●●●●●●●● ●●●● ●● ● ● ●● ●●●● ●●●●● ●●●●●● ●●●●●●●● ●● 0.00 ●●● ●●●● ● ●● ●● ●●●● ●●●●●●●●● ● ● ●●●●● ●● ●●●●● ● ● ● ●●●●● ●● ●●●●●● ● ●●● ● ●●● ●●●●●●● ● ● ●● ●●● ● ●●● ● ● ● ●●● ● ●●● ● ●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●● ●●● ●●● ●●● ●●●●●●●● ●● ●●●●●● ●●●●●●●● ●●● ●●● ●● ●●● ● ●●●●●●●●●● ●●●●● ●●●●●● ●● ●●●●●●● ●●●●●●●● ●●● ●●●● ● ● ●●● ●● ● ●●●●●●●●● ●● ●●● ● ● ●● ●●●●●● ● ●● ●● ●● ●● ●●●●●●●●● ● ● ● ●●●● ●●●●● ●● ●●●●● ●●● ●● ●●●● ● ●●● ●●●●●●●●● ●●●●●●●●●● ● ●● ●● ● ●●● ●● ●●● ●●●●●●●●●●● ●●●● ● ●●● ● ● ● ● ●●● ●●●●● ●●● ● ●● ●●●●● ●●● ●● ●●●●● ● ● ●● ● ● ●●●● ● ●●●●● ● ●● ●● ●●● ●●●● ● ●● ●●● ● ● ●●● ●● ●● ●●●● ●● ●● ●● ● ● ● ●● ●●●● ●●● ●● ● ● ● ● ● ●● ● ● ● ●● ●●● ● ●● ● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ●●●●●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● −0.02 ● ● ● ● ● ● ● ● ● ● ● −0.04

0 500 1000 1500 2000

Index

n = 2000, k = 100 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 26 / 39 Spectral algorithm: First attempt

Naive Spectral Algorithm Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: Compute first eigenvector v of matrix W = W (G); 2: Sort vertices by value of |vi |; 3: Check if the k vertices with largest value form a clique; 4: If yes, output them;

Where is the problem?

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 27 / 39 Spectral algorithm: First attempt

Naive Spectral Algorithm Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: Compute first eigenvector v of matrix W = W (G); 2: Sort vertices by value of |vi |; 3: Check if the k vertices with largest value form a clique; 4: If yes, output them;

Where is the problem?

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 27 / 39 Spectral algorithm

Spectral Algorithm Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: Compute first eigenvector v of matrix W = W (G); 2: Sort vertices by value of |vi |; 3: Let R ⊆ V be the set of k vertices with largest value; 4: For i ∈ V 5: If degR (i) > 3k/4, let S ← S ∪ {i}; 6: Output S;

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 28 / 39 Why is this a good trick?

By the perturbation bound R is roughly good: R ∩ S > 0.9 · k.

All the vertices in S pass the test.

For i 6∈ S, EdegR (i) = k/2 and degR (i) < 3k/4 whp.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 29 / 39 Improving over the spectral algorithm

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 30 / 39 We proved this

Theorem √ If k ≥ 100 n then spectral algorithm finds the clique.

Can we make 100 as small as we want?

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 31 / 39 We proved this

Theorem √ If k ≥ 100 n then spectral algorithm finds the clique.

Can we make 100 as small as we want?

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 31 / 39 Not without a new idea. . .

Histogram of eA$val 70 60 50 40 Frequency 30 20 10 0

−50 0 50

eA$val

n = 2000, k = 30 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 32 / 39 Not without a new idea. . .

● ● ● ● ● ● ● ● ● ● 0.06 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● 0.04 ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ●● ●● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ●●●● ● ●● ● ● ●●●●● ● ● ● ● ● ● ●●●●●● ●● ●● ● ●● ● ● ● ●● ●●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●● ● ●●● ● ● ●● ● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ●●●●● ●● ● ●● ● ●●● ● ● ●● ●● ●●● ●● ● ●●● ● ●● ● ●● ●● ● ● ● ● ● ●●●● ● ●●● ● ●●● ● ● ● ●●●● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ●●●●●● ●●●● ●● ● ● ● ● ●●● ● ● ● ● ●●●●● ● ● ●● ● ●●●●● ●●● ●●● ●● ● ● ● ● ●●●● ● ● ● ●● ●● ●● ● ● ●●●●●●● 0.02 ● ●● ●●●●● ● ●● ●●●● ● ● ● ● ● ● ●●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●●● ● ●●● ● ● ● ● ●● ● ●●● ●●● ●● ●●● ● ● ●● ● ● ● ●● ● ● ● ● ● ●●●●● ●●● ●●● ● ●● ●●●●●● ●●● ●● ●●●● ● ● ● ● ●●● ● ●● ● ● ●●● ● ●●●●●● ●● ● ●●●● ●● ●● ● ●●● ● ● ●●●● ● ● ● ● ●● ● ● ●●●●●●●●●●● ●●●●●● ●● ●● ● ● ●● ● ●● ●●●●●● ●● ●● ● ● ●● ●● ● ● ●● ● ● ●● ● ● ●●●● ●●●●●●● ●●● ● ●●● ● ● ● ●● ●● ● ●● ●● ● ● ● ●● ●● ●●●●● ● ●● ● ●● ●● ● ●●● ● ● ●●● ●● ● ●● ● ●●●●●●●●● ●●●● ●● ●●● ● ● ● ●● ● ●● ●●●●● ●● ● ● ●● ● ● ●●● ●●● ●● ● ● ● ●● ●● ● ● ●●● ● ● ●●● ●●● ●● ● ● ● ●●● ● ● ●●● ●● ● ●●● ● ●● ●●● ● ●● ●● ● ●●● ●●●●● ●● ●●● ●●● ●● ●● ● ●● ●●● ●●● ●●● ●● ● ●●● ●●●● ● ●● ● ●●● ● ●●●●●●● ●●● ●● ●●● ●●● ●●●● ●● ●●●●●●●●● ●●● ● ●●● ●● ● ●● ●● ● ●●●●●●● ● ● ●● ● ● ● ●●●● ●●●● ● ● ●●●●●●● ● ●●● ●● ● ●● ●●●● ● ●●● ●● ●● ● ●● ● ●● ● ●● ● ● ● ●●● ●●●● ● ● ● ●● ●●●● ● ● 0.00 ● ● ● ● ●● ●● ● ● ●● ● ●●● ●●●●● ● ● ●●●●●●●● ●● ●●● ● ● ●● ●● ● ● ● ●●●● ● ●●●●● ●●●●● ●● ●●●●●●●● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●●●●●● ●● ● ●●● ● ● ●● ●● ●● ●● ●●● ● ●● ●●●● ● ● ●● ● ●●●●●● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ●● ● ● ●●●●●● ●● ● ●●● ●●● ●● ●● ●● ● ●●● ●●●● ● ●●● ● ● ● ●●● ●●● ● ●●●●● ● ●●● ●● ●●● ● ●● ●● ●● ●●● ● ●●●●● ●●● ●● ●●●●●● ● ● ● ●●● ● ●●●●● ●●● ● ●● ● ● ●● ●●● ●●● ●● ● ●● ●●●●●● ● ●●●● ● ●● ● ● ●● ●●●●●●●●● ● ●● ● ●● ● ●● ●● ●● ● ●●● ●● ●● ● ●●● ● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●● ●● ●●●● ● ● ●● ● ● eA$vec[, 1] eA$vec[, ●● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ●●● ●●● ●●● ● ●●● ● ●●●● ●● ●● ● ● ● ●● ●●●● ● ● ●●●●● ● ● ● ● ● ● ●●●● ●●●●●● ● ●● ● ●●● ●●● ● ● ●● ●● ● ●● ● ● ●●●● ● ●● ● ●● ● ● ●● ●●● ● ● ● ● ● ●●●● ● ● ●● ● ●● ● ●● ●●● ● ●● ● ● ● ●● ● ● ● ● ●●● ●●● ●● ● ● ●● ●● ●●●●● ● ●● ●●●●●● ●● ● ● ●● ● ●●● −0.02 ● ● ● ●● ● ● ● ●● ● ●●●● ● ● ● ● ● ●●●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ●●● ● ●● ●● ●● ● ● ●● ●●● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ●● ● ●●● ●● ● ● ● ● ●● ●● ● ●● ● ● ● ●● ● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● −0.04 ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● −0.06 ●

● −0.08 0 500 1000 1500 2000

Index

n = 2000, k = 30 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 33 / 39 Tight analysis

T T W = uS uS + Z − ZS,S ≈ uS uS + Z

Low-rank deformation of a random matrix (e.g. Knowles, Yin 2011) Proposition √ √ If k > (1 + ) n, then huS , vi ≥ min(, )/2. √ −1/2+δ Viceversa, if k < (1 − ) n, then |huS , vi| ≤ n .

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 34 / 39 The test set idea

Enhanced Spectral Algorithm Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: For T ⊆ V , |T | = 2, T is a clique; 2: Let VT = {i ∈ V \ T : degT (i) = 2} and let GT be the induced graph; 3: Run Spectral Algorithm(GT , k − 2); 4: If a clique S of size k − 2 is found, return S ∪ T ;

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 35 / 39 Why is this a good trick?

n V 0 ≈ T 4 Looking for clique of size k − 2.

√ If Spectral succeeds for k ≥ c n, then Enhanced Spectral works for k − 2 ≥ cpn/4. √ Equivalently for k ≥ c0 n with

c0 = c/2 + 0.000001

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 36 / 39 Why is this a good trick?

n V 0 ≈ T 4 Looking for clique of size k − 2.

√ If Spectral succeeds for k ≥ c n, then Enhanced Spectral works for k − 2 ≥ cpn/4. √ Equivalently for k ≥ c0 n with

c0 = c/2 + 0.000001

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 36 / 39 Why is this a good trick?

n V 0 ≈ T 4 Looking for clique of size k − 2.

√ If Spectral succeeds for k ≥ c n, then Enhanced Spectral works for k − 2 ≥ cpn/4. √ Equivalently for k ≥ c0 n with

c0 = c/2 + 0.000001

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 36 / 39 Iterating the same idea

Enhanced Spectral Algorithm Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: For T ⊆ V , |T | = s, T is a clique; 2: Let VT = {i ∈ V \ T : degT (i) = 2} and let GT be the induced graph; 3: Run Spectral Algorithm(GT , k − s); 4: If a clique S of size k − s is found, return S ∪ T ;

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 37 / 39 The state of the art for polynomial

Theorem √ Enhanced Spectral(s) finds cliques of size c2−s/2 n in time O(ns+c0 ).

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 38 / 39 What about really efficient algorithms?

Theorem (Dekel, Gurel-Gurevitch, Peres, 2011) √ If k ≥ 1.261 n, then there exists an algorithm with complexity O(n2) that finds the clique with high probability.

Theorem (Deshpande, Montanari, 2013) If k ≥ (1 + )pn/e, then there exists an algorithm with complexity O(n2 log n) that finds the clique with high probability.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 39 / 39 What about really efficient algorithms?

Theorem (Dekel, Gurel-Gurevitch, Peres, 2011) √ If k ≥ 1.261 n, then there exists an algorithm with complexity O(n2) that finds the clique with high probability.

Theorem (Deshpande, Montanari, 2013) If k ≥ (1 + )pn/e, then there exists an algorithm with complexity O(n2 log n) that finds the clique with high probability.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 39 / 39 What about really efficient algorithms?

Theorem (Dekel, Gurel-Gurevitch, Peres, 2011) √ If k ≥ 1.261 n, then there exists an algorithm with complexity O(n2) that finds the clique with high probability.

Theorem (Deshpande, Montanari, 2013) If k ≥ (1 + )pn/e, then there exists an algorithm with complexity O(n2 log n) that finds the clique with high probability.

Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 39 / 39