The Hidden Clique Problem and Graphical Models

The Hidden Clique Problem and Graphical Models

The hidden clique problem and graphical models Andrea Montanari Stanford University July 15, 2013 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 1 / 39 Outline 1 Finding a clique in a haystack 2 A spectral algorithm 3 Improving over the spectral algorithm Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 2 / 39 Finding a clique in a haystack Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 3 / 39 General Problem G = (V , E) a graph. S ⊆ V supports a clique (i.e. (i, j) ∈ E for all i, j ∈ S) Problem : Find S. Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 4 / 39 General Problem G = (V , E) a graph. S ⊆ V supports a clique (i.e. (i, j) ∈ E for all i, j ∈ S) Problem : Find S. Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 4 / 39 General Problem G = (V , E) a graph. S ⊆ V supports a clique (i.e. (i, j) ∈ E for all i, j ∈ S) Problem : Find S. Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 4 / 39 Example 1: Zachary’s karate club Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 5 / 39 A catchier name Finding a terrorist cell in a social network Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 6 / 39 Toy example: 150 nodes, 15 highly connected 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Here binary data: Can generalize. Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 7 / 39 Of course not the first 15. 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Where are the highly connected nodes? 1021 possilities. Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 8 / 39 An efficient algorithm 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 9 / 39 The model [Alon,Krivelevich,Sudakov 1998] Choose S ⊆ V with |S| = k uniformly at random. Add an edge (i, j) for each pair s.t. i, j ∈ S. Add an edge for each other pair (i, j) independently with prob p. G ∼ G(n, p, k) Will assume p = 1/2 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998] Choose S ⊆ V with |S| = k uniformly at random. Add an edge (i, j) for each pair s.t. i, j ∈ S. Add an edge for each other pair (i, j) independently with prob p. G ∼ G(n, p, k) Will assume p = 1/2 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998] Choose S ⊆ V with |S| = k uniformly at random. Add an edge (i, j) for each pair s.t. i, j ∈ S. Add an edge for each other pair (i, j) independently with prob p. G ∼ G(n, p, k) Will assume p = 1/2 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998] Choose S ⊆ V with |S| = k uniformly at random. Add an edge (i, j) for each pair s.t. i, j ∈ S. Add an edge for each other pair (i, j) independently with prob p. G ∼ G(n, p, k) Will assume p = 1/2 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 The model [Alon,Krivelevich,Sudakov 1998] Choose S ⊆ V with |S| = k uniformly at random. Add an edge (i, j) for each pair s.t. i, j ∈ S. Add an edge for each other pair (i, j) independently with prob p. G ∼ G(n, p, k) Will assume p = 1/2 Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 10 / 39 Question How big k has to be for us to find the clique? Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 11 / 39 If you could wait forever Exhaustive search Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: For all S ⊂ V , |S| = k; 3: Check if GS is a clique; 4: Output all cliques found; Works if k > k∗, typical size of largest clique in G ∼ G(n, 1/2, k) that is not supported on S. Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 12 / 39 If you could wait forever Exhaustive search Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: For all S ⊂ V , |S| = k; 3: Check if GS is a clique; 4: Output all cliques found; Works if k > k∗, typical size of largest clique in G ∼ G(n, 1/2, k) that is not supported on S. Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 12 / 39 Largest random clique Largest clique that does not share any vertex with S Equivalently G ∼ G(n − k, 1/2, 0) ≈ G(n, 1/2, 0). Idea: compute expected number of cliques of size k Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 13 / 39 Largest random clique Expected nb of cliques of size k n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k k∗(n) ≈ 2 log2 n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique Expected nb of cliques of size k n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k k∗(n) ≈ 2 log2 n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique Expected nb of cliques of size k n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k k∗(n) ≈ 2 log2 n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique Expected nb of cliques of size k n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k k∗(n) ≈ 2 log2 n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique Expected nb of cliques of size k n N(k) = (1,..., k) form a clique E k P n = 2−k(k−1)/2 k ne k ≤ 2−k(k−1)/2 k ne k ≤ 2−(k−1)/2 k k∗(n) ≈ 2 log2 n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 14 / 39 Largest random clique 1e+12 1e+10 1e+08 1e+06 N(k) 10000 100 1 0.01 5 10 15 20 25 30 k Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 15 / 39 Can we do it in reasonable time? Naive Algorithm Input : Graph G = (V , E), Clique size k Output : Clique of size k 1: Sort vertices by degree; 2: Check if the k vertices with largest degree form a clique; 3: If yes, output them; Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 16 / 39 When does naive work? For i 6∈ S di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4) 1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2 Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 When does naive work? For i 6∈ S di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4) 1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2 Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 When does naive work? For i 6∈ S di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4) 1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2 Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 When does naive work? For i 6∈ S di ∼ Binom(n − 1, 1/2) ≈ Normal(n/2, n/4) 1/2 n n n o 2 1 d ≥ + t ≤ e−t /2 ≤ for t ≥ p4 log n P i 2 2 n2 Proposition With high probability n p max di ≤ + n log n . i6∈S 2 n p min di ≥ + k − 1 − n log n . i∈S 2 √ Works for k ≥ 2 n log n Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 17 / 39 A spectral algorithm Andrea Montanari (Stanford University) Lecture 6 July 15, 2013 18 / 39 Idea +1 if (i, j) ∈ E, W = ij −1 otherwise.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    63 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us