<<

and Graph CNN

https://qdata.github.io/deep2Read

Presenter : Ji Gao

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 1 / 28 Outline

1 Graph Laplacian Definitions Why Laplacian? Graph

2 Spectral Neural Network

3 Fast Spectral Filtering

4 Reference

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 2 / 28 Outline

1 Graph Laplacian Definitions Why Laplacian? Graph Fourier Transform

2 Spectral Neural Network

3 Fast Spectral Filtering

4 Reference

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 3 / 28 Graph

Graph A graph G = (V , E), where V = 1, 2..N is the set of Vertices and E ⊆ V × V .

(Vertex) Weighted Graph A weighted graph G = (V , E, W ), where V = 1, 2..N is the set of Vertices, E ⊆ V × V , W : V → R.

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 4 / 28 Graph

Degree The degree d(v) of a vertex v is the number of vertices in G that are adjacent to v.

Adjacency Matrix Adjacency matrix A of the graph G is a n × n matrix that ( 1 (i, j) ∈ E Aij = 0 Otherwise

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 5 / 28 Graph Laplacian

(Unnormalized) Graph Laplacian Graph Laplacian L = diag(d) − A, which  d i = j  i Lij = −1 i 6= j&(i, j) ∈ E 0 Otherwise

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 6 / 28 Outline

1 Graph Laplacian Definitions Why Laplacian? Graph Fourier Transform

2 Spectral Neural Network

3 Fast Spectral Filtering

4 Reference

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 7 / 28 Why Laplacian?

Laplacian For function f, Laplacian operator ∆f = ∇ · ∇f

Laplacian represents the divergence of the gradient. It’s a coordinate-free operator! In physics, if a electromagnetic field is defined by a electrostatic potential function φ, then ∆φ gives the charge distribution in the field.

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 8 / 28 Eigenfunction of Laplacian Operator

Eigenfunction of Laplacian in (0, 1) Suppose f is the eigenfunction of the Laplacian:

∆f + λf = 0, f (0) = f (1) = 0

∂2f ∆f = = −λf ∂x2 The only non-trivial solution of the Laplacian is

fn(x) = C sin(nπx), n ∈ N

fn is the Fourier sine series. 2 fn together forms an orthonormal basis of the space L (0, 1) Theorem: For any L2(Ω) space where Ω is a reasonably smooth domain, there exists an orthonormal family of eigenfunctions of ∆ that forms an orthonormal basis of the space.

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 9 / 28 Outline

1 Graph Laplacian Definitions Why Laplacian? Graph Fourier Transform

2 Spectral Neural Network

3 Fast Spectral Filtering

4 Reference

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 10 / 28 Graph Laplacian Revisited

Graph Laplacian Graph Laplacian L = D − A

Suppose f is a function from vertex to R.

f can be represented by a vector (f1, f2...fn) with size n. P P Therefore, [Lf ]i = di − j Aij fj = j Aij (fi − fj ) Calculating the difference on the value of a vertex to its neighbors!

T X 2 f Lf = (fi − fj ) ∈E

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 11 / 28 Graph Laplacian Revisited

Graph Laplacian

L = D − A T X 2 f Lf = (fi − fj ) ∈E

Symmetric real matrix −→ Real eigenvalues Positive semidefinite −→ Non-negative eigenvalues First eigenvalue is 0 with eigenvector {1, 1, 1...1}

0 = λ1 ≤ λ2 ≤ ... ≤ λn

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 12 / 28 Graph Fourier transform

The eigenvector of graph can be used as a orthonormal basis of the Hilbert space.

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 13 / 28 Graph Spectral Filtering

Filters can be used to form a convolutional layer

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 14 / 28 Spectral Networks and Deep Locally Connected Networks on graphs Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann Lecun

CNN is powerful. Extend CNN to general graphs. 1. Use hierarchical clustering 2. Use spectrum of graph laplacian to learn convolutional layers Efficient: Number of parameters is independent of input size

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 15 / 28 Spatial CNN use Hierarchical clustering

Form a multi-scale clustering

The k-th layer has dk clusters

The k-th layer has fk filters Convolutional Layer

For j = 1..fk , fk−1 X xk+1,j = Lk h( Fk,i,j xk,i ) i=1 . Fk , i, j is a dk−1 × dk−1 sparse matrix. Lk is a pooling operation.

Clusters are pre-defined by hierarchical clustering.

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 16 / 28 Spectral Construction

Spectral Suppose V is the eigenvectors of L. Input: xk , size n × fk−1 Without spatial subsampling:

fk−1 X T xk+1,j = h(U Fk,i,j U xk,i ) i=1

Fk,i,j is a diagonal weight matrix.

Only use top d eigenvectors to reduce cost.

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 17 / 28 Experiment 1: Subsampled MNIST

Subsample MNIST to 400 points Baseline: Nearest Neighbor (4.11% Error rate)

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 18 / 28 Experiment 1: Subsampled MNIST

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 19 / 28 Experiment 2: Sphere MNIST

Project MNIST to sphere Uniformly or Randomly

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 20 / 28 Experiment 2: Sphere MNIST

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 21 / 28 Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering Micha¨elDefferrard, Xavier Bresson, Pierre Vandergheynst

Improve previous spectral CNN Main Contributions: Strictly localized filters Low computational complexity Efficient pooling method Multiple experiment on different datatypes

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 22 / 28 Filters

graph filter

y = UT g(Λ)Ux Where U is the eigenvector of L and Λ is the diagonal matrix of all eigenvalues of L

Naive approach is to learn g(Λ) = diag(θ) directly. Limitations: It’s not localized The complexity is O(n).

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 23 / 28 Polynomial filter

Polynomial filter PL K g(Λ) = k=1 θk Λ

Spectral filters represented by Kth-order polynomials of the Laplacian are K-localized: connect all the vertices in at most K steps. Learning complicity is O(K) PL Use Chebyshev polynomial to make it faster: g(Λ) = k=1 θk Tk (Λ), where Tk = 2xTk−1 − Tk−2

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 24 / 28 Pooling

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 25 / 28 Experiment 1: MNIST

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 26 / 28 Experiment 2: 20Newsgroup

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 27 / 28 Reference

1 Laplacian Operator - Wikipedia 2 An introduction to spectral graph theory Jiang Jiaqi 3 Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering Micha¨el Defferrard, Xavier Bresson, Pierre Vandergheynst(EPFL, Lausanne, Switzerland) 4 Spectral Networks and Locally Connected Networks on Graphs Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann LeCun 5 Graph : Concepts, tools and applications Xiaowen Dong

(University of Virginia) Spectral Graph Theory and Graph CNN Presenter : Ji Gao 28 / 28