Graph Convolutional Networks Yunsheng Bai Overview
1. Improve GCN itself a. Convolution i. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (NIPS 2016) ii. Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017) iii. Dynamic Filters in Graph Convolutional Networks (2017) b. Pooling (Unpooling) i. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (NIPS 2016) 2. Apply to new/larger datasets/graphs a. Use GCN as an auxiliary module i. Structured Sequence Modeling with Graph Convolutional Recurrent Networks (2017) b. Use GCN only i. Node/Link classification/prediction: http://tkipf.github.io/misc/GCNSlides.pdf 1. Directed graph a. Modeling Relational Data with Graph Convolutional Networks (2017) ii. Graph classification, e.g. MNIST (with or without pooling) Roadmap
1. Define Convolution for Graph 2. Architecture of Graph Convolutional Networks 3. Improvements: Generalizable Graph Convolutional Networks with Deconvolutional Layers a. Improvement 1: Dynamic Filters -> Generalizable b. Improvement 2: Deconvolution Define Convolution for Graph Laplace
http://www.norbertwiener.umd.edu/Research/lectures/2014/MBegue_Prelim.pdf Graph Laplacian
http://www.norbertwiener.umd.edu/Research/lectures/2014/MBegue_Prelim.pdf Graph Laplacian
Labeled graph Degree matrix Adjacency matrix Laplacian matrix
https://en.wikipedia.org/wiki/Laplacian_matrix Graph Fourier Transform
L: (Normalized) Graph Laplacian
D: Degree Matrix
W: Adjacency Matrix
U: Eigenvectors of L (Orthonormal b.c. L is symmetric PSD)
Λ: Eigenvalues of L
: Fourier Transform of x
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (NIPS 2016) 1-D Convolution
Ex. x = [0 1 2 3 4 5 6 7 8]
f = [4 -1 0]
y = [0*4 1*4+0*(-1) 2*4+1*(-1)+0*0 3*4+2*(-1)+1*0 ...]
= [0 4 7 10 ...]
I made this based on my EECS 351 lecture notes. Convolution <--> Multiplication in Fourier Domain
View X and F as vectors
I made this based on my EECS 351 lecture notes. Spectral Filtering
~ to the Convolution: previous slide
Filter a signal:
“As we cannot express a (1) (1) meaningful translation e x y 1 operator in the vertex 1 0 0 (2) (2) e x domain, the convolution e1 e2 e3 0 2 0 2 = y operator on graph G is 0 0 3 e defined in the Fourier 3 x(3) y(3) domain”
Inverse Fourier Non-parametric Fourier Transform Transform Filter of x
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (NIPS 2016) Spectral Filtering
(1) e x 1 1 0 0 e x(2) e1 e2 e3 0 2 0 2 0 0 3 e 3 x(3)
Inverse Fourier Non-parametric Fourier Transform Transform Filter of x
e e *
Fourier Basis Spectral Filtering
(1) e x 1 1 0 0 e x(2) e1 e2 e3 0 2 0 2 0 0 3 e 3 x(3)
Inverse Fourier Non-parametric Fourier Transform Transform Filter of x
= e e e 1*
The result of the convolution is the original signal: (1) first Fourier Transformed (2) then multiplied by a filter (3) finally inverse Fourier Transformed Spectral Filtering
Convolution:
Filter a signal:
(1) (1) e x y 1 1 0 0 (2) (2) e x e1 e2 e3 0 2 0 2 = y 0 0 3 e 3 x(3) y(3)
Inverse Fourier Non-parametric Fourier Transform Transform Filter of x
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (NIPS 2016) Better Filters
Convolution:
Filter a signal:
(1) (1) e x y 1 1 0 0 (2) (2) e x e1 e2 e3 0 2 0 2 = y 0 0 3 e 3 x(3) y(3)
Inverse Fourier Localized & Fourier Transform Transform Polynomial of x Filter
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (NIPS 2016) Better Filters: Localized
Labeled graph Degree matrix Adjacency matrix Laplacian matrix
L2
Wavelets on Graphs via Spectral Graph Theory (2011) Better Filters: Localized
Filter:
Filter a signal:
x(1) x(1) x(1) 1-step neighbors 2-step neighbors x(2) x(2) x(2)
K=3 x(3) x(3) x(3) = Θ * + Θ * * + Θ *( )* 0 x(4) 1 x(4) 2 x(4)
x(5) x(5) x(5)
x(6) x(6) x(6) Fixed Θ for every neighbor :( (Dynamic Filters in Graph Convolutional Networks (2017)) Better Filters, but O(n2)
Convolution:
Filter a signal:
Filter: s(1) Computing Eigenvectors s(2) e1 e2 e3 O(n3) :( s(3) I am actually confused. They used Chebyshev polynomials to approximate the filter, but at the end of the day, the filtered signal is the same as the previous slide. O(n2) :( In fact, authors of Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017) set K=1, so no Chebyshev at all. Approximations
If K=1, filtering becomes:
If set (further approximate):
Then, filtering becomes:
If input is a matrix:
Then, filtering becomes:
Filter parameters:
Convolved signal matrix:
Filtering complexity:
Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017) Illustration of
-1/2 -1/2 D * A * D * X *
Feature 2 of Word Embedding of Node 1 F F Node 1 i i l l * * t t = L X e e Z r r Feature 2 of 1 2 Word Embedding of Node 6 Node 6
Feature 1 of Node 6
Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017) Roadmap
1. Define Convolution for Graph 2. Architecture of Graph Convolutional Networks 3. Improvements: Generalizable Graph Convolutional Networks with Deconvolutional Layers a. Improvement 1: Dynamic Filters -> Generalizable b. Improvement 2: Deconvolution Architecture of Graph Convolutional Networks Schematic Depiction
Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017) Roadmap
1. Define Convolution for Graph 2. Architecture of Graph Convolutional Networks 3. Improvements: Generalizable Graph Convolutional Networks with Deconvolutional Layers a. Improvement 1: Dynamic Filters -> Generalizable b. Improvement 2: Deconvolution Improvements: Generalizable Graph Convolutional Networks with Deconvolutional Layers Roadmap
1. Define Convolution for Graph 2. Architecture of Graph Convolutional Networks 3. Improvements: Generalizable Graph Convolutional Networks with Deconvolutional Layers a. Improvement 1: Dynamic Filters -> Generalizable b. Improvement 2: Deconvolution Improvement 1 Dynamic Filters -> Generalizable Roadmap
1. Define Convolution for Graph 2. Architecture of Graph Convolutional Networks 3. Improvements: Generalizable Graph Convolutional Networks with Deconvolutional Layers a. Improvement 1: Dynamic Filters -> Generalizable i. Basics ii. Ideas iii. Ordering iv. Example b. Improvement 2: Deconvolution Baseline Filter: Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017)
0 12 0 0 0 12
...
0 0 0 0 11 11 3 2 1 11 0 11 0 0 0 Filter 4 0 0 0 0 1 All share 11 11 the same 0 0 0 0 5 parameter. 11 11 6