[Icml2019] mix hop higher-order graph convolutional architectures via sparsified neighborhood mixing

MixHop: Higher-Order Graph Convolutional
Architectures via Sparsified Neighborhood Mixing
Reader:
LeapMind, DL Engineer
Jira JINDALERTUDOMDEE
LeapMind ICML2019 Reading Session

1. Paper Info
2. Background
3. Limitation of current method
4. Proposed method
5. Experiment
6. Conclusion
2
Contents

LeapMind Inc. © 2019
Paper Info
3
● “MixHop: Higher-Order Graph Convolutional Architectures via
Sparsified Neighborhood Mixing”
○ ICML 2019
○ Link to paper here
○ Author: Sami Abu-El-Haija, et al.
○ Figures and tables from the paper have been used in these slides to explain the
result of the experiment and show the examples of network architecture.
● Why did I choose this paper ?
○ Graph can represent various kinds of data, e.g. chemical compounds, networks
○ Graph convolution is a way to apply NN to graph and solve more problems

Background: Graph Neural Network
4
Input: A graph, a set of given classes, features of all nodes
Output: A graph with labeled nodes

Background: Notation
5
● Notation used in a graph convolutional network

Background: Notation
6
● Notation used in a graph convolutional network

Background: Graph convolutional layer
7
● Convolutional layer for graph proposed by Kipf and Welling [1]
[1] Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016).

Background: Graph convolutional Layer
8
● Convolutional layer for graph proposed by Kipf and Welling [1]
[1] Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016).
Consider only neighbor’s feature

Limitation of normal graph convolutional network
9
Do not consider node’s neighbors
with distance more than one
Cannot be adapted to Gabor-like filters (e.g. edge detection in images)
Modify convolutional layer to receive information of farther nodes in one layer

Proposed method: Power of adjacency matrix
10
Â( Â (... (Â H(i))...)) if i ≥ 1
i times
Âi H(i) =
Identity matrix if
i = 0

Proposed method: MixHop graph convolutional layer
11

Proposed method: Output layer
12

Proposed method: MixHop graph convolutional network
13
● Stack graph convolutional layers and end with an output layer

Proposed method: Learning network architecture
14
● Train the model with group regularization over each column of weight
total loss = cross-entropy loss(input label, YO) +

Proposed method: Learning network architecture
15
● Train the model with group regularization over each column of weight
total loss = cross-entropy loss(input label, YO) +
● Remove all columns less than a threshold
* the authors choose threshold that make #columns = 60
● Restart training using standard L2 regularization instead of group regularization

Experiment: Model setting
16
● #convolutional layers: 2
● Optimizer: Gradient descent
● Steps: 2000 or validation accuracy doesn’t improve for 40 steps
● Initial learning: 0.05 decays by 0.0005 every 40 steps
● Weight decay rate: 0.0005

Experiment 1: Synthetic dataset
17
● Networks with different homophily are generated based on [2]
homophily = prob. that same labeled nodes attaching to each other
[2] Karimi, Fariba, et al. "Visibility of minorities in social networks." arXiv preprint arXiv:1702.00150 (2017).
Adjacency
matrix with
power > 1
Image by Sami Abu-El-Haija, et al. from the paper “MixHop: Higher-Order Graph Convolutional Architectures via
Sparsified Neighborhood Mixing”, published as a conference paper in ICML 2019

Experiment 2: Real world datasets
18
● 3 datasets, Citeseer, Cora, and Pubmed, are used in the evaluation
Accuracy Table
* default architecture: All weights have the same dimension
Table by Sami Abu-El-Haija, et
al. from the paper “MixHop:
Higher-Order Graph
Convolutional Architectures via
Sparsified Neighborhood Mixing”,
published as a conference paper
in ICML 2019

Experiment 2: Real world datasets
19
● 3 datasets, Citeseer, Cora, and Pubmed, are used in the evaluation
Example networks
Image by Sami Abu-El-Haija, et al. from the paper “MixHop: Higher-Order Graph Convolutional Architectures via
Sparsified Neighborhood Mixing”, published as a conference paper in ICML 2019

Conclusion
20
● A MixHop graph convolutional layer is proposed
○ using multiple adjacency powers so that node receiving info from farther nodes
● The less homophily the graph,
the more impact of MixHop graph convolutional layer
● Learning network architecture is important because
○ finding optimal architecture for given dataset
○ pruning the weight

[Icml2019] mix hop higher-order graph convolutional architectures via sparsified neighborhood mixing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [Icml2019] mix hop higher-order graph convolutional architectures via sparsified neighborhood mixing

Similar to [Icml2019] mix hop higher-order graph convolutional architectures via sparsified neighborhood mixing (20)

More from LeapMind Inc

More from LeapMind Inc (16)

Recently uploaded

Recently uploaded (20)

[Icml2019] mix hop higher-order graph convolutional architectures via sparsified neighborhood mixing