5. § Goal: Learn representation (features) for a set of graph
elements (nodes, edges, etc.)
§ Key intuition: Map the graph elements (e.g., nodes) to the
d-‐dimension space, while preserving node similarity
§ Use the features for any downstream prediction task
6. Recent work: Map nodes based on their proximity in the
input graph – (nearby nodes are close together)
DeepWalk Model
Perrozi et
al.
KDD
2014
7. Recent work: Map nodes based on their proximity in the
input graph – (nearby nodes are close together)
How to get nearby nodes?
Perrozi et
al.
KDD
2014
Grover
et
al.
KDD
2016
8. Recent work: Map nodes based on their proximity in the
input graph – (nearby nodes are close together)
§ A (conditional) walk/path is a finite sequence of adjacent
vertices in the graph
How to get nearby nodes?
Perrozi et
al.
KDD
2014
Grover
et
al.
KDD
2016
10. Mikolov et
al.
ICLR
2013
Perrozi et
al.
KDD
2014
focus
vertex
11.
12. § No support for inductive/transfer learning
• features are learned for node identities
• features do not generalize beyond the input graph
§ Map nodes based on their proximity only
§ No notion of attributes
§ No notion of structural similarity
13. Communities: cohesive subsets of nodes
Roles: represent structural patterns
-‐ two nodes belong to the same role if they’ve similar structural patterns
Cj#
Ci#
Ck#
Rossi
&
Ahmed
TKDE
2015
Ahmed
et
al.
AAAI
2017
14. Goal: Find a mapping of nodes to d-‐dimensions that preserves
proximity and node similarity
Using structure + attributes (if any)
21. § Predict which pairs of nodes are likely to connect
§ Applications: social network analysis, biological networks,
terrorist networks, etc.
22. Deepwalk (DW) – Perrozi et al. KDD 2014
node2vec (N2V) – Grover et al. KDD 2016
LINE: Tang et al. – WWW 2015
23. 1 2 4 8 12 16
0
2
4
6
8
10
12
14
16
Number of processing units
Speedup
socfb−MIT
bio−dmela
soc−gowalla
tech−RL−caida
web−wikipedia09
1 2 4 8 12 16
0
2
4
6
8
10
12
14
16
Number of processing units
Speedup
Strong scaling results
Using Intel Xeon E5-‐2687W server, 16 cores
Motif Counting
24. § We propose a generic framework for learning representation
in large attributed graphs
§ Maps nodes based on Structural similarity + proximity +
attributes (if any)
§ Learns universal features that can generalize across
networks/graphs
§ Useful for inductive/transfer learning
§ Scalable for large graphs
25. § Generalizing other deep graph models
§ Theoretical analysis
§ Choice of mapping functions
§ Impact of sampling strategy
§ Evaluation on other ML tasks
26. § Efficient estimation of word representations in vector space. ICLR 2013 [Mikolov et. al]
§ A Framework for Generalizing Graph-‐based Representation Learning Methods. arXiv:1709.04596 2017 [Ahmed et. al]
§ Role Discovery in Networks. TKDE 2015 [Rossi & Ahmed]
§ A Higher-‐order Latent Space Network Model. AAAI 2017 [Ahmed, Rossi, Willke, Zhou]
§ node2vec: Scalable Feature Learning for Networks. KDD 2016 [Grover, Leskovec]
§ DeepWalk: online learning of social representations. KDD 2014 [Perozzi, Al-‐Rafou, Skiena]
§ Efficient Graphlet Counting for Large Networks. ICDM 2015, [Ahmed et al.]
§ Graphlet Decomposition: Framework, Algorithms, and Applications. J. Know. & Info. 2016 [Ahmed et al.]
§ Network Motifs: Simple Building Blocks of Complex Networks. Science 2002, [Milo et al.]
§ Uncovering Biological Network Function via Graphlet Degree Signatures. Cancer Informatics 2008 [Milenković-‐Pržulj]
§ Graph Kernels. JMLR 2010, [Vishwanathan et al.]
§ The Structure and Function of Complex Networks. SIAM Review 2003, [Newman]
§ Biological network comparison using graphlet degree distribution. Bioinformatics 2007 [Pržulj]
§ Efficient Graphlet Kernels for Large Graph Comparison. AISTAT 2009 [Shervashidze et al.]
§ Local structure in social networks. Sociological methodology 1976, [Holland-‐Leinhardt]
§ The strength of weak ties: A network theory revisited. Sociological theory 1983 [Granovetter]