SlideShare uma empresa Scribd logo
1 de 99
Baixar para ler offline
Structure of the presentation
What
High-level overview onemergingfieldofgeometric
deeplearning(andgraphdeeplearning)
How
Presentation focusedonstartup-style organizations
witheveryone doingabitofeverything,everyone
needingtounderstandabit of everything.CEOcannot
bethe ‘ideaguy’not knowinganythingabout graphs
andgeometricdeeplearning,ifyouare operatingin
thisspace
EFFECTUATION – THE BEST THEORY OF ENTREPRENEURSHIP YOU ACTUALLY FOLLOW,
WHETHER YOU’VE HEARD OF IT OR NOT
by Ricardo dos Santos
Brief Review of
Geometric Deep Learning
Geometric Deep Learning #1
Bronstein et al. (July 2017): “Geometric deep learning (
http://geometricdeeplearning.com/) is an umbrella term for e merging
techniques attempting to generalize (structured) deep neural models to non-
Euclidean domains, such as graphs and manifolds. The purpose of this article
is to overview different examples of geometric deep-learning problems and
present available solutions, key difficulties, applications, and future research
directions in this nascent field”
SCNN (2013)
GCNN/ChebNet (2016)
GCN (2016)
GNN (2009)
Geodesic CNN (2015)
Anisotropic CNN (2016)
MoNet (2016)
Localized SCNN (2015)
Geometric Deep Learning #2
Bronstein et al. (July 2017): “The non-Euclidean nature of data
implies that there are no such familiar properties as global
parameterization, common system of coordinates, vector space
structure, or shift-invariance. Consequently, basic operations like
convolution that are taken for granted in the Euclidean case are even
not well defined on non-Euclidean domains.”
“First attempts to generalize neural networks to graphs we are aware of
are due to Mori et al. (2005) who proposed a scheme combining
recurrent neural networks and random walk models. This approach
went almost unnoticed, re-emerging in a modern form in
Suhkbaatar et al. (2016) and Li et al. (2015) due to the renewed recent
interest in deep learning.”
“In a parallel effort in the computer vision and graphics community,
Masci et al. (2015) showed the first CNN model on meshed surfaces,
resorting to a spatial definition of the convolution operation based on
local intrinsic patches. Among other applications, such models were
shown to achieve state-of-the-art performance in finding
correspondence between deformable 3D shapes. Followup works
proposed different construction of intrinsic patches on point clouds
Boscaini et al. (2016)a,b and general graphs Monti et al. (2016).”
In calculus, the notion of derivative describes
how the value of a function changes with an
infinitesimal change of its argument. One of the
big differences distinguishing classical calculus
from differential geometry is a lack of vector
space structure on the manifold, prohibiting us
from naïvely using expressions like f(x+dx). The
conceptual leap that is required to generalize
such notions to manifolds is the need to work
locally in the tangent space.
Physically, a tangent vector field can be
thought of as a flow of material on a manifold.
The divergence measures the net flow of a field
at a point, allowing to distinguish between field
‘sources’ and ‘sinks’. Finally, the Laplacian (or
Laplace-Beltrami operator in differential
geometric jargon)
“A centerpiece of classical Euclidean signal processing is the property of the Fourier
transform diagonalizing the convolution operator, colloquially referred to as the
Convolution Theorem. This property allows to express the convolution f⋆g of two
functions in the spectral domain as the element-wise product of their Fourier transforms.
Unfortunately, in the non-Euclidean case we cannot even define the operation x-x’ on the
manifold or graph, so the notion of convolution does not directly extend to this case.
Geometric Deep Learning #3
Bronstein et al. (July 2017): “We expect the following years to bring exciting new approaches
and results, and conclude our review with a few observations of current key difficulties and
potential directions of future research.”
Generalization: Generalizing
deep learning models to
geometric data requires not only
finding non-Euclidean
counterparts of basic building
blocks (such as convolutional
and pooling layers), but also
generalization across different
domains. Generalization
capability is a key requirement in
many applications, including
computer graphics, where a
model is learned on a training set
of non-Euclidean domains (3D
shapes) and then applied to
previously unseen ones.
Time-varying domains: An
interesting extension of geometric
deep learning problems discussed
in this review is coping with signals
defined over a dynamically
changing structure. In this case, we
cannot assume a fixed domain and
must track how these changes
affect signals. This could prove
useful to tackle applications such
as abnormal activity detection in
social or financial networks. In the
domain of computer graphics and
vision, potential applications deal
with dynamic shapes (e.g. 3D video
captured by a range sensor).
Computation: The final consideration is
a computational one. All existing deep
learning software frameworks are
primarily optimized for Euclidean data.
One of the main reasons for the
computational efficiency of deep
learning architectures (and one of the
factors that contributed to their
renaissance) is the assumption of
regularly structured data on 1D or 2D
grid, allowing to take advantage of
modern GPU hardware. Geometric data,
on the other hand, in most cases do not
have a grid structure, requiring different
ways to achieve efficient computations.
It seems that computational paradigms
developed for large-scale graph
processing are more adequate
frameworks for such applications.
Primer on
GRAPHs
Taylor and Wrana (2012)
doi: 10.1002/pmic.201100594
Graph theory especially useful for networks analysis
https://doi.org/10.1126/science.286.5439.509
Cited by 29,071 articles
https://doi.org/10.1038/30918
Cited by 33,772
Random rewiring procedure for interpolating between a
regular ring lattice and a random network, without
altering the number of vertices or edges in the graph.
http://www.bbc.co.uk/newsbeat/article/35500398/how-facebook-updated-six-degree
s-of-separation-its-now-357
https://research.fb.com/three-and-a-half-degrees-of-separation/
http://slideplayer.com/slide/9267536/
Graph theory Common metrics and definitions
Graph-theoretic node
importance mining on
network topology
- Xue et al. (2017)
The graph-theoretic node importance mining methods based on
network topologies comprise two main categories: node
relevance and shortest path. The method of node relevance is
measured by degree analysis. The methods of shortest path that
aim at finding optimal spreading paths are measured by several
node importance analyses, e.g., betweenness, closeness
centrality, eigenvector centrality, Bonacich centrality and alter-
based centrality.
Betweenness is used particularly for measurements of power
while closeness centrality and eigenvector centrality are
used particularly for measurements of centrality. Bonacich
centrality is an extension of eigenvector centrality which
measures node importance on both centrality and power. The
other mining methods for node importance based on network
topologies included in this review are via processes such as
node deleting, node contraction, and data mining and machine
learning embedded techniques. For heterogeneous network
structures, fusion methods integrate all the previously
mentioned measurements.
28 February, 2013
Google’s Knowledge Graph: one step
closer to the semantic web?
By Andrew Isidoro
Knowledge Graph, a database of over 570m of the most
searched-for people, places and things (entities), including around
18bn cross-references.
The knowledge graph as the default data
model for learning on heterogeneous
knowledge
Wilcke, Xandera; Bloem, Peterc; de Boer, Victor
Data Science, vol. Preprint, no. Preprint, pp. 1-19, 2017
http://doi.org/10.3233/DS-170007
The FuhSen Architecture. High-level architecture
comprising (a) Mediator and wrappers architecture to
build the (b) knowledge graph on demand. The answer of a
keyword query corresponds to an RDF subject-molecule
that integrates RDF molecules collected from the
wrappers. (c) The components to enrich the results KG.
FuhSen: A Federated Hybrid
Search Engine for building a
knowledge graph on-demand
July 2016
https://doi.org/10.1007/978-3-319-48472-3_47
+ https://doi.org/10.1109/ICSC.2017.85
researchgate.net
Ranking in time-varying complex networks
Ranking in evolving complex networks
Hao Liao, Manuel Sebastian Mariani, Matúš Medo, Yi-Cheng Zhang, Ming-
Yang Zhou
Physics Reports Volume 689, 19 May 2017, Pages 1-54
https://doi.org/10.1016/j.physrep.2017.05.001
Top: The often-studied Zachary’s karate club network has 34 nodes
and 78 links (here visualized with the Gephi software). Bottom:
Ranking of the nodes in the Zachary karate club network by the
centrality metrics described in this section. Node labels on the
horizontal axis correspond to the node labels in the top panel.
For the APS citation data from the period 1893–2015
(560,000 papers in total), we compute the ranking of
papers according to various metrics— citation count c,
PageRank centrality p (with the teleportation
parameter α = 0.5), and rescaled PageRank R(p). The
figure shows the median ranking position of the top 1%
of papers from each year. The three curves show three
distinct patterns. For c, the median rank is stable until
approximately 1995; then it starts to grow because the
best young papers have not yet reached sufficiently
high citation counts. For p, the median rank grows
during the whole displayed time period because
PageRank applied on an acyclic time-ordered citation
network favors old papers. By contrast, the curve is
approximately flat for R(p) during the whole period
which confirms that the metric is not biased by paper
age and gives equal chances to all papers.
An illustration of the difference between the first-order Markovian
(time-aggregated) and second-order network representation of
the same data. Panels A–B represent the destination cities (the
right-most column) of flows of passengers from Chicago to other
cities, given the previous location (the left-most column). When
including memory effects (panel B), the fraction of passengers
coming back to the original destination is large, in agreement with
our intuition. A similar effect is found for the network of academic
journals
information diffusion intro
Many graphs can be modeled or used to predict how an information flows in the given graph.
● How influential are with your Instagram posts, tweets, LinkedIn posts, etc?
● How does tweet affect the stock market, or in more general terms, how can the causality be inferred from graph?
● In practice, you see heat diffusion methods applied also applied to information diffusion
Random walks and diffusion on networks
Naoki Masuda, Mason A. Porter, Renaud Lambiotte
Physics Reports (Available online 31 August 2017)
https://doi.org/10.1016/j.physrep.2017.07.007
Fig. 12. The weary random walker retires from the network and heads off
into the distant sunset. [This picture was drawn by Yulian Ng.].
Inferring networks of diffusion and influence
Manuel Gomez Rodriguez, Jure Leskovec, Andreas Krause
KDD '10 Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
https://doi.org/10.1145/1835804.1835933
There are several interesting directions for future work.
Here we only used time difference to infer edges and thus
it would be interesting to utilize more informative features
(e.g., textual content of postings etc.) to more accurately
estimate the influence probabilities. Moreover, our work
considers static propagation networks, however real
influence networks are dynamic and thus it would be
interesting to relax this assumption. Last, there are many
other domains where our methodology could be useful:
inferring interaction networks in systems biology (protein-
protein and gene interaction networks), neuroscience
(inferring physical connections between neurons) and
epidemiology. We believe that our results provide a
promising step towards understanding complex processes
on networks based on partial observations.
information diffusion Social Networks #1
Nonlinear Dynamics of Information Diffusion in Social Networks
ACM Transactions on the Web (TWEB) Volume 11 Issue 2, May 2017 Article No. 11
https://doi.org/10.1145/3057741
Online Social Networks and information diffusion: The role of ego networks
Valerio Arnaboldi, Marco Conti, Andrea Passarella, Robin I.M. Dunbar
Online Social Networks and Media 1 (2017) 44–55
http://dx.doi.org/10.1016/j.osnem.2017.04.001
Data Driven Modeling of Continuous Time Information Diffusion in Social
Networks
Liang Liu ; Bin Chen ; Bo Qu ; Lingnan He ; Xiaogang Qiu
Data Science in Cyberspace (DSC), 2017 IEEE
https://doi.org/10.1109/DSC.2017.103
Online Bayesian Inference of Diffusion Networks
Shohreh Shaghaghian ; Mark Coates
IEEE Transactions on Signal and Information Processing over Networks ( Volume: 3, Issue: 3, Sept. 2017 )
https://doi.org/10.1109/TSIPN.2017.2731160
Modeling the reemergence of information diffusion in social network
Dingda Yang, Xiangwen Liao, Huawei Shen, Xueqi Cheng, Guolong Chen
Physica A: Statistical Mechanics and its Applications [Available online 1 September 2017]
http://dx.doi.org/10.1016/j.physa.2017.08.115
Information Diffusion in Online Social Networks: A Survey
Adrien Guille, Hakim Hacid, Cécile Favre, Djamel A. Zighed
ACM SIGMOD Volume 42 Issue 2, May 2013 Pages 17-28
https://doi.org/10.1145/2503792.2503797
information diffusion Social Networks #2
Literature Survey on Interplay of Topics, Information Diffusion and
Connections on Social Networks
Kuntal Dey, Saroj Kaushik, L. Venkata Subramaniam
(Submitted on 3 Jun 2017)
https://arxiv.org/abs/1706.00921
information diffusion scientific citation networks #1
Integration of Scholarly Communication Metadata
Using Knowledge Graphs
Afshin Sadeghi, Christoph Lange, Maria-Esther Vidal, Sören Auer
International Conference on Theory and Practice of Digital Libraries
TPDL 2017: Research and Advanced Technology for Digital Libraries pp 328-341
https://doi.org/10.1007/978-3-319-67008-9_26
Particularly, we demonstrate the benefits of exploiting semantic web
technology to reconcile data about authors, papers, and conferences.
A Recommendation System Based on Hierarchical
Clustering of an Article-Level Citation Network
Jevin D. West ; Ian Wesley-Smith ; Carl T. Bergstrom
IEEE Transactions on Big Data ( Volume: 2, Issue: 2, June 1 2016 )
https://doi.org/10.1109/TBDATA.2016.2541167
http://babel.eigenfactor.org/
The scholarly literature is expanding at a rate that necessitates
intelligent algorithms for search and navigation. For the most part, the
problem of delivering scholarly articles has been solved. If one knows the
title of an article, locating it requires little effort and, paywalls permitting,
acquiring a digital copy has become trivial. However, the navigational
aspect of scientific search - finding relevant, influential articles that
one does not know exist - is in its early development
Big Scholarly Data: A Survey
Feng Xia ; Wei Wang ; Teshome Megersa Bekele ; Huan Liu
IEEE Transactionson Big Data ( Volume: 3, Issue: 1, March 1 2017 )
https://doi.org/10.1109/TBDATA.2016.2641460
ASNA - Academic Social Network Analysis
information diffusion scientific citation networks #2
Implicit Multi-Feature Learning for Dynamic Time
Series Prediction of the Impact of Institutions
Xiaomei Bai ; Fuli Zhang ; Jie Hou ; Feng Xia ; Amr Tolba ; ElsayedElashkarr
IEEE Access ( Volume: 5 )
https://doi.org/10.1109/ACCESS.2017.2739179
Predicting the impact of research institutions is an important tool for
decision makers, such as resource allocation for funding bodies.
Despite significant effort of adopting quantitative indicators to measure
the impact of research institutions, little is known that how the impact of
institutions evolves in time
The Role of Positive and Negative Citations in
Scientific Evaluation
Xiaomei Bai ; Ivan Lee ; Zhaolong Ning ; Amr Tolba ; Feng Xia
IEEE Access ( Volume: PP, Issue: 99 )
https://doi.org/10.1109/ACCESS.2017.2740226
Predicting the impact of
research institutions is an
important tool for decision
makers, such as resource
allocation for funding bodies.
Despite significant effort of
adopting quantitative indicators
to measure the impact of
research institutions, little is
known that how the impact of
institutions evolves in time
Recommendation for Cross-
Disciplinary Collaboration
Based on Potential Research
Field Discovery
Wei Liang ; Xiaokang Zhou ; Suzhen Huang ;
Chunhua Hu ; Qun Jin
Advanced Cloud and Big Data (CBD), 2017
https://doi.org/10.1109/CBD.2017.67
The cross-disciplinary information is hidden
in tons of publications, and the relationships
between different fields are complicated,
which make it challengeable
recommending cross-disciplinary
collaboration for a specific researcher.
Petteri: Whether to recommend “outliers”
i.e. unexpected combinations of fields, or
something outside your field that would be
useful to you. Or just the typical landmark
papers of your field? Depends on your
needs for sure.
https://iris.ai/
http://www.bibblio.org/learning-and-knowledge
In the future, we will further explore the relationships between the impact of
institutions and the features driving the impact of institutions change to
enhance the prediction performance. In addition, this work is conducted only
on literatures from the eight top conferences based on Microsoft Academic
Graph (MAG), dataset, examining other conferences for the same observed
patterns could widen the significance of our findings.
information diffusion Finance, Quant trading, decision making
Information Diffusion, Cluster formation
and Entropy-based Network Dynamics
in Equity and Commodity Markets
Stelios Bekiros , Duc Khuong Nguyen , Leonidas Sandoval
Junior , Gazi Salah Uddin
European Journal of Operational Research (2016)
http://dx.doi.org/10.1016/j.ejor.2016.06.052
https://www.prowler.io/
https://www.causalitylink.com/
https://www.forbes.com/sites/antoinegara/2017/02/28/kensho-sp-5
00-million-valuation-jpmorgan-morgan-stanley/#6fe4bb0b5cbf
Technology that brings transparency to complex systems
https://www.kensho.com/
Our platform uses artificial intelligence
to discover, extract and index events,
variables and relationships about
markets, sectors, industries and
equities. It absorbs news articles,
analysts’ point-of view or equity-
related materials as they are
published. Save time and get ahead by
letting AI do the repetitive reading for
you. Focus on new knowledge.
Analysis of Investment Relationships
Between Companies and Organizations
Based on Knowledge Graph
Xiaobo Hu, Xinhuai Tang, Feilong Tang
In: Barolli L., Enokido T. (eds) Innovative Mobile and Internet Services in
Ubiquitous Computing. IMIS 2017. Advances in Intelligent Systems and
Computing, vol 612
https://doi.org/10.1007/978-3-319-61542-4_20
A design for a common-sense
knowledge-enhanced decision-support
system: Integration of high-frequency
market data and real-time news
Kun Chen, Jian Yin, Sulin Pang
Expert Systems (June 2017) doi: 10.1111/exsy.12209
Compared with previous work, our model is the
first to incorporate broad common-sense
knowledge into a decision support system, thereby
improving the news analysis process through the
application of a graphic random-walk framework.
Prototype and experiments based on Hong Kong
stock market data have demonstrated that
common-sense knowledge is an important factor
in building financial decision models that
incorporate news information.
Dynamics of financial markets and
transaction costs: A graph-based study
FelipeLillo, RodrigoValdés
Research in International Business and Finance
Volume 38, September 2016, Pages 455-465
Using financialization as a conceptual framework
to understand the current trading patterns of
financial markets, this work employs a market
graph model for studying the stock indexes of
geographically separated financial markets. By
using an edge creation condition based on a
transaction cost threshold, the resulting market
graph features a strong connectivity, some traces
of a power law in the degree distribution and an
intensive presence of cliques.
Ponzi scheme diffusion in complex
networks
Anding Zhu, Peihua Fu, Qinghe Zhang, ZhenyueChen
Physica A: Statistical Mechanics and its Applications
Volume 479, 1 August 2017, Pages 128-136
https://doi.org/10.1016/j.physa.2017.03.015
“Intelligent knowledge graphs” with “actionable insights”
Model-Driven Analytics: Connecting Data,
Domain Knowledge, and Learning
Thomas Hartmann, Assaad Moawad, Francois Fouquet, Gregory Nain,
Jacques Klein, Yves Le Traon, Jean-Marc Jezequel
(Submitted on 5 Apr 2017)
https://arxiv.org/abs/1704.01320
Gaining profound insights from collected data of today's application domains like IoT, cyber-physical
systems, health care, or the financial sector is business-critical and can create the next multi-billion
dollar market. However, analyzing these data and turning it into valuable insights is a huge challenge.
This is often not alone due to the large volume of data but due to an incredibly high domain complexity,
which makes it necessary to combine various extrapolation and prediction methods to understand the
collected data. Model-driven analytics is a refinement process of raw data driven by a model reflecting
deep domain understanding, connecting data, domain knowledge, and learning.
Graph theory example Applications beyond typical networks
Construction (BIM): “Graph theory based
representation of building information models (BIM)
for access control applications”
Automation in Construction, Volume 68, August 2016, Pages 44-51
https://doi.org/10.1016/j.autcon.2016.04.001
IFC 4 model
IFC-SPF
format.
Medical Imaging (OCT): “Improving Segmentation
of 3D Retina Layers Based on Graph Theory
Approach for Low Quality OCT Images”
Metrology and Measurement System Volume 23, Issue 2 (Jun 2016)
https://doi.org/10.1515/mms-2016-0016
Dijkstra shortest path algorithm
Risk Assessment: “A New Risk
Assessment Framework Using Graph
Theory for Complex ICT Systems”
MIST '16 Proceedings of the 8th ACM CCS1
https://doi.org/10.1145/2995959.2995969
Biodiversity management: “Multiscale
connectivity and graph theory highlight
critical areas for conservation under
climate change”
Ecological Applications (8 June 2016)
http://doi.org/10.1890/15-0925
Brain Imaging: ““Small World” architecture in
brain connectivity and hippocampal volume in
Alzheimer’s disease: a study via graph theory
from EEG data”
Brain Imaging and Behavior April 2017, Volume 11, Issue 2, pp
473–485 doi: 10.1007/s11682-016-9528-3
Small World trends in the two groups of subjects
Medical Imaging (OCT):
“Reconstruction of 3D surface maps from anterior
segment optical coherence tomography images using
graph theory and genetic algorithms”
Biomedical Signal Processing and Control
Volume 25, March 2016, Pages 91-98
https://doi.org/10.1016/j.bspc.2015.11.004
Cybersecurity: “Big Data Behavioral Analytics
Meet Graph Theory: On Effective Botnet
Takedowns”
IEEE Network ( Volume: 31, Issue: 1, January/February 2017 )
https://doi.org/10.1109/MNET.2016.1500116NM
Graph Signal Processing and quantitative graph theory
Defferrard et al. (2016): “The emerging field of Graph Signal Processing (GSP) aims at bridging the gap
between signal processing and spectral graph theory [Shuman et al. 2013], a blend between graph theory and
harmonic analysis. A goal is to generalize fundamental analysis operations for signals from regular grids to
irregular structures embodied by graphs. We refer the reader to Belkin and Niyogi 2008 for an introduction of
the field.”
Matthias Dehmer, Frank Emmert-Streib, Yongtang Shi
https://doi.org/10.1016/j.ins.2017.08.009
The main goal of quantitative graph theory is
the structural quantification of information
contained in complex networks by employing
a measurement approach based on numerical
invariants and comparisons. Furthermore, the
methods as well as the networks do not need to be
deterministic but can be statistic.
Shuman et al. 2013:Perraudin and Vandergheynst 2016:
”the proposed Wiener regularization framework offers a
compelling way to solve traditional problems such as denoising,
regression or semi-supervised learning”
Experiments on the temperature of Molene. Top: A
realization of the stochastic graph signal (first measure).
Bottom center: the temperature of the Island of Brehat.
Bottom right: Recovery errors (inpainting error) for different
noise levels
Graph Fourier Transform GFT
The use of Graph Fourier Transform in image
processing: A new solution to classical problems
Verdoja Francesco. PhD Thesis 2017
https://doi.org/10.1109/ICASSP.2017.7952886
On the Graph Fourier Transform for
Directed Graphs
Stefania Sardellitti ; Sergio Barbarossa ; Paolo Di Lorenzo
IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 )
https://doi.org/10.1109/JSTSP.2017.2726979
The analysis of signals defined over a graph is relevant in many applications, such as social and
economic networks, big data or biological networks, and so on. A key tool for analyzing these
signals is the so called Graph Fourier Transform (GFT). Alternative definitions of GFT have been
suggested in the literature, based on the eigen-decomposition of either the graph Laplacian or
adjacency matrix. In this paper, we address the general case of directed graphs and we propose an
alternative approach that builds the graph Fourier basis as the set of orthonormal vectors that
minimize a continuous extension of the graph cut size, known as the Lovasz extension.
Graph-based approaches have recently seen a spike of interest in the
image processing and computer vision communities, and many
classical problems are finding new solutions thanks to these
techniques. The Graph Fourier Transform (GFT), the equivalent of the
Fourier transform for graph signals, is used in many domains to
analyze and process data modeled by a graph.
In this thesis we present some classical image processing problems
that can be solved through the use of GFT. We’ll focus our attention
on two main research area: the first is image compression, where
the use of the GFT is finding its way in recent literature; we’ll propose
two novel ways to deal with the problem of graph weight
encoding. We’ll also propose approaches to reduce overhead costs
of shape-adaptive compression methods.
The second research field is image anomaly detection, GFT has
never been proposed to this date to solve this class of problems;
we’ll discuss here a novel technique and we’ll test its application on
hyperspectral and medical (PET tumor scan) images
Graph signal Processing #1
Adaptive Least Mean Squares Estimation of Graph
Signals
Paolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania Sardellitti
IEEE Transactions on Signal and Information Processing over Networks
( Volume: 2, Issue: 4, Dec. 2016 )
https://doi.org/10.1109/TSIPN.2016.2613687
Distributed Adaptive Learning of Graph Signals
Paolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania Sardellitti
IEEE Transactions on Signal Processing ( Volume: 65, Issue: 16, Aug.15, 15 2017 )
https://doi.org/10.1109/TSP.2017.2708035
The aim of this paper is to propose a least mean squares (LMS) strategy for adaptive estimation of
signals defined over graphs. Assuming the graph signal to be band-limited, over a known bandwidth, the
method enables reconstruction, with guaranteed performance in terms of mean-square error, and tracking
from a limited number of observations over a subset of vertices.
Furthermore, to cope with the case where the bandwidth is not known beforehand, we propose a method
that performs a sparse online estimation of the signal support in the (graph) frequency domain, which
enables online adaptation of the graph sampling strategy. Finally, we apply the proposed method to build
the power spatial density cartography of a given operational region in a cognitive network environment.
“We apply the proposed distributed framework to power density cartography in cognitive
radio (CR) networks. We consider a 5G scenario, where a dense deployment of radio access
points (RAPs) is envisioned to provide a service environment characterized by very low latency
and high rate access. Each RAP collects data related to the transmissions of primary users
(PUs) at its geographical position, and communicates with other RAPs with the aim of
implementing advanced cooperative sensing techniques”
“This paper represents the first work that merges the well established field
of adaptation and learning over networks, and the emerging topic of
signal processing over graphs. Several interesting problems are still open,
e.g., distributed reconstruction in the presence of directed and/or
switching graph topologies, online identification of the graph signal support
from streaming data, distributed inference of the (possibly unknown)
graph signal topology, adaptation of the sampling strategy to time-varying
scenarios, optimization of the sampling probabilities, just to name a few.
We plan to investigate on these exciting problems in our future works”
Graph signal Processing #2
Kernel Regression for Signals over Graphs
Arun Venkitaraman, Saikat Chatterjee, Peter Händel
(Submitted on 7 Jun 2017)
https://arxiv.org/abs/1706.02191
Uncertainty Principles and Sparse
Eigenvectors of Graphs
Arun Venkitaraman, Saikat Chatterjee, Peter Händel
IEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 )
https://doi.org/10.1109/TSP.2017.2731299
We propose kernel regression for signals over graphs. The optimal regression coefficients are
learnt using a constraint that the target vector is a smooth signal over an underlying graph.
The constraint is imposed using a graph- Laplacian based regularization. We discuss how
the proposed kernel regression exhibits a smoothing effect, simultaneously achieving noise-
reduction and graph-smoothness. We further extend the kernel regression to simultaneously
learn the underlying graph and the regression coefficients.
Our hypothesis was that incorporating the graph smoothness constraint would help
kernel regression to perform better, particularly when we lack sufficient and reliable
training data. Our experiments illustrate that this is indeed the case in practice.
Through experiments we also conclude that graph signals carry sufficient
information about the underlying graph structure which may be extracted in the
regression setting even with moderately small number of samples in comparison with
the graph dimension. Thus, our approach helps both predict and infer the underlying
topology of the network or graph.
When the graph has
repeated eigenvalues we
explained that s graph
Fourier Basis (GFB) is not
unique, and the derived
lower bound can have
different values depending
on the selected GFB. We
provided a constructive
method to find a GFB that
yields the smallest
uncertainty bound. In order
to find the signals that
achieve the derived lower
bound we considered
sparse eigenvectors of the
graph. We showed that the
graph Laplacian has a 2-
sparse eigenvector if and
only if there exists a pair of
nodes with the same
neighbors. When this
happens, the uncertainty
bound is very low and the 2-
sparse eigenvectors
achieve this bound. We
presented examples of
both classical and real-
world graphs with 2-sparse
eigenvectors. We also
discussed that, in some
examples, the
neighborhood structure
has a meaningful
interpretation.
Graph signal Processing #3 Time-varying graphs
Kernel-Based Reconstruction of Space-Time
Functions on Dynamic Graphs
Daniel Romero ; Vassilis N. Ioannidis ; Georgios B. Giannakis
IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 )
https://doi.org/10.1109/JSTSP.2017.2726976
Filtering Random Graph Processes Over Random
Time-Varying Graphs
Kai Qiu ; Xianghui Mao ; Xinyue Shen ; Xiaohan Wang ; Tiejian Li ; Yuantao Gu
IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 )
https://doi.org/10.1109/JSTSP.2017.2726969
DSLR distributed least squares reconstruction LMS least mean-squares
KKF kernel Kalman filter ECoG electrocorticography
NMSE cumulative normalized mean-square error
This paper investigated kernel-based
reconstruction of space-time
functions on graphs. The adopted
approach relied on the construction of
an extended graph, which regards the
time dimension just as a spatial
dimension. Several kernel designs were
introduced together with a batch and
an online function estimators. The latter
is a kernel Kalman filter developed from
a purely deterministic standpoint
without any need to adopt any state-
space model. Future research will deal
with multi-kernel and distributed
versions of the proposed algorithms.
Schemes tailored for time-evolving functions on
graphs include [Bach and Jordan 2004] and [
Mei and Moura 2016], which predict the
function values at time t given observations up
to time t − 1. However, these schemes assume
that the function of interest adheres to a
specific vector autoregression and all vertices
are observed at previous time instances.
Moreover, [Bach and Jordan 2004] requires
Gaussianity along with an ad hoc form of
stationarity.
However, many real-world graph signals are time-varying, and they evolve smoothly, so instead of the signals themselves
being bandlimited or smooth on graph, it is more reasonable that their temporal differences are smooth on graph. In
this paper, a new batch reconstruction method of time-varying graph signals is proposed by exploiting the smoothness
of the temporal difference signals, and the uniqueness of the solution to the corresponding optimization problem is
theoretically analyzed. Furthermore, driven by practical applications faced with real-time requirements, huge size of
data, lack of computing center, or communication difficulties between two non-neighboring vertices, an online
distributed method is proposed by applying local properties of the temporal difference operator and the graph
Laplacian matrix.
In the future, we
will further study
the applications
of smoothness of
temporal
difference signals,
and may combine
it with other
properties of
signals, such as
low rank.
Besides, it is also
interesting to
consider the
situation where
both the signal
and the graph
are time-
varying.
Graph signal Processing #4 Time-varying graphs
Signal Processing on Graphs: Causal Modeling of
Unstructured Data
Jonathan Mei, José M. F. Moura
(Submitted on 28 Feb 2015 (v1), last revised 8 Feb 2017 (this version, v6))
https://arxiv.org/abs/1503.00173
Learning directed Graph Shifts from High-
Dimensional Time Series
Lukas Nagel(June 2017)
Master Thesis, Institute of Telecommunications (TU Wien)
https://pdfs.semanticscholar.org/8822/526b7b2862f6374f5f950c89a14a7a931820.pdf
Many applications collect a large number of time series, for example, the financial data of companies quoted in a stock
exchange, the health care data of all patients that visit the emergency room of a hospital, or the temperature sequences
continuously measured by weather stations across the US. These data are often referred to as unstructured.
A first task in its analytics is to derive a low dimensional representation, a graph or discrete manifold, that describes well
the interrelations among the time series and their intrarelations across time. This paper presents a computationally
tractable algorithm for estimating this graph that structures the data. The resulting graph is directed and weighted,
possibly capturing causal relations, not just reciprocal correlations as in many existing approaches in the literature. A
convergence analysis is carried out. The algorithm is demonstrated on random graph datasets and real network time
series datasets, and its performance is compared to that of related methods. The adjacency matrices estimated with the
new method are close to the true graph in the simulated data and consistent with prior physical knowledge in the real
dataset tested. Frequency ordering depending on the position of the
eigenvalues λ in C. Both graphics are from
Sandryhaila and Moura 2014.
Causal graph signal process. Visualization of the
information spreading through graph shifts for P3(A, c)
We want to apply the causal graph process estimation algorithm to
stock prices and especially point out some additional points of
failure we spotted.
In the shift matrix shown in Figure 4.9a, we observe that the stocks
number 2, 16 and 24 have many incoming connections. It appears
unlikely that this is due to some economic relations and points
towards a numerical problem.
As we were interested in
potential interpretations
of the shift recovered
from the stock data, we
chose to visualize the
largest possible
directions of the shift
shown in Figure 4.11 as a
graph in Figure 4.12. The
only observation we
could draw from the
graph is that there are
multiple bank stocks,
which affect multiple
other stocks. Otherwise,
the connected
companies show no
common ownership
structure nor even
similar or related
products.
The stocks example with no clear expectation did not lead to promising results. Despite this, we described with scaling and averaging two processing steps
that could be applied before starting the estimation algorithm. It is unclear if further tuning were needed or the domain of daily stock data cannot
reasonably be modeled with causal graph processes, and we, therefore, leave this question open for future research.
Graph Wavelet transform vs. GFT #1
Compression of dynamic 3D point clouds using
subdivisional meshes and graph wavelet transforms
Aamir Anis ; Philip A. Chou ; Antonio Ortega
University of Southern California, Los Angeles, CA; † Microsoft Research, Redmond, WA
Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE
https://doi.org/10.1109/ICASSP.2016.7472901
The subdivisional structure also allows us to
obtain a sequence of bipartite graphs that
facilitate the use of GraphBior [
Narang et al. (2012)] to compute the wavelet
transform coefficients of the geometry and
color attributes.
Compact Support Biorthogonal Wavelet
Filterbanks for Arbitrary Undirected Graphs
Sunil K. Narang, Antonio Ortega
(Submitted on 30 Oct 2012 (v1), last revised 19 Nov 2012 (this version, v2))
https://arxiv.org/abs/1210.8129
In this paper, we provide a framework for compression of 3D point cloud sequences. Our approach involves
representing sets of frames by a consistently-evolving high-resolution subdivisional triangular mesh. This
representation helps us facilitate efficient implementations of motion estimation and graph wavelet transforms.
The subdivisional structure plays a crucial role in designing a simple hierarchical method for efficiently
estimating these meshes, and the application of Biorthogonal Graph Wavelet Filterbanks for compression.
Preliminary experimental results show promising performances of both the estimation and the compression
steps, and we believe this work shall open new avenues of research in this emerging field.
Comparison of graph wavelet
designs in terms of key properties:
zero highpass response for constant
graph-signal (DC), critical sampling
(CS), perfect reconstruction (PR),
compact support (Comp),
orthogonal expansion (OE), requires
graph simplification (GS).
In this paper we have presented
novel graph-wavelet filterbanks that
provide a critically sampled
representation with compactly
supported basis functions.
The filterbanks come in two flavors:
a) nonzeroDC filterbanks, and b)
zeroDC filterbanks. The former
filterbanks are designed as
polynomials of the normalized graph
Laplacian matrix, and the latter
filterbanks are extensions of the
former to provide a zero response
by the highpass operators.
Preliminary results showed that the
filterbanks are useful not only for
arbitrary graph but also to the
standard regular signal processing
domains. Extensions of this work will
focus on the application of these
filters to different scenarios,
including, for example, social
network analysis, sensor networks
etc.
Graph Wavelet transform vs. GFT #2
Bipartite Approximation for Graph
Wavelet Signal Decomposition
Jin Zeng ; Gene Cheung ; Antonio Ortega
IEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 )
https://doi.org/10.1109/TSP.2017.2733489
Splines and Wavelets on Circulant Graphs
Madeleine S. Kotzagiannidis, Pier Luigi Dragotti
(Submitted on 15 Mar 2016)
https://arxiv.org/abs/1603.04917
(a) Two-channel
wavelet filterbank
on bipartite graph;
(b) Kernels of H0
, H1
in graphBior
Narang et al. (2012) with
filter length of 19.
Unlike previous works, our design of the two metrics relates directly to energy
compaction for bipartite subgraph decomposition. Comparison with the state-
of-the-art schemes validates our proposed metrics for energy compaction and
illustrates the efficiency of our approach. We are currently working on different
applications of graphBior with our bipartite approximation, e.g., graph-signal
denoising, which will benefit from the energy compaction in the wavelet domain.
In this paper, we have introduced novel
families of wavelets and associated
filterbanks on circulant graphs with
vanishing moment properties, which
reveal (e-)spline-like functions on
graphs, and promote sparse multiscale
representations.
Moreover, we have discussed
generalizations to arbitrary graphs in the
form of a multidimensional wavelet
analysis scheme based on graph
product decomposition, facilitating a
sparsity-promoting generalization with
the advantage of lower-dimensional
processing. In our future work, we wish
to further explore the sets of graph
signals which can be annihilated with
existing and/or evolved graph wavelets
as well as refine its extensions and
relevance for arbitrary graphs.
Graphlet induced subgraphs of a large network
Estimation of Graphlet Statistics
Ryan A. Rossi, Rong Zhou, and Nesreen K. Ahmed
(Submitted on 6 Jan 2017 (v1), last revised 28 Feb 2017 (this version, v2))
https://arxiv.org/abs/1701.01772
Graph Computing Accelerations
Parallel Local Algorithms for Core, Truss,
and Nucleus Decompositions
Ahmet Erdem Sariyuce, C. Seshadhri, Ali Pinar
Sandia National Laboratories, University of California
(Submitted on 2 Apr 2017)
https://arxiv.org/abs/1704.00386
Finding the dense regions of a graph and relations among them is a fundamental task in network
analysis. Nucleus decomposition is a principled framework of algorithms that generalizes the k-
core and k-truss decompositions. It can leverage the higher-order structures to locate the dense
subgraphs with hierarchical relations. … We present a framework of local algorithms to obtain the
exact and approximate nucleus decompositions. Our algorithms are pleasingly parallel and can
provide approximations to explore time and quality trade-offs. Our shared-memory implementation
verifies the efficiency, scalability, and effectiveness of our algorithms on real-world networks. In
particular, using 24 threads, we obtain up to 4.04x and 7.98x speedups for k-truss and (3, 4)
nucleus decompositions.
P-Laplacian on graphs
p-Laplacian Regularized Sparse Coding for Human
Activity Recognition
Weifeng Liu ; Zheng-Jun Zha ; Yanjiang Wang ; Ke Lu ; Dacheng Tao
IEEE Transactions on Industrial Electronics ( Volume: 63, Issue: 8, Aug. 2016 )
https://doi.org/10.1109/TIE.2016.2552147
On the game p-Laplacian on weighted
graphs with applications in image
processing and data clustering
A. ELMOATAZ, X. DESQUESNES and M. TOUTAIN
(3 July 2017) European Journal of Applied Mathematics
https://doi.org/10.1017/S0956792517000122
In this paper, we have introduced a new class of normalized p-Laplacian
operators as a discrete adaptation of the game-theoretic p-Laplacian on
weighted graphs. This class is based on new partial difference operator
which interpolate between normalized 2- Laplacian, 1-Laplacian and ∞-
Laplacian on graphs. This operator is also connected to non-local average
operators such as non-local mean, non-local median and non-local midrange.
It generalizes the normalized p-Laplacian on graphs for 1 ≤ p ≤ . We have∞
shown the connections with local and non-local PDEs of p-Laplacian types
and stochastic game Tug-of-War with noise (Peres et al. 2008). We have
proved existence and uniqueness of the Dirichlet problem involving operators
of this new class. Finally, we have illustrated the interest and behaviour of such
operators in some inverse problems in image processing and machine
learning.
The framework of human activity recognition. Firstly, we extract the representative features of human activity including SIFT, STIP and
MFCC. Then we concatenate the histograms formed by bags of each feature. Thirdly, we learn the sparse codes of each sample and
the corresponding dictionary simultaneously by p-Laplacian regularized sparse coding algorithm. Finally, we input the learned sparse
codes into classifiers i.e. support vector machines to conduct human activity recognition.
As a sparse representation, the proposed p-
Laplacian regularized sparse coding algorithm
can also be employed for modern industry using
data-based techniques [Jung et al. 2015;
Shen et al. 2015] and other computer vision
applications such as video summary and
visual tracking [Bai and Li 2014; Yu et al. 2016].
In the future, we will apply the proposed p-
Laplacian regularized sparse coding for more
practical implementations. We will also study
the extensions to the multiview learning and
deep architecture construction for more
attractive performance.
Sparse coding has achieved promising performance in classification. The
most prominent Laplacian regularized sparse coding employs Laplacian
regularization to preserve the manifold structure; however, Laplacian
regularization suffers from poor generalization. To tackle this problem,
we present a p-Laplacian regularized sparse coding algorithm by
introducing the nonlinear generalization of standard graph Laplacian to
exploit the local geometry. Compared to the conventional graph Laplacian,
the p-Laplacian has tighter isoperimetric inequality and the p-
Laplacian regularized sparse coding can achieve superior theoretical
evidence.
“Applied Laplacian” Mesh processing #1A
Spectral Mesh Processing
H. Zhang, O. Van Kaick, R. Dyer
Computer Graphics Forum 9 April 2010
http://dx.doi.org/10.1111/j.1467-8659.2010.01655.x
Graph Framework for Manifold-valued Data image processing
Nonlocal Inpainting of Manifold-valued Data on Finite
Weighted Graphs
Ronny Bergmann, Daniel Tenbrinck
(Submitted on 21 Apr 2017 (v1), last revised 12 Jul 2017 (this version, v2))
https://arxiv.org/abs/1704.06424
open source code: http://www.mathematik.uni-kl.de/imagepro/members/bergmann/mvirt/
A Graph Framework for Manifold-valued Data
Ronny Bergmann, Daniel Tenbrinck
(Submitted on 17 Feb 2017)
https://arxiv.org/abs/1702.05293
Recently, there has been a strong
ambition to translate models and
algorithms from traditional image
processing to non-Euclidean domains,
e.g., to manifold-valued data. While the
task of denoising has been extensively
studied in the last years, there was rarely
an attempt to perform image inpainting
on manifold-valued data. In this paper we
present a nonlocal inpainting method for
manifold-valued data given on a finite
weighted graph.
First numerical examples using a nonlocal graph
construction with patch-based similarity
measures demonstrate the capabilities and
performance of the inpainting algorithm applied
to manifold-valued images.
Despite an analytic investigation of the
convergence of the presented scheme, future
work includes further development of
numerical algorithms, as well as properties of
the -Laplacian for manifold-valued vertex∞
functions on graphs
Illustration of the basic
definitions and concepts on a
Riemannian manifold M.
In the following we present several examples illustrating the
large variety of problems that can be tackled using the
proposed manifold-valued graph framework. Furthermore,
we compare our framework for the special case of nonlocal
denoising of phase-valued data to a state-of-the-art method.
Finally, we demonstrate a real-world application from
denoising surface normals in digital elevation maps from
LiDAR data. Subsequently, we model manifold-data
measured on samples of an explicitly given surface and in
particular illustrate denoising of diffusion tensors measured
on a sphere. Finally, we investigate denoising of real DT-MRI
data from medical applications both on a regular pixel grid as
well as on an implicitly given surface. All algorithm were
implemented in Mathworks Matlab by extending the open
source software package
Manifold-valued Image Restoration Toolbox (MVIRT) .
Reconstruction results of measured surface normals in digital
elevation maps (DEM) generated by light detection and ranging
(LiDAR) measurements of earth’s surface topology.
Reconstruction results of manifold-valued data given on the
implicit surface of the open Camino brain data set.
segmentation of graphs #1
Convex variational methods for multiclass data
segmentation on graphs
Egil Bae, Ekaterina Merkurjev
(Submitted on 4 May 2016 (v1), last revised 16 Feb 2017 (this version, v4))
https://arxiv.org/abs/1605.01443 | https://doi.org/10.1007/s10851-017-0713-9
Theoretical Analysis of Active Contours on
Graphs
Christos Sakaridis, Kimon Drakopoulos, Petros Maragos
(Submitted on 24 Oct 2016)
https://arxiv.org/abs/1610.07381
Detection of triangle on a random geometric graph. Edges are
omitted for illustration purposes. (a) Original triangle on graph (b)–
(f) Instances of active contour evolution at intervals of 60
iterations, with vertices in the contour’s interior shown in red and
the rest in blue (g) Final detection result after 300 iterations, using
green for true positives, blue for true negatives, red for false
positives and black for false negatives.
Experiments on 3D point clouds acquired by a LiDAR in outdoor scenes demonstrate that the
scenes can accurately be segmented into object classes such as vegetation, the ground plane
and regular structures. The experiments also demonstrate fast and highly accurate convergence
of the algorithms, and show that the approximation difference between the convex and original
problems vanishes or becomes extremely low in practice.
In the future, it would be interesting to investigate region
homogeneity terms for general unsupervised classification
problems. In addition to avoiding the problem of trivial global
minimizers, the region terms may improve the accuracy compared
to models based primarily on boundary terms. Region
homogeneity may for instance be defined in terms of the
eigendecomposition of the covariance matrix or graph Laplacian.
segmentation of graphs #2:
Scalable Motif-aware Graph Clustering
CE Tsourakakis, J Pachocki, Michael Mitzenmacher Harvard University, Cambridge, MA, USA
WWW '17 Proceedings of the 26th International Conference on World Wide Web
Pages 1451-1460
https://doi.org/10.1145/3038912.3052653
Coarsening Massive Influence Networks
for Scalable Diffusion Analysis
Naoto Ohsaka, Tomohiro Sonobe, Sumio Fujita, Ken-ichi Kawarabayashi
SIGMOD '17 Proceedings of the 2017 ACM International Conference on
Management of Data Pages 635-650
https://doi.org/10.1145/3035918.3064045
“superpixelization”/clustering
to speed-up computations
Higher-order organization of complex networks
Austin R. Benson, David F. Gleich, Jure Leskovec (Submitted on 26 Dec 2016)
https://arxiv.org/abs/1612.08447 pre-print to Science→
https://doi.org/10.1126/science.aad9029
Theoretical results in the
supplementary materials
also explain why classes of
hypergraph partitioning
methods are more general
than previously assumed and
how motif-based clustering
provides a rigorous
framework for the special
case of partitioning directed
graphs. Finally, the higher-
order network clustering
framework is generally
applicable to a wide range of
network types, including
directed, undirected,
weighted, and signed
networks.
Graph Summarization #1A
Graph Summarization: A Survey
Yike Liu, Abhilash Dighe, Tara Safavi, Danai Koutra
(Submitted on 14 Dec 2016 (v1), last revised 12 Apr 2017 (this version, v2))
https://arxiv.org/abs/1612.04883
The abundance of generated data and its velocity call for data summarization, one of the main
data mining tasks. … This survey focuses on summarizing interconnected data, otherwise known
as graphs or networks. … . In general, graph summarization or coarsening or aggregation
approaches seek to find a short representation of the input graph, often in the form of a summary
or sparsified graph, which reveals patterns in the original data and preserves specific structural or
other properties, depending on the application domain.
Graph Summarization #1B
Table I: Qualitative comparison of static graph summarization techniques. The first six columns describe the type of the input graph (e.g. with weighted/directed edges, and one/multiple types of node entities), followed by
three algorithm-specific properties (i.e., user parameters, algorithmic compexity—linear on the number of edges or higher—, and type of output). The last column gives the final purpose of each approach. Notation: (1) ∗
indicates that the algorithm can be extended to handle the corresponding type of input, but the authors do not provide details in the paper, for complexity indicates sub-linear; (2) + means that at least one parameter can be∗
set by the user, but it is not required (i.e., the algorithm provides a default value). - Liu et al. (2017)
Point cloud resampling via graphs
Fast Resampling of 3D Point Clouds via Graphs
Siheng Chen ; Dong Tian ; Chen Feng ; Anthony Vetro ; Jelena Kovačević
Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE
https://doi.org/10.1109/ICASSP.2017.7952695
https://arxiv.org/abs/1702.06397
Proposed resampling strategy
enhances contours of a point
cloud. Plots (a) and (b)
resamples 2% points from a 3D
point cloud of a building
containing 381, 903 points. Plot
(b) is more visual-friendly than
Plot (a). Note that the proposed
resampling strategy is able to to
enhance any information
depending on users’
preferences.
2D Image Processing with graphs
Directional graph weight prediction for
image compression
Francesco Verdoja ; Marco Grangetto
Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE
https://doi.org/10.1109/ICASSP.2017.7952410
The experimental results showed that the
proposed technique is able to improve the
compression efficiency; as an example we
reported a Bjøntegaard Delta (BD) rate
reduction of about 30% over JPEG.
Future works will investigate the integration of
the proposed method in more advanced image
and video coding tools comprising adaptive
block sizes and richer set of intra prediction
modes.
Luminance coding in graph-based
representation of multiview images
Thomas Maugey ; Yung-Hsuan Chao ; Akshay Gadde ; Antonio Ortega ; Pascal Frossard
Image Processing (ICIP), 2014 IEEE
https://doi.org/10.1109/ICIP.2014.7025025
(a) Wavelet decomposition on graphs in GraphBior, where shape {circle, triangle, square, and
cross} denote coefficients in LL, LH, HL, HH subbands. (b) Parent-children relationship: P node in
LH band of level l + 1 has five children from two views in level l marked with blue. (c)The procedure
of finding the children node in level for the parent node in level l + 1 (be
Background on
GRAPH Deep learning
Beyond the short
introduction from
the review above
Graph structure known or not?
GRAPH KNOWN
”Graph well defined, when the temperature
measurement positions are known, and
temperature measurement uncertainty is small”
- Perraudin and Vandergheynst 2016
GRAPH “Semi-KNOWN”
”In a way the structure is known as we can
quantify graph signal as number of citations with
some journal impact factor weighing, but does
this really represent the impact of an article?
Scientists are known to game the system and
just responding to the metrics[
*]
. Are they
alternative ways to improve the graph to
represent better the impact of an article and the
GRAPH NOT KNOWN
“Point cloud measured with a terrestrial laser
scanner is unordered point cloud given on non-
grid x,y,z coordinates. It is not trivial to define
how the points are connected to each other”
Bibliometric network analysis by Nees Jan van Eck
[
*]
See e.g.
Clauset, Aaron, Daniel B. Larremore, and Roberta Sinatra. "Data-driven
predictions in the science of science." Science 355.6324 (2017): 477-480. DOI:
10.1126/science.aal4217
Sinatra, Roberta, et al. "Quantifying the evolution of individual scientific
impact." Science 354.6312 (2016): aaf5239. DOI: 10.1126/science.aaf5239
Furlanello, Cesare, et al. "Towards a scientific blockchain framework for
reproducible data analysis." arXiv preprint arXiv: 1707.06552 (2017).
the R-factor, with R standing for reputation, reproducibility, responsibility, and
robustness, http://verumanalytics.io/
Overview of the segmentation method: (a) the initial LiDAR point cloud, (b)
height raster image, (c) patches formed with adjacent cells of the same
value, (d) hierarchized patches, (e) weighted graph, (f) graph partition, (g)
partition result on the raster, (h) segmented point cloud.
- Strimbu and Strimbu (2015)
Graphics and Media
Lab (GML) is a part of
Department of
Computational
Mathematics and
Cybernetics of M.V.
Lomonosov Moscow
State University.
http://graphics.cs.msu.
ru/en/node/922
http://slideplayer.com/slide/8146222/
Convolutions for graphs #1
Deep Convolutional Networks on
Graph-Structured Data
Mikael Henaff, Joan Bruna, Yann LeCun
(Submitted on 16 Jun 2015)
https://arxiv.org/abs/1506.05163
https://github.com/mdeff/cnn_graph
However, as our results demonstrate, their extension poses significant
challenges:
• Although the learning complexity requires O(1) parameters per feature map,
the evaluation, both forward and backward, requires a multiplication by the
Graph Fourier Transform, which costs O(N2
) operations. This is a major
difference with respect to traditional ConvNets, which require only O(N).
Fourier implementations of Convnets bring the complexity to O(N log N)
thanks again to the specific symmetries of the grid. An open question is
whether one can find approximate eigenbasis of general Graph Laplacians
using Givens’ decompositions similar to those of the FFT.
Our experiments show that when the input graph structure is not known a
priori, graph estimation is the statistical bottleneck of the model,
requiring O(N2) for general graphs and O(MN) for M-dimensional graphs.
Supervised graph estimation performs significantly better than unsupervised
graph estimation based on low-order moments. Furthermore, we have
verified that the architecture is quite sensitive to graph estimation
errors. In the supervised setting, this step can be viewed in terms of a
Bootstrapping mechanism, where an initially unconstrained network is self-
adjusted to become more localized and with weightsharing.
• Finally, the statistical assumptions of stationarity and compositionality
are not always verified. In those situations, the constraints imposed by the
model risk to reduce its capacity for no reason. One possibility for addressing
this issue is to insert Fully connected layers between the input and the
spectral layers, such that data can be transformed into the appropriate
statistical model. Another strategy, that is left for future work, is to relax the
notion of weight sharing by introducing instead a commutation error ∥Wi
L −
LWi
∥ with the graph Laplacian, which puts a soft penalty on transformations
that do not commute with the Laplacian, instead of imposing exact
commutation as is the case in the spectral net.
We explore for two areas of application for which it has not been
possible to apply convolutional networks before: text categorization
and bioinformatics. Our results show that our method is capable of
matching or outperforming large, fully-connected networks trained
with dropout using fewer parameters.
Our main contributions can be summarized as follows:
● We extend the ideas from Bruna et al. (2013) to large-scale
classification problems, specifically Imagenet Object
Recognition, text categorization and bioinformatics.
● We consider the most general setting where no prior information
on the graph structure is available, and propose unsupervised
and new supervised graph estimation strategies in combination
with the supervised graph convolutions.
Convolutions for graphs #2
Learning Convolutional Neural Networks
for Graphs
Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov ;
Proceedings of The 33rd International Conference on Machine
Learning, PMLR 48:2014-2023, 2016.
http://proceedings.mlr.press/v48/niepert16.html
A CNN with a receptive field of size 3x3. The field is moved over an image from
left to right and top to bottom using a particular stride (here: 1) and zero-
padding (here: none) (a). The values read by the receptive fields are
transformed into a linear layer and fed to a convolutional architecture (b). The
node sequence for which the receptive fields are created and the shapes of
the receptive fields are fully determined by the hyper-parameters.
An illustration of the proposed architecture. A node sequence is selected
from a graph via a graph labeling procedure. For some nodes in the sequence,
a local neighborhood graph is assembled and normalized. The normalized
neighborhoods are used as receptive fields and combined with existing CNN
components.
The normalization is performed for each of the graphs induced on the neighborhood of a root node v (the red node; node colors indicate distance to
the root node). A graph labeling is used to rank the nodes and to create the normalized receptive fields, one of size k (here: k = 9) for node attributes
and one of size k × k for edge attributes. Normalization also includes cropping of excess nodes and padding with dummy nodes. Each vertex (edge)
attribute corresponds to an input channel with the respective receptive field.
Visualization of RBM features learned with 1-dimensional WL normalized receptive fields of size 9 for a torus (periodic lattice, top left), a preferential
attachment graph (Barabási & Albert 1999, bottom left), a co-purchasing network of political books (top right), and a random graph (bottom right).
Instances of these graphs with about 100 nodes are depicted on the left. A visual representation of the feature’s weights (the darker a pixel, the stronger
the corresponding weight) and 3 graphs sampled from the RBMs by setting all but the hidden node corresponding to the feature to zero. Yellow nodes
have position 1 in the adjacency matrices
“Directions for future work include the use of alternative neural network architectures such
as recurrent neural networks (RNNs); combining different receptive field sizes; pretraining
with e restricted Boltzman machines (RBMs) and autoencoders; and statistical relational
models based on the ideas of the approach.”
Convolutions for graphs #3
Geometric deep learning on graphs
and manifolds using mixture model
CNNs
Federico Monti, Davide Boscaini, Jonathan Masci,
Emanuele Rodolà, Jan Svoboda, Michael M. Bronstein
Submitted on 25 Nov 2016 (v1), last revised 6 Dec 2016
(this version, v3))
https://arxiv.org/abs/1611.08402
Left: intrinsic local polar coordinates ,ρ θ on manifold around a point marked in white. Right: patch operator weighting functions wi
( , )ρ θ used in different
generalizations of convolution on the manifold (hand-crafted in GCNN and ACNN and learned in MoNet). All kernels are L -normalized; red curves∞
represent the 0.5 level set.
Representation of images as graphs. Left: regular grid
(the graph is fixed for all images). Right: graph of superpixel
adjacency (different for each image). Vertices are shown as
red circles, edges as red lines.
Learning configuration used for Cora
and PubMed experiments.
. Predictions obtained
applying MoNet over the
Cora dataset. Marker fill
color represents the
predicted class; marker
outline color represents
the groundtruth class.
In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs
and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN
methods previously proposed in the literature can be considered as particular instances of our framework. We test the
proposed method on standard tasks from the realms of image-, graph- and 3D shape analysis and show that it consistently
outperforms previous approaches.
Convolutions for graphs #4
Convolutional Neural Networks on Graphs
with Fast Localized Spectral Filtering
Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst
Advances in Neural Information Processing Systems 29 (NIPS 2016)
https://arxiv.org/abs/1606.09375
https://github.com/mdeff/cnn_graph
https://youtu.be/cIA_m7vwOVQ
Architecture of a CNN on graphs and the four ingredients of a (graph)
convolutional layer.
It is however known that graph clustering is NP-hard [Bui and Jones, 1992] and
that approximations must be used. While there exist many clustering
techniques, e.g. the popular spectral clustering [von Luxburg, 2007], we are
most interested in multilevel clustering algorithms where each level produces a
coarser graph which corresponds to the data domain seen at a different
resolution.
Future works will investigate two directions.
On one hand, we will enhance the proposed framework with newly developed tools in
GSP. On the other hand, we will explore applications of this generic model to important
fields where the data naturally lies on graphs, which may then incorporate external
information about the structure of the data rather than artificially created graphs which
quality may vary as seen in the experiments.
Another natural and future approach, pioneered in [Henaff et al. 2015], would be to
alternate the learning of the CNN parameters and the graph.
Convolutions for graphs #5
Top: Schematic
illustration of a
standard CNN where
patches of w×h
pixels are convolved
with D×E filters to
map the D
dimensional input
features to E
dimensional output
features.
Middle: same, but
representing the
CNN parameters as
a set of M = w×h
weight matrices,
each of size D×E.
Each weight matrix is
associated with a
single relative
position in the input
patch.
Bottom: our graph
convolutional
network, where each
relative position in
the input patch is
associated in a soft
manner to each of
the M weight
matrices using the
function q(xi
, xj
).
Convolutions for graphs #6
CayleyNets: Graph Convolutional Neural
Networks with Complex Rational Spectral Filters
Ron Levie, Federico Monti, Xavier Bresson, Michael M. Bronstein
(Submitted on 22 May 2017)
https://arxiv.org/abs/1705.07664
The core ingredient of our model is a new class of parametric rational complex functions
(Cayley polynomials) allowing to efficiently compute localized regular filters on graphs that
specialize on frequency bands of interest. Our model scales linearly with the size of the input
data for sparsely-connected graphs, can handle different constructions of Laplacian
operators, and typically requires less parameters than previous models
Filters (spatial domain, top and spectral domain, bottom) learned by
CayleyNet (left) and ChebNet (center, right) on the MNIST dataset.
Cayley filters are able to realize larger supports for the same order r.
Eigenvalues of the unnormalized Laplacian h u∆ of the 15-communities
graph mapped on the complex unit half-circle by means of Cayley
transform with spectral zoom values (left-to-right) h = 0.1, 1, and 10. The first
15 frequencies carrying most of the information about the communities are
marked in red. Larger values of h zoom (right) on the low frequency band
Convolutions for graphs #7
Graph Convolutional Matrix Completion
Rianne van den Berg, Thomas N. Kipf, Max Welling
(Submitted on 7 Jun 2017)
https://arxiv.org/abs/1706.02263
Left: Rating matrix M with entries that correspond to user-item interactions (ratings between 1-
5) or missing observations (0). Right: User-item interaction graph with bipartite structure. Edges
correspond to interaction events, numbers on edges denote the rating a user has given to a
particular item. The matrix completion task (i.e. predictions for unobserved interactions) can
be cast as a link prediction problem and modeled using an end-to-end trainable graph auto-
encoder.
Schematic of a forward-pass through the MC-GC model, which is comprised of a graph convolutional encoder [U, V ] =
f(X, M1, . . . , MR) that passes and transforms messages from user to item nodes, and vice versa, followed by a bilinear
decoder model that predicts entries of the (reconstructed) rating matrix M = g(U, V), based on pairs of user and item
embeddings.
“Our model can be seen as a first step towards
modeling recommender systems where the
interaction data is integrated into other structured
modalities, such as a social network or a
knowledge graph.
As a next step, it would be interesting to investigate
how the differentiable message passing scheme of
our encoder model can be extended to such
structured data environments. We expect that
further approximations, e.g. subsampling of local
graph neighborhoods, will be necessary in order to
keep requirements in terms of computation and
memory in a feasible range.”
Convolutions for graphs #8
Graph Based Convolutional Neural Network
Michael Edwards, Xianghua Xie
(Submitted on 28 Sep 2016)
https://arxiv.org/abs/1609.08965
Graph based Convolutional Neural Network components. The GCNN is designed from an architecture of
graph convolution and pooling operator layers. Convolution layers generate O output feature maps
dependent on the selected O for that layer. Graph pooling layers will coarsen the current graph and graph
signal based on the selected vertex reduction method.
Two levels of graph pooling operation on regular and irregular grid with MNIST signal. From left: Regular grid, AMG level 1, AMG
level 2, Irregular grid, AMG level 1, AMG level 2.
Feature maps formed by a feed-forward pass of the regular domain. From left: Original image, Convolution round 1, Pooling round
1, Convolution round 2, Pooling round 2
Feature maps formed by a feed-forward pass of the irregular domain. From left: Original image, Convolution round 1, Pooling
round 1, Convolution round 2, Pooling round 2.
This study proposes a novel method of performing deep convolutional learning on the
irregular graph by coupling standard graph signal processing techniques and
backpropagation based neural network design.
Convolutions are performed in the spectral domain of the graph Laplacian and allow for the learning of
spatially localized features whilst handling the nontrivial irregular kernel design. Results are provided on
both a regular and irregular domain classification problem and show the ability to learn localized feature
maps across multiple layers of a network. A graph pooling method is provided that agglomerates
vertices in the spatial domain to reduce complexity and generalize the features learnt. GPU performance
of the algorithm improves upon training and testing speed, however further optimization is needed.
Although the results on the regular grid are outperformed by standard CNN architecture this is
understandable due to the direct use of a local kernel in the spatial domain.
The major contribution over standard CNNs is the ability to function on the irregular graph is not to be
underestimated. Graph based CNN requires costly forward and inverse graph Fourier transforms, and
this requires some work to enhance usability in the community. Ongoing study into graph construction
and reduction techniques is required to encourage uptake by a wider range of problem domains.
Convolutions for graphs #9
Generalizing CNNs for data structured on
locations irregularly spaced out
Jean-Charles Vialatte, Vincent Gripon, Grégoire Mercier
(Submitted on 3 Jun 2016 (v1), last revised 4 Jul 2017 (this version, v3))
https://arxiv.org/abs/1609.08965
In this paper, we have defined a generalized convolution
operator. This operator makes possible to transport the
CNN paradigm to irregular domains. It retains the
proprieties of a regular convolutional operator. Namely, it is
linear, supported locally and uses the same kernel of
weights for each local operation. The generalized
convolution operator can then naturally be used instead of
convolutional layers in a deep learning framework.
Typically, the created model is well suited for input data
that has an underlying graph structure.
The definition of this operator is flexible enough for it
allows to adapt its weight-allocation map to any input
domain, so that depending on the case, the distribution of
the kernel weight can be done in a way that is natural for
this domain. However, in some cases, there is no natural
way but multiple acceptable methods to define the weight
allocation. In further works, we plan to study these
methods. We also plan to apply the generalized operator
on unsupervised learning tasks.
Convolutions for graphs #10
Robust Spatial Filtering with Graph
Convolutional Neural Networks
Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai,
Chao Zhang, Andrew Michael, Nathan Cahill, Raymond Ptucha
(Submitted on 2 Mar 2017 (v1), last revised 14 Jul 2017 (this version, v3))
https://arxiv.org/abs/1703.00792
https://github.com/fps7806/Graph-CNN
Two types of graph datasets. Left: Homogeneous
datasets. All samples in a homogeneous graph data
have identical graph structure, but different vertex
values or “signals”. Right: Heterogeneous graph
samples. Heterogeneous graph samples can vary in
number of vertices, structure of edge connections,
and in the vertex values.
General vertex-edge domain Graph-CNN architecture. Convolution and pooling layers are cascaded into a deep network. FC are fully-
connected layers for graph classification. V is vertex set and A is adjacency matrix that define a graph.
Graph convolution and pooling setting. The convolution operation
obtains a filtered representation of the graph after a multi-hop vertex
filter. Likewise, a compact representation of the graph after a pooling
layer
Convolutions for graphs #11
A Generalization of Convolutional Neural
Networks to Graph-Structured Data
Yotam Hechtlinger, Purvasha Chakravarti, Jining Qin
(Submitted on 26 Apr 2017)
https://arxiv.org/abs/1704.08165
https://github.com/hechtlinger/graph_cnn
Visualization of the graph convolution size 5. For a given node, the convolution
is applied on the node and its 4 closest neighbors selected by the random
walk. As the right figure demonstrates, the random walk can expand further
into the graph to higher degree neighbors. The convolution weights are
shared according to the neighbors’ closeness to the nodes and applied
globally on all nodes.
Visualization of a row of Q(k)
on the graph generated over the 2-D grid at a node near the center, when connecting each node to
its 8 adjacent neighbors. For k = 1, most of the weight is on the node, with smaller weights on the first order neighbors. This
corresponds to a standard 3 × 3 convolution. As k increases the number of active neighbors also increases, providing greater
weight to neighbors farther away, while still keeping the local information.
We propose a generalization of convolutional neural networks from grid-structured data to
graph-structured data, a problem that is being actively researched by our community. Our novel
contribution is a convolution over a graph that can handle different graph structures as its
input. The proposed convolution contains many sought-after attributes; it has a natural and
intuitive interpretation, it can be transferred within different domains of knowledge, it is
computationally efficient and it is effective.
Furthermore, the convolution can be applied on standard regression or classification problems
by learning the graph structure in the data, using the correlation matrix or other methods.
Compared to a fully connected layer, the suggested convolution has significantly fewer parameters
while providing stable convergence and comparable performance. Our experimental results on the
Merck Molecular Activity data set and MNIST data demonstrate the potential of this approach.
Convolutional Neural Networks have already revolutionized the fields of computer vision, speech
recognition and language processing. We think an important step forward is to extend it to other
problems which have an inherent graph structure.
Autoencoders for graphs
Variational Graph Auto-Encoders
Thomas N. Kipf, Max Welling
(Submitted on 21 Nov 2016)
https://arxiv.org/abs/1611.07308
https://github.com/tkipf/gae
→ http://tkipf.github.io/graph-convolutional-networks/
Latent space of unsupervised VGAE model
trained on Cora citation network dataset.
Grey lines denote citation links. Colors
denote document class (not provided during
training).
Future work will investigate
better-suited prior
distributions (instead of
Gaussian here), more flexible
generative models and the
application of a stochastic
gradient descent algorithm for
improved scalability.
Modeling Relational Data with Graph Convolutional Networks
Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling
(Submitted on 17 Mar 2017 (v1), last revised 6 Jun 2017 (this version, v3))
https://arxiv.org/abs/1703.06103
In this work, we introduce relational GCNs (R-GCNs). R-GCNs are specifically designed to deal with highly multi-relational data,
characteristic of realistic knowledge bases. Our entity classification model, similarly to Kipf and Welling [see left], uses softmax
classifiers at each node in the graph. The classifiers take node representations supplied by an R-GCN and predict the labels. The
model, including R-GCN parameters, is learned by optimizing the cross-entropy loss. Our link prediction model can be regarded as
an autoencoder consisting of (1) an encoder: an R-GCN producing latent feature representations of entities, and (2) a decoder: a
tensor factorization model exploiting these representations to predict labeled edges. Though in principle the decoder can rely on any
type of factorization (or generally any scoring function), we use one of the simplest and most effective factorization methods: DistMult [
Yang et al. 2014].
(a) R-GCN per-layer update for a single graph node (in light red). Activations from neighboring nodes (dark blue) are
gathered and then transformed for each relation type individually (for both in- and outgoing edges). The resulting
representation is accumulated in a (normalized) sum and passed through an activation function (such as the ReLU). This
per-node update can be computed in parallel with shared parameters across the whole graph. (b) Depiction of an R-GCN
model for entity classification with a per-node loss function. (c) Link prediction model with an R-GCN encoder
(interspersed with fully-connected/dense layers) and a DistMult decoder that takes pairs of hidden node representations
and produces a score for every (potential) edge in the graph. The loss is evaluated per edge.
Representation Learning For graphs #1
Inductive Representation Learning on Large Graphs
William L. Hamilton, Rex Ying, Jure Leskovec (Submitted on 7 Jun 2017)
https://arxiv.org/abs/1706.02216
http://snap.stanford.edu/graphsage/
We propose a general framework, called GraphSAGE (SAmple and
aggreGatE), for inductive node embedding. Unlike embedding approaches that
are based on matrix factorization, we leverage node features (e.g., text
attributes, node profile information, node degrees) in order to learn an
embedding function that generalizes to unseen nodes. By incorporating node
features in the learning algorithm, we simultaneously learn the topological
structure of each node’s neighborhood as well as the distribution of node
features in the neighborhood. While we focus on feature-rich graphs (e.g.,
citation data with text attributes, biological data with functional/molecular
markers), our approach can also make use of structural features that are
present in all graphs (e.g., node degrees). Thus, our algorithm can also be
applied to graphs without node features (i.e. point clouds with only the xyz-
coordinates without RGB texture, normals, etc.)
Low-dimensional vector embeddings of nodes in large graphs have proved
extremely useful as feature inputs for a wide variety of prediction and graph analysis
tasks. The basic idea behind node embedding approaches is to use dimensionality
reduction techniques to distill the high-dimensional information about a node’s
neighborhood into a dense vector embedding. These node embeddings can then be
fed to downstream machine learning systems and aid in tasks such as node
classification, clustering, and link prediction (e.g. LINE, see below).
However, previous works have focused on embedding nodes from a single fixed graph,
and many real-world applications require embeddings to be quickly generated for
unseen nodes, or entirely new (sub)graphs. This inductive capability is essential for
high-throughput, production machine learning systems, which operate on evolving
graphs and constantly encounter unseen nodes (e.g., posts on Reddit, users and videos
on Youtube). An inductive approach to generating node embeddings also facilitates
generalization across graphs with the same form of features: for example, one could
train an embedding generator on protein-protein interaction graphs derived from a
model organism, and then easily produce node embeddings for data collected on new
organisms using the trained model.
LINE: Large-scale Information Network Embedding
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, Qiaozhu Mei
(Submitted on 12 Mar 2015)
https://arxiv.org/abs/1503.03578
https://github.com/tangjianpku/LINE
Representation Learning For graphs #2
Skip-graph: Learning graph embeddings with an
encoder-decoder model
John Boaz Lee, Xiangnan Kong
04 Nov 2016 (modified: 11 Jan 2017) ICLR 2017 conference submission
https://openreview.net/forum?id=BkSqjHqxg&noteId=BkSqjHqxg
We introduced an unsupervised method, based on the encoder-decoder model, for
generating feature representations for graph-structured data. The model was
evaluated on the binary classification task on several real-world datasets. The
method outperformed several state-of-the-art algorithms on the tested datasets.
There are several interesting directions for future work. For instance, we can try
training multiple encoders on random walks generated using very different
neighborhood selection strategies. This may allow the different encoders to capture
different properties in the graphs. We would also like to test the approach using
different neural network architectures. Finally, it would be interesting to test the
method on other types of heterogeneous information networks.
Semi-supervised Learning For graphs
Inductive Representation Learning on Large Graphs
Thang D. Bui, Sujith Ravi, Vivek Ramavajjala
University of Cambridge, United Kingdom; Google Research, Mountain View, CA, USA
(Submitted on 14 Mar 2017)
https://arxiv.org/abs/1703.04818
We have revisited graph-augmentation training of neural networks and proposed
Neural Graph Machines as a general framework for doing so. Its label propagation (for
semi-supervised CNNs see e.g. Tarvainen and Valpola 2017) objective function
encourages the neural networks to make accurate node-level predictions, as in vanilla
neural network training, as well as constrains the networks to learn similar hidden
representations for nodes connected by an edge in the graph. Importantly, the
objective can be trained by stochastic gradient descent and scaled to large graphs
We validated the efficacy of the graph-augmented objective on various tasks including
bloggers’ interest, text category and semantic intent classification problems, using a
wide range of neural network architectures (FFNNs, CNNs and LSTM RNNs). The
experimental results demonstrated that graph-augmented training almost always
helps to find better neural networks that outperforms other techniques in
predictive performance or even much smaller networks that are faster and easier to
train. Additionally, the node-level input features can be combined with graph features
as inputs to the neural networks. We showed that a neural network that simply takes
the adjacency matrix of a graph and produces node labels, can perform better
than a recently proposed two-stage approach using sophisticated graph embeddings
and a linear classifier. Our framework also excels when the neural network is small,
or when there is limited supervision available.
While our objective can be applied to multiple graphs which come from different
domains, we have not fully explored this aspect and leave this as future work. We
expect the domain-specific networks can interact with the graphs to determine the
importance of each domain/graph source in prediction. We also did not explore using
graph regularisation for different hidden layers of the neural networks; we expect
this is key for the multi-graph transfer setting (Yosinski et al., 2014). Another possible
future extension is to use our objective on directed graphs, that is to control the
direction of influence between nodes during training.
Recurrent Networks for graphs #1
Geometric Matrix Completion with Recurrent
Multi-Graph Neural Networks
Federico Monti, Michael M. Bronstein, Xavier Bresson
(Submitted on 22 Apr 2017)
https://arxiv.org/abs/1704.06803
Main contribution. In this work, we treat matrix completion problem as deep learning on graph-structured
data. We introduce a novel neural network architecture that is able to extract local stationary patterns
from the high-dimensional spaces of users and items, and use these meaningful representations to infer the
non-linear temporal diffusion mechanism of ratings. The spatial patterns are extracted by a new CNN
architecture designed to work on multiple graphs. The temporal dynamics of the rating diffusion is produced
by a Long-Short Term Memory (LSTM) recurrent neural network (RNN). To our knowledge, our work is the
first application of graph-based deep learning to matrix completion problem.
Recurrent GCNN (RGCNN) architecture
using the full matrix completion model and
operating simultaneously on the rows and
columns of the matrix X. The output of the
Multi-Graph CNN (MGCNN) module is a q-
dimensional feature vector for each element
of the input matrix. The number of
parameters to learn is O(1) and the learning
complexity is O(mn).
Separable Recurrent GCNN (sRGCNN) architecture using the factorized
matrix completion model and operating separately on the rows and
columns of the factors W, H>. The output of the GCNN module is a q-
dimensional feature vector for each input row/column, respectively. The
number of parameters to learn is O(1) and the learning complexity is O(m
+ n).
Evolution of the matrix X(t) with our architecture using full matrix completion model RGCNN (top) and factorized matrix completion model
sRGCNN (bottom). Numbers indicate the RMS error.
Absolute value of the first 8 spectral filters learnt by our bidimensional
convolution. On the left the first filter with the reference axes
associated to the row and column graph eigenvalues.
Recurrent Networks for graphs #2
Learning From Graph Neighborhoods Using
LSTMs
Rakshit Agrawal, Luca de Alfaro, Vassilis Polychronopoulos
(Submitted on 21 Nov 2016)
https://arxiv.org/abs/1611.06882
https://sites.google.com/view/ml-on-structures
→ https://github.com/ML-on-structures/blockchain-lstm
→ → Bitcoin blockchain data used in paper
“The approach is based on a multi-level architecture built from Long Short-Term
Memory neural nets (LSTMs); the LSTMs learn how to summarize the
neighborhood from data. We demonstrate the effectiveness of the proposed
technique on a synthetic example and on real-world data related to
crowdsourced grading, Bitcoin transactions, and Wikipedia edit reversions.”
The blockchain is the public immutable distributed ledger where Bitcoin transactions are recorded [20]. In Bitcoin, coins
are held by addresses, which are hash values; these address identifiers are used by their owners to anonymously hold
bitcoins, with ownership provable with public key cryptography. A Bitcoin transaction involves a set of source addresses,
and a set of destination addresses: all coins in the source addresses are gathered, and they are then sent in various
amounts to the destination addresses.
Mining data on the blockchain is challenging [Meiklejohn et al. 2013] due to the anonymity of addresses. We use data
from the blockchain to predict whether an address will spend the funds that were deposited to it.
We obtain a dataset of addresses by using a slice of the blockchain. In particular, we consider all the addresses where
deposits happened in a short range of 101 blocks, from 200,000 to 200,100 (included) . They contain 15,709 unique
addresses where deposits took place. Looking at the state of the blockchain after 50,000 blocks (which corresponds to
roughly one year later as each block is mined on average every 10 minutes), 3,717 of those addresses still had funds sitting:
we call these “hoarding addresses”. The goal is to predict which addresses are hoarding addresses, and which spent
the funds. We randomly split the 15,709 addresses into a training set of 10,000 and a validation set of 5,709 addresses.
We built a graph with addresses as nodes, and transactions as edges. Each edge was labeled with features of the
transaction: its time, amount of funds transmitted, number of recipients, and so forth, for a total of 9 features. We
compared two different algorithms:
● Baseline: an informative guess; it guesses a label with a probability equal to its percentage in the training set.
● MLSL of depths 1, 2, 3. The outputs and memory sizes of the learners for the reported results are K2 = K3 = 3.
Increasing these to 5 maintained virtually the same performance while increasing training time. Using only 1 output
and memory cell was not providing any advances in performance.
Quantitative Analysis of the Full Bitcoin Transaction Graph
Dorit Ron, Adi Shamir Financial Cryptography 2012
http://doi.org/10.1007/978-3-642-39884-1_2
Time-series analysis with graphs #1
Spectral Algorithms for Temporal Graph Cuts
Arlei Silva, Ambuj Singh, Ananthram Swami
(Submitted on 15 Feb 2017)
https://arxiv.org/abs/1702.04746
We propose novel formulations and algorithms for
computing temporal cuts using spectral graph theory,
multiplex graphs, divide-and-conquer and low-rank
matrix approximation. Furthermore, we extend our
formulation to dynamic graph signals, where cuts
also capture node values, as graph wavelets.
Experiments show that our solutions are accurate and
scalable, enabling the discovery of dynamic
communities and the analysis of dynamic graph
processes.
This work opens several lines for future investigation:
(i) temporal cuts, as a general framework for solving
problems involving dynamic data, can be applied in
many scenarios, we are particularly interested to see
how our method performs in computer vision tasks;
(ii) Perturbation Theory can provide deeper
theoretical insights into the properties of temporal
cuts [Sole-Ribalta et al. 2013; Taylor et al. 2015]
; finally,
(iii) we want to study Cheeger inequalities [Chung 1996]
for temporal cuts, as means to better understand the
performance of our algorithms.
Temporal graph cut for a primary school network. The cut, represented as node colors, reflects the
network dynamics, capturing major changes in the children’s interactions.
Active learning on Graphs
Active Learning for Graph Embedding
Hongyun Cai, Vincent W. Zheng, Kevin Chen-Chuan Chang
(Submitted on 15 May 2017)
https://arxiv.org/abs/1705.05085
https://github.com/vwz/AGE
In this paper, we proposed a novel active learning
framework for graph embedding named Active
Graph Embedding (AGE). Unlike the traditional
active learning algorithms, AGE processes the
data with structural information and learnt
representations (node embeddings), and it is
carefully designed to address the challenges
brought by these two characteristics.
First, to exploit the graphical information, a
graphical centrality based measurement is
considered in addition to the popular information
entropy based and information density based
query criteria.
Second, the active learning and graph
embedding process are jointly run together
by posing the label query at the end of every
epoch of the graph embedding training process.
Moreover, the time-sensitive weights are put on
the three active learning query criteria which
focus on the graphical centrality at the beginning
and shift the focus to the other two embedding
based criteria as the training process progresses
(i.e., more accurate embeddings are learnt).
Transfer learning on Graphs
Intrinsic Geometric Information Transfer
Learning on Multiple Graph-Structured
Datasets
Jaekoo Lee, Hyunjae Kim, Jongsun Lee, Sungroh Yoon
(Submitted on 15 Nov 2016 (v1), last revised 5 Dec 2016 (this version, v2))
https://arxiv.org/abs/1611.04687
Conventional CNN works on a regular grid domain (top); proposed
transfer learning framework for CNN, which can transfer intrinsic
geometric information obtained from a source graph domain to a
target graph domain (bottom).
Overview of the proposed method.
Conclusion We have proposed a new transfer learning framework for deep learning on graph-structured data. Our approach can transfer the
intrinsic geometric information learned from the graph representation of the source domain to the target domain. We observed that the
knowledge transfer between tasks domains is most effective when the source and target domains possess high similarity in their graph
representations. We anticipate that adoption of our methodology will help extend the territory of deep learning to data in non-grid structure as
well as to cases with limited quantity and quality of data. To prove this, we are planning to apply our approach to diverse datasets in different
domains.
Transfer learning on Graphs #2
Deep Feature Learning for Graphs
Ryan A. Rossi, Rong Zhou, Nesreen K. Ahmed
(Submitted on 28 Apr 2017)
https://arxiv.org/abs/1611.04687
This paper presents a general graph representation learning framework called DeepGL for learning
deep node and edge representations from large (attributed) graphs. In particular, DeepGL begins by
deriving a set of base features (e.g., graphlet features) and automatically learns a multi-layered
hierarchical graph representation where each successive layer leverages the output from the
previous layer to learn features of a higher-order. Contrary to previous work, DeepGL learns relational
functions (each representing a feature) that generalize across-networks and therefore useful for graph-
based transfer learning tasks. Moreover, DeepGL naturally supports attributed graphs, learns
interpretable features, and is space-efficient (by learning sparse feature vectors).
Thus, features learned by DeepGL are interpretable
and naturally generalize for across-network transfer
learning tasks as they can be derived on any arbitrary
graph. The framework is flexible with many
interchangeable components, expressive, interpretable,
parallel, and is both space- and time-efficient for large
graphs with runtime that is linear in the number of edges.
DeepGL has all the following desired properties:
● Effective for attributed graphs and across-network transfer learning tasks
● Space-efficient requiring up to 6× less memory
● Fast with up to 182× speedup in runtime
● Accurate with a mean improvement of 20% or more on many applications
● Parallel with strong scaling results.
Learning Graphs learning the graph itself #1
Learning Graph While Training: An Evolving
Graph Convolutional Neural Network
Ruoyu Li, Junzhou Huang
(Submitted on 10 Aug 2017)
https://arxiv.org/abs/1708.04675
“In this paper, we propose a more general and flexible graph convolution network
(EGCN) fed by batch of arbitrarily shaped data together with their evolving graph
Laplacians trained in supervised fashion. Extensive experiments have been
conducted to demonstrate the superior performance in terms of both the acceleration
of parameter fitting and the significantly improved prediction accuracy on multiple
graph-structured datasets.”
In this paper, we explore our approach primarily on
chemical molecular datasets, although the network
can be straightforwardly trained on other graph-
structured data, such as point cloud, social networks
and so on. Our contributions can be summarized as
follows:
● A novel spectral graph convolution layer boosted by
Laplacian learning (SGC-LL) has been proposed to
dynamically update the residual graph Laplacians via metric
learning for deep graph learning.
● Re-parametrization on feature domain has been
introduced in K-hop spectral graph convolution to
enable our proposed deep graph learning and to
grant graph CNNs the similar capability of feature
extraction on graph data as that in the classical CNNs
on grid data.
● An evolving graph convolution network (EGCN) has
been designed to be fed by a batch of arbitrarily
shaped graph-structured data. The network is able to
construct and learn for each data sample the graph
structure that best serves the prediction part of
network. Extensive experimental results indicate the
benefits from the evolving graph structure of data.
Graph structure as the “signal” for prediction
DeepGraph: Graph Structure Predicts
Network Growth
Cheng Li, Xiaoxiao Guo, Qiaozhu Mei
(Submitted on 20 Oct 2016)
https://arxiv.org/abs/1708.04675
“Extensive experiments on five large collections of real-world networks demonstrate that the
proposed prediction model significantly improves the effectiveness of existing methods,
including linear or nonlinear regressors that use hand-crafted features, graph kernels, and
competing deep learning methods.”
Graph descriptor vs. adjacency matrix.
We have described the process in
converting an adjacency matrix into our
graph descriptor, which is then passed
through a deep neural network for further
feature extraction. All computation in this
process is to obtain a more effective low-
level representation of the topological
structure information than the original
adjacency matrix.
First, isometric graphs could be
represented by many different adjacency
matrices, while our graph descriptor would
provide a unique representation for those
isometric graphs. The unique
representation simplifies the neural
network structures for network growth
prediction.
Second, our graph descriptor provides
similar representations for graphs with
similar structures. The similarity of graphs
is less preserved in adjacency matrix
representation. Such information loss
could cause great burden for deep neural
networks in growth prediction tasks.
Third, our graph descriptor is a universal
graph structure representation which
does not depend on vertex ordering or the
number of vertexes, while the adjacency
matrix is not.
The motivation in adopting Heat Kernel Signature (HKS) is its
theoretical proven properties in representing graphs: HKS is an intrinsic
and informative representation for graphs [31]. Intrinsicness means that
isomorphic graphs map to the same HKS representation, and
informativeness means if two graphs have the same HKS representation,
then they must be isomorphic graphs.
A meaningful future direction is to
integrate network structure with other
types of information, such as the content
of information cascades in the network. A
joint representation of multi-modal
information may maximize the
performance of particular prediction
tasks.
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Cnn
CnnCnn
Cnn
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
 
DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering Algorithm
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetup
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Deep learning
Deep learningDeep learning
Deep learning
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
AlexNet
AlexNetAlexNet
AlexNet
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
PointNet
PointNetPointNet
PointNet
 

Semelhante a Geometric Deep Learning

INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
dannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
IJwest
 
Pak eko 4412ijdms01
Pak eko 4412ijdms01Pak eko 4412ijdms01
Pak eko 4412ijdms01
hyuviridvic
 
Impact of Graphs and Network in Minimizing Project and Product Cost
Impact of Graphs and Network in Minimizing Project and Product CostImpact of Graphs and Network in Minimizing Project and Product Cost
Impact of Graphs and Network in Minimizing Project and Product Cost
The International Journal of Business Management and Technology
 
A frame work for clustering time evolving data
A frame work for clustering time evolving dataA frame work for clustering time evolving data
A frame work for clustering time evolving data
iaemedu
 
Text documents clustering using modified multi-verse optimizer
Text documents clustering using modified multi-verse optimizerText documents clustering using modified multi-verse optimizer
Text documents clustering using modified multi-verse optimizer
IJECEIAES
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932
Editor IJARCET
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932
Editor IJARCET
 
Drsp dimension reduction for similarity matching and pruning of time series ...
Drsp  dimension reduction for similarity matching and pruning of time series ...Drsp  dimension reduction for similarity matching and pruning of time series ...
Drsp dimension reduction for similarity matching and pruning of time series ...
IJDKP
 
2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box
Marc Smith
 

Semelhante a Geometric Deep Learning (20)

MODEL_FOR_SEMANTICALLY_RICH_POINT_CLOUD.pdf
MODEL_FOR_SEMANTICALLY_RICH_POINT_CLOUD.pdfMODEL_FOR_SEMANTICALLY_RICH_POINT_CLOUD.pdf
MODEL_FOR_SEMANTICALLY_RICH_POINT_CLOUD.pdf
 
IEEE Big data 2016 Title and Abstract
IEEE Big data  2016 Title and AbstractIEEE Big data  2016 Title and Abstract
IEEE Big data 2016 Title and Abstract
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
Pak eko 4412ijdms01
Pak eko 4412ijdms01Pak eko 4412ijdms01
Pak eko 4412ijdms01
 
isprsarchives-XL-3-381-2014
isprsarchives-XL-3-381-2014isprsarchives-XL-3-381-2014
isprsarchives-XL-3-381-2014
 
algorithms
algorithmsalgorithms
algorithms
 
Query Optimization Techniques in Graph Databases
Query Optimization Techniques in Graph DatabasesQuery Optimization Techniques in Graph Databases
Query Optimization Techniques in Graph Databases
 
G1803054653
G1803054653G1803054653
G1803054653
 
Research Proposal
Research ProposalResearch Proposal
Research Proposal
 
Impact of Graphs and Network in Minimizing Project and Product Cost
Impact of Graphs and Network in Minimizing Project and Product CostImpact of Graphs and Network in Minimizing Project and Product Cost
Impact of Graphs and Network in Minimizing Project and Product Cost
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POI
 
A frame work for clustering time evolving data
A frame work for clustering time evolving dataA frame work for clustering time evolving data
A frame work for clustering time evolving data
 
Text documents clustering using modified multi-verse optimizer
Text documents clustering using modified multi-verse optimizerText documents clustering using modified multi-verse optimizer
Text documents clustering using modified multi-verse optimizer
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932
 
Drsp dimension reduction for similarity matching and pruning of time series ...
Drsp  dimension reduction for similarity matching and pruning of time series ...Drsp  dimension reduction for similarity matching and pruning of time series ...
Drsp dimension reduction for similarity matching and pruning of time series ...
 
2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box
 
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
 

Mais de PetteriTeikariPhD

Mais de PetteriTeikariPhD (20)

ML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung SoundsML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung Sounds
 
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and OculomicsNext Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
 
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
 
Wearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung SensingWearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung Sensing
 
Precision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthmaPrecision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthma
 
Two-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature SegmentationTwo-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature Segmentation
 
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phaseSkin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
 
Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...
 
Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...
 
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresIntracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
 
Hand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsHand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical Applications
 
Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1
 
Multimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisMultimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysis
 
Creativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyCreativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technology
 
Light Treatment Glasses
Light Treatment GlassesLight Treatment Glasses
Light Treatment Glasses
 
Deep Learning for Biomedical Unstructured Time Series
Deep Learning for Biomedical  Unstructured Time SeriesDeep Learning for Biomedical  Unstructured Time Series
Deep Learning for Biomedical Unstructured Time Series
 
Hyperspectral Retinal Imaging
Hyperspectral Retinal ImagingHyperspectral Retinal Imaging
Hyperspectral Retinal Imaging
 
Instrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopyInstrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopy
 
Future of Retinal Diagnostics
Future of Retinal DiagnosticsFuture of Retinal Diagnostics
Future of Retinal Diagnostics
 
OCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep LearningOCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep Learning
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Geometric Deep Learning

  • 1.
  • 2. Structure of the presentation What High-level overview onemergingfieldofgeometric deeplearning(andgraphdeeplearning) How Presentation focusedonstartup-style organizations witheveryone doingabitofeverything,everyone needingtounderstandabit of everything.CEOcannot bethe ‘ideaguy’not knowinganythingabout graphs andgeometricdeeplearning,ifyouare operatingin thisspace EFFECTUATION – THE BEST THEORY OF ENTREPRENEURSHIP YOU ACTUALLY FOLLOW, WHETHER YOU’VE HEARD OF IT OR NOT by Ricardo dos Santos
  • 4. Geometric Deep Learning #1 Bronstein et al. (July 2017): “Geometric deep learning ( http://geometricdeeplearning.com/) is an umbrella term for e merging techniques attempting to generalize (structured) deep neural models to non- Euclidean domains, such as graphs and manifolds. The purpose of this article is to overview different examples of geometric deep-learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field” SCNN (2013) GCNN/ChebNet (2016) GCN (2016) GNN (2009) Geodesic CNN (2015) Anisotropic CNN (2016) MoNet (2016) Localized SCNN (2015)
  • 5. Geometric Deep Learning #2 Bronstein et al. (July 2017): “The non-Euclidean nature of data implies that there are no such familiar properties as global parameterization, common system of coordinates, vector space structure, or shift-invariance. Consequently, basic operations like convolution that are taken for granted in the Euclidean case are even not well defined on non-Euclidean domains.” “First attempts to generalize neural networks to graphs we are aware of are due to Mori et al. (2005) who proposed a scheme combining recurrent neural networks and random walk models. This approach went almost unnoticed, re-emerging in a modern form in Suhkbaatar et al. (2016) and Li et al. (2015) due to the renewed recent interest in deep learning.” “In a parallel effort in the computer vision and graphics community, Masci et al. (2015) showed the first CNN model on meshed surfaces, resorting to a spatial definition of the convolution operation based on local intrinsic patches. Among other applications, such models were shown to achieve state-of-the-art performance in finding correspondence between deformable 3D shapes. Followup works proposed different construction of intrinsic patches on point clouds Boscaini et al. (2016)a,b and general graphs Monti et al. (2016).” In calculus, the notion of derivative describes how the value of a function changes with an infinitesimal change of its argument. One of the big differences distinguishing classical calculus from differential geometry is a lack of vector space structure on the manifold, prohibiting us from naïvely using expressions like f(x+dx). The conceptual leap that is required to generalize such notions to manifolds is the need to work locally in the tangent space. Physically, a tangent vector field can be thought of as a flow of material on a manifold. The divergence measures the net flow of a field at a point, allowing to distinguish between field ‘sources’ and ‘sinks’. Finally, the Laplacian (or Laplace-Beltrami operator in differential geometric jargon) “A centerpiece of classical Euclidean signal processing is the property of the Fourier transform diagonalizing the convolution operator, colloquially referred to as the Convolution Theorem. This property allows to express the convolution f⋆g of two functions in the spectral domain as the element-wise product of their Fourier transforms. Unfortunately, in the non-Euclidean case we cannot even define the operation x-x’ on the manifold or graph, so the notion of convolution does not directly extend to this case.
  • 6. Geometric Deep Learning #3 Bronstein et al. (July 2017): “We expect the following years to bring exciting new approaches and results, and conclude our review with a few observations of current key difficulties and potential directions of future research.” Generalization: Generalizing deep learning models to geometric data requires not only finding non-Euclidean counterparts of basic building blocks (such as convolutional and pooling layers), but also generalization across different domains. Generalization capability is a key requirement in many applications, including computer graphics, where a model is learned on a training set of non-Euclidean domains (3D shapes) and then applied to previously unseen ones. Time-varying domains: An interesting extension of geometric deep learning problems discussed in this review is coping with signals defined over a dynamically changing structure. In this case, we cannot assume a fixed domain and must track how these changes affect signals. This could prove useful to tackle applications such as abnormal activity detection in social or financial networks. In the domain of computer graphics and vision, potential applications deal with dynamic shapes (e.g. 3D video captured by a range sensor). Computation: The final consideration is a computational one. All existing deep learning software frameworks are primarily optimized for Euclidean data. One of the main reasons for the computational efficiency of deep learning architectures (and one of the factors that contributed to their renaissance) is the assumption of regularly structured data on 1D or 2D grid, allowing to take advantage of modern GPU hardware. Geometric data, on the other hand, in most cases do not have a grid structure, requiring different ways to achieve efficient computations. It seems that computational paradigms developed for large-scale graph processing are more adequate frameworks for such applications.
  • 7. Primer on GRAPHs Taylor and Wrana (2012) doi: 10.1002/pmic.201100594
  • 8. Graph theory especially useful for networks analysis https://doi.org/10.1126/science.286.5439.509 Cited by 29,071 articles https://doi.org/10.1038/30918 Cited by 33,772 Random rewiring procedure for interpolating between a regular ring lattice and a random network, without altering the number of vertices or edges in the graph. http://www.bbc.co.uk/newsbeat/article/35500398/how-facebook-updated-six-degree s-of-separation-its-now-357 https://research.fb.com/three-and-a-half-degrees-of-separation/ http://slideplayer.com/slide/9267536/
  • 9. Graph theory Common metrics and definitions Graph-theoretic node importance mining on network topology - Xue et al. (2017) The graph-theoretic node importance mining methods based on network topologies comprise two main categories: node relevance and shortest path. The method of node relevance is measured by degree analysis. The methods of shortest path that aim at finding optimal spreading paths are measured by several node importance analyses, e.g., betweenness, closeness centrality, eigenvector centrality, Bonacich centrality and alter- based centrality. Betweenness is used particularly for measurements of power while closeness centrality and eigenvector centrality are used particularly for measurements of centrality. Bonacich centrality is an extension of eigenvector centrality which measures node importance on both centrality and power. The other mining methods for node importance based on network topologies included in this review are via processes such as node deleting, node contraction, and data mining and machine learning embedded techniques. For heterogeneous network structures, fusion methods integrate all the previously mentioned measurements. 28 February, 2013 Google’s Knowledge Graph: one step closer to the semantic web? By Andrew Isidoro Knowledge Graph, a database of over 570m of the most searched-for people, places and things (entities), including around 18bn cross-references. The knowledge graph as the default data model for learning on heterogeneous knowledge Wilcke, Xandera; Bloem, Peterc; de Boer, Victor Data Science, vol. Preprint, no. Preprint, pp. 1-19, 2017 http://doi.org/10.3233/DS-170007 The FuhSen Architecture. High-level architecture comprising (a) Mediator and wrappers architecture to build the (b) knowledge graph on demand. The answer of a keyword query corresponds to an RDF subject-molecule that integrates RDF molecules collected from the wrappers. (c) The components to enrich the results KG. FuhSen: A Federated Hybrid Search Engine for building a knowledge graph on-demand July 2016 https://doi.org/10.1007/978-3-319-48472-3_47 + https://doi.org/10.1109/ICSC.2017.85 researchgate.net
  • 10. Ranking in time-varying complex networks Ranking in evolving complex networks Hao Liao, Manuel Sebastian Mariani, Matúš Medo, Yi-Cheng Zhang, Ming- Yang Zhou Physics Reports Volume 689, 19 May 2017, Pages 1-54 https://doi.org/10.1016/j.physrep.2017.05.001 Top: The often-studied Zachary’s karate club network has 34 nodes and 78 links (here visualized with the Gephi software). Bottom: Ranking of the nodes in the Zachary karate club network by the centrality metrics described in this section. Node labels on the horizontal axis correspond to the node labels in the top panel. For the APS citation data from the period 1893–2015 (560,000 papers in total), we compute the ranking of papers according to various metrics— citation count c, PageRank centrality p (with the teleportation parameter α = 0.5), and rescaled PageRank R(p). The figure shows the median ranking position of the top 1% of papers from each year. The three curves show three distinct patterns. For c, the median rank is stable until approximately 1995; then it starts to grow because the best young papers have not yet reached sufficiently high citation counts. For p, the median rank grows during the whole displayed time period because PageRank applied on an acyclic time-ordered citation network favors old papers. By contrast, the curve is approximately flat for R(p) during the whole period which confirms that the metric is not biased by paper age and gives equal chances to all papers. An illustration of the difference between the first-order Markovian (time-aggregated) and second-order network representation of the same data. Panels A–B represent the destination cities (the right-most column) of flows of passengers from Chicago to other cities, given the previous location (the left-most column). When including memory effects (panel B), the fraction of passengers coming back to the original destination is large, in agreement with our intuition. A similar effect is found for the network of academic journals
  • 11. information diffusion intro Many graphs can be modeled or used to predict how an information flows in the given graph. ● How influential are with your Instagram posts, tweets, LinkedIn posts, etc? ● How does tweet affect the stock market, or in more general terms, how can the causality be inferred from graph? ● In practice, you see heat diffusion methods applied also applied to information diffusion Random walks and diffusion on networks Naoki Masuda, Mason A. Porter, Renaud Lambiotte Physics Reports (Available online 31 August 2017) https://doi.org/10.1016/j.physrep.2017.07.007 Fig. 12. The weary random walker retires from the network and heads off into the distant sunset. [This picture was drawn by Yulian Ng.]. Inferring networks of diffusion and influence Manuel Gomez Rodriguez, Jure Leskovec, Andreas Krause KDD '10 Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining https://doi.org/10.1145/1835804.1835933 There are several interesting directions for future work. Here we only used time difference to infer edges and thus it would be interesting to utilize more informative features (e.g., textual content of postings etc.) to more accurately estimate the influence probabilities. Moreover, our work considers static propagation networks, however real influence networks are dynamic and thus it would be interesting to relax this assumption. Last, there are many other domains where our methodology could be useful: inferring interaction networks in systems biology (protein- protein and gene interaction networks), neuroscience (inferring physical connections between neurons) and epidemiology. We believe that our results provide a promising step towards understanding complex processes on networks based on partial observations.
  • 12. information diffusion Social Networks #1 Nonlinear Dynamics of Information Diffusion in Social Networks ACM Transactions on the Web (TWEB) Volume 11 Issue 2, May 2017 Article No. 11 https://doi.org/10.1145/3057741 Online Social Networks and information diffusion: The role of ego networks Valerio Arnaboldi, Marco Conti, Andrea Passarella, Robin I.M. Dunbar Online Social Networks and Media 1 (2017) 44–55 http://dx.doi.org/10.1016/j.osnem.2017.04.001 Data Driven Modeling of Continuous Time Information Diffusion in Social Networks Liang Liu ; Bin Chen ; Bo Qu ; Lingnan He ; Xiaogang Qiu Data Science in Cyberspace (DSC), 2017 IEEE https://doi.org/10.1109/DSC.2017.103 Online Bayesian Inference of Diffusion Networks Shohreh Shaghaghian ; Mark Coates IEEE Transactions on Signal and Information Processing over Networks ( Volume: 3, Issue: 3, Sept. 2017 ) https://doi.org/10.1109/TSIPN.2017.2731160 Modeling the reemergence of information diffusion in social network Dingda Yang, Xiangwen Liao, Huawei Shen, Xueqi Cheng, Guolong Chen Physica A: Statistical Mechanics and its Applications [Available online 1 September 2017] http://dx.doi.org/10.1016/j.physa.2017.08.115 Information Diffusion in Online Social Networks: A Survey Adrien Guille, Hakim Hacid, Cécile Favre, Djamel A. Zighed ACM SIGMOD Volume 42 Issue 2, May 2013 Pages 17-28 https://doi.org/10.1145/2503792.2503797
  • 13. information diffusion Social Networks #2 Literature Survey on Interplay of Topics, Information Diffusion and Connections on Social Networks Kuntal Dey, Saroj Kaushik, L. Venkata Subramaniam (Submitted on 3 Jun 2017) https://arxiv.org/abs/1706.00921
  • 14. information diffusion scientific citation networks #1 Integration of Scholarly Communication Metadata Using Knowledge Graphs Afshin Sadeghi, Christoph Lange, Maria-Esther Vidal, Sören Auer International Conference on Theory and Practice of Digital Libraries TPDL 2017: Research and Advanced Technology for Digital Libraries pp 328-341 https://doi.org/10.1007/978-3-319-67008-9_26 Particularly, we demonstrate the benefits of exploiting semantic web technology to reconcile data about authors, papers, and conferences. A Recommendation System Based on Hierarchical Clustering of an Article-Level Citation Network Jevin D. West ; Ian Wesley-Smith ; Carl T. Bergstrom IEEE Transactions on Big Data ( Volume: 2, Issue: 2, June 1 2016 ) https://doi.org/10.1109/TBDATA.2016.2541167 http://babel.eigenfactor.org/ The scholarly literature is expanding at a rate that necessitates intelligent algorithms for search and navigation. For the most part, the problem of delivering scholarly articles has been solved. If one knows the title of an article, locating it requires little effort and, paywalls permitting, acquiring a digital copy has become trivial. However, the navigational aspect of scientific search - finding relevant, influential articles that one does not know exist - is in its early development Big Scholarly Data: A Survey Feng Xia ; Wei Wang ; Teshome Megersa Bekele ; Huan Liu IEEE Transactionson Big Data ( Volume: 3, Issue: 1, March 1 2017 ) https://doi.org/10.1109/TBDATA.2016.2641460 ASNA - Academic Social Network Analysis
  • 15. information diffusion scientific citation networks #2 Implicit Multi-Feature Learning for Dynamic Time Series Prediction of the Impact of Institutions Xiaomei Bai ; Fuli Zhang ; Jie Hou ; Feng Xia ; Amr Tolba ; ElsayedElashkarr IEEE Access ( Volume: 5 ) https://doi.org/10.1109/ACCESS.2017.2739179 Predicting the impact of research institutions is an important tool for decision makers, such as resource allocation for funding bodies. Despite significant effort of adopting quantitative indicators to measure the impact of research institutions, little is known that how the impact of institutions evolves in time The Role of Positive and Negative Citations in Scientific Evaluation Xiaomei Bai ; Ivan Lee ; Zhaolong Ning ; Amr Tolba ; Feng Xia IEEE Access ( Volume: PP, Issue: 99 ) https://doi.org/10.1109/ACCESS.2017.2740226 Predicting the impact of research institutions is an important tool for decision makers, such as resource allocation for funding bodies. Despite significant effort of adopting quantitative indicators to measure the impact of research institutions, little is known that how the impact of institutions evolves in time Recommendation for Cross- Disciplinary Collaboration Based on Potential Research Field Discovery Wei Liang ; Xiaokang Zhou ; Suzhen Huang ; Chunhua Hu ; Qun Jin Advanced Cloud and Big Data (CBD), 2017 https://doi.org/10.1109/CBD.2017.67 The cross-disciplinary information is hidden in tons of publications, and the relationships between different fields are complicated, which make it challengeable recommending cross-disciplinary collaboration for a specific researcher. Petteri: Whether to recommend “outliers” i.e. unexpected combinations of fields, or something outside your field that would be useful to you. Or just the typical landmark papers of your field? Depends on your needs for sure. https://iris.ai/ http://www.bibblio.org/learning-and-knowledge In the future, we will further explore the relationships between the impact of institutions and the features driving the impact of institutions change to enhance the prediction performance. In addition, this work is conducted only on literatures from the eight top conferences based on Microsoft Academic Graph (MAG), dataset, examining other conferences for the same observed patterns could widen the significance of our findings.
  • 16. information diffusion Finance, Quant trading, decision making Information Diffusion, Cluster formation and Entropy-based Network Dynamics in Equity and Commodity Markets Stelios Bekiros , Duc Khuong Nguyen , Leonidas Sandoval Junior , Gazi Salah Uddin European Journal of Operational Research (2016) http://dx.doi.org/10.1016/j.ejor.2016.06.052 https://www.prowler.io/ https://www.causalitylink.com/ https://www.forbes.com/sites/antoinegara/2017/02/28/kensho-sp-5 00-million-valuation-jpmorgan-morgan-stanley/#6fe4bb0b5cbf Technology that brings transparency to complex systems https://www.kensho.com/ Our platform uses artificial intelligence to discover, extract and index events, variables and relationships about markets, sectors, industries and equities. It absorbs news articles, analysts’ point-of view or equity- related materials as they are published. Save time and get ahead by letting AI do the repetitive reading for you. Focus on new knowledge. Analysis of Investment Relationships Between Companies and Organizations Based on Knowledge Graph Xiaobo Hu, Xinhuai Tang, Feilong Tang In: Barolli L., Enokido T. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing. IMIS 2017. Advances in Intelligent Systems and Computing, vol 612 https://doi.org/10.1007/978-3-319-61542-4_20 A design for a common-sense knowledge-enhanced decision-support system: Integration of high-frequency market data and real-time news Kun Chen, Jian Yin, Sulin Pang Expert Systems (June 2017) doi: 10.1111/exsy.12209 Compared with previous work, our model is the first to incorporate broad common-sense knowledge into a decision support system, thereby improving the news analysis process through the application of a graphic random-walk framework. Prototype and experiments based on Hong Kong stock market data have demonstrated that common-sense knowledge is an important factor in building financial decision models that incorporate news information. Dynamics of financial markets and transaction costs: A graph-based study FelipeLillo, RodrigoValdés Research in International Business and Finance Volume 38, September 2016, Pages 455-465 Using financialization as a conceptual framework to understand the current trading patterns of financial markets, this work employs a market graph model for studying the stock indexes of geographically separated financial markets. By using an edge creation condition based on a transaction cost threshold, the resulting market graph features a strong connectivity, some traces of a power law in the degree distribution and an intensive presence of cliques. Ponzi scheme diffusion in complex networks Anding Zhu, Peihua Fu, Qinghe Zhang, ZhenyueChen Physica A: Statistical Mechanics and its Applications Volume 479, 1 August 2017, Pages 128-136 https://doi.org/10.1016/j.physa.2017.03.015
  • 17. “Intelligent knowledge graphs” with “actionable insights” Model-Driven Analytics: Connecting Data, Domain Knowledge, and Learning Thomas Hartmann, Assaad Moawad, Francois Fouquet, Gregory Nain, Jacques Klein, Yves Le Traon, Jean-Marc Jezequel (Submitted on 5 Apr 2017) https://arxiv.org/abs/1704.01320 Gaining profound insights from collected data of today's application domains like IoT, cyber-physical systems, health care, or the financial sector is business-critical and can create the next multi-billion dollar market. However, analyzing these data and turning it into valuable insights is a huge challenge. This is often not alone due to the large volume of data but due to an incredibly high domain complexity, which makes it necessary to combine various extrapolation and prediction methods to understand the collected data. Model-driven analytics is a refinement process of raw data driven by a model reflecting deep domain understanding, connecting data, domain knowledge, and learning.
  • 18. Graph theory example Applications beyond typical networks Construction (BIM): “Graph theory based representation of building information models (BIM) for access control applications” Automation in Construction, Volume 68, August 2016, Pages 44-51 https://doi.org/10.1016/j.autcon.2016.04.001 IFC 4 model IFC-SPF format. Medical Imaging (OCT): “Improving Segmentation of 3D Retina Layers Based on Graph Theory Approach for Low Quality OCT Images” Metrology and Measurement System Volume 23, Issue 2 (Jun 2016) https://doi.org/10.1515/mms-2016-0016 Dijkstra shortest path algorithm Risk Assessment: “A New Risk Assessment Framework Using Graph Theory for Complex ICT Systems” MIST '16 Proceedings of the 8th ACM CCS1 https://doi.org/10.1145/2995959.2995969 Biodiversity management: “Multiscale connectivity and graph theory highlight critical areas for conservation under climate change” Ecological Applications (8 June 2016) http://doi.org/10.1890/15-0925 Brain Imaging: ““Small World” architecture in brain connectivity and hippocampal volume in Alzheimer’s disease: a study via graph theory from EEG data” Brain Imaging and Behavior April 2017, Volume 11, Issue 2, pp 473–485 doi: 10.1007/s11682-016-9528-3 Small World trends in the two groups of subjects Medical Imaging (OCT): “Reconstruction of 3D surface maps from anterior segment optical coherence tomography images using graph theory and genetic algorithms” Biomedical Signal Processing and Control Volume 25, March 2016, Pages 91-98 https://doi.org/10.1016/j.bspc.2015.11.004 Cybersecurity: “Big Data Behavioral Analytics Meet Graph Theory: On Effective Botnet Takedowns” IEEE Network ( Volume: 31, Issue: 1, January/February 2017 ) https://doi.org/10.1109/MNET.2016.1500116NM
  • 19. Graph Signal Processing and quantitative graph theory Defferrard et al. (2016): “The emerging field of Graph Signal Processing (GSP) aims at bridging the gap between signal processing and spectral graph theory [Shuman et al. 2013], a blend between graph theory and harmonic analysis. A goal is to generalize fundamental analysis operations for signals from regular grids to irregular structures embodied by graphs. We refer the reader to Belkin and Niyogi 2008 for an introduction of the field.” Matthias Dehmer, Frank Emmert-Streib, Yongtang Shi https://doi.org/10.1016/j.ins.2017.08.009 The main goal of quantitative graph theory is the structural quantification of information contained in complex networks by employing a measurement approach based on numerical invariants and comparisons. Furthermore, the methods as well as the networks do not need to be deterministic but can be statistic. Shuman et al. 2013:Perraudin and Vandergheynst 2016: ”the proposed Wiener regularization framework offers a compelling way to solve traditional problems such as denoising, regression or semi-supervised learning” Experiments on the temperature of Molene. Top: A realization of the stochastic graph signal (first measure). Bottom center: the temperature of the Island of Brehat. Bottom right: Recovery errors (inpainting error) for different noise levels
  • 20. Graph Fourier Transform GFT The use of Graph Fourier Transform in image processing: A new solution to classical problems Verdoja Francesco. PhD Thesis 2017 https://doi.org/10.1109/ICASSP.2017.7952886 On the Graph Fourier Transform for Directed Graphs Stefania Sardellitti ; Sergio Barbarossa ; Paolo Di Lorenzo IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 ) https://doi.org/10.1109/JSTSP.2017.2726979 The analysis of signals defined over a graph is relevant in many applications, such as social and economic networks, big data or biological networks, and so on. A key tool for analyzing these signals is the so called Graph Fourier Transform (GFT). Alternative definitions of GFT have been suggested in the literature, based on the eigen-decomposition of either the graph Laplacian or adjacency matrix. In this paper, we address the general case of directed graphs and we propose an alternative approach that builds the graph Fourier basis as the set of orthonormal vectors that minimize a continuous extension of the graph cut size, known as the Lovasz extension. Graph-based approaches have recently seen a spike of interest in the image processing and computer vision communities, and many classical problems are finding new solutions thanks to these techniques. The Graph Fourier Transform (GFT), the equivalent of the Fourier transform for graph signals, is used in many domains to analyze and process data modeled by a graph. In this thesis we present some classical image processing problems that can be solved through the use of GFT. We’ll focus our attention on two main research area: the first is image compression, where the use of the GFT is finding its way in recent literature; we’ll propose two novel ways to deal with the problem of graph weight encoding. We’ll also propose approaches to reduce overhead costs of shape-adaptive compression methods. The second research field is image anomaly detection, GFT has never been proposed to this date to solve this class of problems; we’ll discuss here a novel technique and we’ll test its application on hyperspectral and medical (PET tumor scan) images
  • 21. Graph signal Processing #1 Adaptive Least Mean Squares Estimation of Graph Signals Paolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania Sardellitti IEEE Transactions on Signal and Information Processing over Networks ( Volume: 2, Issue: 4, Dec. 2016 ) https://doi.org/10.1109/TSIPN.2016.2613687 Distributed Adaptive Learning of Graph Signals Paolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania Sardellitti IEEE Transactions on Signal Processing ( Volume: 65, Issue: 16, Aug.15, 15 2017 ) https://doi.org/10.1109/TSP.2017.2708035 The aim of this paper is to propose a least mean squares (LMS) strategy for adaptive estimation of signals defined over graphs. Assuming the graph signal to be band-limited, over a known bandwidth, the method enables reconstruction, with guaranteed performance in terms of mean-square error, and tracking from a limited number of observations over a subset of vertices. Furthermore, to cope with the case where the bandwidth is not known beforehand, we propose a method that performs a sparse online estimation of the signal support in the (graph) frequency domain, which enables online adaptation of the graph sampling strategy. Finally, we apply the proposed method to build the power spatial density cartography of a given operational region in a cognitive network environment. “We apply the proposed distributed framework to power density cartography in cognitive radio (CR) networks. We consider a 5G scenario, where a dense deployment of radio access points (RAPs) is envisioned to provide a service environment characterized by very low latency and high rate access. Each RAP collects data related to the transmissions of primary users (PUs) at its geographical position, and communicates with other RAPs with the aim of implementing advanced cooperative sensing techniques” “This paper represents the first work that merges the well established field of adaptation and learning over networks, and the emerging topic of signal processing over graphs. Several interesting problems are still open, e.g., distributed reconstruction in the presence of directed and/or switching graph topologies, online identification of the graph signal support from streaming data, distributed inference of the (possibly unknown) graph signal topology, adaptation of the sampling strategy to time-varying scenarios, optimization of the sampling probabilities, just to name a few. We plan to investigate on these exciting problems in our future works”
  • 22. Graph signal Processing #2 Kernel Regression for Signals over Graphs Arun Venkitaraman, Saikat Chatterjee, Peter Händel (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02191 Uncertainty Principles and Sparse Eigenvectors of Graphs Arun Venkitaraman, Saikat Chatterjee, Peter Händel IEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 ) https://doi.org/10.1109/TSP.2017.2731299 We propose kernel regression for signals over graphs. The optimal regression coefficients are learnt using a constraint that the target vector is a smooth signal over an underlying graph. The constraint is imposed using a graph- Laplacian based regularization. We discuss how the proposed kernel regression exhibits a smoothing effect, simultaneously achieving noise- reduction and graph-smoothness. We further extend the kernel regression to simultaneously learn the underlying graph and the regression coefficients. Our hypothesis was that incorporating the graph smoothness constraint would help kernel regression to perform better, particularly when we lack sufficient and reliable training data. Our experiments illustrate that this is indeed the case in practice. Through experiments we also conclude that graph signals carry sufficient information about the underlying graph structure which may be extracted in the regression setting even with moderately small number of samples in comparison with the graph dimension. Thus, our approach helps both predict and infer the underlying topology of the network or graph. When the graph has repeated eigenvalues we explained that s graph Fourier Basis (GFB) is not unique, and the derived lower bound can have different values depending on the selected GFB. We provided a constructive method to find a GFB that yields the smallest uncertainty bound. In order to find the signals that achieve the derived lower bound we considered sparse eigenvectors of the graph. We showed that the graph Laplacian has a 2- sparse eigenvector if and only if there exists a pair of nodes with the same neighbors. When this happens, the uncertainty bound is very low and the 2- sparse eigenvectors achieve this bound. We presented examples of both classical and real- world graphs with 2-sparse eigenvectors. We also discussed that, in some examples, the neighborhood structure has a meaningful interpretation.
  • 23. Graph signal Processing #3 Time-varying graphs Kernel-Based Reconstruction of Space-Time Functions on Dynamic Graphs Daniel Romero ; Vassilis N. Ioannidis ; Georgios B. Giannakis IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 ) https://doi.org/10.1109/JSTSP.2017.2726976 Filtering Random Graph Processes Over Random Time-Varying Graphs Kai Qiu ; Xianghui Mao ; Xinyue Shen ; Xiaohan Wang ; Tiejian Li ; Yuantao Gu IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 ) https://doi.org/10.1109/JSTSP.2017.2726969 DSLR distributed least squares reconstruction LMS least mean-squares KKF kernel Kalman filter ECoG electrocorticography NMSE cumulative normalized mean-square error This paper investigated kernel-based reconstruction of space-time functions on graphs. The adopted approach relied on the construction of an extended graph, which regards the time dimension just as a spatial dimension. Several kernel designs were introduced together with a batch and an online function estimators. The latter is a kernel Kalman filter developed from a purely deterministic standpoint without any need to adopt any state- space model. Future research will deal with multi-kernel and distributed versions of the proposed algorithms. Schemes tailored for time-evolving functions on graphs include [Bach and Jordan 2004] and [ Mei and Moura 2016], which predict the function values at time t given observations up to time t − 1. However, these schemes assume that the function of interest adheres to a specific vector autoregression and all vertices are observed at previous time instances. Moreover, [Bach and Jordan 2004] requires Gaussianity along with an ad hoc form of stationarity. However, many real-world graph signals are time-varying, and they evolve smoothly, so instead of the signals themselves being bandlimited or smooth on graph, it is more reasonable that their temporal differences are smooth on graph. In this paper, a new batch reconstruction method of time-varying graph signals is proposed by exploiting the smoothness of the temporal difference signals, and the uniqueness of the solution to the corresponding optimization problem is theoretically analyzed. Furthermore, driven by practical applications faced with real-time requirements, huge size of data, lack of computing center, or communication difficulties between two non-neighboring vertices, an online distributed method is proposed by applying local properties of the temporal difference operator and the graph Laplacian matrix. In the future, we will further study the applications of smoothness of temporal difference signals, and may combine it with other properties of signals, such as low rank. Besides, it is also interesting to consider the situation where both the signal and the graph are time- varying.
  • 24. Graph signal Processing #4 Time-varying graphs Signal Processing on Graphs: Causal Modeling of Unstructured Data Jonathan Mei, José M. F. Moura (Submitted on 28 Feb 2015 (v1), last revised 8 Feb 2017 (this version, v6)) https://arxiv.org/abs/1503.00173 Learning directed Graph Shifts from High- Dimensional Time Series Lukas Nagel(June 2017) Master Thesis, Institute of Telecommunications (TU Wien) https://pdfs.semanticscholar.org/8822/526b7b2862f6374f5f950c89a14a7a931820.pdf Many applications collect a large number of time series, for example, the financial data of companies quoted in a stock exchange, the health care data of all patients that visit the emergency room of a hospital, or the temperature sequences continuously measured by weather stations across the US. These data are often referred to as unstructured. A first task in its analytics is to derive a low dimensional representation, a graph or discrete manifold, that describes well the interrelations among the time series and their intrarelations across time. This paper presents a computationally tractable algorithm for estimating this graph that structures the data. The resulting graph is directed and weighted, possibly capturing causal relations, not just reciprocal correlations as in many existing approaches in the literature. A convergence analysis is carried out. The algorithm is demonstrated on random graph datasets and real network time series datasets, and its performance is compared to that of related methods. The adjacency matrices estimated with the new method are close to the true graph in the simulated data and consistent with prior physical knowledge in the real dataset tested. Frequency ordering depending on the position of the eigenvalues λ in C. Both graphics are from Sandryhaila and Moura 2014. Causal graph signal process. Visualization of the information spreading through graph shifts for P3(A, c) We want to apply the causal graph process estimation algorithm to stock prices and especially point out some additional points of failure we spotted. In the shift matrix shown in Figure 4.9a, we observe that the stocks number 2, 16 and 24 have many incoming connections. It appears unlikely that this is due to some economic relations and points towards a numerical problem. As we were interested in potential interpretations of the shift recovered from the stock data, we chose to visualize the largest possible directions of the shift shown in Figure 4.11 as a graph in Figure 4.12. The only observation we could draw from the graph is that there are multiple bank stocks, which affect multiple other stocks. Otherwise, the connected companies show no common ownership structure nor even similar or related products. The stocks example with no clear expectation did not lead to promising results. Despite this, we described with scaling and averaging two processing steps that could be applied before starting the estimation algorithm. It is unclear if further tuning were needed or the domain of daily stock data cannot reasonably be modeled with causal graph processes, and we, therefore, leave this question open for future research.
  • 25. Graph Wavelet transform vs. GFT #1 Compression of dynamic 3D point clouds using subdivisional meshes and graph wavelet transforms Aamir Anis ; Philip A. Chou ; Antonio Ortega University of Southern California, Los Angeles, CA; † Microsoft Research, Redmond, WA Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE https://doi.org/10.1109/ICASSP.2016.7472901 The subdivisional structure also allows us to obtain a sequence of bipartite graphs that facilitate the use of GraphBior [ Narang et al. (2012)] to compute the wavelet transform coefficients of the geometry and color attributes. Compact Support Biorthogonal Wavelet Filterbanks for Arbitrary Undirected Graphs Sunil K. Narang, Antonio Ortega (Submitted on 30 Oct 2012 (v1), last revised 19 Nov 2012 (this version, v2)) https://arxiv.org/abs/1210.8129 In this paper, we provide a framework for compression of 3D point cloud sequences. Our approach involves representing sets of frames by a consistently-evolving high-resolution subdivisional triangular mesh. This representation helps us facilitate efficient implementations of motion estimation and graph wavelet transforms. The subdivisional structure plays a crucial role in designing a simple hierarchical method for efficiently estimating these meshes, and the application of Biorthogonal Graph Wavelet Filterbanks for compression. Preliminary experimental results show promising performances of both the estimation and the compression steps, and we believe this work shall open new avenues of research in this emerging field. Comparison of graph wavelet designs in terms of key properties: zero highpass response for constant graph-signal (DC), critical sampling (CS), perfect reconstruction (PR), compact support (Comp), orthogonal expansion (OE), requires graph simplification (GS). In this paper we have presented novel graph-wavelet filterbanks that provide a critically sampled representation with compactly supported basis functions. The filterbanks come in two flavors: a) nonzeroDC filterbanks, and b) zeroDC filterbanks. The former filterbanks are designed as polynomials of the normalized graph Laplacian matrix, and the latter filterbanks are extensions of the former to provide a zero response by the highpass operators. Preliminary results showed that the filterbanks are useful not only for arbitrary graph but also to the standard regular signal processing domains. Extensions of this work will focus on the application of these filters to different scenarios, including, for example, social network analysis, sensor networks etc.
  • 26. Graph Wavelet transform vs. GFT #2 Bipartite Approximation for Graph Wavelet Signal Decomposition Jin Zeng ; Gene Cheung ; Antonio Ortega IEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 ) https://doi.org/10.1109/TSP.2017.2733489 Splines and Wavelets on Circulant Graphs Madeleine S. Kotzagiannidis, Pier Luigi Dragotti (Submitted on 15 Mar 2016) https://arxiv.org/abs/1603.04917 (a) Two-channel wavelet filterbank on bipartite graph; (b) Kernels of H0 , H1 in graphBior Narang et al. (2012) with filter length of 19. Unlike previous works, our design of the two metrics relates directly to energy compaction for bipartite subgraph decomposition. Comparison with the state- of-the-art schemes validates our proposed metrics for energy compaction and illustrates the efficiency of our approach. We are currently working on different applications of graphBior with our bipartite approximation, e.g., graph-signal denoising, which will benefit from the energy compaction in the wavelet domain. In this paper, we have introduced novel families of wavelets and associated filterbanks on circulant graphs with vanishing moment properties, which reveal (e-)spline-like functions on graphs, and promote sparse multiscale representations. Moreover, we have discussed generalizations to arbitrary graphs in the form of a multidimensional wavelet analysis scheme based on graph product decomposition, facilitating a sparsity-promoting generalization with the advantage of lower-dimensional processing. In our future work, we wish to further explore the sets of graph signals which can be annihilated with existing and/or evolved graph wavelets as well as refine its extensions and relevance for arbitrary graphs.
  • 27. Graphlet induced subgraphs of a large network Estimation of Graphlet Statistics Ryan A. Rossi, Rong Zhou, and Nesreen K. Ahmed (Submitted on 6 Jan 2017 (v1), last revised 28 Feb 2017 (this version, v2)) https://arxiv.org/abs/1701.01772
  • 28. Graph Computing Accelerations Parallel Local Algorithms for Core, Truss, and Nucleus Decompositions Ahmet Erdem Sariyuce, C. Seshadhri, Ali Pinar Sandia National Laboratories, University of California (Submitted on 2 Apr 2017) https://arxiv.org/abs/1704.00386 Finding the dense regions of a graph and relations among them is a fundamental task in network analysis. Nucleus decomposition is a principled framework of algorithms that generalizes the k- core and k-truss decompositions. It can leverage the higher-order structures to locate the dense subgraphs with hierarchical relations. … We present a framework of local algorithms to obtain the exact and approximate nucleus decompositions. Our algorithms are pleasingly parallel and can provide approximations to explore time and quality trade-offs. Our shared-memory implementation verifies the efficiency, scalability, and effectiveness of our algorithms on real-world networks. In particular, using 24 threads, we obtain up to 4.04x and 7.98x speedups for k-truss and (3, 4) nucleus decompositions.
  • 29. P-Laplacian on graphs p-Laplacian Regularized Sparse Coding for Human Activity Recognition Weifeng Liu ; Zheng-Jun Zha ; Yanjiang Wang ; Ke Lu ; Dacheng Tao IEEE Transactions on Industrial Electronics ( Volume: 63, Issue: 8, Aug. 2016 ) https://doi.org/10.1109/TIE.2016.2552147 On the game p-Laplacian on weighted graphs with applications in image processing and data clustering A. ELMOATAZ, X. DESQUESNES and M. TOUTAIN (3 July 2017) European Journal of Applied Mathematics https://doi.org/10.1017/S0956792517000122 In this paper, we have introduced a new class of normalized p-Laplacian operators as a discrete adaptation of the game-theoretic p-Laplacian on weighted graphs. This class is based on new partial difference operator which interpolate between normalized 2- Laplacian, 1-Laplacian and ∞- Laplacian on graphs. This operator is also connected to non-local average operators such as non-local mean, non-local median and non-local midrange. It generalizes the normalized p-Laplacian on graphs for 1 ≤ p ≤ . We have∞ shown the connections with local and non-local PDEs of p-Laplacian types and stochastic game Tug-of-War with noise (Peres et al. 2008). We have proved existence and uniqueness of the Dirichlet problem involving operators of this new class. Finally, we have illustrated the interest and behaviour of such operators in some inverse problems in image processing and machine learning. The framework of human activity recognition. Firstly, we extract the representative features of human activity including SIFT, STIP and MFCC. Then we concatenate the histograms formed by bags of each feature. Thirdly, we learn the sparse codes of each sample and the corresponding dictionary simultaneously by p-Laplacian regularized sparse coding algorithm. Finally, we input the learned sparse codes into classifiers i.e. support vector machines to conduct human activity recognition. As a sparse representation, the proposed p- Laplacian regularized sparse coding algorithm can also be employed for modern industry using data-based techniques [Jung et al. 2015; Shen et al. 2015] and other computer vision applications such as video summary and visual tracking [Bai and Li 2014; Yu et al. 2016]. In the future, we will apply the proposed p- Laplacian regularized sparse coding for more practical implementations. We will also study the extensions to the multiview learning and deep architecture construction for more attractive performance. Sparse coding has achieved promising performance in classification. The most prominent Laplacian regularized sparse coding employs Laplacian regularization to preserve the manifold structure; however, Laplacian regularization suffers from poor generalization. To tackle this problem, we present a p-Laplacian regularized sparse coding algorithm by introducing the nonlinear generalization of standard graph Laplacian to exploit the local geometry. Compared to the conventional graph Laplacian, the p-Laplacian has tighter isoperimetric inequality and the p- Laplacian regularized sparse coding can achieve superior theoretical evidence.
  • 30. “Applied Laplacian” Mesh processing #1A Spectral Mesh Processing H. Zhang, O. Van Kaick, R. Dyer Computer Graphics Forum 9 April 2010 http://dx.doi.org/10.1111/j.1467-8659.2010.01655.x
  • 31. Graph Framework for Manifold-valued Data image processing Nonlocal Inpainting of Manifold-valued Data on Finite Weighted Graphs Ronny Bergmann, Daniel Tenbrinck (Submitted on 21 Apr 2017 (v1), last revised 12 Jul 2017 (this version, v2)) https://arxiv.org/abs/1704.06424 open source code: http://www.mathematik.uni-kl.de/imagepro/members/bergmann/mvirt/ A Graph Framework for Manifold-valued Data Ronny Bergmann, Daniel Tenbrinck (Submitted on 17 Feb 2017) https://arxiv.org/abs/1702.05293 Recently, there has been a strong ambition to translate models and algorithms from traditional image processing to non-Euclidean domains, e.g., to manifold-valued data. While the task of denoising has been extensively studied in the last years, there was rarely an attempt to perform image inpainting on manifold-valued data. In this paper we present a nonlocal inpainting method for manifold-valued data given on a finite weighted graph. First numerical examples using a nonlocal graph construction with patch-based similarity measures demonstrate the capabilities and performance of the inpainting algorithm applied to manifold-valued images. Despite an analytic investigation of the convergence of the presented scheme, future work includes further development of numerical algorithms, as well as properties of the -Laplacian for manifold-valued vertex∞ functions on graphs Illustration of the basic definitions and concepts on a Riemannian manifold M. In the following we present several examples illustrating the large variety of problems that can be tackled using the proposed manifold-valued graph framework. Furthermore, we compare our framework for the special case of nonlocal denoising of phase-valued data to a state-of-the-art method. Finally, we demonstrate a real-world application from denoising surface normals in digital elevation maps from LiDAR data. Subsequently, we model manifold-data measured on samples of an explicitly given surface and in particular illustrate denoising of diffusion tensors measured on a sphere. Finally, we investigate denoising of real DT-MRI data from medical applications both on a regular pixel grid as well as on an implicitly given surface. All algorithm were implemented in Mathworks Matlab by extending the open source software package Manifold-valued Image Restoration Toolbox (MVIRT) . Reconstruction results of measured surface normals in digital elevation maps (DEM) generated by light detection and ranging (LiDAR) measurements of earth’s surface topology. Reconstruction results of manifold-valued data given on the implicit surface of the open Camino brain data set.
  • 32. segmentation of graphs #1 Convex variational methods for multiclass data segmentation on graphs Egil Bae, Ekaterina Merkurjev (Submitted on 4 May 2016 (v1), last revised 16 Feb 2017 (this version, v4)) https://arxiv.org/abs/1605.01443 | https://doi.org/10.1007/s10851-017-0713-9 Theoretical Analysis of Active Contours on Graphs Christos Sakaridis, Kimon Drakopoulos, Petros Maragos (Submitted on 24 Oct 2016) https://arxiv.org/abs/1610.07381 Detection of triangle on a random geometric graph. Edges are omitted for illustration purposes. (a) Original triangle on graph (b)– (f) Instances of active contour evolution at intervals of 60 iterations, with vertices in the contour’s interior shown in red and the rest in blue (g) Final detection result after 300 iterations, using green for true positives, blue for true negatives, red for false positives and black for false negatives. Experiments on 3D point clouds acquired by a LiDAR in outdoor scenes demonstrate that the scenes can accurately be segmented into object classes such as vegetation, the ground plane and regular structures. The experiments also demonstrate fast and highly accurate convergence of the algorithms, and show that the approximation difference between the convex and original problems vanishes or becomes extremely low in practice. In the future, it would be interesting to investigate region homogeneity terms for general unsupervised classification problems. In addition to avoiding the problem of trivial global minimizers, the region terms may improve the accuracy compared to models based primarily on boundary terms. Region homogeneity may for instance be defined in terms of the eigendecomposition of the covariance matrix or graph Laplacian.
  • 33. segmentation of graphs #2: Scalable Motif-aware Graph Clustering CE Tsourakakis, J Pachocki, Michael Mitzenmacher Harvard University, Cambridge, MA, USA WWW '17 Proceedings of the 26th International Conference on World Wide Web Pages 1451-1460 https://doi.org/10.1145/3038912.3052653 Coarsening Massive Influence Networks for Scalable Diffusion Analysis Naoto Ohsaka, Tomohiro Sonobe, Sumio Fujita, Ken-ichi Kawarabayashi SIGMOD '17 Proceedings of the 2017 ACM International Conference on Management of Data Pages 635-650 https://doi.org/10.1145/3035918.3064045 “superpixelization”/clustering to speed-up computations Higher-order organization of complex networks Austin R. Benson, David F. Gleich, Jure Leskovec (Submitted on 26 Dec 2016) https://arxiv.org/abs/1612.08447 pre-print to Science→ https://doi.org/10.1126/science.aad9029 Theoretical results in the supplementary materials also explain why classes of hypergraph partitioning methods are more general than previously assumed and how motif-based clustering provides a rigorous framework for the special case of partitioning directed graphs. Finally, the higher- order network clustering framework is generally applicable to a wide range of network types, including directed, undirected, weighted, and signed networks.
  • 34. Graph Summarization #1A Graph Summarization: A Survey Yike Liu, Abhilash Dighe, Tara Safavi, Danai Koutra (Submitted on 14 Dec 2016 (v1), last revised 12 Apr 2017 (this version, v2)) https://arxiv.org/abs/1612.04883 The abundance of generated data and its velocity call for data summarization, one of the main data mining tasks. … This survey focuses on summarizing interconnected data, otherwise known as graphs or networks. … . In general, graph summarization or coarsening or aggregation approaches seek to find a short representation of the input graph, often in the form of a summary or sparsified graph, which reveals patterns in the original data and preserves specific structural or other properties, depending on the application domain.
  • 35. Graph Summarization #1B Table I: Qualitative comparison of static graph summarization techniques. The first six columns describe the type of the input graph (e.g. with weighted/directed edges, and one/multiple types of node entities), followed by three algorithm-specific properties (i.e., user parameters, algorithmic compexity—linear on the number of edges or higher—, and type of output). The last column gives the final purpose of each approach. Notation: (1) ∗ indicates that the algorithm can be extended to handle the corresponding type of input, but the authors do not provide details in the paper, for complexity indicates sub-linear; (2) + means that at least one parameter can be∗ set by the user, but it is not required (i.e., the algorithm provides a default value). - Liu et al. (2017)
  • 36. Point cloud resampling via graphs Fast Resampling of 3D Point Clouds via Graphs Siheng Chen ; Dong Tian ; Chen Feng ; Anthony Vetro ; Jelena Kovačević Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE https://doi.org/10.1109/ICASSP.2017.7952695 https://arxiv.org/abs/1702.06397 Proposed resampling strategy enhances contours of a point cloud. Plots (a) and (b) resamples 2% points from a 3D point cloud of a building containing 381, 903 points. Plot (b) is more visual-friendly than Plot (a). Note that the proposed resampling strategy is able to to enhance any information depending on users’ preferences.
  • 37. 2D Image Processing with graphs Directional graph weight prediction for image compression Francesco Verdoja ; Marco Grangetto Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE https://doi.org/10.1109/ICASSP.2017.7952410 The experimental results showed that the proposed technique is able to improve the compression efficiency; as an example we reported a Bjøntegaard Delta (BD) rate reduction of about 30% over JPEG. Future works will investigate the integration of the proposed method in more advanced image and video coding tools comprising adaptive block sizes and richer set of intra prediction modes. Luminance coding in graph-based representation of multiview images Thomas Maugey ; Yung-Hsuan Chao ; Akshay Gadde ; Antonio Ortega ; Pascal Frossard Image Processing (ICIP), 2014 IEEE https://doi.org/10.1109/ICIP.2014.7025025 (a) Wavelet decomposition on graphs in GraphBior, where shape {circle, triangle, square, and cross} denote coefficients in LL, LH, HL, HH subbands. (b) Parent-children relationship: P node in LH band of level l + 1 has five children from two views in level l marked with blue. (c)The procedure of finding the children node in level for the parent node in level l + 1 (be
  • 38. Background on GRAPH Deep learning Beyond the short introduction from the review above
  • 39. Graph structure known or not? GRAPH KNOWN ”Graph well defined, when the temperature measurement positions are known, and temperature measurement uncertainty is small” - Perraudin and Vandergheynst 2016 GRAPH “Semi-KNOWN” ”In a way the structure is known as we can quantify graph signal as number of citations with some journal impact factor weighing, but does this really represent the impact of an article? Scientists are known to game the system and just responding to the metrics[ *] . Are they alternative ways to improve the graph to represent better the impact of an article and the GRAPH NOT KNOWN “Point cloud measured with a terrestrial laser scanner is unordered point cloud given on non- grid x,y,z coordinates. It is not trivial to define how the points are connected to each other” Bibliometric network analysis by Nees Jan van Eck [ *] See e.g. Clauset, Aaron, Daniel B. Larremore, and Roberta Sinatra. "Data-driven predictions in the science of science." Science 355.6324 (2017): 477-480. DOI: 10.1126/science.aal4217 Sinatra, Roberta, et al. "Quantifying the evolution of individual scientific impact." Science 354.6312 (2016): aaf5239. DOI: 10.1126/science.aaf5239 Furlanello, Cesare, et al. "Towards a scientific blockchain framework for reproducible data analysis." arXiv preprint arXiv: 1707.06552 (2017). the R-factor, with R standing for reputation, reproducibility, responsibility, and robustness, http://verumanalytics.io/ Overview of the segmentation method: (a) the initial LiDAR point cloud, (b) height raster image, (c) patches formed with adjacent cells of the same value, (d) hierarchized patches, (e) weighted graph, (f) graph partition, (g) partition result on the raster, (h) segmented point cloud. - Strimbu and Strimbu (2015) Graphics and Media Lab (GML) is a part of Department of Computational Mathematics and Cybernetics of M.V. Lomonosov Moscow State University. http://graphics.cs.msu. ru/en/node/922 http://slideplayer.com/slide/8146222/
  • 40. Convolutions for graphs #1 Deep Convolutional Networks on Graph-Structured Data Mikael Henaff, Joan Bruna, Yann LeCun (Submitted on 16 Jun 2015) https://arxiv.org/abs/1506.05163 https://github.com/mdeff/cnn_graph However, as our results demonstrate, their extension poses significant challenges: • Although the learning complexity requires O(1) parameters per feature map, the evaluation, both forward and backward, requires a multiplication by the Graph Fourier Transform, which costs O(N2 ) operations. This is a major difference with respect to traditional ConvNets, which require only O(N). Fourier implementations of Convnets bring the complexity to O(N log N) thanks again to the specific symmetries of the grid. An open question is whether one can find approximate eigenbasis of general Graph Laplacians using Givens’ decompositions similar to those of the FFT. Our experiments show that when the input graph structure is not known a priori, graph estimation is the statistical bottleneck of the model, requiring O(N2) for general graphs and O(MN) for M-dimensional graphs. Supervised graph estimation performs significantly better than unsupervised graph estimation based on low-order moments. Furthermore, we have verified that the architecture is quite sensitive to graph estimation errors. In the supervised setting, this step can be viewed in terms of a Bootstrapping mechanism, where an initially unconstrained network is self- adjusted to become more localized and with weightsharing. • Finally, the statistical assumptions of stationarity and compositionality are not always verified. In those situations, the constraints imposed by the model risk to reduce its capacity for no reason. One possibility for addressing this issue is to insert Fully connected layers between the input and the spectral layers, such that data can be transformed into the appropriate statistical model. Another strategy, that is left for future work, is to relax the notion of weight sharing by introducing instead a commutation error ∥Wi L − LWi ∥ with the graph Laplacian, which puts a soft penalty on transformations that do not commute with the Laplacian, instead of imposing exact commutation as is the case in the spectral net. We explore for two areas of application for which it has not been possible to apply convolutional networks before: text categorization and bioinformatics. Our results show that our method is capable of matching or outperforming large, fully-connected networks trained with dropout using fewer parameters. Our main contributions can be summarized as follows: ● We extend the ideas from Bruna et al. (2013) to large-scale classification problems, specifically Imagenet Object Recognition, text categorization and bioinformatics. ● We consider the most general setting where no prior information on the graph structure is available, and propose unsupervised and new supervised graph estimation strategies in combination with the supervised graph convolutions.
  • 41. Convolutions for graphs #2 Learning Convolutional Neural Networks for Graphs Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov ; Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2014-2023, 2016. http://proceedings.mlr.press/v48/niepert16.html A CNN with a receptive field of size 3x3. The field is moved over an image from left to right and top to bottom using a particular stride (here: 1) and zero- padding (here: none) (a). The values read by the receptive fields are transformed into a linear layer and fed to a convolutional architecture (b). The node sequence for which the receptive fields are created and the shapes of the receptive fields are fully determined by the hyper-parameters. An illustration of the proposed architecture. A node sequence is selected from a graph via a graph labeling procedure. For some nodes in the sequence, a local neighborhood graph is assembled and normalized. The normalized neighborhoods are used as receptive fields and combined with existing CNN components. The normalization is performed for each of the graphs induced on the neighborhood of a root node v (the red node; node colors indicate distance to the root node). A graph labeling is used to rank the nodes and to create the normalized receptive fields, one of size k (here: k = 9) for node attributes and one of size k × k for edge attributes. Normalization also includes cropping of excess nodes and padding with dummy nodes. Each vertex (edge) attribute corresponds to an input channel with the respective receptive field. Visualization of RBM features learned with 1-dimensional WL normalized receptive fields of size 9 for a torus (periodic lattice, top left), a preferential attachment graph (Barabási & Albert 1999, bottom left), a co-purchasing network of political books (top right), and a random graph (bottom right). Instances of these graphs with about 100 nodes are depicted on the left. A visual representation of the feature’s weights (the darker a pixel, the stronger the corresponding weight) and 3 graphs sampled from the RBMs by setting all but the hidden node corresponding to the feature to zero. Yellow nodes have position 1 in the adjacency matrices “Directions for future work include the use of alternative neural network architectures such as recurrent neural networks (RNNs); combining different receptive field sizes; pretraining with e restricted Boltzman machines (RBMs) and autoencoders; and statistical relational models based on the ideas of the approach.”
  • 42. Convolutions for graphs #3 Geometric deep learning on graphs and manifolds using mixture model CNNs Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda, Michael M. Bronstein Submitted on 25 Nov 2016 (v1), last revised 6 Dec 2016 (this version, v3)) https://arxiv.org/abs/1611.08402 Left: intrinsic local polar coordinates ,ρ θ on manifold around a point marked in white. Right: patch operator weighting functions wi ( , )ρ θ used in different generalizations of convolution on the manifold (hand-crafted in GCNN and ACNN and learned in MoNet). All kernels are L -normalized; red curves∞ represent the 0.5 level set. Representation of images as graphs. Left: regular grid (the graph is fixed for all images). Right: graph of superpixel adjacency (different for each image). Vertices are shown as red circles, edges as red lines. Learning configuration used for Cora and PubMed experiments. . Predictions obtained applying MoNet over the Cora dataset. Marker fill color represents the predicted class; marker outline color represents the groundtruth class. In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN methods previously proposed in the literature can be considered as particular instances of our framework. We test the proposed method on standard tasks from the realms of image-, graph- and 3D shape analysis and show that it consistently outperforms previous approaches.
  • 43. Convolutions for graphs #4 Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst Advances in Neural Information Processing Systems 29 (NIPS 2016) https://arxiv.org/abs/1606.09375 https://github.com/mdeff/cnn_graph https://youtu.be/cIA_m7vwOVQ Architecture of a CNN on graphs and the four ingredients of a (graph) convolutional layer. It is however known that graph clustering is NP-hard [Bui and Jones, 1992] and that approximations must be used. While there exist many clustering techniques, e.g. the popular spectral clustering [von Luxburg, 2007], we are most interested in multilevel clustering algorithms where each level produces a coarser graph which corresponds to the data domain seen at a different resolution. Future works will investigate two directions. On one hand, we will enhance the proposed framework with newly developed tools in GSP. On the other hand, we will explore applications of this generic model to important fields where the data naturally lies on graphs, which may then incorporate external information about the structure of the data rather than artificially created graphs which quality may vary as seen in the experiments. Another natural and future approach, pioneered in [Henaff et al. 2015], would be to alternate the learning of the CNN parameters and the graph.
  • 44. Convolutions for graphs #5 Top: Schematic illustration of a standard CNN where patches of w×h pixels are convolved with D×E filters to map the D dimensional input features to E dimensional output features. Middle: same, but representing the CNN parameters as a set of M = w×h weight matrices, each of size D×E. Each weight matrix is associated with a single relative position in the input patch. Bottom: our graph convolutional network, where each relative position in the input patch is associated in a soft manner to each of the M weight matrices using the function q(xi , xj ).
  • 45. Convolutions for graphs #6 CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters Ron Levie, Federico Monti, Xavier Bresson, Michael M. Bronstein (Submitted on 22 May 2017) https://arxiv.org/abs/1705.07664 The core ingredient of our model is a new class of parametric rational complex functions (Cayley polynomials) allowing to efficiently compute localized regular filters on graphs that specialize on frequency bands of interest. Our model scales linearly with the size of the input data for sparsely-connected graphs, can handle different constructions of Laplacian operators, and typically requires less parameters than previous models Filters (spatial domain, top and spectral domain, bottom) learned by CayleyNet (left) and ChebNet (center, right) on the MNIST dataset. Cayley filters are able to realize larger supports for the same order r. Eigenvalues of the unnormalized Laplacian h u∆ of the 15-communities graph mapped on the complex unit half-circle by means of Cayley transform with spectral zoom values (left-to-right) h = 0.1, 1, and 10. The first 15 frequencies carrying most of the information about the communities are marked in red. Larger values of h zoom (right) on the low frequency band
  • 46. Convolutions for graphs #7 Graph Convolutional Matrix Completion Rianne van den Berg, Thomas N. Kipf, Max Welling (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02263 Left: Rating matrix M with entries that correspond to user-item interactions (ratings between 1- 5) or missing observations (0). Right: User-item interaction graph with bipartite structure. Edges correspond to interaction events, numbers on edges denote the rating a user has given to a particular item. The matrix completion task (i.e. predictions for unobserved interactions) can be cast as a link prediction problem and modeled using an end-to-end trainable graph auto- encoder. Schematic of a forward-pass through the MC-GC model, which is comprised of a graph convolutional encoder [U, V ] = f(X, M1, . . . , MR) that passes and transforms messages from user to item nodes, and vice versa, followed by a bilinear decoder model that predicts entries of the (reconstructed) rating matrix M = g(U, V), based on pairs of user and item embeddings. “Our model can be seen as a first step towards modeling recommender systems where the interaction data is integrated into other structured modalities, such as a social network or a knowledge graph. As a next step, it would be interesting to investigate how the differentiable message passing scheme of our encoder model can be extended to such structured data environments. We expect that further approximations, e.g. subsampling of local graph neighborhoods, will be necessary in order to keep requirements in terms of computation and memory in a feasible range.”
  • 47. Convolutions for graphs #8 Graph Based Convolutional Neural Network Michael Edwards, Xianghua Xie (Submitted on 28 Sep 2016) https://arxiv.org/abs/1609.08965 Graph based Convolutional Neural Network components. The GCNN is designed from an architecture of graph convolution and pooling operator layers. Convolution layers generate O output feature maps dependent on the selected O for that layer. Graph pooling layers will coarsen the current graph and graph signal based on the selected vertex reduction method. Two levels of graph pooling operation on regular and irregular grid with MNIST signal. From left: Regular grid, AMG level 1, AMG level 2, Irregular grid, AMG level 1, AMG level 2. Feature maps formed by a feed-forward pass of the regular domain. From left: Original image, Convolution round 1, Pooling round 1, Convolution round 2, Pooling round 2 Feature maps formed by a feed-forward pass of the irregular domain. From left: Original image, Convolution round 1, Pooling round 1, Convolution round 2, Pooling round 2. This study proposes a novel method of performing deep convolutional learning on the irregular graph by coupling standard graph signal processing techniques and backpropagation based neural network design. Convolutions are performed in the spectral domain of the graph Laplacian and allow for the learning of spatially localized features whilst handling the nontrivial irregular kernel design. Results are provided on both a regular and irregular domain classification problem and show the ability to learn localized feature maps across multiple layers of a network. A graph pooling method is provided that agglomerates vertices in the spatial domain to reduce complexity and generalize the features learnt. GPU performance of the algorithm improves upon training and testing speed, however further optimization is needed. Although the results on the regular grid are outperformed by standard CNN architecture this is understandable due to the direct use of a local kernel in the spatial domain. The major contribution over standard CNNs is the ability to function on the irregular graph is not to be underestimated. Graph based CNN requires costly forward and inverse graph Fourier transforms, and this requires some work to enhance usability in the community. Ongoing study into graph construction and reduction techniques is required to encourage uptake by a wider range of problem domains.
  • 48. Convolutions for graphs #9 Generalizing CNNs for data structured on locations irregularly spaced out Jean-Charles Vialatte, Vincent Gripon, Grégoire Mercier (Submitted on 3 Jun 2016 (v1), last revised 4 Jul 2017 (this version, v3)) https://arxiv.org/abs/1609.08965 In this paper, we have defined a generalized convolution operator. This operator makes possible to transport the CNN paradigm to irregular domains. It retains the proprieties of a regular convolutional operator. Namely, it is linear, supported locally and uses the same kernel of weights for each local operation. The generalized convolution operator can then naturally be used instead of convolutional layers in a deep learning framework. Typically, the created model is well suited for input data that has an underlying graph structure. The definition of this operator is flexible enough for it allows to adapt its weight-allocation map to any input domain, so that depending on the case, the distribution of the kernel weight can be done in a way that is natural for this domain. However, in some cases, there is no natural way but multiple acceptable methods to define the weight allocation. In further works, we plan to study these methods. We also plan to apply the generalized operator on unsupervised learning tasks.
  • 49. Convolutions for graphs #10 Robust Spatial Filtering with Graph Convolutional Neural Networks Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond Ptucha (Submitted on 2 Mar 2017 (v1), last revised 14 Jul 2017 (this version, v3)) https://arxiv.org/abs/1703.00792 https://github.com/fps7806/Graph-CNN Two types of graph datasets. Left: Homogeneous datasets. All samples in a homogeneous graph data have identical graph structure, but different vertex values or “signals”. Right: Heterogeneous graph samples. Heterogeneous graph samples can vary in number of vertices, structure of edge connections, and in the vertex values. General vertex-edge domain Graph-CNN architecture. Convolution and pooling layers are cascaded into a deep network. FC are fully- connected layers for graph classification. V is vertex set and A is adjacency matrix that define a graph. Graph convolution and pooling setting. The convolution operation obtains a filtered representation of the graph after a multi-hop vertex filter. Likewise, a compact representation of the graph after a pooling layer
  • 50. Convolutions for graphs #11 A Generalization of Convolutional Neural Networks to Graph-Structured Data Yotam Hechtlinger, Purvasha Chakravarti, Jining Qin (Submitted on 26 Apr 2017) https://arxiv.org/abs/1704.08165 https://github.com/hechtlinger/graph_cnn Visualization of the graph convolution size 5. For a given node, the convolution is applied on the node and its 4 closest neighbors selected by the random walk. As the right figure demonstrates, the random walk can expand further into the graph to higher degree neighbors. The convolution weights are shared according to the neighbors’ closeness to the nodes and applied globally on all nodes. Visualization of a row of Q(k) on the graph generated over the 2-D grid at a node near the center, when connecting each node to its 8 adjacent neighbors. For k = 1, most of the weight is on the node, with smaller weights on the first order neighbors. This corresponds to a standard 3 × 3 convolution. As k increases the number of active neighbors also increases, providing greater weight to neighbors farther away, while still keeping the local information. We propose a generalization of convolutional neural networks from grid-structured data to graph-structured data, a problem that is being actively researched by our community. Our novel contribution is a convolution over a graph that can handle different graph structures as its input. The proposed convolution contains many sought-after attributes; it has a natural and intuitive interpretation, it can be transferred within different domains of knowledge, it is computationally efficient and it is effective. Furthermore, the convolution can be applied on standard regression or classification problems by learning the graph structure in the data, using the correlation matrix or other methods. Compared to a fully connected layer, the suggested convolution has significantly fewer parameters while providing stable convergence and comparable performance. Our experimental results on the Merck Molecular Activity data set and MNIST data demonstrate the potential of this approach. Convolutional Neural Networks have already revolutionized the fields of computer vision, speech recognition and language processing. We think an important step forward is to extend it to other problems which have an inherent graph structure.
  • 51. Autoencoders for graphs Variational Graph Auto-Encoders Thomas N. Kipf, Max Welling (Submitted on 21 Nov 2016) https://arxiv.org/abs/1611.07308 https://github.com/tkipf/gae → http://tkipf.github.io/graph-convolutional-networks/ Latent space of unsupervised VGAE model trained on Cora citation network dataset. Grey lines denote citation links. Colors denote document class (not provided during training). Future work will investigate better-suited prior distributions (instead of Gaussian here), more flexible generative models and the application of a stochastic gradient descent algorithm for improved scalability. Modeling Relational Data with Graph Convolutional Networks Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling (Submitted on 17 Mar 2017 (v1), last revised 6 Jun 2017 (this version, v3)) https://arxiv.org/abs/1703.06103 In this work, we introduce relational GCNs (R-GCNs). R-GCNs are specifically designed to deal with highly multi-relational data, characteristic of realistic knowledge bases. Our entity classification model, similarly to Kipf and Welling [see left], uses softmax classifiers at each node in the graph. The classifiers take node representations supplied by an R-GCN and predict the labels. The model, including R-GCN parameters, is learned by optimizing the cross-entropy loss. Our link prediction model can be regarded as an autoencoder consisting of (1) an encoder: an R-GCN producing latent feature representations of entities, and (2) a decoder: a tensor factorization model exploiting these representations to predict labeled edges. Though in principle the decoder can rely on any type of factorization (or generally any scoring function), we use one of the simplest and most effective factorization methods: DistMult [ Yang et al. 2014]. (a) R-GCN per-layer update for a single graph node (in light red). Activations from neighboring nodes (dark blue) are gathered and then transformed for each relation type individually (for both in- and outgoing edges). The resulting representation is accumulated in a (normalized) sum and passed through an activation function (such as the ReLU). This per-node update can be computed in parallel with shared parameters across the whole graph. (b) Depiction of an R-GCN model for entity classification with a per-node loss function. (c) Link prediction model with an R-GCN encoder (interspersed with fully-connected/dense layers) and a DistMult decoder that takes pairs of hidden node representations and produces a score for every (potential) edge in the graph. The loss is evaluated per edge.
  • 52. Representation Learning For graphs #1 Inductive Representation Learning on Large Graphs William L. Hamilton, Rex Ying, Jure Leskovec (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02216 http://snap.stanford.edu/graphsage/ We propose a general framework, called GraphSAGE (SAmple and aggreGatE), for inductive node embedding. Unlike embedding approaches that are based on matrix factorization, we leverage node features (e.g., text attributes, node profile information, node degrees) in order to learn an embedding function that generalizes to unseen nodes. By incorporating node features in the learning algorithm, we simultaneously learn the topological structure of each node’s neighborhood as well as the distribution of node features in the neighborhood. While we focus on feature-rich graphs (e.g., citation data with text attributes, biological data with functional/molecular markers), our approach can also make use of structural features that are present in all graphs (e.g., node degrees). Thus, our algorithm can also be applied to graphs without node features (i.e. point clouds with only the xyz- coordinates without RGB texture, normals, etc.) Low-dimensional vector embeddings of nodes in large graphs have proved extremely useful as feature inputs for a wide variety of prediction and graph analysis tasks. The basic idea behind node embedding approaches is to use dimensionality reduction techniques to distill the high-dimensional information about a node’s neighborhood into a dense vector embedding. These node embeddings can then be fed to downstream machine learning systems and aid in tasks such as node classification, clustering, and link prediction (e.g. LINE, see below). However, previous works have focused on embedding nodes from a single fixed graph, and many real-world applications require embeddings to be quickly generated for unseen nodes, or entirely new (sub)graphs. This inductive capability is essential for high-throughput, production machine learning systems, which operate on evolving graphs and constantly encounter unseen nodes (e.g., posts on Reddit, users and videos on Youtube). An inductive approach to generating node embeddings also facilitates generalization across graphs with the same form of features: for example, one could train an embedding generator on protein-protein interaction graphs derived from a model organism, and then easily produce node embeddings for data collected on new organisms using the trained model. LINE: Large-scale Information Network Embedding Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, Qiaozhu Mei (Submitted on 12 Mar 2015) https://arxiv.org/abs/1503.03578 https://github.com/tangjianpku/LINE
  • 53. Representation Learning For graphs #2 Skip-graph: Learning graph embeddings with an encoder-decoder model John Boaz Lee, Xiangnan Kong 04 Nov 2016 (modified: 11 Jan 2017) ICLR 2017 conference submission https://openreview.net/forum?id=BkSqjHqxg&noteId=BkSqjHqxg We introduced an unsupervised method, based on the encoder-decoder model, for generating feature representations for graph-structured data. The model was evaluated on the binary classification task on several real-world datasets. The method outperformed several state-of-the-art algorithms on the tested datasets. There are several interesting directions for future work. For instance, we can try training multiple encoders on random walks generated using very different neighborhood selection strategies. This may allow the different encoders to capture different properties in the graphs. We would also like to test the approach using different neural network architectures. Finally, it would be interesting to test the method on other types of heterogeneous information networks.
  • 54. Semi-supervised Learning For graphs Inductive Representation Learning on Large Graphs Thang D. Bui, Sujith Ravi, Vivek Ramavajjala University of Cambridge, United Kingdom; Google Research, Mountain View, CA, USA (Submitted on 14 Mar 2017) https://arxiv.org/abs/1703.04818 We have revisited graph-augmentation training of neural networks and proposed Neural Graph Machines as a general framework for doing so. Its label propagation (for semi-supervised CNNs see e.g. Tarvainen and Valpola 2017) objective function encourages the neural networks to make accurate node-level predictions, as in vanilla neural network training, as well as constrains the networks to learn similar hidden representations for nodes connected by an edge in the graph. Importantly, the objective can be trained by stochastic gradient descent and scaled to large graphs We validated the efficacy of the graph-augmented objective on various tasks including bloggers’ interest, text category and semantic intent classification problems, using a wide range of neural network architectures (FFNNs, CNNs and LSTM RNNs). The experimental results demonstrated that graph-augmented training almost always helps to find better neural networks that outperforms other techniques in predictive performance or even much smaller networks that are faster and easier to train. Additionally, the node-level input features can be combined with graph features as inputs to the neural networks. We showed that a neural network that simply takes the adjacency matrix of a graph and produces node labels, can perform better than a recently proposed two-stage approach using sophisticated graph embeddings and a linear classifier. Our framework also excels when the neural network is small, or when there is limited supervision available. While our objective can be applied to multiple graphs which come from different domains, we have not fully explored this aspect and leave this as future work. We expect the domain-specific networks can interact with the graphs to determine the importance of each domain/graph source in prediction. We also did not explore using graph regularisation for different hidden layers of the neural networks; we expect this is key for the multi-graph transfer setting (Yosinski et al., 2014). Another possible future extension is to use our objective on directed graphs, that is to control the direction of influence between nodes during training.
  • 55. Recurrent Networks for graphs #1 Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks Federico Monti, Michael M. Bronstein, Xavier Bresson (Submitted on 22 Apr 2017) https://arxiv.org/abs/1704.06803 Main contribution. In this work, we treat matrix completion problem as deep learning on graph-structured data. We introduce a novel neural network architecture that is able to extract local stationary patterns from the high-dimensional spaces of users and items, and use these meaningful representations to infer the non-linear temporal diffusion mechanism of ratings. The spatial patterns are extracted by a new CNN architecture designed to work on multiple graphs. The temporal dynamics of the rating diffusion is produced by a Long-Short Term Memory (LSTM) recurrent neural network (RNN). To our knowledge, our work is the first application of graph-based deep learning to matrix completion problem. Recurrent GCNN (RGCNN) architecture using the full matrix completion model and operating simultaneously on the rows and columns of the matrix X. The output of the Multi-Graph CNN (MGCNN) module is a q- dimensional feature vector for each element of the input matrix. The number of parameters to learn is O(1) and the learning complexity is O(mn). Separable Recurrent GCNN (sRGCNN) architecture using the factorized matrix completion model and operating separately on the rows and columns of the factors W, H>. The output of the GCNN module is a q- dimensional feature vector for each input row/column, respectively. The number of parameters to learn is O(1) and the learning complexity is O(m + n). Evolution of the matrix X(t) with our architecture using full matrix completion model RGCNN (top) and factorized matrix completion model sRGCNN (bottom). Numbers indicate the RMS error. Absolute value of the first 8 spectral filters learnt by our bidimensional convolution. On the left the first filter with the reference axes associated to the row and column graph eigenvalues.
  • 56. Recurrent Networks for graphs #2 Learning From Graph Neighborhoods Using LSTMs Rakshit Agrawal, Luca de Alfaro, Vassilis Polychronopoulos (Submitted on 21 Nov 2016) https://arxiv.org/abs/1611.06882 https://sites.google.com/view/ml-on-structures → https://github.com/ML-on-structures/blockchain-lstm → → Bitcoin blockchain data used in paper “The approach is based on a multi-level architecture built from Long Short-Term Memory neural nets (LSTMs); the LSTMs learn how to summarize the neighborhood from data. We demonstrate the effectiveness of the proposed technique on a synthetic example and on real-world data related to crowdsourced grading, Bitcoin transactions, and Wikipedia edit reversions.” The blockchain is the public immutable distributed ledger where Bitcoin transactions are recorded [20]. In Bitcoin, coins are held by addresses, which are hash values; these address identifiers are used by their owners to anonymously hold bitcoins, with ownership provable with public key cryptography. A Bitcoin transaction involves a set of source addresses, and a set of destination addresses: all coins in the source addresses are gathered, and they are then sent in various amounts to the destination addresses. Mining data on the blockchain is challenging [Meiklejohn et al. 2013] due to the anonymity of addresses. We use data from the blockchain to predict whether an address will spend the funds that were deposited to it. We obtain a dataset of addresses by using a slice of the blockchain. In particular, we consider all the addresses where deposits happened in a short range of 101 blocks, from 200,000 to 200,100 (included) . They contain 15,709 unique addresses where deposits took place. Looking at the state of the blockchain after 50,000 blocks (which corresponds to roughly one year later as each block is mined on average every 10 minutes), 3,717 of those addresses still had funds sitting: we call these “hoarding addresses”. The goal is to predict which addresses are hoarding addresses, and which spent the funds. We randomly split the 15,709 addresses into a training set of 10,000 and a validation set of 5,709 addresses. We built a graph with addresses as nodes, and transactions as edges. Each edge was labeled with features of the transaction: its time, amount of funds transmitted, number of recipients, and so forth, for a total of 9 features. We compared two different algorithms: ● Baseline: an informative guess; it guesses a label with a probability equal to its percentage in the training set. ● MLSL of depths 1, 2, 3. The outputs and memory sizes of the learners for the reported results are K2 = K3 = 3. Increasing these to 5 maintained virtually the same performance while increasing training time. Using only 1 output and memory cell was not providing any advances in performance. Quantitative Analysis of the Full Bitcoin Transaction Graph Dorit Ron, Adi Shamir Financial Cryptography 2012 http://doi.org/10.1007/978-3-642-39884-1_2
  • 57. Time-series analysis with graphs #1 Spectral Algorithms for Temporal Graph Cuts Arlei Silva, Ambuj Singh, Ananthram Swami (Submitted on 15 Feb 2017) https://arxiv.org/abs/1702.04746 We propose novel formulations and algorithms for computing temporal cuts using spectral graph theory, multiplex graphs, divide-and-conquer and low-rank matrix approximation. Furthermore, we extend our formulation to dynamic graph signals, where cuts also capture node values, as graph wavelets. Experiments show that our solutions are accurate and scalable, enabling the discovery of dynamic communities and the analysis of dynamic graph processes. This work opens several lines for future investigation: (i) temporal cuts, as a general framework for solving problems involving dynamic data, can be applied in many scenarios, we are particularly interested to see how our method performs in computer vision tasks; (ii) Perturbation Theory can provide deeper theoretical insights into the properties of temporal cuts [Sole-Ribalta et al. 2013; Taylor et al. 2015] ; finally, (iii) we want to study Cheeger inequalities [Chung 1996] for temporal cuts, as means to better understand the performance of our algorithms. Temporal graph cut for a primary school network. The cut, represented as node colors, reflects the network dynamics, capturing major changes in the children’s interactions.
  • 58. Active learning on Graphs Active Learning for Graph Embedding Hongyun Cai, Vincent W. Zheng, Kevin Chen-Chuan Chang (Submitted on 15 May 2017) https://arxiv.org/abs/1705.05085 https://github.com/vwz/AGE In this paper, we proposed a novel active learning framework for graph embedding named Active Graph Embedding (AGE). Unlike the traditional active learning algorithms, AGE processes the data with structural information and learnt representations (node embeddings), and it is carefully designed to address the challenges brought by these two characteristics. First, to exploit the graphical information, a graphical centrality based measurement is considered in addition to the popular information entropy based and information density based query criteria. Second, the active learning and graph embedding process are jointly run together by posing the label query at the end of every epoch of the graph embedding training process. Moreover, the time-sensitive weights are put on the three active learning query criteria which focus on the graphical centrality at the beginning and shift the focus to the other two embedding based criteria as the training process progresses (i.e., more accurate embeddings are learnt).
  • 59. Transfer learning on Graphs Intrinsic Geometric Information Transfer Learning on Multiple Graph-Structured Datasets Jaekoo Lee, Hyunjae Kim, Jongsun Lee, Sungroh Yoon (Submitted on 15 Nov 2016 (v1), last revised 5 Dec 2016 (this version, v2)) https://arxiv.org/abs/1611.04687 Conventional CNN works on a regular grid domain (top); proposed transfer learning framework for CNN, which can transfer intrinsic geometric information obtained from a source graph domain to a target graph domain (bottom). Overview of the proposed method. Conclusion We have proposed a new transfer learning framework for deep learning on graph-structured data. Our approach can transfer the intrinsic geometric information learned from the graph representation of the source domain to the target domain. We observed that the knowledge transfer between tasks domains is most effective when the source and target domains possess high similarity in their graph representations. We anticipate that adoption of our methodology will help extend the territory of deep learning to data in non-grid structure as well as to cases with limited quantity and quality of data. To prove this, we are planning to apply our approach to diverse datasets in different domains.
  • 60. Transfer learning on Graphs #2 Deep Feature Learning for Graphs Ryan A. Rossi, Rong Zhou, Nesreen K. Ahmed (Submitted on 28 Apr 2017) https://arxiv.org/abs/1611.04687 This paper presents a general graph representation learning framework called DeepGL for learning deep node and edge representations from large (attributed) graphs. In particular, DeepGL begins by deriving a set of base features (e.g., graphlet features) and automatically learns a multi-layered hierarchical graph representation where each successive layer leverages the output from the previous layer to learn features of a higher-order. Contrary to previous work, DeepGL learns relational functions (each representing a feature) that generalize across-networks and therefore useful for graph- based transfer learning tasks. Moreover, DeepGL naturally supports attributed graphs, learns interpretable features, and is space-efficient (by learning sparse feature vectors). Thus, features learned by DeepGL are interpretable and naturally generalize for across-network transfer learning tasks as they can be derived on any arbitrary graph. The framework is flexible with many interchangeable components, expressive, interpretable, parallel, and is both space- and time-efficient for large graphs with runtime that is linear in the number of edges. DeepGL has all the following desired properties: ● Effective for attributed graphs and across-network transfer learning tasks ● Space-efficient requiring up to 6× less memory ● Fast with up to 182× speedup in runtime ● Accurate with a mean improvement of 20% or more on many applications ● Parallel with strong scaling results.
  • 61. Learning Graphs learning the graph itself #1 Learning Graph While Training: An Evolving Graph Convolutional Neural Network Ruoyu Li, Junzhou Huang (Submitted on 10 Aug 2017) https://arxiv.org/abs/1708.04675 “In this paper, we propose a more general and flexible graph convolution network (EGCN) fed by batch of arbitrarily shaped data together with their evolving graph Laplacians trained in supervised fashion. Extensive experiments have been conducted to demonstrate the superior performance in terms of both the acceleration of parameter fitting and the significantly improved prediction accuracy on multiple graph-structured datasets.” In this paper, we explore our approach primarily on chemical molecular datasets, although the network can be straightforwardly trained on other graph- structured data, such as point cloud, social networks and so on. Our contributions can be summarized as follows: ● A novel spectral graph convolution layer boosted by Laplacian learning (SGC-LL) has been proposed to dynamically update the residual graph Laplacians via metric learning for deep graph learning. ● Re-parametrization on feature domain has been introduced in K-hop spectral graph convolution to enable our proposed deep graph learning and to grant graph CNNs the similar capability of feature extraction on graph data as that in the classical CNNs on grid data. ● An evolving graph convolution network (EGCN) has been designed to be fed by a batch of arbitrarily shaped graph-structured data. The network is able to construct and learn for each data sample the graph structure that best serves the prediction part of network. Extensive experimental results indicate the benefits from the evolving graph structure of data.
  • 62. Graph structure as the “signal” for prediction DeepGraph: Graph Structure Predicts Network Growth Cheng Li, Xiaoxiao Guo, Qiaozhu Mei (Submitted on 20 Oct 2016) https://arxiv.org/abs/1708.04675 “Extensive experiments on five large collections of real-world networks demonstrate that the proposed prediction model significantly improves the effectiveness of existing methods, including linear or nonlinear regressors that use hand-crafted features, graph kernels, and competing deep learning methods.” Graph descriptor vs. adjacency matrix. We have described the process in converting an adjacency matrix into our graph descriptor, which is then passed through a deep neural network for further feature extraction. All computation in this process is to obtain a more effective low- level representation of the topological structure information than the original adjacency matrix. First, isometric graphs could be represented by many different adjacency matrices, while our graph descriptor would provide a unique representation for those isometric graphs. The unique representation simplifies the neural network structures for network growth prediction. Second, our graph descriptor provides similar representations for graphs with similar structures. The similarity of graphs is less preserved in adjacency matrix representation. Such information loss could cause great burden for deep neural networks in growth prediction tasks. Third, our graph descriptor is a universal graph structure representation which does not depend on vertex ordering or the number of vertexes, while the adjacency matrix is not. The motivation in adopting Heat Kernel Signature (HKS) is its theoretical proven properties in representing graphs: HKS is an intrinsic and informative representation for graphs [31]. Intrinsicness means that isomorphic graphs map to the same HKS representation, and informativeness means if two graphs have the same HKS representation, then they must be isomorphic graphs. A meaningful future direction is to integrate network structure with other types of information, such as the content of information cascades in the network. A joint representation of multi-modal information may maximize the performance of particular prediction tasks.