SlideShare uma empresa Scribd logo
1 de 55
Processing Reachability Queries
with Realistic Constraints on
Massive Networks
Hong Cheng
The Chinese University of Hong Kong
Massive Networks are Everywhere!
A: Yes
A: No
)9,3(=Q
Graph Reachability
 Query: can node u reach node v in a directed graph?
)11,1(=Q
1 2
3 4
6 7 8
5
9
13 10
11
12
14
15
Graph Reachability
 Has been studied extensively in the literature
 A comprehensive survey by Yu and Cheng [1]
 Main idea
 Find all strongly connected components (SCCs) in
a graph G
 Compress G into a DAG by replacing each SCC
with a node
 Compute the edge transitive closure on the DAG
Graph Reachability with Realistic
Constraints
 General reachability query is not expressive
enough, and the answers may not be
meaningful or practically feasible.
 We, for the first time, study graph reachability
when realistic constraints are imposed.
 Weight constraint [VLDBJ’13]
 Distance constraint [PVLDB’12]
Weight Constraint Reachability (WCR)
 Input: a weighted undirected graph
 Query: can node u reach node v with every edge
weight on the path satisfying a constraint c?
)4,,( ≤= gaQ
),,( wEVG =
A: Yes!
Applications of WCR
 QoS routing
 Is there a route from one node to another in a
communication network, such that each link has a
bandwidth ≥ x ?
 Trip planning
 Is there a route from one city to another in a road network,
such that each segment has a speed limit within [50, 80]
miles/hour?
 Distribution network
 Is there a feasible delivery route between two locations,
such that each intermediate warehouse, storage point or
distribution center has a proper handling capacity ≥ x ?
A Straightforward Solution
 Perform BFS/DFS search from node u, until reach
node v or no more unvisited nodes left
)4,,( ≤= gaQ
)( nmO + time!
A Nice Property based on MST
 Theorem
Two vertices u and v are reachable w.r.t. the weight
constraint ≤ y in G Vertices u and v are⇔
reachable w.r.t. the constraint ≤ y in the MST of G.
)4,,( ≤= gaQ )()( nOnmO →+ time
With this property, can we further
reduce the query time and how?
Proof of Theorem
 Given and its MST , for any vertices
, denote
 The removal of creates two connected
components and .
 Define an edge cut in as
Then and
),,( wEVG = T
Vvu ∈,
)(maxarg ),(max ewe vuPe T∈=
maxe
uT vT
G
},|),({ vuuv TbTaEbaeC ∈∈∈=
uvCe ∈max )(min)( max ewew uvCe∈=
according to the cut property of minimum spanning tree.
Proof of Theorem
 For any path , we have .
 For any , we have
 Thus if , we can conclude and are
not reachable w.r.t. the constraint .
),( vuP Φ≠uvCvuP ),(
uvCvuPe ),('∈
),()'()( max vuPewew ≤≤
yew >)( max u v
y≤
For any ,
Given , if , then yes!
The Maximum Edge Weight on MST
21, TvTu ∈∈
4)()(max),( max
),(
===
∈
ewewvuP
vuPe
T
T
),,( yvuQ ≤= yvuPT ≤= 4),(
4 maxe
1T 2T
This Property can be Recursively Applied
1T11T 12T
3
maxe
For any ,1211, TvTu ∈∈
3)()(max),( max
),(
===
∈
ewewvuP
vuPe
T
T
Building a Hierarchical Edge Index
Edge Index Tree
Query on the Edge Index Tree
Given , we compute
where is the lowest common ancestor of and in
the edge index tree.
can be computed in time based on size
index.
Then we only need to test whether or
not to answer .
),,( yvuQ ≤=
)),((),( vuLCAwvuPT =
),( vuLCA
),( vuLCA )1(O )(nO
yvuLCAwvuPT ≤= )),((),(
),,( yvuQ ≤=
u v
Query Processing: Examples
)4,,( ≤= gaQ A: Yes!)2,,( ≤= daQ A: No!
Complexity Analysis
Query Time Index Size Index Time
)(nO)1(O )(nO
to process queries or .),,( yvuQ ≤= ),,( xvuQ ≥=
It can be easily extended to process .]),[,,( yxvuQ =
Answering WCR with a Disk-Resident
Index
 What happens if the edge index tree is too large to
fit in memory?
 Problem: it costs a large constant number of random
I/O access if we store the edge index tree in the disk
 Our solution: design a disk-resident index and an
I/O efficient algorithm to answer a WCR query.
A Vertex Coding Idea
 We pick an “arbitrary” node of an MST as the root to
get a rooted MST.
4)},(),,({max),( == gbPbaPgaP TTT 2)},(),,({max),( == efPfaPeaP TTT ))},(,()),,(,({max),( vuLCAvPvuLCAuPvuP TTT =
Vertex Coding
)}2,(),3,(),3,{()( fdbacode = )}4,(),4,(),4,{()( gcbhcode =
4}4,3max{)},(),,({max),( === bhPbaPhaP TTT
A Complexity Issue in Vertex Coding
 We store the code for every vertex on the disk.
 Given a query , and are
read from the disk to compute .
 Space complexity:
 Query I/O complexity: , where B is
the page size
),,( yvuQ ≤= )(ucode )(vcode
),( vuPT
)()( 2
nOdepthnO ⊆⋅
)()(
B
n
O
B
depth
O ⊆
Bound the Tree Depth by Balancing
 We will balance the rooted MST.
 Definition (Median Node)
Given an MST , a node is a median node
of , if for each neighbor of , the following holds
 The median node always and uniquely exists in a tree.
We use the median node of an MST as its root. For
each subtree underneath the root, we use the median
node concept to balance the subtree recursively.
T )(TVv∈
T 'v v
2
|)(|
|)(| '
TV
TV v ≤
Tree Balancing: Example
Theorem
The depth of the balanced tree is at most .n2log
Corollary
code(u) for any node u contains at most entries,
thus can fit into one page (i.e., , where
B=1024 or 4096 bytes).
n2log
Bn ≤2log
Complexity Analysis
Query Time Index Size Index Time
Memory
Disk 2 I/Os )log( nnO)log( nnO
to process queries or .),,( yvuQ ≤= ),,( xvuQ ≥=
It can be easily extended to process .]),[,,( yxvuQ =
)1(O )(nO)(nO
Experiments
 Network statistics
Network |V| |E|
Facebook New
Orleans
63,731 440,384
USARN 23,947,347 29,166,672
Experiment Settings
 2.67G Hz CPU, 12GB Memory, test 10,000 queries
 Memory-based methods
 BFS/DFS on graph
 MST-Index
 Edge-Index
 Disk-based methods
 External BFS/DFS on graph
 External MST
 Balanced Tree Index
Memory-based Algorithms: Query Time
Query Time in Microseconds (10-6
seconds)
Network DFS BFS MST-
Index
Edge-
Index
Facebook 1,098 1,429 1 1
USARN 32,462 30,868 1,382 4
Memory-based Algorithms: Index Size
Index Size in GB
Network DFS BFS MST-
Index
Edge-
Index
Facebook 0.01 0.01 0.0008 0.0025
USARN 0.89 0.89 0.28 0.95
Memory-based Algorithms: Index Time
Index Time in Seconds
Network DFS BFS MST-
Index
Edge-
Index
Facebook 0.4 0.4 0.03 0.06
USARN 33.7 33.7 9.9 39.2
Disk-based Algorithms: Query Time
Query Time in Microseconds (10-6
seconds)
Network Ext-
DFS
Ext-
BFS
Ext-
MST
Balance
d-Index
Facebook 31,368 48,152 772 11
USARN 294,521 64,471 422,810 18
Disk-based Algorithms: Index Size
Index Size in GB
Network Ext-
DFS
Ext-
BFS
Ext-
MST
Balance
d-Index
Facebook 0.01 0.01 0.0008 0.0035
USARN 0.89 0.89 0.28 0.52
Disk-based Algorithms: Index Time
Index Time in Seconds
Network Ext-
DFS
Ext-
BFS
Ext-
MST
Balance
d-Index
Facebook 0.6 0.6 0.048 0.146
USARN 48.8 48.8 12.2 118.8
Summary and Contribution
 The first study on WCR query
 Computing Weight Constraint Reachability in
Large Networks. The VLDB Journal, 22(3):275-
294, 2013.
 Design two novel and efficient solutions
 Memory: edge index tree for O(1) query time
 Disk: balanced tree + vertex coding for 2 I/O
query cost
K-Hop Reachability (K-Reach)
 Input: an unweighted directed graph
 Query: can node u reach node v via a path of length
no more than k?
faQ 3: →
A: Yes!
gaQ 3: →
A: No!
Applications of K-Reach
 In a wireless or sensor network, where a broadcasted
message may get lost during any hop, the probability
of reception degrades exponentially over multiple
hops.
 In social networks, the degree of acquaintance may
even decrease super-exponentially (i.e., two persons
may hardly know each other if they are just 3 hops
apart).
 K-Reach is helpful since it can model the level and
sphere of the influence.
Vertex Cover
 A set of vertices is a vertex cover of a graph
, if for every edge , we have
.
 The problem of computing the minimum vertex cover
is NP-hard.
 But there is a polynomial time algorithm for
computing a 2-approxiamte minimum vertex cover.
VS ⊆
),( EVG = Evu ∈),(
Φ≠Svu },{
A Vertex Cover-based Index
K-Reach Index
for k=3
Query Processing
 Given , there are four cases:
 Case 1
 Case 2
 Case 3
 Case 4
vuQ k→:
SvSu ∈∈ ,
SvSu ∉∈ ,
SvSu ∈∉ ,
SvSu ∉∉ ,
 Let k=3,
 if , we have
 if , we have
Case 1:
Sgv ∈=
SvSu ∈∈ ,
Sbu ∈=
gb 3→
Siv ∈= ib 3→
 Let k=3,
 if , we have
 if , we have
Case 2:
Shv ∉=
SvSu ∉∈ ,
Sdu ∈=
hd 3→
Sjv ∉= jd 3→
 Let k=3,
 if , we have
 if , we have
Case 3:
Sdv ∈=
SvSu ∈∉ ,
Sau ∉=
da 3→
Sgv ∈= ga 3→
 Let k=3,
 if , we have
 if , we have
Case 4:
Sfv ∉=
SvSu ∉∉ ,
Scu ∉=
fc 3→
Shv ∉= hc 3→
Complexity Analysis
 Index construction
 2-approximate minimum vertex cover
 K-Reach index
 Query processing
 Case 1
 Case 2
 Case 3
 Case 4
)( nmO +
)|)(|(∑ ∈Su k uGO
)),(deg(log IuO out
)),(deg),((deg GvIuO inout +
)),(deg),((deg IvGuO inout +
))),(deg),(deg(( ),(
GvIwO inoutGuoutNeiw
+∑ ∈
Experiments
 For processing k-hop reachability queries
 For processing classic reachability queries
(setting k=n)
Network Statistics
K-Reach: Query Processing Time
Query Breakdown
Classic Reachability: Query Processing Time
Index Construction Time
Index Size
Overall Performance
Summary and Contribution
 The first study on K-Reach query
 K-Reach: Who is in Your Small World. Proceedings of
the VLDB Endowment, 5(11):1292-1303, 2012.
 An efficient vertex cover-based index can
answer both classic reachability and k-hop
reachability queries
Conclusions
 We study two graph reachability queries,
WCR and K-Reach, when realistic constraints
are imposed. This makes the answers to the
queries more meaningful and practically
useful in many applications.
 We exploit the nice property for each query
type and design efficient indices for
processing these two types of queries.
Joint work with (in alphabetical order)
 Lijun Chang
 James Cheng
 Miao Qiao
 Lu Qin
 Zechao Shang
 Haixun Wang
 Jeffrey Xu Yu
 Philip S. Yu
References
[1] Jeffrey Xu Yu, Jiefeng Cheng: Graph Reachability
Queries: A Survey. Managing and Mining Graph
Data 2010: 181-215
[2] Miao Qiao, Hong Cheng, Lu Qin, Jeffrey Xu Yu,
Philip S. Yu, Lijun Chang: Computing weight
constraint reachability in large networks. VLDB J.
22(3): 275-294 (2013)
[3] James Cheng, Zechao Shang, Hong Cheng,
Haixun Wang, Jeffrey Xu Yu: K-Reach: Who is in
Your Small World. PVLDB 5(11): 1292-1303 (2012)

Mais conteúdo relacionado

Mais procurados

Lattice Based Cryptography - GGH Cryptosystem
Lattice Based Cryptography - GGH CryptosystemLattice Based Cryptography - GGH Cryptosystem
Lattice Based Cryptography - GGH CryptosystemVarun Janga
 
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...Masumi Shirakawa
 
A Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman ProblemA Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman Problemvsubhashini
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...Masahiro Suzuki
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaFerdinand Jamitzky
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsAlbert Bifet
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeRakuten Group, Inc.
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural NetworkMasahiro Suzuki
 
[241]large scale search with polysemous codes
[241]large scale search with polysemous codes[241]large scale search with polysemous codes
[241]large scale search with polysemous codesNAVER D2
 
First Place Memocode'14 Design Contest Entry
First Place Memocode'14 Design Contest EntryFirst Place Memocode'14 Design Contest Entry
First Place Memocode'14 Design Contest EntryKevin Townsend
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkMila, Université de Montréal
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slidesSara Asher
 
(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural NetworksMasahiro Suzuki
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention NetworksTaeoh Kim
 
Shor's discrete logarithm quantum algorithm for elliptic curves
 Shor's discrete logarithm quantum algorithm for elliptic curves Shor's discrete logarithm quantum algorithm for elliptic curves
Shor's discrete logarithm quantum algorithm for elliptic curvesXequeMateShannon
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...Preferred Networks
 

Mais procurados (20)

Lattice Based Cryptography - GGH Cryptosystem
Lattice Based Cryptography - GGH CryptosystemLattice Based Cryptography - GGH Cryptosystem
Lattice Based Cryptography - GGH Cryptosystem
 
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
 
Aaex3 group2
Aaex3 group2Aaex3 group2
Aaex3 group2
 
A Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman ProblemA Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman Problem
 
presentation
presentationpresentation
presentation
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
 
C07.heaps
C07.heapsC07.heaps
C07.heaps
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cuda
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in Practice
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network
 
[241]large scale search with polysemous codes
[241]large scale search with polysemous codes[241]large scale search with polysemous codes
[241]large scale search with polysemous codes
 
First Place Memocode'14 Design Contest Entry
First Place Memocode'14 Design Contest EntryFirst Place Memocode'14 Design Contest Entry
First Place Memocode'14 Design Contest Entry
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using spark
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention Networks
 
Shor's discrete logarithm quantum algorithm for elliptic curves
 Shor's discrete logarithm quantum algorithm for elliptic curves Shor's discrete logarithm quantum algorithm for elliptic curves
Shor's discrete logarithm quantum algorithm for elliptic curves
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
 

Destaque

Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosBigMine
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...BigMine
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...BigMine
 
Gut vernetzt: Skalierbares Graph Mining für Business Intelligence
Gut vernetzt: Skalierbares Graph Mining für Business IntelligenceGut vernetzt: Skalierbares Graph Mining für Business Intelligence
Gut vernetzt: Skalierbares Graph Mining für Business IntelligenceMartin Junghanns
 
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...Martin Junghanns
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like YouSalford Systems
 

Destaque (6)

Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 
Gut vernetzt: Skalierbares Graph Mining für Business Intelligence
Gut vernetzt: Skalierbares Graph Mining für Business IntelligenceGut vernetzt: Skalierbares Graph Mining für Business Intelligence
Gut vernetzt: Skalierbares Graph Mining für Business Intelligence
 
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
 

Semelhante a Processing Reachability Queries with Realistic Constraints on Massive Networks by Hong Cheng

Sequential and parallel algorithm to find maximum flow on extended mixed netw...
Sequential and parallel algorithm to find maximum flow on extended mixed netw...Sequential and parallel algorithm to find maximum flow on extended mixed netw...
Sequential and parallel algorithm to find maximum flow on extended mixed netw...csandit
 
SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...
SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...
SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...cscpconf
 
Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUDavide Nardone
 
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)MeetupDataScienceRoma
 
04 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x204 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x2MuradAmn
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphstuxette
 
Compressed learning for time series classification
Compressed learning for time series classificationCompressed learning for time series classification
Compressed learning for time series classification學翰 施
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...IJMIT JOURNAL
 
International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)IJMIT JOURNAL
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...IJMIT JOURNAL
 
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...Vladimir Kulyukin
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
 
Graphical Model Selection for Big Data
Graphical Model Selection for Big DataGraphical Model Selection for Big Data
Graphical Model Selection for Big DataAlexander Jung
 

Semelhante a Processing Reachability Queries with Realistic Constraints on Massive Networks by Hong Cheng (20)

Sequential and parallel algorithm to find maximum flow on extended mixed netw...
Sequential and parallel algorithm to find maximum flow on extended mixed netw...Sequential and parallel algorithm to find maximum flow on extended mixed netw...
Sequential and parallel algorithm to find maximum flow on extended mixed netw...
 
SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...
SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...
SEQUENTIAL AND PARALLEL ALGORITHM TO FIND MAXIMUM FLOW ON EXTENDED MIXED NETW...
 
rgDefense
rgDefensergDefense
rgDefense
 
Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPU
 
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
 
04 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x204 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x2
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 
Compressed learning for time series classification
Compressed learning for time series classificationCompressed learning for time series classification
Compressed learning for time series classification
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
 
International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
 
cdrw
cdrwcdrw
cdrw
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
Graphical Model Selection for Big Data
Graphical Model Selection for Big DataGraphical Model Selection for Big Data
Graphical Model Selection for Big Data
 
Scribed lec8
Scribed lec8Scribed lec8
Scribed lec8
 
04 greedyalgorithmsii
04 greedyalgorithmsii04 greedyalgorithmsii
04 greedyalgorithmsii
 
Triggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphsTriggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphs
 

Mais de BigMine

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...BigMine
 
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...BigMine
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16BigMine
 
Big Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBig Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBigMine
 
Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeBigMine
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles ParkerBigMine
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 

Mais de BigMine (7)

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
 
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
 
Big Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBig Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina Morik
 
Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping Ye
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 

Último

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Último (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Processing Reachability Queries with Realistic Constraints on Massive Networks by Hong Cheng

  • 1. Processing Reachability Queries with Realistic Constraints on Massive Networks Hong Cheng The Chinese University of Hong Kong
  • 2. Massive Networks are Everywhere!
  • 3. A: Yes A: No )9,3(=Q Graph Reachability  Query: can node u reach node v in a directed graph? )11,1(=Q 1 2 3 4 6 7 8 5 9 13 10 11 12 14 15
  • 4. Graph Reachability  Has been studied extensively in the literature  A comprehensive survey by Yu and Cheng [1]  Main idea  Find all strongly connected components (SCCs) in a graph G  Compress G into a DAG by replacing each SCC with a node  Compute the edge transitive closure on the DAG
  • 5. Graph Reachability with Realistic Constraints  General reachability query is not expressive enough, and the answers may not be meaningful or practically feasible.  We, for the first time, study graph reachability when realistic constraints are imposed.  Weight constraint [VLDBJ’13]  Distance constraint [PVLDB’12]
  • 6. Weight Constraint Reachability (WCR)  Input: a weighted undirected graph  Query: can node u reach node v with every edge weight on the path satisfying a constraint c? )4,,( ≤= gaQ ),,( wEVG = A: Yes!
  • 7. Applications of WCR  QoS routing  Is there a route from one node to another in a communication network, such that each link has a bandwidth ≥ x ?  Trip planning  Is there a route from one city to another in a road network, such that each segment has a speed limit within [50, 80] miles/hour?  Distribution network  Is there a feasible delivery route between two locations, such that each intermediate warehouse, storage point or distribution center has a proper handling capacity ≥ x ?
  • 8. A Straightforward Solution  Perform BFS/DFS search from node u, until reach node v or no more unvisited nodes left )4,,( ≤= gaQ )( nmO + time!
  • 9. A Nice Property based on MST  Theorem Two vertices u and v are reachable w.r.t. the weight constraint ≤ y in G Vertices u and v are⇔ reachable w.r.t. the constraint ≤ y in the MST of G. )4,,( ≤= gaQ )()( nOnmO →+ time With this property, can we further reduce the query time and how?
  • 10. Proof of Theorem  Given and its MST , for any vertices , denote  The removal of creates two connected components and .  Define an edge cut in as Then and ),,( wEVG = T Vvu ∈, )(maxarg ),(max ewe vuPe T∈= maxe uT vT G },|),({ vuuv TbTaEbaeC ∈∈∈= uvCe ∈max )(min)( max ewew uvCe∈= according to the cut property of minimum spanning tree.
  • 11. Proof of Theorem  For any path , we have .  For any , we have  Thus if , we can conclude and are not reachable w.r.t. the constraint . ),( vuP Φ≠uvCvuP ),( uvCvuPe ),('∈ ),()'()( max vuPewew ≤≤ yew >)( max u v y≤
  • 12. For any , Given , if , then yes! The Maximum Edge Weight on MST 21, TvTu ∈∈ 4)()(max),( max ),( === ∈ ewewvuP vuPe T T ),,( yvuQ ≤= yvuPT ≤= 4),( 4 maxe 1T 2T
  • 13. This Property can be Recursively Applied 1T11T 12T 3 maxe For any ,1211, TvTu ∈∈ 3)()(max),( max ),( === ∈ ewewvuP vuPe T T
  • 14. Building a Hierarchical Edge Index Edge Index Tree
  • 15. Query on the Edge Index Tree Given , we compute where is the lowest common ancestor of and in the edge index tree. can be computed in time based on size index. Then we only need to test whether or not to answer . ),,( yvuQ ≤= )),((),( vuLCAwvuPT = ),( vuLCA ),( vuLCA )1(O )(nO yvuLCAwvuPT ≤= )),((),( ),,( yvuQ ≤= u v
  • 16. Query Processing: Examples )4,,( ≤= gaQ A: Yes!)2,,( ≤= daQ A: No!
  • 17. Complexity Analysis Query Time Index Size Index Time )(nO)1(O )(nO to process queries or .),,( yvuQ ≤= ),,( xvuQ ≥= It can be easily extended to process .]),[,,( yxvuQ =
  • 18. Answering WCR with a Disk-Resident Index  What happens if the edge index tree is too large to fit in memory?  Problem: it costs a large constant number of random I/O access if we store the edge index tree in the disk  Our solution: design a disk-resident index and an I/O efficient algorithm to answer a WCR query.
  • 19. A Vertex Coding Idea  We pick an “arbitrary” node of an MST as the root to get a rooted MST. 4)},(),,({max),( == gbPbaPgaP TTT 2)},(),,({max),( == efPfaPeaP TTT ))},(,()),,(,({max),( vuLCAvPvuLCAuPvuP TTT =
  • 20. Vertex Coding )}2,(),3,(),3,{()( fdbacode = )}4,(),4,(),4,{()( gcbhcode = 4}4,3max{)},(),,({max),( === bhPbaPhaP TTT
  • 21. A Complexity Issue in Vertex Coding  We store the code for every vertex on the disk.  Given a query , and are read from the disk to compute .  Space complexity:  Query I/O complexity: , where B is the page size ),,( yvuQ ≤= )(ucode )(vcode ),( vuPT )()( 2 nOdepthnO ⊆⋅ )()( B n O B depth O ⊆
  • 22. Bound the Tree Depth by Balancing  We will balance the rooted MST.  Definition (Median Node) Given an MST , a node is a median node of , if for each neighbor of , the following holds  The median node always and uniquely exists in a tree. We use the median node of an MST as its root. For each subtree underneath the root, we use the median node concept to balance the subtree recursively. T )(TVv∈ T 'v v 2 |)(| |)(| ' TV TV v ≤
  • 23. Tree Balancing: Example Theorem The depth of the balanced tree is at most .n2log Corollary code(u) for any node u contains at most entries, thus can fit into one page (i.e., , where B=1024 or 4096 bytes). n2log Bn ≤2log
  • 24. Complexity Analysis Query Time Index Size Index Time Memory Disk 2 I/Os )log( nnO)log( nnO to process queries or .),,( yvuQ ≤= ),,( xvuQ ≥= It can be easily extended to process .]),[,,( yxvuQ = )1(O )(nO)(nO
  • 25. Experiments  Network statistics Network |V| |E| Facebook New Orleans 63,731 440,384 USARN 23,947,347 29,166,672
  • 26. Experiment Settings  2.67G Hz CPU, 12GB Memory, test 10,000 queries  Memory-based methods  BFS/DFS on graph  MST-Index  Edge-Index  Disk-based methods  External BFS/DFS on graph  External MST  Balanced Tree Index
  • 27. Memory-based Algorithms: Query Time Query Time in Microseconds (10-6 seconds) Network DFS BFS MST- Index Edge- Index Facebook 1,098 1,429 1 1 USARN 32,462 30,868 1,382 4
  • 28. Memory-based Algorithms: Index Size Index Size in GB Network DFS BFS MST- Index Edge- Index Facebook 0.01 0.01 0.0008 0.0025 USARN 0.89 0.89 0.28 0.95
  • 29. Memory-based Algorithms: Index Time Index Time in Seconds Network DFS BFS MST- Index Edge- Index Facebook 0.4 0.4 0.03 0.06 USARN 33.7 33.7 9.9 39.2
  • 30. Disk-based Algorithms: Query Time Query Time in Microseconds (10-6 seconds) Network Ext- DFS Ext- BFS Ext- MST Balance d-Index Facebook 31,368 48,152 772 11 USARN 294,521 64,471 422,810 18
  • 31. Disk-based Algorithms: Index Size Index Size in GB Network Ext- DFS Ext- BFS Ext- MST Balance d-Index Facebook 0.01 0.01 0.0008 0.0035 USARN 0.89 0.89 0.28 0.52
  • 32. Disk-based Algorithms: Index Time Index Time in Seconds Network Ext- DFS Ext- BFS Ext- MST Balance d-Index Facebook 0.6 0.6 0.048 0.146 USARN 48.8 48.8 12.2 118.8
  • 33. Summary and Contribution  The first study on WCR query  Computing Weight Constraint Reachability in Large Networks. The VLDB Journal, 22(3):275- 294, 2013.  Design two novel and efficient solutions  Memory: edge index tree for O(1) query time  Disk: balanced tree + vertex coding for 2 I/O query cost
  • 34. K-Hop Reachability (K-Reach)  Input: an unweighted directed graph  Query: can node u reach node v via a path of length no more than k? faQ 3: → A: Yes! gaQ 3: → A: No!
  • 35. Applications of K-Reach  In a wireless or sensor network, where a broadcasted message may get lost during any hop, the probability of reception degrades exponentially over multiple hops.  In social networks, the degree of acquaintance may even decrease super-exponentially (i.e., two persons may hardly know each other if they are just 3 hops apart).  K-Reach is helpful since it can model the level and sphere of the influence.
  • 36. Vertex Cover  A set of vertices is a vertex cover of a graph , if for every edge , we have .  The problem of computing the minimum vertex cover is NP-hard.  But there is a polynomial time algorithm for computing a 2-approxiamte minimum vertex cover. VS ⊆ ),( EVG = Evu ∈),( Φ≠Svu },{
  • 37. A Vertex Cover-based Index K-Reach Index for k=3
  • 38. Query Processing  Given , there are four cases:  Case 1  Case 2  Case 3  Case 4 vuQ k→: SvSu ∈∈ , SvSu ∉∈ , SvSu ∈∉ , SvSu ∉∉ ,
  • 39.  Let k=3,  if , we have  if , we have Case 1: Sgv ∈= SvSu ∈∈ , Sbu ∈= gb 3→ Siv ∈= ib 3→
  • 40.  Let k=3,  if , we have  if , we have Case 2: Shv ∉= SvSu ∉∈ , Sdu ∈= hd 3→ Sjv ∉= jd 3→
  • 41.  Let k=3,  if , we have  if , we have Case 3: Sdv ∈= SvSu ∈∉ , Sau ∉= da 3→ Sgv ∈= ga 3→
  • 42.  Let k=3,  if , we have  if , we have Case 4: Sfv ∉= SvSu ∉∉ , Scu ∉= fc 3→ Shv ∉= hc 3→
  • 43. Complexity Analysis  Index construction  2-approximate minimum vertex cover  K-Reach index  Query processing  Case 1  Case 2  Case 3  Case 4 )( nmO + )|)(|(∑ ∈Su k uGO )),(deg(log IuO out )),(deg),((deg GvIuO inout + )),(deg),((deg IvGuO inout + ))),(deg),(deg(( ),( GvIwO inoutGuoutNeiw +∑ ∈
  • 44. Experiments  For processing k-hop reachability queries  For processing classic reachability queries (setting k=n)
  • 48. Classic Reachability: Query Processing Time
  • 52. Summary and Contribution  The first study on K-Reach query  K-Reach: Who is in Your Small World. Proceedings of the VLDB Endowment, 5(11):1292-1303, 2012.  An efficient vertex cover-based index can answer both classic reachability and k-hop reachability queries
  • 53. Conclusions  We study two graph reachability queries, WCR and K-Reach, when realistic constraints are imposed. This makes the answers to the queries more meaningful and practically useful in many applications.  We exploit the nice property for each query type and design efficient indices for processing these two types of queries.
  • 54. Joint work with (in alphabetical order)  Lijun Chang  James Cheng  Miao Qiao  Lu Qin  Zechao Shang  Haixun Wang  Jeffrey Xu Yu  Philip S. Yu
  • 55. References [1] Jeffrey Xu Yu, Jiefeng Cheng: Graph Reachability Queries: A Survey. Managing and Mining Graph Data 2010: 181-215 [2] Miao Qiao, Hong Cheng, Lu Qin, Jeffrey Xu Yu, Philip S. Yu, Lijun Chang: Computing weight constraint reachability in large networks. VLDB J. 22(3): 275-294 (2013) [3] James Cheng, Zechao Shang, Hong Cheng, Haixun Wang, Jeffrey Xu Yu: K-Reach: Who is in Your Small World. PVLDB 5(11): 1292-1303 (2012)