SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
PARALLEL DATALOG REASONING IN RDFOX 
Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks, Dan Olteanu 
University of Oxford 
September 25, 2014
TABLE OF CONTENTS 
1 INTRODUCTION 
2 OUR APPROACH TO PARALLEL MATERIALISATION 
3 EVALUATION 
4 CONCLUSION 
Motik et al. Parallel Datalog Reasoning in RDFox 0/9
Introduction 
TABLE OF CONTENTS 
1 INTRODUCTION 
2 OUR APPROACH TO PARALLEL MATERIALISATION 
3 EVALUATION 
4 CONCLUSION 
Motik et al. Parallel Datalog Reasoning in RDFox 0/9
Introduction 
CURRENT TRENDS IN KNOWLEDGE/DATABASE SYSTEMS 
Materialisation is computationally intensive ) natural to parallelise 
mid-range laptops have 4 cores, servers with 16 cores are routine 
Price of RAM keeps falling 
128 GB is routine, systems with 1 TB are emerging 
in-memory databases: SAP’s HANA, Oracle’s TimesTen, YarcData’s Urika 
Motik et al. Parallel Datalog Reasoning in RDFox 1/9
Introduction 
RDFOX SUMMARY 
RDFox: a new RDF store and reasoner 
http://www.cs.ox.ac.uk/isg/tools/RDFox/ 
store data in RAM 
effectively use modern multi-core architectures 
Core focus: materialisation reasoning (recursive) datalog rules 
e.g., hx; rdf :type; yi ^ hy; rdfs:subClassOf ; zi ! hx; rdf :type; zi 
handle OWL 2 RL and the Semantic Web Rule Language (SWRL) 
explicitly store all implied triples so that queries can be evaluated directly 
used in a number of systems: Oracle’s database, OWLIM, WebPIE 
Used in projects for reasoning beyond OWL 2 RL 
‘combined approach’ for reasoning with OWL 2 EL 
PAGOaA reasoner for OWL 2 DL comdining RDFox with HermiT 
Motik et al. Parallel Datalog Reasoning in RDFox 2/9
Our Approach to Parallel Materialisation 
TABLE OF CONTENTS 
1 INTRODUCTION 
2 OUR APPROACH TO PARALLEL MATERIALISATION 
3 EVALUATION 
4 CONCLUSION 
Motik et al. Parallel Datalog Reasoning in RDFox 2/9
Our Approach to Parallel Materialisation 
EXISTING APPROACHES TO PARALLEL MATERIALISATION 
Interquery parallelism: run independent rules in parallel 
degree of parallelism limited by the number of independent rules 
) does not distribute workload to cores evenly 
Intraquery parallelism: partition rule instantiations to N threads 
e.g., constrain the body of rules evaluated by thread i to (x mod N = i) 
) static partitioning may not distribute workload well due to data skew 
) dynamic partitioning may incur an overhead due to load balancing 
Goal: distribute workload to threads evenly and with minimum overhead 
Motik et al. Parallel Datalog Reasoning in RDFox 3/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
) R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: A(a) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
) R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: A(a) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
) R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: A(b) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
) R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: A(b) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
) A(a) 
R(c,f) 
R(c,g) 
A(b) 
A(c) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: R(a,y) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
) R(c,f) 
R(c,g) 
A(b) 
A(c) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: A(c) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
) R(c,g) 
A(b) 
A(c) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: A(c) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
) A(b) 
A(c) 
A(d) 
A(e) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: R(b,y) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(b) 
) A(c) 
A(d) 
A(e) 
A(f) 
A(g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: R(c,y) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(b) 
A(c) 
) A(d) 
A(e) 
A(f) 
A(g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: R(d,y) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(b) 
A(c) 
A(d) 
) A(e) 
A(f) 
A(g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: R(e,y) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(b) 
A(c) 
A(d) 
A(e) 
) A(f) 
A(g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: R(f,y) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
SOLUTION PART I: ALGORITHM 
R(a,b) 
R(a,c) 
R(b,d) 
R(b,e) 
A(a) 
R(c,f) 
R(c,g) 
A(b) 
A(c) 
A(d) 
A(e) 
A(f) 
) A(g) 
A(x) ^ R(x; y) ! A(y) 
For each fact: 
match the fact to all body atoms to obtain subqueries 
evaluate subqueries w.r.t. all previous facts 
add results to the table 
Current subquery: R(g,y) 
Motik et al. Parallel Datalog Reasoning in RDFox 4/9
Our Approach to Parallel Materialisation 
PARALLELISING COMPUTATION 
Each thread extracts facts and evaluates subqueries independently 
The number of subqueries is determined by the number of facts 
ensures in practice that threads are equally loaded 
Requires no thread synchronisation 
) We partition rule instances dynamically and with little overhead 
EXAMPLE WHEN PARALLELISATION FAILS 
A(x) ^ R(x; y) ! A(y) 
R(a; b); R(b; c); R(c; d); R(d; e); A(a) 
Motik et al. Parallel Datalog Reasoning in RDFox 5/9
Our Approach to Parallel Materialisation 
PARALLELISING COMPUTATION 
Each thread extracts facts and evaluates subqueries independently 
The number of subqueries is determined by the number of facts 
ensures in practice that threads are equally loaded 
Requires no thread synchronisation 
) We partition rule instances dynamically and with little overhead 
EXAMPLE WHEN PARALLELISATION FAILS 
A(x) ^ R(x; y) ! A(y) 
R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b) 
Motik et al. Parallel Datalog Reasoning in RDFox 5/9
Our Approach to Parallel Materialisation 
PARALLELISING COMPUTATION 
Each thread extracts facts and evaluates subqueries independently 
The number of subqueries is determined by the number of facts 
ensures in practice that threads are equally loaded 
Requires no thread synchronisation 
) We partition rule instances dynamically and with little overhead 
EXAMPLE WHEN PARALLELISATION FAILS 
A(x) ^ R(x; y) ! A(y) 
R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b); A(c) 
Motik et al. Parallel Datalog Reasoning in RDFox 5/9
Our Approach to Parallel Materialisation 
PARALLELISING COMPUTATION 
Each thread extracts facts and evaluates subqueries independently 
The number of subqueries is determined by the number of facts 
ensures in practice that threads are equally loaded 
Requires no thread synchronisation 
) We partition rule instances dynamically and with little overhead 
EXAMPLE WHEN PARALLELISATION FAILS 
A(x) ^ R(x; y) ! A(y) 
R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b); A(c); A(d) 
Motik et al. Parallel Datalog Reasoning in RDFox 5/9
Our Approach to Parallel Materialisation 
PARALLELISING COMPUTATION 
Each thread extracts facts and evaluates subqueries independently 
The number of subqueries is determined by the number of facts 
ensures in practice that threads are equally loaded 
Requires no thread synchronisation 
) We partition rule instances dynamically and with little overhead 
EXAMPLE WHEN PARALLELISATION FAILS 
A(x) ^ R(x; y) ! A(y) 
R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b); A(c); A(d); A(e) 
Motik et al. Parallel Datalog Reasoning in RDFox 5/9
Our Approach to Parallel Materialisation 
SOLUTION PART II: INDEXING RDF DATA IN MAIN MEMORY 
The critically algorithm depends on: 
matching atoms ht1; t2; t3i with ti a constant or variable 
continuous concurrent updates 
Our RDF storage data structure: 
hash-based indexes ) naturally parallel data structure 
‘mostly’ lock-free: at least one thread makes progress at most of the time 
compare: if a thread acquire a lock and dies, other threads are blocked 
main benefit: performance is less susceptible to scheduling decisions 
Main technical challenge: reduce thread interference 
when A writes to a location cached by B, the cache of B is invalidated 
our updates ensure that threads (typically) write to different locations 
Motik et al. Parallel Datalog Reasoning in RDFox 6/9
Evaluation 
TABLE OF CONTENTS 
1 INTRODUCTION 
2 OUR APPROACH TO PARALLEL MATERIALISATION 
3 EVALUATION 
4 CONCLUSION 
Motik et al. Parallel Datalog Reasoning in RDFox 6/9
Evaluation 
CONCURRENCY OVERHEAD AND PARALLELISATION SPEEDUP 
RDFox: an RDF store developed at Oxford University 
http://www.cs.ox.ac.uk/isg/tools/RDFox/ 
8 16 24 32 
20 
18 
16 
14 
12 
10 
8 
6 
4 
2 
ClarosL 
ClarosLE 
DBpediaL 
DBpediaLE 
LUBML01K 
LUBMU01K 
8 16 24 32 
20 
18 
16 
14 
12 
10 
8 
6 
4 
2 
UOBML01K 
UOBMU010 
LUBMLE 01K 
LUBML05K 
LUBMLE 05K 
LUBMU05K 
Small concurrency overhead; parallelisation pays off already with two threads 
Speedup continues to increase after we exhaust all physical cores 
) hyperthreading and parallelism can compensate CPU cache misses 
Motik et al. Parallel Datalog Reasoning in RDFox 7/9
32 127 19.5 285 17.5 24 6.6 602 15.1 7 10.1 71 13.4 10 13.4 Mem Mem Mem 168 16.3 31 17.1 
Evaluation 
Seq. imp. 61 57 331 335 421 408 410 2610 2587 2643 6 798 
Par. imp. 58 -4.9% 58 1.7% COMPARISON 334 0.9% 367 9.5% WITH 415 1.4% 415 THE 1.7% STATE 415 1.2% 2710 OF 3.8% THE 3553 ART 
37% 2733 3.4% 6 0.0% 783 -1.9% 
Triples 95.5M 555.1M 118.3M 1529.7M 182.4M 332.6M 219.0M 911.2M 1661.0M 1094.0M 35.6M 429.6M 
Memory 4.2GB 18.0GB 6.1GB 51.9GB 9.3GB 13.8GB 11.1GB 49.0GB 75.5GB 53.3GB 1.1GB 20.2GB 
Active 93.3% 98.8% 28.1% 94.5% 66.5% 90.3% 85.3% 66.5% 90.3% 85.3% 99.7% 99.1% 
(b) Comparison of SysX with DBRDF and OWLIM-Lite 
SysX PG-VP MDB-VP PG-TT MDB-TT OWLIM-Lite 
Import Materialise Import Materialise Import Materialise Import Import Import Materialise 
T B/t T B/t T B/t T B/t T B/t T B/t T B/t T B/t T B/t T B/t 
ClarosL 48 89 
2062 60 
1138 165 
25026 97 
883 58 
Mem 
1174 217 896 94 300 54 
14293 28 
ClarosLE 4218 44 Mem Mem Time 
DBpediaL 143 67 
Mem 
354 49 
274 69 
5844 148 
15499 51 
5968 174 4492 92 1735 171 
14855 164 
DBpediaLE 9538 49 Mem Mem Time 
LUBML01K 
332 74 
71 65 
7736 144 
948 127 
6136 56 
27 41 
7947 187 5606 104 2409 40 
316 36 
LUBMLE 01K 765 54 15632 112 Mem Time 
LUBMU01K 113 67 Time 138 44 Time 
UOBMU010 5 68 2501 43 116 154 Time 96 50 Mem 120 203 96 85 32 69 11311 24 
UOBML01K 632 64 467 63 13864 119 6708 107 11063 41 358 33 14901 176 10892 101 4778 38 Time 
Figure 2: Results of our Empirical Evaluation 
DBRDF: an implementation using RDBMS and known techniques 
PostgreSQL (row store) or MonetDB (column store) as underlying engine 
vertical partitioning (VP) or triple table (TT) storage model 
averages over three runs. OWLIM-Lite 
materialisation and import, so we loaded each 
the test datalog program and once with 
subtracted the two times; Ontotext con-firmed 
OWLIM-Lite: a commercial RDF store by OntoText 
good materialisation time estimate. 
Table (a) in Figure 2 show the speedup 
number of used threads. The middle part 
sequential and parallel import times, 
slowdown for the parallel version. The 
shows the number of triples, the mem-ory 
speedup as s(1) = 1/(1 − p(1)); assuming that 
p(1) = p(32), we get s(1) between 8.3 and 50. With 32 
threads we thus already approach this maximum, which ex-plains 
the flattening of the curves in Figure 2. 
Table (b) in Figure 2 compares SysX with DBRDF on 
PostgreSQL (PG) and MonetDB (MDB) with VP or TT 
scheme, and with OWLIM-Lite. Columns T show the times 
in seconds, and columns B/t show the number of bytes per 
triple. Import in DBRDF is about 20 times slower than in 
SysX; we found out that half of the import time is used in 
Motik et al. Parallel Datalog Reasoning in RDFox 8/9
Conclusion 
TABLE OF CONTENTS 
1 INTRODUCTION 
2 OUR APPROACH TO PARALLEL MATERIALISATION 
3 EVALUATION 
4 CONCLUSION 
Motik et al. Parallel Datalog Reasoning in RDFox 8/9
Conclusion 
RESEARCH DIRECTIONS 
Under development/already finished: 
Deal with owl:sameAs via rewriting 
Incremental deletions/additions 
Add a data/query/reasoning distribution layer 
Future work: 
Investigate potential for data compression 
Improve join cardinality estimation 
Improve query planning 
Motik et al. Parallel Datalog Reasoning in RDFox 9/9

Mais conteúdo relacionado

Destaque

Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXLDBC council
 
On unifying query languages for RDF streams
On unifying query languages for RDF streamsOn unifying query languages for RDF streams
On unifying query languages for RDF streamsDaniele Dell'Aglio
 
Stream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondStream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondEmanuele Della Valle
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactJean-Paul Calbimonte
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paperDBOnto
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ PosterDBOnto
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paperDBOnto
 
Optique - poster
Optique - posterOptique - poster
Optique - posterDBOnto
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperDBOnto
 
Optique presentation
Optique presentationOptique presentation
Optique presentationDBOnto
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet PosterDBOnto
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox PosterDBOnto
 
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...DBOnto
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA posterDBOnto
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paperDBOnto
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn PresentationDBOnto
 
PAGOdA Presentation
PAGOdA PresentationPAGOdA Presentation
PAGOdA PresentationDBOnto
 
PDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentationPDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentationDBOnto
 
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterDBOnto
 
Diadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDiadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDBOnto
 

Destaque (20)

Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
 
On unifying query languages for RDF streams
On unifying query languages for RDF streamsOn unifying query languages for RDF streams
On unifying query languages for RDF streams
 
Stream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondStream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and Beyond
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's React
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ Poster
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paper
 
Optique - poster
Optique - posterOptique - poster
Optique - poster
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
 
Optique presentation
Optique presentationOptique presentation
Optique presentation
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet Poster
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox Poster
 
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA poster
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn Presentation
 
PAGOdA Presentation
PAGOdA PresentationPAGOdA Presentation
PAGOdA Presentation
 
PDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentationPDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentation
 
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
 
Diadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDiadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meeting
 

Semelhante a Parallel Datalog Reasoning in RDFox Presentation

Crunching Molecules and Numbers in R
Crunching Molecules and Numbers in RCrunching Molecules and Numbers in R
Crunching Molecules and Numbers in RRajarshi Guha
 
A Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description LogicsA Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description LogicsJie Bao
 
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012taxonbytes
 
The Semantics of SPARQL
The Semantics of SPARQLThe Semantics of SPARQL
The Semantics of SPARQLOlaf Hartig
 
Introduction to database-Normalisation
Introduction to database-NormalisationIntroduction to database-Normalisation
Introduction to database-NormalisationAjit Nayak
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)Ankit Rathi
 
Loops and functions in r
Loops and functions in rLoops and functions in r
Loops and functions in rmanikanta361
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDFSara EL HASSAD
 
Robust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object DetectionRobust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object DetectionActiveEon
 
Regression in Modal Logic
Regression in Modal LogicRegression in Modal Logic
Regression in Modal LogicIvan Varzinczak
 
Dot Call interface
Dot Call interfaceDot Call interface
Dot Call interfaceHao Chai
 
Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1net2-project
 
Semantic Parsing with Combinatory Categorial Grammar (CCG)
Semantic Parsing with Combinatory Categorial Grammar (CCG)Semantic Parsing with Combinatory Categorial Grammar (CCG)
Semantic Parsing with Combinatory Categorial Grammar (CCG)shakimov
 
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...RuleML
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster packageAlberto Labarga
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data ManagementAlbert Bifet
 

Semelhante a Parallel Datalog Reasoning in RDFox Presentation (20)

Crunching Molecules and Numbers in R
Crunching Molecules and Numbers in RCrunching Molecules and Numbers in R
Crunching Molecules and Numbers in R
 
Automatic Mathematical Information Retrieval to Perform Translations up to Co...
Automatic Mathematical Information Retrieval to Perform Translations up to Co...Automatic Mathematical Information Retrieval to Perform Translations up to Co...
Automatic Mathematical Information Retrieval to Perform Translations up to Co...
 
A Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description LogicsA Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description Logics
 
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
 
The Semantics of SPARQL
The Semantics of SPARQLThe Semantics of SPARQL
The Semantics of SPARQL
 
Introduction to database-Normalisation
Introduction to database-NormalisationIntroduction to database-Normalisation
Introduction to database-Normalisation
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
Loops and functions in r
Loops and functions in rLoops and functions in r
Loops and functions in r
 
MLMM_16_08_2022.pdf
MLMM_16_08_2022.pdfMLMM_16_08_2022.pdf
MLMM_16_08_2022.pdf
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDF
 
Robust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object DetectionRobust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object Detection
 
Regression in Modal Logic
Regression in Modal LogicRegression in Modal Logic
Regression in Modal Logic
 
Dot Call interface
Dot Call interfaceDot Call interface
Dot Call interface
 
Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1
 
Semantic Parsing with Combinatory Categorial Grammar (CCG)
Semantic Parsing with Combinatory Categorial Grammar (CCG)Semantic Parsing with Combinatory Categorial Grammar (CCG)
Semantic Parsing with Combinatory Categorial Grammar (CCG)
 
Q
QQ
Q
 
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster package
 
Matlab python-
Matlab python-Matlab python-
Matlab python-
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Parallel Datalog Reasoning in RDFox Presentation

  • 1. PARALLEL DATALOG REASONING IN RDFOX Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks, Dan Olteanu University of Oxford September 25, 2014
  • 2. TABLE OF CONTENTS 1 INTRODUCTION 2 OUR APPROACH TO PARALLEL MATERIALISATION 3 EVALUATION 4 CONCLUSION Motik et al. Parallel Datalog Reasoning in RDFox 0/9
  • 3. Introduction TABLE OF CONTENTS 1 INTRODUCTION 2 OUR APPROACH TO PARALLEL MATERIALISATION 3 EVALUATION 4 CONCLUSION Motik et al. Parallel Datalog Reasoning in RDFox 0/9
  • 4. Introduction CURRENT TRENDS IN KNOWLEDGE/DATABASE SYSTEMS Materialisation is computationally intensive ) natural to parallelise mid-range laptops have 4 cores, servers with 16 cores are routine Price of RAM keeps falling 128 GB is routine, systems with 1 TB are emerging in-memory databases: SAP’s HANA, Oracle’s TimesTen, YarcData’s Urika Motik et al. Parallel Datalog Reasoning in RDFox 1/9
  • 5. Introduction RDFOX SUMMARY RDFox: a new RDF store and reasoner http://www.cs.ox.ac.uk/isg/tools/RDFox/ store data in RAM effectively use modern multi-core architectures Core focus: materialisation reasoning (recursive) datalog rules e.g., hx; rdf :type; yi ^ hy; rdfs:subClassOf ; zi ! hx; rdf :type; zi handle OWL 2 RL and the Semantic Web Rule Language (SWRL) explicitly store all implied triples so that queries can be evaluated directly used in a number of systems: Oracle’s database, OWLIM, WebPIE Used in projects for reasoning beyond OWL 2 RL ‘combined approach’ for reasoning with OWL 2 EL PAGOaA reasoner for OWL 2 DL comdining RDFox with HermiT Motik et al. Parallel Datalog Reasoning in RDFox 2/9
  • 6. Our Approach to Parallel Materialisation TABLE OF CONTENTS 1 INTRODUCTION 2 OUR APPROACH TO PARALLEL MATERIALISATION 3 EVALUATION 4 CONCLUSION Motik et al. Parallel Datalog Reasoning in RDFox 2/9
  • 7. Our Approach to Parallel Materialisation EXISTING APPROACHES TO PARALLEL MATERIALISATION Interquery parallelism: run independent rules in parallel degree of parallelism limited by the number of independent rules ) does not distribute workload to cores evenly Intraquery parallelism: partition rule instantiations to N threads e.g., constrain the body of rules evaluated by thread i to (x mod N = i) ) static partitioning may not distribute workload well due to data skew ) dynamic partitioning may incur an overhead due to load balancing Goal: distribute workload to threads evenly and with minimum overhead Motik et al. Parallel Datalog Reasoning in RDFox 3/9
  • 8. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 9. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM ) R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: A(a) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 10. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) ) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: A(a) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 11. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) ) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: A(b) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 12. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) ) R(b,e) A(a) R(c,f) R(c,g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: A(b) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 13. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) ) A(a) R(c,f) R(c,g) A(b) A(c) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: R(a,y) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 14. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) ) R(c,f) R(c,g) A(b) A(c) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: A(c) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 15. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) ) R(c,g) A(b) A(c) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: A(c) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 16. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) ) A(b) A(c) A(d) A(e) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: R(b,y) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 17. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(b) ) A(c) A(d) A(e) A(f) A(g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: R(c,y) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 18. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(b) A(c) ) A(d) A(e) A(f) A(g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: R(d,y) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 19. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(b) A(c) A(d) ) A(e) A(f) A(g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: R(e,y) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 20. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(b) A(c) A(d) A(e) ) A(f) A(g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: R(f,y) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 21. Our Approach to Parallel Materialisation SOLUTION PART I: ALGORITHM R(a,b) R(a,c) R(b,d) R(b,e) A(a) R(c,f) R(c,g) A(b) A(c) A(d) A(e) A(f) ) A(g) A(x) ^ R(x; y) ! A(y) For each fact: match the fact to all body atoms to obtain subqueries evaluate subqueries w.r.t. all previous facts add results to the table Current subquery: R(g,y) Motik et al. Parallel Datalog Reasoning in RDFox 4/9
  • 22. Our Approach to Parallel Materialisation PARALLELISING COMPUTATION Each thread extracts facts and evaluates subqueries independently The number of subqueries is determined by the number of facts ensures in practice that threads are equally loaded Requires no thread synchronisation ) We partition rule instances dynamically and with little overhead EXAMPLE WHEN PARALLELISATION FAILS A(x) ^ R(x; y) ! A(y) R(a; b); R(b; c); R(c; d); R(d; e); A(a) Motik et al. Parallel Datalog Reasoning in RDFox 5/9
  • 23. Our Approach to Parallel Materialisation PARALLELISING COMPUTATION Each thread extracts facts and evaluates subqueries independently The number of subqueries is determined by the number of facts ensures in practice that threads are equally loaded Requires no thread synchronisation ) We partition rule instances dynamically and with little overhead EXAMPLE WHEN PARALLELISATION FAILS A(x) ^ R(x; y) ! A(y) R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b) Motik et al. Parallel Datalog Reasoning in RDFox 5/9
  • 24. Our Approach to Parallel Materialisation PARALLELISING COMPUTATION Each thread extracts facts and evaluates subqueries independently The number of subqueries is determined by the number of facts ensures in practice that threads are equally loaded Requires no thread synchronisation ) We partition rule instances dynamically and with little overhead EXAMPLE WHEN PARALLELISATION FAILS A(x) ^ R(x; y) ! A(y) R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b); A(c) Motik et al. Parallel Datalog Reasoning in RDFox 5/9
  • 25. Our Approach to Parallel Materialisation PARALLELISING COMPUTATION Each thread extracts facts and evaluates subqueries independently The number of subqueries is determined by the number of facts ensures in practice that threads are equally loaded Requires no thread synchronisation ) We partition rule instances dynamically and with little overhead EXAMPLE WHEN PARALLELISATION FAILS A(x) ^ R(x; y) ! A(y) R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b); A(c); A(d) Motik et al. Parallel Datalog Reasoning in RDFox 5/9
  • 26. Our Approach to Parallel Materialisation PARALLELISING COMPUTATION Each thread extracts facts and evaluates subqueries independently The number of subqueries is determined by the number of facts ensures in practice that threads are equally loaded Requires no thread synchronisation ) We partition rule instances dynamically and with little overhead EXAMPLE WHEN PARALLELISATION FAILS A(x) ^ R(x; y) ! A(y) R(a; b); R(b; c); R(c; d); R(d; e); A(a); A(b); A(c); A(d); A(e) Motik et al. Parallel Datalog Reasoning in RDFox 5/9
  • 27. Our Approach to Parallel Materialisation SOLUTION PART II: INDEXING RDF DATA IN MAIN MEMORY The critically algorithm depends on: matching atoms ht1; t2; t3i with ti a constant or variable continuous concurrent updates Our RDF storage data structure: hash-based indexes ) naturally parallel data structure ‘mostly’ lock-free: at least one thread makes progress at most of the time compare: if a thread acquire a lock and dies, other threads are blocked main benefit: performance is less susceptible to scheduling decisions Main technical challenge: reduce thread interference when A writes to a location cached by B, the cache of B is invalidated our updates ensure that threads (typically) write to different locations Motik et al. Parallel Datalog Reasoning in RDFox 6/9
  • 28. Evaluation TABLE OF CONTENTS 1 INTRODUCTION 2 OUR APPROACH TO PARALLEL MATERIALISATION 3 EVALUATION 4 CONCLUSION Motik et al. Parallel Datalog Reasoning in RDFox 6/9
  • 29. Evaluation CONCURRENCY OVERHEAD AND PARALLELISATION SPEEDUP RDFox: an RDF store developed at Oxford University http://www.cs.ox.ac.uk/isg/tools/RDFox/ 8 16 24 32 20 18 16 14 12 10 8 6 4 2 ClarosL ClarosLE DBpediaL DBpediaLE LUBML01K LUBMU01K 8 16 24 32 20 18 16 14 12 10 8 6 4 2 UOBML01K UOBMU010 LUBMLE 01K LUBML05K LUBMLE 05K LUBMU05K Small concurrency overhead; parallelisation pays off already with two threads Speedup continues to increase after we exhaust all physical cores ) hyperthreading and parallelism can compensate CPU cache misses Motik et al. Parallel Datalog Reasoning in RDFox 7/9
  • 30. 32 127 19.5 285 17.5 24 6.6 602 15.1 7 10.1 71 13.4 10 13.4 Mem Mem Mem 168 16.3 31 17.1 Evaluation Seq. imp. 61 57 331 335 421 408 410 2610 2587 2643 6 798 Par. imp. 58 -4.9% 58 1.7% COMPARISON 334 0.9% 367 9.5% WITH 415 1.4% 415 THE 1.7% STATE 415 1.2% 2710 OF 3.8% THE 3553 ART 37% 2733 3.4% 6 0.0% 783 -1.9% Triples 95.5M 555.1M 118.3M 1529.7M 182.4M 332.6M 219.0M 911.2M 1661.0M 1094.0M 35.6M 429.6M Memory 4.2GB 18.0GB 6.1GB 51.9GB 9.3GB 13.8GB 11.1GB 49.0GB 75.5GB 53.3GB 1.1GB 20.2GB Active 93.3% 98.8% 28.1% 94.5% 66.5% 90.3% 85.3% 66.5% 90.3% 85.3% 99.7% 99.1% (b) Comparison of SysX with DBRDF and OWLIM-Lite SysX PG-VP MDB-VP PG-TT MDB-TT OWLIM-Lite Import Materialise Import Materialise Import Materialise Import Import Import Materialise T B/t T B/t T B/t T B/t T B/t T B/t T B/t T B/t T B/t T B/t ClarosL 48 89 2062 60 1138 165 25026 97 883 58 Mem 1174 217 896 94 300 54 14293 28 ClarosLE 4218 44 Mem Mem Time DBpediaL 143 67 Mem 354 49 274 69 5844 148 15499 51 5968 174 4492 92 1735 171 14855 164 DBpediaLE 9538 49 Mem Mem Time LUBML01K 332 74 71 65 7736 144 948 127 6136 56 27 41 7947 187 5606 104 2409 40 316 36 LUBMLE 01K 765 54 15632 112 Mem Time LUBMU01K 113 67 Time 138 44 Time UOBMU010 5 68 2501 43 116 154 Time 96 50 Mem 120 203 96 85 32 69 11311 24 UOBML01K 632 64 467 63 13864 119 6708 107 11063 41 358 33 14901 176 10892 101 4778 38 Time Figure 2: Results of our Empirical Evaluation DBRDF: an implementation using RDBMS and known techniques PostgreSQL (row store) or MonetDB (column store) as underlying engine vertical partitioning (VP) or triple table (TT) storage model averages over three runs. OWLIM-Lite materialisation and import, so we loaded each the test datalog program and once with subtracted the two times; Ontotext con-firmed OWLIM-Lite: a commercial RDF store by OntoText good materialisation time estimate. Table (a) in Figure 2 show the speedup number of used threads. The middle part sequential and parallel import times, slowdown for the parallel version. The shows the number of triples, the mem-ory speedup as s(1) = 1/(1 − p(1)); assuming that p(1) = p(32), we get s(1) between 8.3 and 50. With 32 threads we thus already approach this maximum, which ex-plains the flattening of the curves in Figure 2. Table (b) in Figure 2 compares SysX with DBRDF on PostgreSQL (PG) and MonetDB (MDB) with VP or TT scheme, and with OWLIM-Lite. Columns T show the times in seconds, and columns B/t show the number of bytes per triple. Import in DBRDF is about 20 times slower than in SysX; we found out that half of the import time is used in Motik et al. Parallel Datalog Reasoning in RDFox 8/9
  • 31. Conclusion TABLE OF CONTENTS 1 INTRODUCTION 2 OUR APPROACH TO PARALLEL MATERIALISATION 3 EVALUATION 4 CONCLUSION Motik et al. Parallel Datalog Reasoning in RDFox 8/9
  • 32. Conclusion RESEARCH DIRECTIONS Under development/already finished: Deal with owl:sameAs via rewriting Incremental deletions/additions Add a data/query/reasoning distribution layer Future work: Investigate potential for data compression Improve join cardinality estimation Improve query planning Motik et al. Parallel Datalog Reasoning in RDFox 9/9