4. The Resource Description Framework (RDF)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 4 / 35
⊲ graph-based data model
⊲ W3C standard
5. The Resource Description Framework (RDF)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 4 / 35
⊲ graph-based data model
⊲ W3C standard
RDF Graph:
⊲ set of triples: s p o s ∈ U ∪ B, p ∈ U, o ∈ U ∪ B ∪ L
U – URIs, L – literals (constants), B – blank nodes
the subject s
has the property p
with the value: the object o
6. The Resource Description Framework (RDF)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 4 / 35
⊲ graph-based data model
⊲ W3C standard
RDF Graph:
⊲ set of triples: s p o s ∈ U ∪ B, p ∈ U, o ∈ U ∪ B ∪ L
U – URIs, L – literals (constants), B – blank nodes
the subject s
has the property p
with the value: the object o
⊲ built-in property: rdf:type
specify to which classes a resource belongs
7. The Resource Description Framework (RDF)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 4 / 35
⊲ graph-based data model
⊲ W3C standard
RDF Graph:
⊲ set of triples: s p o s ∈ U ∪ B, p ∈ U, o ∈ U ∪ B ∪ L
U – URIs, L – literals (constants), B – blank nodes
the subject s
has the property p
with the value: the object o
⊲ built-in property: rdf:type
specify to which classes a resource belongs
Constructor Triple Relational notation
Class assertion s rdf:type o o(s)
Property assertion s p o p(s, o)
8. Blank nodes
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 5 / 35
⊲ feature of RDF
⊲ support unknown URI/literal tokens
9. Blank nodes
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 5 / 35
⊲ feature of RDF
⊲ support unknown URI/literal tokens
Example:
the country of _:b1 is Italy
10. Blank nodes
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 5 / 35
⊲ feature of RDF
⊲ support unknown URI/literal tokens
Example:
the country of _:b1 is Italy
the city of the same _:b1 is Genoa
11. Blank nodes
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 5 / 35
⊲ feature of RDF
⊲ support unknown URI/literal tokens
Example:
the country of _:b1 is Italy
the city of the same _:b1 is Genoa
the population of Genoa is an unspecified value _:b2
12. Running example
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 6 / 35
book1
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
Book
English
_:b0
_:b1
Language
writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range rdfs:subPropertyOf
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
13. RDF Schema (RDFS)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 7 / 35
⊲ feature of RDF
⊲ enhance the descriptions in graphs
⊲ declare semantic constraints between classes and properties
14. RDF Schema (RDFS)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 7 / 35
⊲ feature of RDF
⊲ enhance the descriptions in graphs
⊲ declare semantic constraints between classes and properties
Built-in properties:
⊲ subclass relationships: rdfs:subClassOf
⊲ subproperty relationships: rdfs:subPropertyOf
⊲ typing the first attribute (domain) of a property: rdfs:domain
⊲ typing the second attribute (range) of a property: rdfs:range
Constructor Triple Relational notation
Subclass constraint s rdfs:subClassOf o s ⊆ o
Subproperty constraint s rdfs:subPropertyOf o s ⊆ o
Domain typing constraint s rdfs:domain o Πdomain(s) ⊆ o
Range typing constraint s rdfs:range o Πrange(s) ⊆ o
15. Running example
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 8 / 35
book1
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
Book
English
_:b0
_:b1
Language
writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range rdfs:subPropertyOf
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
16. Open-world assumption and RDF entailment
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 9 / 35
The RDF data model is based on the open-world assumption.
→ deductive constraints – implicitly propagate tuples
Implicit triples
→ considered part of the graph – not explicitly present
17. Open-world assumption and RDF entailment
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 9 / 35
The RDF data model is based on the open-world assumption.
→ deductive constraints – implicitly propagate tuples
Implicit triples
→ considered part of the graph – not explicitly present
Entailment – reasoning mechanism
set of explicit triples & some entailment rules
derive implicit information
18. Open-world assumption and RDF entailment
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 9 / 35
The RDF data model is based on the open-world assumption.
→ deductive constraints – implicitly propagate tuples
Implicit triples
→ considered part of the graph – not explicitly present
Entailment – reasoning mechanism
set of explicit triples & some entailment rules
derive implicit information
Exhaustive application of entailment rules → saturation (a.k.a. closure)
The saturation of a graph is unique (up to blank node renaming).
19. Open-world assumption and RDF entailment
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 9 / 35
The RDF data model is based on the open-world assumption.
→ deductive constraints – implicitly propagate tuples
Implicit triples
→ considered part of the graph – not explicitly present
Entailment – reasoning mechanism
set of explicit triples & some entailment rules
derive implicit information
Exhaustive application of entailment rules → saturation (a.k.a. closure)
The saturation of a graph is unique (up to blank node renaming).
Entailment is part of the RDF specification itself.
The semantics of an RDF graph is its saturation.
20. Entailment rules by example
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 10 / 35
21. Entailment rules by example
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 10 / 35
1)
book1Book
Publication
rdfs:subClassOf
rdf:type
rdf:type
22. Entailment rules by example
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 10 / 35
1)
book1Book
Publication
rdfs:subClassOf
rdf:type
rdf:type
2)
book1writtenIn
hasLanguage English
rdfs:subPropertyOf writtenIn hasLanguage
23. Entailment rules by example
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 10 / 35
1)
book1Book
Publication
rdfs:subClassOf
rdf:type
rdf:type
2)
book1writtenIn
hasLanguage English
rdfs:subPropertyOf writtenIn hasLanguage
3)
book1writtenIn
Book English
rdfs:domain writtenIn
rdf:type
24. Entailment rules by example
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 10 / 35
1)
book1Book
Publication
rdfs:subClassOf
rdf:type
rdf:type
2)
book1writtenIn
hasLanguage English
rdfs:subPropertyOf writtenIn hasLanguage
3)
book1writtenIn
Book English
rdfs:domain writtenIn
rdf:type
4)
book1writtenIn
Language English
rdfs:range writtenIn
rdf:type
26. Basic Graph Pattern (BGP) Queries
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 12 / 35
⊲ subset of SPARQL
⊲ BGP – conjunction of triple patterns (or triples)
q(¯x):- t1, . . . , tα
ti = si pi oi, si, pi ∈ U ∪ B ∪ V, oi ∈ U ∪ B ∪ V ∪ L
¯x ∈ V (distinguished variables)
27. Basic Graph Pattern (BGP) Queries
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 12 / 35
⊲ subset of SPARQL
⊲ BGP – conjunction of triple patterns (or triples)
q(¯x):- t1, . . . , tα
ti = si pi oi, si, pi ∈ U ∪ B ∪ V, oi ∈ U ∪ B ∪ V ∪ L
¯x ∈ V (distinguished variables)
query evaluation treats blank nodes in a query as
non-distinguished variables
28. Basic Graph Pattern (BGP) Queries
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 12 / 35
⊲ subset of SPARQL
⊲ BGP – conjunction of triple patterns (or triples)
q(¯x):- t1, . . . , tα
ti = si pi oi, si, pi ∈ U ∪ B ∪ V, oi ∈ U ∪ B ∪ V ∪ L
¯x ∈ V (distinguished variables)
query evaluation treats blank nodes in a query as
non-distinguished variables
Example:
q(x, y):- x hasAuthor z, x rdf:type y
≡
q(x, y):- x hasAuthor _:b0, x rdf:type y
30. Query answering
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 13 / 35
Problem:
query evaluation = query answering
the evaluation of a query only uses the graph’s explicit triples
may lead to an incomplete answer set
the (complete) answer set is obtained by evaluating the query
against the graph’s saturation
31. Query answering
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 13 / 35
Problem:
query evaluation = query answering
the evaluation of a query only uses the graph’s explicit triples
may lead to an incomplete answer set
the (complete) answer set is obtained by evaluating the query
against the graph’s saturation
Solution:
decouple RDF entailment from query evaluation
32. Query answering
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 13 / 35
Problem:
query evaluation = query answering
the evaluation of a query only uses the graph’s explicit triples
may lead to an incomplete answer set
the (complete) answer set is obtained by evaluating the query
against the graph’s saturation
Solution:
decouple RDF entailment from query evaluation
Perform a pre-processing step to deal with entailed triples:
⊲ on the database – data saturation
⊲ on the queries – query reformulation
33. Data saturation vs. Query reformulation
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 14 / 35
Data saturation
Advantages:
⊲ straightforward
⊲ easy to implement
Drawbacks:
⊲ computation time
⊲ additional storage space
⊲ must be recomputed upon
database updates
Example:
the YAGO2 dataset doubles in
size when computing the
RDFS-closure
→ 33M to 64M triples
Query reformulation
Advantages:
⊲ database saturation does not need
to be (re)computed
Drawbacks:
⊲ every incoming query must be
reformulated
⊲ reformulations can be
prohibitively large
⊲ difficult to optimize
Example:
a single atom query over YAGO2,
can yield of union of > 300 000
queries
35. Contributions
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 16 / 35
1. The database (DB) fragment of RDF
extending previously studied fragments by the support of blank nodes
2. Novel BGP query answering techniques for this DB fragment
designed to work on top of on any standard conjunctive query processor
(i) an efficient incremental RDF saturation maintenance algorithm
(ii) a novel reformulation-based query answering algorithm
3. Thorough performance comparison and analysis
36. Contributions
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 16 / 35
1. The database (DB) fragment of RDF
extending previously studied fragments by the support of blank nodes
2. Novel BGP query answering techniques for this DB fragment
designed to work on top of on any standard conjunctive query processor
(i) an efficient incremental RDF saturation maintenance algorithm
(ii) a novel reformulation-based query answering algorithm
3. Thorough performance comparison and analysis
37. Contributions
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 16 / 35
1. The database (DB) fragment of RDF
extending previously studied fragments by the support of blank nodes
2. Novel BGP query answering techniques for this DB fragment
designed to work on top of on any standard conjunctive query processor
(i) an efficient incremental RDF saturation maintenance algorithm
(ii) a novel reformulation-based query answering algorithm
3. Thorough performance comparison and analysis
38. Contributions
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 16 / 35
1. The database (DB) fragment of RDF
extending previously studied fragments by the support of blank nodes
2. Novel BGP query answering techniques for this DB fragment
designed to work on top of on any standard conjunctive query processor
(i) an efficient incremental RDF saturation maintenance algorithm
(ii) a novel reformulation-based query answering algorithm
3. Thorough performance comparison and analysis
39. The database (DB) fragment of RDF
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 17 / 35
⊲ restricts entailment to RDFS entailment
⊲ does not restrict graphs in any way
40. The database (DB) fragment of RDF
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 17 / 35
⊲ restricts entailment to RDFS entailment
⊲ does not restrict graphs in any way
An RDF database: db = D, S
D & S – disjoint sets of triples
D (RDF) – instance level → assertions
S (RDFS) – schema level → semantics
41. The database (DB) fragment of RDF
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 17 / 35
⊲ restricts entailment to RDFS entailment
⊲ does not restrict graphs in any way
An RDF database: db = D, S
D & S – disjoint sets of triples
D (RDF) – instance level → assertions
S (RDFS) – schema level → semantics
db =
book1
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
Book
English
_:b0 Language
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
,
Book
_:b1
Language
writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range rdfs:subPropertyOf
43. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
44. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
Reformulate(q, db) =
q(x, y):- x rdf:type y
45. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
Reformulate(q, db) =
q(x, y):- x rdf:type y
∪
q(x, Publication):- x rdf:type Publication
46. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
Reformulate(q, db) =
q(x, y):- x rdf:type y
∪
q(x, Publication):- x rdf:type Publication
∪
q(x, Publication):- x rdf:type Book
47. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
Reformulate(q, db) =
q(x, y):- x rdf:type y
∪
q(x, Publication):- x rdf:type Publication
∪
q(x, Publication):- x rdf:type Book
∪
q(x, Publication):- x writtenIn z
48. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
Reformulate(q, db) =
q(x, y):- x rdf:type y
∪
q(x, Publication):- x rdf:type Publication
∪
q(x, Publication):- x rdf:type Book
∪
q(x, Publication):- x writtenIn z
∪ . . . ∪
q(x, _:b1):- x rdf:type _:b1
∪ . . .
49. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
LanguageEnglish
book1 _:b1
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
rdf:type
rdf:type
q(x, _:b1):- x rdf:type _:b1
50. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
LanguageEnglish
book1 _:b1
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
rdf:type
rdf:type
q(x, _:b1):- x rdf:type _:b1
≡
q(x, _:b1):- x rdf:type z
51. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the evaluations of these queries
on db produces the correct answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
LanguageEnglish
book1 _:b1
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
rdf:type
rdf:type
q(x, _:b1):- x rdf:type _:b1
≡
q(x, _:b1):- x rdf:type z
Answer set: { book1, _:b1 , English, _:b1 }
wrong answer
52. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the non-standard evaluations
of these queries on db produces the correct
answer
Book
_:b1
Language writtenIn
hasLanguage
Publication
LanguageEnglish
book1 _:b1
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
rdf:type
rdf:type
q(x, _:b1):- x rdf:type _:b1
≡
q(x, _:b1):- x rdf:type z
Answer set: { book1, _:b1 }
correct answer
53. Query reformulation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 18 / 35
Reformulate(q, db)
⊲ fixpoint algorithm (13 reformulation rules)
⊲ reformulates q into a set of queries s.t.
the union of the non-standard evaluations
of these queries on db produces the correct
answer
⊲ size of the output: O((6 ∗ #db2)#q)
55. Database saturation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 19 / 35
Saturate(db)
⊲ fixpoint algorithm (4 saturation rules)
⊲ explicitly adds to db all its implicit triples
56. Database saturation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 19 / 35
Saturate(db)
⊲ fixpoint algorithm (4 saturation rules)
⊲ explicitly adds to db all its implicit triples
Saturate(db) = db ∪
book1
Language
Publication
_:b1
English
rdf:type
rdf:type
hasLanguage
rdf:type
57. Database saturation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 19 / 35
Saturate(db)
⊲ fixpoint algorithm (4 saturation rules)
⊲ explicitly adds to db all its implicit triples
⊲ size of the output: O(#db2
)
58. Database saturation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 19 / 35
Saturate(db)
⊲ fixpoint algorithm (4 saturation rules)
⊲ explicitly adds to db all its implicit triples
⊲ size of the output: O(#db2
)
⊲ computation time: O(#db3)
59. Database saturation algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 19 / 35
Saturate(db)
⊲ fixpoint algorithm (4 saturation rules)
⊲ explicitly adds to db all its implicit triples
⊲ size of the output: O(#db2
)
⊲ computation time: O(#db3)
What about updates?
60. Saturation maintenance algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 20 / 35
Saturate+(db)
⊲ multiset variant of Saturate(db)
⊲ allows saturation maintenance upon updates
61. Saturation maintenance algorithm
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 20 / 35
Saturate+(db)
⊲ multiset variant of Saturate(db)
⊲ allows saturation maintenance upon updates
Saturate+(db) = db ∪
Book
book1
Language
Publication
_:b1
English
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
62. Example of instance insertion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 21 / 35
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
book1
_:b1 Book
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
English
_:b0 Language
Publication
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
To insert the triple:
book1 French
writtenIn
63. Example of instance insertion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 21 / 35
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
book1
_:b1 Book
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
English
_:b0 Language
Publication
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
To insert the triple:
book1 French
writtenIn
First saturate the triple using db:
book1
Language
Book
Publication
_:b1
French
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
64. Example of instance insertion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 21 / 35
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
book1
_:b1 Book
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
English
French
_:b0 Language
Publication
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
hasLanguage
rdf:type
writtenIn
To insert the triple:
book1 French
writtenIn
First saturate the triple using db:
book1
Language
Book
Publication
_:b1
French
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
Then
insert the explicit triple
and
the inferred ones in db.
65. Example of schema deletion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 22 / 35
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
book1
_:b1 Book
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
English
_:b0 Language
Publication
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
To delete the triple:
BookwrittenIn
rdfs:domain
66. Example of schema deletion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 22 / 35
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
rdfs:range
rdfs:subPropertyOf
book1
_:b1 Book
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
English
_:b0 Language
Publication
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
To delete the triple:
BookwrittenIn
rdfs:domain
First infer affected data triples
using db:
book1
Book
Publication
_:b1
rdf:type
rdf:type
rdf:type
67. Example of schema deletion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 22 / 35
Book
_:b1
Language writtenIn
hasLanguage
Publication
rdfs:subClassOf
rdfs:subClassOf
rdfs:range
rdfs:subPropertyOf
book1
_:b1 Book
“Good Omens”
“Neil Gaiman”
“Terry Pratchett”
English
_:b0 Language
Publication
hasTitle
hasAuthor
hasAuthor
rdf:type
translatedTo
writtenIn
rdf:type
rdf:type
rdf:type
hasLanguage
rdf:type
To delete the triple:
BookwrittenIn
rdfs:domain
First infer affected data triples
using db:
book1
Book
Publication
_:b1
rdf:type
rdf:type
rdf:type
Then
delete the explicit triple
and
the inferred ones from db.
69. Experimental setup
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 24 / 35
• implementation in Java 1.6
• deployed on top of a PostgreSQL v8.5 server
• 6 indexes – all permutations of the (s, p, o) columns
• the spo index is clustering
• dictionary encoding
Graph characteristics and saturation times:
Graph Storage Barton DBpedia DBLP
#Schema in memory 101 5, 666 41
#Instance Triple(s, p, o) 34 × 106 27 × 106 8.4 × 106
#Saturation Sat(s, p, o) 39 × 106 30 × 106 12 × 106
Saturation increase (%) 14.91 10.65 41.05
#Multiset SatM(s, p, o, isExp, count) 73.5 × 106 66 × 106 18.7 × 106
Multiset increase (%) 116.89 227.37 121.97
tsat (s) 4, 294 2, 742 748
tsat+ (s) 4, 586 2, 977 799
70. Query answering
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 25 / 35
• 26 hand-picked queries (between 1 and 10 triple patterns – 6 on average)
• similar query answering times on Sat and SatM
ABCDEFB
EB E B F DE B DE
EB E B F DE D DE
EB E B F DE B DE D B
71. Graph updates
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 26 / 35
• no impact on reformulation
• saturation needs to maintain SatM
• insertions & deletions
• updates of one triple on the data and the schema
ABCDEFB
EB EC EB DE A
EB EC F DE A
EB EC EB DE A
EB EC F DE A
BC EB DE A
BC F DE A
72. Saturation thresholds
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 27 / 35
The saturation threshold of a query q (st(q)):
the smallest integer n s.t.
n × tref
(q) > n × tsat
(q) + tsat+
tref
(q) – time to answer q through reformulation (using Triple)
tsat
(q) – time to answer q based on saturation (using SatM)
tsat+ – time to saturate db (create SatM)
AB
CDE CDF D AB D AB C F DC F DF
D AB C F DC B A DF D AB C C F DF
D AB C C B A DF
74. Outline of the positioning of our work
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 29 / 35
Query language
expressive power
SPARQL
BGP queries
relational
conjunctive
queries RDF fragment
expressive powerDL DB
[1, 3, 5]
[4, 6, 7]
[2]
this
work
[1] ADJIMAN, P., GOASDOUÉ, F., AND ROUSSET, M.-C. SomeRDFS in the semantic web. JODS 8 (2007).
[2] ARENAS, M., GUTIERREZ, C., AND PÉREZ, J. Foundations of RDF databases. In Reasoning Web (2009).
[3] CALVANESE, D., GIACOMO, G. D., LEMBO, D., LENZERINI, M., AND ROSATI, R. Tractable reasoning and efficient query answering in
description logics: The DL-Lite family. Journal of Automated Reasoning (JAR) 39, 3 (2007).
[4] GOASDOUÉ, F., KARANASOS, K., LEBLAY, J., AND MANOLESCU, I. View selection in semantic web databases. PVLDB (2011).
[5] GOTTLOB, G., ORSI, G., AND PIERIS, A. Ontological queries: Rewriting and optimization. In ICDE (2011). Keynote.
[6] KAOUDI, Z., MILIARAKI, I., AND KOUBARAKIS, M. RDFS reasoning and query answering on DHTs. In ISWC (2008).
[7] URBANI, J., VAN HARMELEN, F., SCHLOBACH, S., AND BAL, H. QueryPIE: Backward reasoning for OWL Horst over very large
knowledge bases. In ISWC (2011).
76. Conclusion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 31 / 35
Summary:
⊲ RDF fragment (extending those studied in the literature)
⊲ novel saturation- and reformulation-based query answering techniques
robust to instance and schema updates
⊲ algorithms directly deployable on top of any RDBMS
⊲ thorough performance comparison and analysis
77. Conclusion
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 31 / 35
Summary:
⊲ RDF fragment (extending those studied in the literature)
⊲ novel saturation- and reformulation-based query answering techniques
robust to instance and schema updates
⊲ algorithms directly deployable on top of any RDBMS
⊲ thorough performance comparison and analysis
Future work:
An automated strategy to choose between the two techniques:
Saturate+(db) / Reformulate(q, db)
78. Thank you!
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 32 / 35
I you attention
Question
_:b1
_:b2
_:b3
thank
pay
ask
ask
ask
rdf:type
rdf:type
rdf:type
79. Open-world interpretation of RDFS constraints
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 33 / 35
Constraint interpretation:
⊲ closed-world assumption (CWA)
any fact not present in the database is assumed not to hold
database facts do not respect a constraint → inconsistency
R1 ⊆ R2 – any tuple in the relation R1 must also be in the relation R2
⊲ open-world assumption (OWA)
facts may hold even though they are not in the database
R1 ⊆ R2 – any tuple in the relation R1 is also in the relation R2
The RDF data model is based on OWA.
80. RDF meets Relational Database Management Systems (RDBMS)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 34 / 35
RDF graphs:
incomplete relational databases based on V-tables
V-tables:
allow using variables in their tuples
using a variable multiple times allows expressing joins on unknown values
BGP query answering boils down to
conjunctive query evaluation on a saturated database.
81. Saturation (related work)
EDBT 2013 Efficient Query Answering against Dynamic RDF Databases 35 / 35
• J. Broekstra and A. Kampman
“Inferencing and truth maintenance in RDF Schema: Exploring a naive
practical approach”
in PSSS Workshop, 2003.
• B. Bishop, A. Kiryakov, D. Ognyanoff, I. Peikov, Z. Tashev, and R. Velkov
“OWLIM: A family of scalable semantic repositories”
Semantic Web, vol. 2, no. 1, 2011.
• C. Gutierrez, C. A. Hurtado, and A. A. Vaisman
“RDFS update: From theory to practice”
in ESWC, 2011.