This document discusses ontology-based data access (OBDA) using existential rules. OBDA allows querying over data sources using an ontology to infer new facts and provide a unified view to the user. Existential rules provide a logical framework for OBDA and generalize other rule languages like Datalog. However, entailment with existential rules is undecidable in general. The document outlines different decidable classes of existential rules defined by properties like the chase procedure halting, query rewriting being finite, or generated facts having a tree structure.
1. Ontology-based Data Access
with Existential Rules
Marie-Laure Mugnier
Université Montpellier 2
RuleML 2012
2. Ontology-‐based
Data
Access
(OBDA)
Answers ?
Knowledge Base
Query
Ontology
Data
Patient records Medical ontology
« Patient P suffers from myeloid leukemia and
has been prescribed drug D against hypertension »
Query: « find all cancer patients treated for high-blood pressure »
à Use ontological knowledge to infer all deducible answers
3. Ontology-‐based
Data
Access:
What
for?
■ Enrich the vocabulary of data sources
à Abstract from a specific database schema
■ Relate the vocabulary of different data sources
à Provide a unified view to the user
■ Allow inference of new facts
à Allow data incompleteness
4. Outline
n The existential rule framework for OBDA
rule-based, logic-based and graph-based
n Decidability, complexity and algorithmic issues
n A tool for combining decidable classes of rules
n Perspectives
5. Data
/
Facts
Relational Database RDF (Semantic Web) Etc.
parentOf Male Fem. F M
A B B A rdf:type rdf:type
A C … … ex:parent
ex:A ex:B
C ?
ex:parent
…… ex:C
Abstraction in first-order logic Or in graphs / hypergraphs
1 2
F A p B M
∃x( parentOf(A,B) ∧ parentOf(A,C) ∧
1
parentOf(C,x) ∧ F(A) ∧ M(B) )
p
2
1
2
C p
6. Ontology
(1)
Concepts Relations
Human sameFamilyAs
Male Female Adult
ancestorOf uncleOf
Father Mother
parentOf siblingOf
…
motherOf brotherOf
+ properties on concepts and relations:
The relation ancestorOf is transitive
The inverse of the relation fatherOf is functional
The concepts Male and Female are disjoint
The relation siblingOf can be defined from the relation parentOf
7. Ontology
(2)
Abstraction with rules in First-Order Logic
• Specialization relationships between concepts / relations
∀x (Male(x) à Human(x)) ∀x ∀y (parentOf(x,y) à ancestorOf(x,y))
• « ancestorOf is transitive »
∀x ∀y ∀z (ancestorOf(x,y) ∧ ancestorOf (y,z) à ancestorOf (x,z))
• « the inverse of fatherOf is functional »
∀x ∀y ∀z (fatherOf(y,x) ∧ fatherOf(z,x) à y = z)
• « Male et Female are disjoint »
∀x (Male(x) ∧ Female(x) à ⊥)
• Definition of siblingOf
∀x ∀y ∀z (parentOf(x,y) ∧ parentOf(x,z) à siblingOf(y,z))
∀x ∀y (siblingOf(x,y) à ∃ z parentOf(z,x) ∧ parentOf(z,y))
8. ExistenCal
Rules
« Value Invention »
∀X ∀Y ( B[X, Y] → ∃Z H[X, Z] ) X, Y, Z :
tuples of variables
Body Head
Any conjunction of
atoms (on variables
and constants)
∀x ∀y (siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)))
Simplified form: siblingOf(x,y) à parentOf(z,x) ∧ parentOf(z,y)
§ Same as Tuple Generating Dependencies (TGDs)
§ See also Datalog+/-
§ Same as the logical translation of Conceptual Graph rules
9. Value
invenCon
R = ∀x ∀y (siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)))
F = siblingOf(A,B)
x 2
A
1
P 1
h: body à F 1
S z h ={(x,A), (y,B)} S
2
P 1
2
y 2
B
A rule bodyà head is applicable to a fact F if
there is a homomorphism h from body to F
The resulting fact is F’= F ∪ h(head)
[with renaming existential variables of head ] A 2
1
P 1
F’= ∃ z0 (siblingOf(A,B) z0
∧ parentOf(z0,A) ∧ parentOf(z0,B)) S
2
P 1
B 2
10. ExistenCal
Rules
cover
«
lightweight
»
DescripCon
Logics
n New DLs tailored for ontology-based data access:
DL-Lite
EL
}
Core of the « tractable profiles » of OWL2
_
Human
⊑
∃parentOf
.Human
Human(x)
à
parentOf(y,x)
∧ Human
(y)
q Existential rules are strictly more expressive: x 2
1
P 1
siblingOf(x,y)
à
parentOf(z,x)
∧
parentOf(z,y)
S z
not
expressible
in
DL
2
P 1
y 2
q Non-bounded predicate arity provides more flexibility:
à direct correspondence with database relations
à adding contextual information is easy
11. Logical
[and
graphical]
framework
Answers ?
Knowledge Base
Existential Rules (Union of)
Conjunctive
Equality Rules Query
Facts
Constraints
(∨)
∃X
F[X]
Existential Rule: ∀X ∀Y ( B[X, Y] → ∃Z H[X, Z] )
Equality rule: ∀X (B[X] → x = e) with x,e var. or const.
Negative constraint: ¬ B or ∀X (B[X] → ⊥)
Positive constraint: same form as an existential rule
13. The
Conceptual
Graph
origins
n Conceptual Graphs introduced in [Sowa 76] [Sowa 84]
n Specific research line by Montpellier’s group since 1992
Graph-based knowledge representation and reasoning
«
Graph-‐Based
Knowledge
RepresentaFon:
ComputaFonal
FoundaFons
of
Conceptual
Graphs
»,
M.
Chein
&
M.-‐L.
Mugnier,
Springer,
2009
14. Conceptual
Graph
vocabulary:
1.
parFally
(pre-‐)ordered
set
of
concept
types
[screenshots from CoGui, http://www.lirmm.fr/cogui]
15.
Conceptual
Graph
vocabulary:
2.
parFally
(pre-‐)ordered
set
of
relaCons
with
their
signature
[any
relaFon
arity
allowed]
Logical
translaCon
of
the
preorders
and
signatures:
p<q ∀x1…xk ( p(x1…xk) → q(x1…xk) )
Signature of r ∀x1…xk ( p(x1…xk) → ti1(x1)…tik(xk))
16. Basic
Conceptual
Graph
Eva
y
[more generally: total order on
the edges incident to a relation node]
x
Logical
translaCon
(Φ)
:
existenCally
closed
conjuncCon
of
atoms
∃x ∃y (Girl(Eva) ∧ Child(x) ∧ Toy(y) ∧ Train(y)
∧ sisterOf(Eva,x) ∧ playWith(Eva,y) ∧ playWith(x,y))
Allows
to
represent
facts
and
conjuncCve
queries
17. Homomorphism
(with
vocabulary
preorders
integrated)
Fact
F
Query
Q
Logical
soundness
[Sowa
84]
and
completeness
[Chein
Mugnier
92]:
there
is
a
homomorphism
from
Q
to
F
iff
Φ(Q)
is
entailed
by
Φ(F)
and
Φ(vocabulary)
The
Basic
CG
fragment
restricted
to
binary
relaFons
is
equivalent
to
RDFS
[Baget
ISWC’05]
[Baget+
ICCS’10]
18. Richer
fragments
(nested
graphs,
rules,
constraints,
+
negaCon,
…)
¢ Rules
:
pairs
of
basic
conceptual
graphs
∀x ∀y (Human(x) ∧ Human(y) ∧ siblingOf(x,y)
à ∃ z (Adult(z) ∧ parentOf(z,x) ∧ parentOf(z,y)))
¢
Sound
and
complete
forward
chaining
and
backward
chaining
mechanisms
[Salvat
Mugnier
1996]
¢ PosiFve
and
NegaFve
constraints
¢ Several
ways
of
combining
rules
and
constraints
[Baget
Mugnier
JAIR
2002]
19. Outline
n The existential rule framework for OBDA
rule-based, logic-based and graph-based
n Decidability, complexity and algorithmic issues
n A tool for combining decidable classes of rules
n Perspectives
20. Let’s
focus
on
standard
existenCal
rules
Answers ?
Knowledge Base
Existential Rules
Conjunctive
Equality Rules Query
Facts
Constraints Q
K
=
(F,
R)
n Basic problem: Conjunctive Query Entailment
Given a KB K = (F, R) and a conjunctive query Q,
is Q entailed by K ?
21. Forward
versus
Backward
chaining
FC Fact saturation (« chase »)
R
Q
F
BC Query rewriting
F Q R
22. Forward chaining may not halt
R = Human(x) à parentOf(y,x) ∧ Human(y)
F = Human(A)
∧ Human(y1) ∧ parentOf(y1, A)
∧ Human(y2) ∧ parentOf(y2, y1)
Etc.
Human Human Human
2 1 2 1
A p p
[same non-halting trouble with backward chaining]
22
23. Decidability
Issues
n Entailment is not decidable
n Many decidable classes exhibited in databases and KR
n Three generic kinds of properties ensuring decidability:
- Saturation by Forward Chaining halts
- Query rewriting by Backward Chaining halts
- Saturation by Forward Chaining does not halt
but the generated facts have a tree-like structure
27. (ParCal)
inclusion
map
of
decidable
classes
w-sticky-join Finite query Tree-shaped
rewriting glut-fg saturation
w-sticky sticky-join
domain-r. jointly-fg
Finite saturation
sticky
weakly
wa-GRD jointly- frontier-guarded
acyclic
weakly- weakly- frontier-
acyclic guarded guarded
acyclic GRD
guarded frontier-1
Atomic-
atomic Body restricted
body
body to a single atom
datalog
E.g. Human(x) à
parentOf(y,x) ∧
Human(y)
Inclusion dependency
28. Main
classes
with
(infinite)
tree-‐shaped
saturaCon
Frontier: variables shared weakly Guard only affected variables
by the body and the head frontier from the frontier
guarded [Baget+ KR’10]
Guard only the frontier
[Baget+ KR’10] Guard only affected
variables
r(x,y) ∧ r(y,z) à frontier (possibly mapped on
r(y,u) ∧ r(z,u) weakly
guarded new existentials)
guarded
The frontier [Cali+ KR’08]
has size 1
[Baget+ IJCAI’09] [Cali+ KR’08] datalog
frontier guarded An atom in the body
1 guards all variables
from the body
r(x,y) ∧ r(y,z) ∧ r(x,z)
r(x,y) ∧ r(y,z) ∧ s(x,y,z) Atomic-
à r(z,u)
à r(y,u) ∧ r(z,u) body
29. Complexity
n Combined complexity Input: F, R, Q
Data complexity Input: F (R and Q are fixed)
n Desirable property in the context of large data:
polynomial data complexity
31. Towards
efficiency
in
pracCce
• Interest of query rewriting mechanisms: does not make the data grow
• However, the number of generated queries may be prohibitive in practice
A
B1(x) à A(x)
B2 (x) à B1(x) Q
=
A(x1)
∧
…
∧
A(xk)
B1
…
Bn(x) à Bn-1(x)
B2
Number
of
(conjuncFve)
rewriFngs:
nk
Bn
à Use
indexing
techniques
to
avoid
the
above
kind
of
blow-‐up
à Algorithms
combining
both
forward
and
backward
chaining
à Rewrite
into
more
compact
kinds
of
queries
32. Outline
n The existential rule framework for OBDA
rule-based, logic-based and graph-based
n Decidability, complexity and algorithmic issues
n A tool for combining decidable classes of rules
n Perspectives
33. Union
of
decidable
sets
of
rules
n Next question:
is the union of two decidable sets of rules still decidable ?
practically:
n can we safely merge several ontologies known to be decidable ?
n can we build a decidable hybrid language from two languages whose
semantics can be expressed by decidable subsets of rules ?
n Bad news:
Almost all classes are pairwise incompatible
n Next question:
which conditions on the interactions between rules ensure compatibility ?
34. A
tool
:
the
Graph
of
Rule
Dependency
R2 depends on R1 if applying R1 may trigger a new application of R2
i.e., there exists a fact F s.t. R1 is applicable to F but R2 is not
and there is an application of R1 to F leading to F’
s.t. R2 is applicable to F’
Body Head
R1
h 1 1
F Body
2 R2
Effective computation of dependencies with a unification test
35. Piece-‐based
unificaCon
n Existential variables make rule heads complex
à unification is more complex too
Atomic unification is not sufficient
R1= p(x) à r(x,y) ∧ r(y,z) ∧ r(z,x)
R2 does not depend on R1
R2 = r(u,v) ∧ r(v,u) à q(u)
1 2
u v
p r y 1 2
R1 1
R2 r
1
r r 2 1
x r
2 2 q
z
R2 depends on R1 iff there is a « piece-unifier » of body(R2) with head(R1)
36. Combining
decidable
classes
with
the
Graph
of
Rule
Dependencies
Rules
R1 R2 : R1 « may trigger » R2 (R2 depends on R1)
37. Combining
decidable
classes
with
the
Graph
of
Rule
Dependencies
If GRD(R) is without circuit then R is both fes (thus bts) and fus
fes = finite fact saturation
fus = finite query rewriting
bts = (possibly infinite) tree-shaped saturation
38. Combining
decidable
classes
with
the
Graph
of
Rule
Dependencies
If all strongly connected components of GRD(R) are fes
then R is fes
The same holds for fus (but not for bts)
ab (fus) fus
Datalog (fes) fg(bts)
fes dr (fus)
wa (fes)
39. Combining
decidable
classes
with
the
Graph
of
Rule
Dependencies
Let R1〉R2 be a partition of R s.t. no rule of R1 depends on a rule of R2
n If R1 is fes and R2 is bts, then R is bts
n If R1 is bts and R2 is fus, then R1〉R2 is decidable
Decidable
bts ab (fus) fus
Datalog (fes) fg(bts)
fes dr (fus)
wa (fes)
40. Combining
decidable
classes
with
the
Graph
of
Rule
Dependencies
Recommended algorithm:
Use FC-like algorithm on the bts subset à « saturated » fact F*
Use query rewriting with the fus subset à rewritten set Q
Check if a query in Q maps to F*
ab
fus
Datalog bts
fg
fes dr
wa
41. Conclusion
-‐
PerspecCves
n An emerging rule-based framework suitable to OBDA
§ simple,
§ expressive
§ flexible
n Currently:
§ A quite clear picture of decidable classes of rules with complexity
analysis
§ Effervescence around new algorithmic techniques
§ First implementations for very specific subclasses
n Main challenge: scalability