UML Class Diagrams (UCDs) are the best known class-based formalism for conceptual modeling. They are used by software engineers to model the intensional structure of a system in terms of classes, attributes and operations, and to express constraints that must hold for every instance of the system. Reasoning over UCDs is of paramount importance in design, validation, maintenance and system analysis; however, for medium and large software projects, reasoning over UCDs may be impractical. Query answering, in particular, can be used to verify whether a (possibly incomplete) instance of the system modeled by the UCD, i.e., a snapshot, enjoys a certain property. In this work, we study the problem of querying UCD instances, and we relate it to query answering under guarded Datalog +/-, that is, a powerful Datalog-based language for ontological modeling. We present an expressive and meaningful class of UCDs, named UCDLog, under which conjunctive query answering is tractable in the size of the instances.
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Querying UML Class Diagrams - FoSSaCS 2012
1. Querying UML Class Diagrams
Georg Gottlob
Department of Computer Science
University of Oxford
joint work with Andrea Calì, Giorgio Orsi and Andreas Pieris
2. Major Modeling Formalisms for Data and Objects
Competes
person: SSN Name
0..1
Stock employee[1] person[1]
Company Issues
Index[0..1]:Str
0..1 1..1 1..1
getIndex():List
Relational Data Dependencies
1..1
0..1
Member
Owns
2..1
Executive Person
1..1
UML Class Diagrams XML Schemas
student v member
m_name since g_name leads¡ v works
member
(1,1) 1
works
2 (1,N)
group
student v :professor
(1,1)
[1,2]
Description Logics
(1,N)
student professor leads
1 2
Context C1 inv: C1.allInstances -> forAll ( x1: C1 | C2.allInstances ->
forAll ( x2: C2 | x1=x2 implies x2.oclIsTyeOf(C)))
ER Diagrams
Object Constraint Language (OCL)
3. Datalog± : A Unifying Logical Framework
Datalog±
Description Logics
(DL-Lite, EL,…) Relational Constraints
(IDs, FKDs,…)
Datalog
Conceptual Models
(UML, ER,…)
… providing: Logical foundations, semantics, decidability and
complexity results for reasoning and query-answering,
identification of tractable fragments, …
4. Datalog§
8X8Y (X,Y) (X)
• Extend Datalog with additional features such as:
• Existential quantification (9): TGDs 8X8Y (X,Y) 9Z (X,Z)
• Equality atoms (=): EGDs 8X (X) Xi = Xj
• Constant false (?): Negative constraints 8X (X) ?
• But query answering under Datalog[9] is undecidable
[see, e.g., Beeri & Vardi, ICALP 81]
• Datalog[9,=,?] is syntactically restricted ! Datalog§
5. Restriction: Guardedness
• All 8-variables occur in one body atom - guard atom
8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z) 9W Q(X,W)
guard
• Models of finite treewidth ) decidability of query answering
[Calì, G. & Kifer, KR 08] related to work by [Andréka, Németi & van Benthem] and [Grädel]
• Query answering is PTIME-complete in data complexity
[Calì, G. & Lukasiewicz, PODS 09]
6. Reasoning over UML Class Diagrams
Competes
0..1 Stock
Company Issues
0..1 1..1
Index[0..1]:Str
1..1
1..1 getIndex():List
0..1
Member
Owns
2..1
Executive Person
1..1
Satisfiability: Does the diagram admit at least one instantiation?
7. Reasoning over UML Class Diagrams
• Satisfiability - the diagram has a (possibly infinite) non-empty
instantiation
• Full Satisfiability - the diagram has a (possibly infinite) instantiation
where each class and association is non-empty
• Finite Satisfiability - the diagram has a finite instantiation
8. Reasoning over UML Class Diagrams
Person
{disjoint}
Student Worker
finitely satisfiable, e.g., {Worker(john),Person(john)}
but not fully - student class is necessarily empty
9. Complexity of Reasoning over UML Class Diagrams
• Satisfiability is EXPTIME-complete
[Berardi, Calvanese & De Giacomo, Artificial Intelligence 05]
• Full Satisfiability is EXPTIME-complete
[Artale, Calvanese & Ibánez-García, ER 10]
• Finite Satisfiability is EXPTIME-complete
[implicit in Berardi, Calvanese & De Giacomo, Artificial Intelligence 05]
10. Querying UML Class Diagrams
Executive(john) Competes
Member(john,LU) 0..1
Stock
Stock(BAY) Company Issues
0..1 1..1 1..1
Index[0..1]:Str
Issues(BA,BAY)
1..1 getIndex():List
Owns(john,BAY) 0..1
Competes(LU,BA) Member
Owns
2..1
Executive Person
1..1
Which persons have a potential conflict of interest?
11. Querying UML Class Diagrams
Executive(john) Competes
Member(john,LU) 0..1
Stock
Stock(BAY) Company Issues
0..1 1..1 1..1
Index[0..1]:Str
Issues(BA,BAY)
1..1 getIndex():List
Owns(john,BAY) 0..1
Competes(LU,BA) Member
Owns
2..1
Executive Person
1..1
Conflict(P) Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
12. Querying UML Class Diagrams
Executive(john) Competes
Member(john,LU) 0..1
Stock
Stock(BAY) Company Issues
0..1 1..1 1..1
Index[0..1]:Str
Issues(BA,BAY)
1..1 getIndex():List
Owns(john,BAY) 0..1
Competes(LU,BA) Member
Owns
2..1
Executive Person
1..1
Does anybody have a potential conflict of interest?
13. Querying UML Class Diagrams
Executive(john) Competes
Member(john,LU) 0..1
Stock
Stock(BAY) Company Issues
0..1 1..1 1..1
Index[0..1]:Str
Issues(BA,BAY)
1..1 getIndex():List
Owns(john,BAY) 0..1
Competes(LU,BA) Member
Owns
2..1
Executive Person
1..1
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
14. Querying UML Class Diagrams
Executive(john) Competes
Member(john,LU) 0..1
Stock
Stock(BAY) Company Issues
0..1 1..1 1..1
Index[0..1]:Str
Issues(BA,BAY)
1..1 getIndex():List
Owns(john,BAY) 0..1
Competes(LU,BA) Member
Person(john) Owns
2..1
Executive Person
1..1
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
17. Querying UML Class Diagrams: Existential Rules
Group(DB)
Member 3..1 WorksIn
1..1
{disjoint} Group
0..1
Student Professor
1..1 Leads
CLeads
since: Date
Is there a professor who works in the database group?
18. Querying UML Class Diagrams: Existential Rules
Group(DB)
Member 3..1 WorksIn
1..1
{disjoint} Group
0..1
Student Professor
1..1 Leads
CLeads
since: Date
Ans Professor(P), WorksIn(P,DB)
19. Querying UML Class Diagrams: Existential Rules
Group(DB)
Leads(z1,DB) Member 3..1 WorksIn
Professor(z1) 1..1
{disjoint} Group
R*(z1,DB,z2)
CLeads(z2) 0..1
Student Professor
1..1 Leads
CLeads
since: Date
Ans Professor(P), WorksIn(P,DB)
20. Querying UML Class Diagrams: Existential Rules
Group(DB)
Leads(z1,DB) Member 3..1 WorksIn
Professor(z1) 1..1
{disjoint} Group
R*(z1,DB,z2)
CLeads(z2) 0..1
WorksIn(z1,DB) Student Professor
1..1 Leads
CLeads
since: Date
Ans Professor(P), WorksIn(P,DB)
21. Querying UML Class Diagrams: Existential Rules
Group(DB)
Leads(z1,DB) Member 3..1 WorksIn
Professor(z1) 1..1
{disjoint} Group
R*(z1,DB,z2)
CLeads(z2) 0..1
WorksIn(z1,DB) Student Professor
1..1 Leads
…
CLeads
since: Date
{P ! z1, DB ! DB}
Ans Professor(P), WorksIn(P,DB)
23. Querying UML Class Diagrams
D
Q
D[²Q , 8M (M ² D [ ! M ² Q)
M¶D Æ M²
24. From Diagrams to First-Order Logic (Datalog§)
Competes
0..1 Stock
Company Issues
0..1 1..1
Index[0..1]:Str
1..1
1..1 getIndex():List
0..1
Member
Owns
2..1
Executive Person
1..1
25. From Diagrams to First-Order Logic (Datalog§)
Competes
0..1 Stock
Company Issues
0..1 1..1
Index[0..1]:Str
1..1
1..1 getIndex():List
0..1
Member
Owns
2..1
Executive Person
1..1
8X8Y Member(X,Y) Company(X) ^ Executive(Y)
8X Company(X) 9Y9Z Member(X,Y) ^ Member(X,Z) ^ Y ≠ Z
8X Executive(X) 9Y Member(Y,X)
[Satoh & Kaneiwa, TCS 10
and Berardi et al., AI 05]
26. From Diagrams to First-Order Logic (Datalog§)
Competes
0..1 Stock
Company Issues
0..1 1..1
Index[0..1]:Str
1..1
1..1 getIndex():List
0..1
Member
Owns
2..1
Executive Person
1..1
8X Company(X) 9Y Issues(X,Y) 8X Stock(X) 9Y Issues(Y,X)
8X8Y8Z Stock(X) ^ Issues(Y,X) ^ Issues(Z,X) Y = Z
8X8Y Stock(X) ^ Index(X,Y) Str(Y)
[Satoh & Kaneiwa, TCS 10
8X8Y Stock(X) ^ getIndex(X,Y) List(Y) and Berardi et al., AI 05]
27. From Diagrams to First-Order Logic (Datalog§)
Competes
0..1 Stock
Company Issues
0..1 1..1
Index[0..1]:Str
1..1
1..1 getIndex():List
0..1
Member
Owns
2..1
Executive Person
1..1
8X Executive(X) Person(X)
[Satoh & Kaneiwa, TCS 10
and Berardi et al., AI 05]
28. Complexity of Query Answering
• EXPTIME-complete in combined complexity (everything is part of the input)
[implicit in Berardi et al., AI 05 and Lutz, IJCAR 08]
• coNP-complete in data complexity (the diagram and the query are fixed)
[implicit in Ortiz, Calvanese & Eiter, AAAI 06]
• Undecidable when diagrams are combined with arbitrary OCL
(Object Constraint Language) constraints
[folklore]
29. Research Challenge: Reduce High Complexity
• Diagrams often have very large instantiations
• Some applications require very large diagrams
• OCL constraints, that are not expressible diagrammatically,
lead to undecidability or high complexity
30. Our Goals
• Restrict UML class diagrams to achieve tractability of query
answering in data complexity
• Better understanding of combined complexity
• Add relevant OCL constraints without losing tractability of
query answering in data complexity
31. Lean UML Class Diagrams
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
32. Lean UML Class Diagrams
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
mL..mU A nL..nU
• For each association A: C1 C2
- upper bounds mU ,nU 2 {1,1},
- if A generalizes some other association, then mU = nU = 1
33. Lean UML Class Diagrams
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
mL..mU A nL..nU
• For each association A: C1 C2
- upper bounds mU ,nU 2 {1,1},
- if A generalizes some other association, then mU = nU = 1
C
• Completeness constraints are forbidden
{complete}
8X C(X) C1(X) _ C2(X) C1 C2
34. Lean UML Class Diagrams
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
mL..mU A nL..nU
• For each association A: C1 C2
- upper bounds mU ,nU 2 {1,1},
- if A generalizes some other association, then mU = nU = 1
C
• Completeness constraints are forbidden
{complete}
8X C(X) C1(X) _ C2(X) C1 C2
35. Lean UML Class Diagrams: Example
Competes
0..1 Stock
Company Issues
0..1 1..1
Index[0..1]:Str
1..1
1..1 getIndex():List
0..1
Member
Owns
2..1
Executive Person
1..1
36. Lean UML Class Diagrams: Example
3..1 WorksIn
Member
1..1
{disjoint} Group
0..1
Student Professor
1..1 Leads
CLeads
since: Date
37. Add Some Non-Diagrammatic Constraints (OCL)
C
C1 C2 8X C2(X) ^ C3(X) ?
C3
disjoint classes
We need negative constraints of the form
8X C1(X) ^ … ^ Cn(X) ?
38. Add Some Non-Diagrammatic Constraints (OCL)
C1 C2
C 8X C1(X) ^ C2(X) C(X)
most-specific class
We need most-specific class constraints of the form
8X C1(X) ^ … ^ Cn(X) C(X)
39. Add Some Non-Diagrammatic Constraints (OCL)
Student
Enrolled[0..1]:Course
type the domain of Enrolled
CS-Student
Enrolled[0..1]:CS-Course
8X8Y CS-Course(X) ^ Enrolled(Y,X) CS-Student(Y)
We need domain-type constraints of the form
8X8Y C(X) ^ Attr(Y,X) T(Y)
40. Add Some Non-Diagrammatic Constraints (OCL)
Student
Enrolled[0..1]:Course
type the domain of Enrolled
CS-Student
Enrolled[0..1]:CS-Course
8X8Y CS-Course(X) ^ Enrolled(Y,X) CS-Student(Y)
We need domain-type constraints of the form
8X8Y C(X) ^ Attr(Y,X) T(Y)
pullback rule
41. Add Some Non-Diagrammatic Constraints (OCL)
Student
Enrolled[0..1]:Course
type the domain of Enrolled
CS-Student
Enrolled[0..1]:CS-Course
8X8Y CS-Course(X) ^ Enrolled(Y,X) CS-Student(Y)
We need domain-type constraints of the form
8X8Y C(X) ^ Attr(Y,X) T(Y)
pullback rule will make a difference!
47. Data Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative
& most-specific class & domain-type constraints is PTIME-complete
48. Data Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative
& most-specific class & domain-type constraints is PTIME-complete
Proof:
• in PTIME: reduction to assertions 8X8Y body(X,Y) 9Z head(X,Z),
where body(X,Y) has a guard-atom [Calì, G. & Lukasiewicz, PODS 09]
• PTIME-hardness (even without domain-type constraints): reduction
from Path System Accessibility
49. Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams +
negative & most-specific class constraints is PSPACE-complete
50. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
8X8Y C(X) ^ A(X,Y) T(Y)
C(x)
8X8Y A(X,Y) C1(X) ^ C2(Y)
polynomial depth
Relevant part:
8X8Y A(X,Y) B(X,Y)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
51. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
8X8Y C(X) ^ A(X,Y) T(Y)
C(x)
8X8Y A(X,Y) C1(X) ^ C2(Y)
polynomial depth
Relevant part:
8X8Y A(X,Y) B(X,Y)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
52. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (Chase)
D
8X8Y C(X) ^ A(X,Y) T(Y)
C(x)
8X8Y A(X,Y) C1(X) ^ C2(Y)
polynomial depth
Relevant part:
8X8Y A(X,Y) B(X,Y)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
53. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
8X8Y C(X) ^ A(X,Y) T(Y)
C(x)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
With each current atom A we need to compute and memorize its type(A). This is in NP
54. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
8X8Y C(X) ^ A(X,Y) T(Y)
C(x)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
With each current atom A we need to compute and memorize its type(A). This is in NP
55. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
56. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
57. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
58. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
59. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
60. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
61. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
62. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
63. Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
D
C(x)
A(y,z)
B(y,z) T(z)
R(a,v) S(v,w)
8X8Y C(X) ^ A(X,Y) T(Y)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
With each current atom A we need to compute 8X8Y A(X,Y) C1(X) ^ C2(Y)
and memorize its type(A). This is in NP 8X8Y A(X,Y) B(X,Y)
64. Combined Complexity of Query Answering
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Initialization rules 8X initial(X) initial-state(X)
8X initial(X) cell0[α0,1](X)
8X initial(X) celli[αi,0](X), for each i 2 {1,…,n-1}
8X initial(X) celli[0,0](X), for each i 2 {n,…,m-1}
initial-state cell0[α0,1] cell1[α1,0] … celln-1[αn-1,0] celln[0,0] … cellm-1[0,0]
initial
65. Combined Complexity of Query Answering
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Configuration generation rules
config
8X initial(X) config(X) succ[1..1]:config
8X config(X) 9Y succ(X,Y)
8X8Y config(X) ^ succ(X,Y) config(Y)
initial
66. Combined Complexity of Query Answering
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Rules to describe the transition from one configuration to another,
e.g., state transition rules
for each δ(hs1,α1i) = hs2,α2,di:
8X8Y s1-celli [α1,1](X) ^ succ(X,Y) state-s2(Y), for each i 2 {0,…,m-1}
in configuration X, which has state s1, the i-th
cell contains α1, and the cursor is over cell i s1-celli [α1,1]
succ[0..1]:state-s2
67. Combined Complexity of Query Answering
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Acceptance rule 8X accept-state(X) accept(X)
accept
accept-state
• Initial database D = {initial(c)} - c is the initial configuration
• Boolean CQ Q= accept(X)
• Turing machine accepts I iff D [ ² Q
68. Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative &
most-specific class & domain-type constraints is EXPTIME-complete
69. Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative &
most-specific class & domain-type constraints is EXPTIME-complete
Proof:
• in EXPTIME: reduction to assertions 8X8Y body(X,Y) 9Z head(X,Z), where
body(X,Y) has a guard-atom, and all the predicates are of bounded arity
[Calì, G. & Kifer, KR 08]
71. Further Restrictions
Student
Enrolled [0..1]:Course
CS-Student
Enrolled [0..1]:CS-Course
72. Further Restrictions
Student
Enrolled [0..1]:Course
different classes have disjoint sets
of attributes and operations
CS-Student
Enrolled [0..1]:CS-Course
instead of 8X8Y Student(X) ^ Enrolled(X,Y) Course(Y)
we have 8X8Y Student-Enrolled(X,Y) Course(Y)
Data complexity in AC0 and combined complexity NP-complete
73. Complexity of Querying UML Class Diagrams
UML Additional Data Combined
Formalism Constraints Complexity Complexity
Full none coNP-complete EXPTIME-complete
+ negative
Lean + specific-class PTIME-complete EXPTIME-complete
+ domain-type
+ negative
Lean PTIME-complete PSPACE-complete
+ specific-class
Restricted + negative
in AC0 NP-complete
Lean + specific-class
74. Datalog± : A Unifying Logical Framework
Datalog±
Description Logics
(DL-Lite, EL,…) Relational Constraints
(IDs, FKDs,…)
Datalog
Conceptual Models
(UML, ER,…)
… without losing tractable data complexity
75. Datalog± : A Unifying Logical Framework
Datalog±
Description Logics
(DL-Lite, EL,…) Relational Constraints
(IDs, FKDs,…)
Datalog
Conceptual Models
(UML, ER,…)
… without losing tractable data complexity
76. Ontological Reasoning and Datalog
DL Assertion Datalog Rule
Concept Inclusion
emp v person emp(X) person(X)
Concept Product
sen-emp £ emp v moreThan sen-emp(X) ^ emp(Y) moreThan(X,Y)
(Inverse) Role Inclusion
reports¡ v mgr reports(X,Y) mgr(Y,X)
Role Transitivity
trans(mgr) mgr(X,Y) ^ mgr(Y,Z) mgr(X,Z)
77. Ontological Reasoning and Datalog
DL Assertion Datalog Rule
Concept Inclusion
emp v person emp(X) person(X)
Concept Product
sen-emp £ emp v moreThan sen-emp(X) ^ emp(Y) moreThan(X,Y)
(Inverse) Role Inclusion
reports¡ v mgr reports(X,Y) mgr(Y,X)
Role Transitivity
trans(mgr) mgr(X,Y) ^ mgr(Y,Z) mgr(X,Z)
Participation
emp v 9report emp(X) 9Y report(X,Y)
Disjointness
emp u customer v ? emp(X) ^ customer(X) ?
Functionality
funct(reports) reports(X,Y) ^ reports(X,Z) Y = Z
78. Guardedness
• All 8-variables occur in one body atom - guard atom
8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z) 9W Q(X,W)
guard
• Models of finite treewidth ) decidability of query answering
[Calì, G. & Kifer, KR 08]
• Query answering is PTIME-complete in data complexity
[Calì, G. & Lukasiewicz, PODS 09]
• Properly extends ELH (same data complexity)
79. Ontology Querying
ELH: Popular DL with PTIME data complexity
[Baader, IJCAI 03 and Rosati, DL 07]
ELH TBox Datalog§ Representation
AvB 8X A(X) B(X)
Au Bv C 8X A(X) ^ B(X) C(X)
A v 9R.B 8X A(X) 9Y R(X,Y) ^ B(Y)
9R.A v B 8X8Y R(X,Y) ^ A(Y) B(X)
RvP 8X8Y R(X,Y) P(X,Y)
84. Linearity
• Just one atom in the body 8X8Y R(X,Y) ! 9Z (X,Z)
guard
• Linear TGDs are trivially guarded
• Linear TGDs are first-order rewritable [Calì, G. & Lukasiewicz, PODS 09]
Query answering in AC0 data complexity
• Polynomial-size rewriting (SQL DL = NR Datalog) [G. & Schwentick, KR 12]
• Properly extends DL-Lite (same data complexity)
85. Ontology Querying
DL-Lite: Popular family of DLs with AC0 data complexity (OWL 2 QL)
[Calvanese, De Giacomo, Lembo, Lenzerini & Rosati, JAR 07]
DL-Lite TBox Datalog§ Representation
AvB 8X A(X) B(X)
A v 9R 8X A(X) 9Y R(X,Y)
9R v A 8X8Y R(X,Y) A(X)
RvP 8X8Y R(X,Y) P(X,Y)
86. Finite Controllability
?
D[²Q , D [ ²fin Q
• Holds for inclusion dependencies
[Rosati, PODS 06]
• Holds for guarded Datalog± (in fact, for the guarded fragment)
[Bárány, G. & Otto, LICS 10]
• Different from finite-model property of the guarded fragment:
If D [ [ Q has a model, then D [ [ Q it has a finite one.
87. Finite Controllability and Lean UML Class Diagrams
• For each attribute assertion Attribute[ i..j ]:Type: j = 1
C1 mL..mU A nL..nU
• For each association A: mU = nU = 1 C2
• Disjointness and negative constraints are forbidden
• Most-specific class and domain-type constraints are allowed
set of guarded TGDs ) finite controllability holds
91. Datalog§: Summary of Complexity Results
Data Fixed Combined
Guarded PTIME-complete NP-complete 2EXPTIME-complete
Linear in AC0 NP-complete PSPACE-complete
Sticky-join in AC0 NP-complete EXPTIME-complete
Same complexity with negative constraints and non-conflicting EGDs
92. Datalog§: Next Steps
PTIME-complete
PTIME-complete FO-rewritable
Guarded Linear Sticky-join
?
…with disjunction in the head
… finite-model reasoning
94. But…
• What about joins in rule bodies?
8A8D8P runs(D,P) ^ area(P,A) 9E employee(E,D,P,A)
• What about the DL assertion concept product?
8E8M elephant(E) ^ mouse(M) biggerThan(E,M)
95. But…
• What about joins in rule bodies?
8A8D8P runs(D,P) ^ area(P,A) 9E employee(E,D,P,A)
• What about the DL assertion concept product?
8E8M elephant(E) ^ mouse(M) biggerThan(E,M)
No tree-like models guaranteed
8X8Y R(X,Y) 9Z R(Y,Z)
Infinitely many symbols in S
8X8Y R(X,Y) S(X)
8X8Y S(X) ^ S(Y) P(X,Y) P forms an infinite clique
98. Stickiness
• Properly generalize inclusion dependencies
• Backward-resolution terminates
• Query answering is in AC0 in data complexity (first-order rewritability)
[Calì, G. & Pieris, VLDB 10]
• Properly extends DL-Lite (same data complexity)
99. Additional Features
• EGDs, e.g., 8X8Y8Z reports(X,Y) ^ reports(X,Z) ! Y = Z
Non-Conflicting EGDs: do not interact with TGDs
Preliminary check without adding complexity
• Negative constraints, e.g., 8X emp(X) ^ customer(X) ! ?
Check without adding complexity
Finite controllability does not hold
D = {R(a,b)}
D[²Q
8X8Y R(X,Y) 9Z R(Y,Z) but
=
8X8Y8Z R(Y,X) ^ R(Z,X) Y = Z D [ ²fin Q
Q R(A,a)
100. Additional Features
• EGDs, e.g., 8X8Y8Z reports(X,Y) ^ reports(X,Z) ! Y = Z
Non-Conflicting EGDs: do not interact with TGDs
Preliminary check without adding complexity
• Negative constraints, e.g., 8X emp(X) ^ customer(X) ! ?
Check without adding complexity
101. Comparison with ER§ Schemata
ER§: Extended ER formalism with AC0 data complexity
[Calì, G. & Pieris, Information Systems 2012]
Lean UML ER§
IS-A among classes
IS-A among associations
¸n participation n¸0 n 2 {0,1}
·1 participation
Permutation on IS-A
Values of attributes complex atomic
Operations
Attribute re-use
102. The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
D
person(john)
person(P) 9F father(F,P) father(F,P) person(F)
chase(D,) = D [ ?
103. The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
D
person(john)
person(P) 9F father(F,P) father(F,P) person(F)
chase(D,) = D [ {father(z1,john)
104. The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
D
person(john)
person(P) 9F father(F,P) father(F,P) person(F)
chase(D,) = D [ {father(z1,john), person(z1)
105. The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
D
person(john)
person(P) 9F father(F,P) father(F,P) person(F)
chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1)
106. The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
D
person(john)
person(P) 9F father(F,P) father(F,P) person(F)
chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1), …}
107. The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
D
person(john)
person(P) 9F father(F,P) father(F,P) person(F)
chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1), …}
infinite instance
108. Query Answering via Chase
Q h
C = chase(D,)
D
h1 h2
h2(C)
h1(C) . . .
M1
M2
D[²Q , chase(D,) ² Q
[see, e.g., Deutsch, Nash & Remmel, PODS 08]
109. Bounded Derivation-Depth Property (BDDP)
D
Q
constant depth
w.r.t. D
P
chase(D,)
chase(D,) ² Q ) P²Q
[Calì, G. & Lukasiewitcz, PODS 09]
110. Bounded Derivation-Depth Property (BDDP)
D
Q
constant depth
w.r.t. D
P
chase(D,)
BDDP ) First-Order Rewritability
[Calì, G. & Lukasiewitcz, PODS 09]
111. First-Order Rewritable TGDs
Q
rewriting
Q
8D: D [ ² Q , D ² Q
Query answering is in AC0 in data complexity [Vardi, PODS 95]
112. OMG UML (Unified Modeling Language)
Standard conceptual modeling tool for software design
Competes
0..1 Stock
Company Issues
Index[0..1]:Str
0..1 1..1 1..1
1..1 getIndex():List
0..1
Member
Owns
2..1
Executive Person
1..1
Class Diagrams
Reasoning over UML models
Model checking
Specification recovery
Software maintenance