This document discusses access control for RDF graphs using abstract models. It presents an abstract access control model defined using abstract tokens and operators to model the computation of access labels for inferred RDF triples. The model supports dynamic datasets and policies. Experiments show that annotation time increases with the number of implied triples, while evaluation time increases linearly with the total number of triples. The abstract model approach allows different concrete access control policies to be applied to the same dataset.
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Access Control for RDF graphs using Abstract Models
1. ACCESS CONTROL FOR
RDF GRAPHS USING
ABSTRACT MODELS
Vassilis Papakonstantinou
(papv@ics.forth.gr)
Joint work with:
Maria Michou, Irini Fundulaki,
Giorgos Flouris, Grigoris Antoniou
SACMAT 2012
2. MOTIVATION
June 20-22, 2012
Why RDF Data?
RDF is the de-facto standard for publishing data in
the Linked Open Data Cloud
E-Science (astronomy, life
sciences, earth sciences)
Public Government Data
(US, UK, The Netherlands, … )
Social Networks
SACMAT-2012
DBPedia, CIA World FactBook, …
Why Access Control?
Crucial for sensitive content since it ensures the
2
selective exposure of information to different
classes of users
3. MAIN CONTRIBUTIONS
June 20-22, 2012
Fine-grained Access Control Model for RDF
defined at the level of RDF triples
focus on read-only permissions
with support for RDFS inference to infer new
knowledge
encodes how an access label has been computed
Supports dynamic datasets
SACMAT-2012
Supports dynamic access control policies
Implementation and experiments on top of
MonetDB and PostgreSQL
3
4. OUTLINE
June 20-22, 2012
Preliminaries: RDF and RDF Schema
Current models: Access Control Annotations
Our approach: Abstract Access Control Models
Implementation
Experiments
SACMAT-2012
4
5. RESOURCE DESCRIPTION
FRAMEWORK (RDF)
June 20-22, 2012
General-purpose language for representing
information in the Semantic Web
Information represented using triples
(s, p, o) [subject, predicate, object]
s, p, o: URIs or literals
Example: (&a, firstName, “Alice”)
SACMAT-2012
firstName
&a “Alice”
An entity being A property of the entity The value of the
described (first name) predicate
(the first name)
[subject] [predicate] [object] 5
6. RDF SCHEMA
June 20-22, 2012
RDF Schema is a Vocabulary Agent
Description Language
Used to define the vocabulary used
sc
in an RDF graph. (Class, Property,
subClassOf, subPropertyOf,
domain, range) Person sc
Semantics add simple reasoning
SACMAT-2012
sc
capabilities
e.g. inference rules for subClass or Student
subProperty relations
(sc rdfs:subClassOf)
6
7. CURRENT MODELS: ACCESS
CONTROL ANNOTATIONS
June 20-22, 2012
Access control provided at the level of RDF triples
Represented by RDF quadruples (s,p,o,l)
subject predicate object label
Student sc Person Accessible
Person sc Agent Inaccessible
SACMAT-2012
In implied triples semantics are applied directly to
give them labels
subject predicate object label
Acc.∧ Inacc.
Inaccessible 7
Student sc Agent
8. PROBLEMS OF ACCESS CONTROL
ANNNOTATIONS
June 20-22, 2012
Easy, but not amenable to changes
If one access label of one triple changes, it has
cascading effects to implied labels of other triples
Cannot know which labels/triples are affected
Re-computation of access labels is necessary (for the
entire dataset)
If the access label of one triple changes
If a triple is deleted, modified or added
SACMAT-2012
If the semantics according to which labels of inferred triples
are computed change
If the policy changes (e.g. a liberal policy becomes
conservative)
8
9. OUR APPROACH: ABSTRACT ACCESS
CONTROL MODELS
June 20-22, 2012
Abstract Access Control Model defined by a set
of abstract tokens and abstract operators to
model
computation of access labels of implied RDF triples
propagation of access labels
Access Control Authorizations associate triples in
the RDF/S graph with abstract tokens: quadruples
SACMAT-2012
RDFS inference rules for computing the access
labels of implied quadruples
Propagation rules to specify how access labels are
propagated along the subClassOf and
9
subPropertyOf relations
10. ABSTRACT ACCESS CONTROL
MODELS
June 20-22, 2012
Abstract Access Control Model defined by a set
of abstract tokens and abstract operators
⊙: binary operator over access tokens to model RDFS
inference
computes the label of implied RDF triples for the
subClassOf/subPropertyOf and type hierarchies
SACMAT-2012
(A1, sc, A2, l1) (A2, sc, A3, l2) (A1, sc, A3, l1 ⊙ l2)
10
11. ABSTRACT ACCESS CONTROL
MODELS
June 20-22, 2012
Abstract Access Control Model defined by a set
of abstract tokens and abstract operators
⊗ : unary operator over multi-sets of access tokens to
model propagation of access labels
propagates the access labels along the subclass/subproperty and
type hierarchies
the subclasses of a class inherit the label of its superclass, the
instances of a class inherit the label of its superclass, etc.
SACMAT-2012
(A1, type, class, l1) (A2, sc, A1, l2) (A2, type, class, l3) (A2, type, class, ⊗ (l1 ))
11
12. ANNOTATION - DETERMINING THE
ABSTRACT EXPRESSIONS (1/3)
June 20-22, 2012
Apply authorizations Authorizations (Query, access token)
we are going from A1: (construct {?x sc ?y}, at1)
triples to quadruples A2: (construct {?x type Student }, at2)
A3: (construct {?x type class}, at3)
A4: (construct {?x ?p Person}, at4)
id S p o id s p o l
t1 Student sc Person q1 Student sc Person at1
SACMAT-2012
t2 Person sc Agent q2 Person sc Agent at1
t3 &a type Student q3 &a type Student at2
t4 &a lastName “Smith” q4 &a lastName “Smith” ⊥
t5 Agent type Class q5 Agent type Class at3 12
q6 Student sc Person at4
13. ANNOTATION - DETERMINING THE
ABSTRACT EXPRESSIONS (2/3)
June 20-22, 2012
Apply RDFS id s p o l
inference rules q1 Student sc Person at1
New quadruples Person sc Agent at1
q2
produced
q3 &a type Student at2
R1 q6 Student sc Person at4
(A, sc, B, l1)
(A, sc, C, l1⊙l2) …
(B, sc, C, l2)
SACMAT-2012
q7 Student sc Agent q1 q2
R2
(x, type, A, l1) q8 Student sc Agent q6 q2
(x, type, B, l1⊙l2)
(A, sc, B, l2) q9 &a type Person q3 q1
q10 &a type Agent q3 (q1 q2 )
… 13
14. ANNOTATION - DETERMINING THE
ABSTRACT EXPRESSIONS (3/3)
June 20-22, 2012
Apply propagation id s p o l
rules Agent type Class at3
q5
Add new labels to
q10 &a type Agent q3 (q1 q2 )
existing triples
…
e.g. classes propagate
q11 &a type Agent ⊗q5
labels to their
…
instances and
SACMAT-2012
subclasses R5
R6 (B, type, class, l1)
(A, type, class, l1) (A, sc, B, l2) (x, type, class, ⊗l1)
(x, type, A, ⊗l1)
(x, type, A, l2) (A, type, class, l3)
14
15. EVALUATION - DETERMINING
ACCESSIBILITY
June 20-22, 2012
We have to define
Set of Concrete Tokens and a Mapping from
abstract to concrete tokens
Set of Concrete operators that implement the
abstract ones
Conflict resolution operator to resolve ambiguous
SACMAT-2012
labels
Access Function to decide when a triple is
accessible
15
17. EVALUATION FOR CP1 (1/3)
COMPUTE LABELS
June 20-22, 2012
Concrete policy 1 id s p o l
LP = {true, false} q1 Student sc Person at1
true
∧⊙
q2 Person sc Agent true
at1
IDL ⊗
q5 Agent type Class false
at3
∧⊕
q6 Student sc Person false
at4
q7 Student sc Agent true⊙q2
qtrue
1 ∧ true
Map abstract tokens
SACMAT-2012
q8 Student sc Agent q6⊙ true
false∧q2
false
to concrete q11 &a type Agent false
⊗q5
true at1, at2
false at3, at4
17
18. EVALUATION FOR CP1 (2/3)
AMBIGUOUS LABELS REMOVAL
June 20-22, 2012
Back from quadruples to triples
subject predicate object label
Student sc Person true
Student sc Person false
SACMAT-2012
subject predicate object label
Student sc Person true∧ false
false
18
19. EVALUATION FOR CP1 (3/3)
DETERMINING ACCESSIBILITY
June 20-22, 2012
The essence of access control:
subject predicate object label
Student sc Person false
Inaccessible
Person sc Agent true
Accessible
&a type Student true
Accessible
&a lastName “Smith” false
Inaccessible
SACMAT-2012
Agent type Class false
Inaccessible
19
20. PROS & CONS OF ABSTRACT ACCESS
CONTROL MODELS
June 20-22, 2012
Pros:
The same application can experiment with different
concrete policies over the same dataset
liberal vs conservative policies for different classes of users
Different applications can experiment with different
concrete policies for the same data
In the case of updates there is no need to re-
compute the inferred triples
SACMAT-2012
Cons:
overhead in the required storage space
algebraic expressions can become complex depending on the
structure of the dataset
20
21. IMPLEMENTATION
June 20-22, 2012
Used a relational schema to store quadruples
and their labels (including abstract expressions)
Using stored procedure mechanism through
which we perform annotation and evaluation
MonetDB
PostgreSQL
SACMAT-2012
21
22. EXPERIMENTS
June 20-22, 2012
Experiments
Experiment 1: annotation time (the time required
to compute the inferred triples with their labels and
the propagated labels)
Experiment 2: evaluation time (a) (the time
needed to compute for a concrete policy, the concrete
access labels of all RDF triples)
Experiment 3: evaluation time (b) (the time
SACMAT-2012
needed to compute for a concrete policy, the concrete
access label of a percentage of the RDF triples)
Datasets:
Synthetic Schemas produced with PowerGen
22
Real: CIDOC, GO
23. EXPERIMENTAL RESULTS
ANNOTATION TIME – MONETDB
(SYNTHETIC)
June 20-22, 2012
Annotation time
increases as the
number of implied
triples increases
Plunges are due to
changes in the
structure of the
SACMAT-2012
ontology
(reduction of the
depth)
152 Synthetic ontologies
100-1000 classes, 113-1635 properties, 124-50295 class instances 23
and 110-1321 property instances
Different depth for the sc and sp hierarchies (from 4 to 8)
24. EXPERIMENTAL RESULTS
EVALUATION TIME (FULL)
June 20-22, 2012
Evaluation time
increases linearly
as the number of
total triples
increases
MonetDB
outperforms
PostgreSQL
SACMAT-2012
Some of synthetic
datasets couldn’t
be evaluated
24
25. EXPERIMENTAL RESULTS
EVALUATION TIME (DATASET
PERCENTAGE) - MONETDB
June 20-22, 2012
Evaluation time
for largest dataset
that evaluated
successfully on
Experiment 2
Similar conclu-
sions as with
SACMAT-2012
Experiment 2
25
26. EXPERIMENTAL RESULTS - REAL
DATASETS
June 20-22, 2012
CIDOC
Annotation time
MonetDB: 69ms
PostgreSQL: 4000ms
Evaluation time (full)
MonetDB – CP1: 7775ms
MonetDB – CP2: 3923ms
GO
SACMAT-2012
Annotation time
MonetDB: 32s
PostgreSQL: 844s
Evaluation time (full) 26
Exceeded our set timeout
27. CONCLUSIONS
June 20-22, 2012
Proposed a new paradigm based on abstract
models and operators
Advantages
Flexibility and easy adaptation to change (no re-
computation necessary)
Easy experimentation with different access control
policies
SACMAT-2012
Disadvantages
Increased space requirements
Overhead at query time (for evaluation)
Suitable for dynamic datasets
27
29. EXPERIMENTAL RESULTS
ANNOTATION TIME –
POSTGRESQL(SYNTHETIC)
June 20-22, 2012
Annotation time
increases as the
number of implied
triples increases
One plunges are
due to change in
the structure of
SACMAT-2012
the ontology
(reduction of the
depth)
Up to 1000 classes, 1635 properties, 50167 class instances and 95
property instances before reaching the timeout. 29
30. IMPLEMENTATION
June 20-22, 2012
Used a relational schema to store quadruples
Quad(qid, s, p, o, propop, inferop, label)
inferop, propop: boolean values indicating whether the label
is obtained through propagation or inference
LabelStore(qid, qid_uses)
stores the access label of a triple
qid: the quadruple whose label is stored
qid_uses: the explict quadruple’s qid through which qid
SACMAT-2012
produced.
30
31. IMPLEMENTATION
id s p o l id s p o iop pop l
q1 Student sc Person at1 q1 Student sc Person f f at1
q2 Person sc Agent at1 q2 Person sc Agent f f at1
q3 &a type Student at2 q3 &a type Student f f at2
q5 Agent type Class at3 q5 Agent type Class f f at3
q6 Student sc Person at4 q6 Student sc Person f f at4
q7 Student sc Agent at1⊙at1 q7 Student sc Agent t f null
q10 &a type Agent at2⊙(at1⊙at1) q9 &a type Agent t f null
q11 &a type Agent ⊗at3 q10 Person Sc Agent f t null
Quadruples (Motivating example) Quad(qid,s,p,o,propop,inferop,label)
31
32. IMPLEMENTATION
June 20-22, 2012
id s p o l qid qid_uses
q1 Student sc Person at1 q7 q1
q2 Person sc Agent at1 q7 q2
q3 &a type Student at2 q10 q3
q5 Agent type Class at3 q10 q1
q6 Student sc Person at4 q10 q2
q7 Student sc Agent at1⊙at1 q11 q5
SACMAT-2012
q10 &a type Agent at2⊙(at1⊙at1)
q11 &a type Agent ⊗at3
Quadruples (Motivating example) Labelstore(qid,qid_uses)
32