SlideShare uma empresa Scribd logo
1 de 32
ACCESS CONTROL FOR
 RDF GRAPHS USING
 ABSTRACT MODELS

     Vassilis Papakonstantinou
        (papv@ics.forth.gr)


           Joint work with:
   Maria Michou, Irini Fundulaki,
  Giorgos Flouris, Grigoris Antoniou

           SACMAT 2012
MOTIVATION




                                                                  June 20-22, 2012
   Why RDF Data?
       RDF is the de-facto standard for publishing data in
        the Linked Open Data Cloud
       E-Science (astronomy, life
         sciences, earth sciences)
       Public Government Data
         (US, UK, The Netherlands, … )
       Social Networks




                                                                  SACMAT-2012
       DBPedia, CIA World FactBook, …

   Why Access Control?
       Crucial for sensitive content since it ensures the
                                                              2
        selective exposure of information to different
        classes of users
MAIN CONTRIBUTIONS




                                                          June 20-22, 2012
   Fine-grained Access Control Model for RDF
     defined at the level of RDF triples
     focus on read-only permissions
     with support for RDFS inference to infer new
      knowledge
     encodes how an access label has been computed
 Supports dynamic datasets




                                                          SACMAT-2012
 Supports dynamic access control policies

 Implementation and experiments on top of
  MonetDB and PostgreSQL
                                                      3
OUTLINE




                                                     June 20-22, 2012
 Preliminaries: RDF and RDF Schema
 Current models: Access Control Annotations

 Our approach: Abstract Access Control Models

 Implementation

 Experiments




                                                     SACMAT-2012
                                                 4
RESOURCE DESCRIPTION
               FRAMEWORK (RDF)




                                                                       June 20-22, 2012
 General-purpose language for representing
  information in the Semantic Web
 Information represented using triples
     (s, p, o) [subject, predicate, object]
     s, p, o: URIs or literals
     Example: (&a, firstName, “Alice”)




                                                                       SACMAT-2012
                         firstName
        &a                                         “Alice”
 An entity being     A property of the entity   The value of the
   described              (first name)             predicate
                                                (the first name)
    [subject]              [predicate]              [object]       5
RDF SCHEMA




                                                                        June 20-22, 2012
   RDF Schema is a Vocabulary                     Agent
    Description Language
       Used to define the vocabulary used
                                                       sc
        in an RDF graph. (Class, Property,
        subClassOf, subPropertyOf,
        domain, range)                             Person           sc


    Semantics add simple reasoning




                                                                        SACMAT-2012
                                                      sc
    capabilities
       e.g. inference rules for subClass or      Student
        subProperty relations
                                           (sc  rdfs:subClassOf)
                                                                    6
CURRENT MODELS: ACCESS
        CONTROL ANNOTATIONS




                                                               June 20-22, 2012
 Access control provided at the level of RDF triples
 Represented by RDF quadruples (s,p,o,l)


      subject    predicate   object      label
      Student       sc       Person    Accessible
       Person       sc       Agent    Inaccessible




                                                               SACMAT-2012
   In implied triples semantics are applied directly to
    give them labels
      subject    predicate   object      label
                                      Acc.∧ Inacc.
                                      Inaccessible         7
      Student       sc       Agent
PROBLEMS OF ACCESS CONTROL
          ANNNOTATIONS




                                                                               June 20-22, 2012
   Easy, but not amenable to changes
     If one access label of one triple changes, it has
      cascading effects to implied labels of other triples
     Cannot know which labels/triples are affected
     Re-computation of access labels is necessary (for the
      entire dataset)
         If the access label of one triple changes
         If a triple is deleted, modified or added




                                                                               SACMAT-2012
         If the semantics according to which labels of inferred triples

          are computed change
         If the policy changes (e.g. a liberal policy becomes

          conservative)
                                                                           8
OUR APPROACH: ABSTRACT ACCESS
       CONTROL MODELS




                                                                June 20-22, 2012
   Abstract Access Control Model defined by a set
    of abstract tokens and abstract operators to
    model
     computation of access labels of implied RDF triples
     propagation of access labels
 Access Control Authorizations associate triples in
  the RDF/S graph with abstract tokens: quadruples




                                                                SACMAT-2012
 RDFS inference rules for computing the access
  labels of implied quadruples
 Propagation rules to specify how access labels are
  propagated along the subClassOf and
                                                            9
  subPropertyOf relations
ABSTRACT ACCESS CONTROL
                MODELS




                                                                     June 20-22, 2012
   Abstract Access Control Model defined by a set
    of abstract tokens and abstract operators
       ⊙: binary operator over access tokens to model RDFS
        inference
           computes the label of implied RDF triples for the
            subClassOf/subPropertyOf and type hierarchies




                                                                     SACMAT-2012
     (A1, sc, A2, l1) (A2, sc, A3, l2)      (A1, sc, A3, l1 ⊙ l2)




                                                                    10
ABSTRACT ACCESS CONTROL
                MODELS




                                                                                            June 20-22, 2012
   Abstract Access Control Model defined by a set
    of abstract tokens and abstract operators
       ⊗ : unary operator over multi-sets of access tokens to
        model propagation of access labels
         propagates the access labels along the subclass/subproperty and
          type hierarchies
         the subclasses of a class inherit the label of its superclass, the

          instances of a class inherit the label of its superclass, etc.




                                                                                        SACMAT-2012
(A1, type, class, l1) (A2, sc, A1, l2) (A2, type, class, l3)   (A2, type, class, ⊗ (l1 ))



                                                                                       11
ANNOTATION - DETERMINING THE
  ABSTRACT EXPRESSIONS (1/3)




                                                                                 June 20-22, 2012
   Apply authorizations              Authorizations (Query, access token)
        we are going from            A1: (construct {?x sc ?y}, at1)
         triples to quadruples        A2: (construct {?x type Student }, at2)
                                      A3: (construct {?x type class}, at3)
                                      A4: (construct {?x ?p Person}, at4)


id         S          p        o       id     s        p        o       l
t1       Student      sc    Person     q1 Student      sc     Person   at1




                                                                                 SACMAT-2012
t2       Person       sc     Agent     q2 Person       sc     Agent    at1
t3         &a        type   Student    q3    &a       type   Student   at2
t4         &a      lastName “Smith”    q4    &a     lastName “Smith”    ⊥
t5       Agent       type    Class     q5   Agent     type    Class    at3      12

                                       q6 Student      sc     Person   at4
ANNOTATION - DETERMINING THE
    ABSTRACT EXPRESSIONS (2/3)




                                                                                                        June 20-22, 2012
   Apply RDFS                              id       s       p        o                l
    inference rules                         q1    Student    sc    Person          at1
        New quadruples                           Person     sc    Agent           at1
                                            q2
         produced
                                            q3      &a      type   Student         at2
                   R1                       q6    Student    sc    Person          at4
  (A, sc, B, l1)
                        (A, sc, C, l1⊙l2)                   …
  (B, sc, C, l2)




                                                                                                        SACMAT-2012
                                            q7    Student   sc     Agent          q1       q2
                   R2
(x, type, A, l1)                            q8    Student    sc    Agent          q6       q2
                    (x, type, B, l1⊙l2)
(A, sc, B, l2)                              q9      &a      type   Person         q3       q1

                                            q10     &a      type   Agent     q3    (q1          q2 )
                                                            …                                          13
ANNOTATION - DETERMINING THE
     ABSTRACT EXPRESSIONS (3/3)




                                                                                                 June 20-22, 2012
  Apply propagation                        id      s         p          o            l
   rules                                          Agent     type       Class        at3
                                            q5
  Add new labels to
                                            q10    &a       type       Agent   q3   (q1   q2 )
   existing triples
                                                             …
  e.g. classes propagate
                                            q11    &a       type       Agent        ⊗q5
   labels to their
                                                             …
   instances and




                                                                                                 SACMAT-2012
   subclasses                                                     R5
                   R6                        (B, type, class, l1)
(A, type, class, l1)                         (A, sc, B, l2)            (x, type, class, ⊗l1)
                        (x, type, A, ⊗l1)
(x, type, A, l2)                             (A, type, class, l3)

                                                                                               14
EVALUATION - DETERMINING
             ACCESSIBILITY




                                                           June 20-22, 2012
   We have to define

     Set of Concrete Tokens and a Mapping from
      abstract to concrete tokens
     Set of Concrete operators that implement the
      abstract ones
     Conflict resolution operator to resolve ambiguous




                                                           SACMAT-2012
      labels
     Access Function to decide when a triple is
      accessible


                                                          15
CONCRETE ACCESS CONTROL
                POLICY




                                                                    June 20-22, 2012
   Example: Concrete Policy 1

       Concrete tokens: LP = {true, false}
       Inference operator: (∧) Conjunction  ⊙
       Propagation operator: (IDL ) Identity function  ⊗
       Conflict resolution operator: (∧) Conjunction  ⊕




                                                                    SACMAT-2012
       Access function: triples with label true are accessible,
        otherwise, inaccessible



                                                                   16
EVALUATION FOR CP1 (1/3)
            COMPUTE LABELS




                                                                        June 20-22, 2012
   Concrete policy 1      id      s       p       o          l
     LP = {true, false}   q1 Student      sc    Person      at1
                                                             true
     ∧⊙
                           q2    Person    sc    Agent       true
                                                              at1
     IDL  ⊗
                           q5    Agent    type   Class       false
                                                              at3
     ∧⊕
                           q6 Student      sc    Person      false
                                                              at4
                           q7 Student      sc    Agent    true⊙q2
                                                            qtrue
                                                             1 ∧ true
   Map abstract tokens




                                                                        SACMAT-2012
                           q8 Student      sc    Agent       q6⊙ true
                                                          false∧q2
                                                              false
    to concrete            q11    &a      type   Agent       false
                                                             ⊗q5
     true  at1, at2
     false  at3, at4

                                                                     17
EVALUATION FOR CP1 (2/3)
     AMBIGUOUS LABELS REMOVAL




                                                     June 20-22, 2012
   Back from quadruples to triples

      subject   predicate    object      label
      Student      sc        Person      true
      Student      sc        Person      false




                                                     SACMAT-2012
      subject   predicate    object      label
      Student      sc        Person   true∧ false
                                         false



                                                    18
EVALUATION FOR CP1 (3/3)
      DETERMINING ACCESSIBILITY




                                                        June 20-22, 2012
   The essence of access control:

       subject   predicate    object       label
       Student       sc       Person       false
                                        Inaccessible
       Person        sc       Agent        true
                                         Accessible
         &a         type      Student      true
                                         Accessible
         &a       lastName    “Smith”      false
                                        Inaccessible




                                                        SACMAT-2012
        Agent       type       Class       false
                                        Inaccessible




                                                       19
PROS & CONS OF ABSTRACT ACCESS
       CONTROL MODELS




                                                                               June 20-22, 2012
   Pros:
       The same application can experiment with different
        concrete policies over the same dataset
           liberal vs conservative policies for different classes of users
     Different applications can experiment with different
      concrete policies for the same data
     In the case of updates there is no need to re-
      compute the inferred triples




                                                                               SACMAT-2012
   Cons:
       overhead in the required storage space
           algebraic expressions can become complex depending on the
            structure of the dataset
                                                                              20
IMPLEMENTATION




                                                         June 20-22, 2012
   Used a relational schema to store quadruples
    and their labels (including abstract expressions)

   Using stored procedure mechanism through
    which we perform annotation and evaluation
     MonetDB
     PostgreSQL




                                                         SACMAT-2012
                                                        21
EXPERIMENTS




                                                               June 20-22, 2012
   Experiments
     Experiment 1: annotation time (the time required
      to compute the inferred triples with their labels and
      the propagated labels)
     Experiment 2: evaluation time (a) (the time
      needed to compute for a concrete policy, the concrete
      access labels of all RDF triples)
     Experiment 3: evaluation time (b) (the time




                                                               SACMAT-2012
      needed to compute for a concrete policy, the concrete
      access label of a percentage of the RDF triples)
   Datasets:
     Synthetic Schemas produced with PowerGen
                                                              22
     Real: CIDOC, GO
EXPERIMENTAL RESULTS
         ANNOTATION TIME – MONETDB
                (SYNTHETIC)




                                                                                 June 20-22, 2012
                                                          Annotation time
                                                           increases as the
                                                           number of implied
                                                           triples increases


                                                          Plunges are due to
                                                           changes in the
                                                           structure of the




                                                                                 SACMAT-2012
                                                           ontology
                                                           (reduction of the
                                                           depth)

   152 Synthetic ontologies
        100-1000 classes, 113-1635 properties, 124-50295 class instances       23
         and 110-1321 property instances
   Different depth for the sc and sp hierarchies (from 4 to 8)
EXPERIMENTAL RESULTS
EVALUATION TIME (FULL)




                                            June 20-22, 2012
                     Evaluation time
                      increases linearly
                      as the number of
                      total triples
                      increases
                     MonetDB
                      outperforms
                      PostgreSQL




                                            SACMAT-2012
                     Some of synthetic
                      datasets couldn’t
                      be evaluated



                                           24
EXPERIMENTAL RESULTS
EVALUATION TIME (DATASET
 PERCENTAGE) - MONETDB




                                              June 20-22, 2012
                      Evaluation time
                       for largest dataset
                       that evaluated
                       successfully on
                       Experiment 2


                      Similar conclu-
                       sions as with




                                              SACMAT-2012
                       Experiment 2




                                             25
EXPERIMENTAL RESULTS - REAL
                DATASETS




                                        June 20-22, 2012
   CIDOC
       Annotation time
         MonetDB: 69ms
         PostgreSQL: 4000ms

       Evaluation time (full)
         MonetDB – CP1: 7775ms
         MonetDB – CP2: 3923ms


    GO




                                        SACMAT-2012

       Annotation time
         MonetDB: 32s
         PostgreSQL: 844s

       Evaluation time (full)         26
           Exceeded our set timeout
CONCLUSIONS




                                                            June 20-22, 2012
 Proposed a new paradigm based on abstract
  models and operators
 Advantages
     Flexibility and easy adaptation to change (no re-
      computation necessary)
     Easy experimentation with different access control
      policies




                                                            SACMAT-2012
   Disadvantages
     Increased space requirements
     Overhead at query time (for evaluation)
   Suitable for dynamic datasets
                                                           27
Thank you!


             28
EXPERIMENTAL RESULTS
               ANNOTATION TIME –
             POSTGRESQL(SYNTHETIC)




                                                                               June 20-22, 2012
                                                         Annotation time
                                                          increases as the
                                                          number of implied
                                                          triples increases


                                                         One plunges are
                                                          due to change in
                                                          the structure of




                                                                               SACMAT-2012
                                                          the ontology
                                                          (reduction of the
                                                          depth)

   Up to 1000 classes, 1635 properties, 50167 class instances and 95
    property instances before reaching the timeout.                           29
IMPLEMENTATION




                                                                            June 20-22, 2012
   Used a relational schema to store quadruples
       Quad(qid, s, p, o, propop, inferop, label)
           inferop, propop: boolean values indicating whether the label
            is obtained through propagation or inference
       LabelStore(qid, qid_uses)
           stores the access label of a triple
              qid: the quadruple whose label is stored

              qid_uses: the explict quadruple’s qid through which qid




                                                                            SACMAT-2012
               produced.




                                                                           30
IMPLEMENTATION

id      s       p       o            l         id    s       p      o      iop pop    l

q1 Student      sc    Person        at1        q1 Student    sc   Person    f   f    at1

q2    Person    sc    Agent         at1        q2 Person     sc   Agent     f   f    at1

q3     &a      type Student         at2        q3    &a     type Student    f   f    at2

q5    Agent    type   Class         at3        q5   Agent type    Class     f   f    at3

q6 Student      sc    Person        at4        q6 Student    sc   Person    f   f    at4

q7 Student      sc    Agent      at1⊙at1       q7 Student    sc   Agent     t   f    null

q10    &a      type   Agent    at2⊙(at1⊙at1)   q9    &a     type Agent      t   f    null

q11    &a      type   Agent        ⊗at3        q10 Person   Sc    Agent     f   t    null


      Quadruples (Motivating example)            Quad(qid,s,p,o,propop,inferop,label)

                                                                                       31
IMPLEMENTATION




                                                                           June 20-22, 2012
id      s       p       o            l              qid qid_uses
q1 Student      sc    Person        at1              q7     q1
q2    Person    sc    Agent         at1              q7     q2
q3     &a      type Student         at2             q10     q3
q5    Agent    type   Class         at3             q10     q1
q6 Student      sc    Person        at4             q10     q2
q7 Student      sc    Agent      at1⊙at1            q11     q5




                                                                           SACMAT-2012
q10    &a      type   Agent    at2⊙(at1⊙at1)

q11    &a      type   Agent        ⊗at3


Quadruples (Motivating example)                Labelstore(qid,qid_uses)

                                                                          32

Mais conteúdo relacionado

Semelhante a Access Control for RDF graphs using Abstract Models

KIT Graduiertenkolloquium 11.05.2016
KIT Graduiertenkolloquium 11.05.2016KIT Graduiertenkolloquium 11.05.2016
KIT Graduiertenkolloquium 11.05.2016
Dr.-Ing. Thomas Hartmann
 
Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678
Editor IJARCET
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
Tomek Pluskiewicz
 

Semelhante a Access Control for RDF graphs using Abstract Models (20)

Linked data presentation to AALL 2012 boston
Linked data presentation to AALL 2012 bostonLinked data presentation to AALL 2012 boston
Linked data presentation to AALL 2012 boston
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic web
 
KIT Graduiertenkolloquium 11.05.2016
KIT Graduiertenkolloquium 11.05.2016KIT Graduiertenkolloquium 11.05.2016
KIT Graduiertenkolloquium 11.05.2016
 
Semantic web
Semantic web Semantic web
Semantic web
 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial data
 
Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678
 
Towards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIsTowards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIs
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
The Standardization of Semantic Web Ontology
The Standardization of Semantic Web OntologyThe Standardization of Semantic Web Ontology
The Standardization of Semantic Web Ontology
 
Semantic Web and Related Work at W3C
Semantic Web and Related Work at W3CSemantic Web and Related Work at W3C
Semantic Web and Related Work at W3C
 
Ch21-OODB.ppt
Ch21-OODB.pptCh21-OODB.ppt
Ch21-OODB.ppt
 
OOPs in Java
OOPs in JavaOOPs in Java
OOPs in Java
 
Mapping of extensible markup language-to-ontology representation for effectiv...
Mapping of extensible markup language-to-ontology representation for effectiv...Mapping of extensible markup language-to-ontology representation for effectiv...
Mapping of extensible markup language-to-ontology representation for effectiv...
 
RDA-DCAM and Application Profiles
RDA-DCAM and Application ProfilesRDA-DCAM and Application Profiles
RDA-DCAM and Application Profiles
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
The JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeThe JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scope
 
A Hands On Overview Of The Semantic Web
A Hands On Overview Of The Semantic WebA Hands On Overview Of The Semantic Web
A Hands On Overview Of The Semantic Web
 
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
 
Jpl presentation
Jpl presentationJpl presentation
Jpl presentation
 

Mais de PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
PlanetData Network of Excellence
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
PlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
PlanetData Network of Excellence
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
PlanetData Network of Excellence
 

Mais de PlanetData Network of Excellence (20)

Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Access Control for RDF graphs using Abstract Models

  • 1. ACCESS CONTROL FOR RDF GRAPHS USING ABSTRACT MODELS Vassilis Papakonstantinou (papv@ics.forth.gr) Joint work with: Maria Michou, Irini Fundulaki, Giorgos Flouris, Grigoris Antoniou SACMAT 2012
  • 2. MOTIVATION June 20-22, 2012  Why RDF Data?  RDF is the de-facto standard for publishing data in the Linked Open Data Cloud  E-Science (astronomy, life  sciences, earth sciences)  Public Government Data  (US, UK, The Netherlands, … )  Social Networks SACMAT-2012  DBPedia, CIA World FactBook, …  Why Access Control?  Crucial for sensitive content since it ensures the 2 selective exposure of information to different classes of users
  • 3. MAIN CONTRIBUTIONS June 20-22, 2012  Fine-grained Access Control Model for RDF  defined at the level of RDF triples  focus on read-only permissions  with support for RDFS inference to infer new knowledge  encodes how an access label has been computed  Supports dynamic datasets SACMAT-2012  Supports dynamic access control policies  Implementation and experiments on top of MonetDB and PostgreSQL 3
  • 4. OUTLINE June 20-22, 2012  Preliminaries: RDF and RDF Schema  Current models: Access Control Annotations  Our approach: Abstract Access Control Models  Implementation  Experiments SACMAT-2012 4
  • 5. RESOURCE DESCRIPTION FRAMEWORK (RDF) June 20-22, 2012  General-purpose language for representing information in the Semantic Web  Information represented using triples  (s, p, o) [subject, predicate, object]  s, p, o: URIs or literals  Example: (&a, firstName, “Alice”) SACMAT-2012 firstName &a “Alice” An entity being A property of the entity The value of the described (first name) predicate (the first name) [subject] [predicate] [object] 5
  • 6. RDF SCHEMA June 20-22, 2012  RDF Schema is a Vocabulary Agent Description Language  Used to define the vocabulary used sc in an RDF graph. (Class, Property, subClassOf, subPropertyOf, domain, range) Person sc Semantics add simple reasoning SACMAT-2012  sc capabilities  e.g. inference rules for subClass or Student subProperty relations (sc  rdfs:subClassOf) 6
  • 7. CURRENT MODELS: ACCESS CONTROL ANNOTATIONS June 20-22, 2012  Access control provided at the level of RDF triples  Represented by RDF quadruples (s,p,o,l) subject predicate object label Student sc Person Accessible Person sc Agent Inaccessible SACMAT-2012  In implied triples semantics are applied directly to give them labels subject predicate object label Acc.∧ Inacc. Inaccessible 7 Student sc Agent
  • 8. PROBLEMS OF ACCESS CONTROL ANNNOTATIONS June 20-22, 2012  Easy, but not amenable to changes  If one access label of one triple changes, it has cascading effects to implied labels of other triples  Cannot know which labels/triples are affected  Re-computation of access labels is necessary (for the entire dataset)  If the access label of one triple changes  If a triple is deleted, modified or added SACMAT-2012  If the semantics according to which labels of inferred triples are computed change  If the policy changes (e.g. a liberal policy becomes conservative) 8
  • 9. OUR APPROACH: ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012  Abstract Access Control Model defined by a set of abstract tokens and abstract operators to model  computation of access labels of implied RDF triples  propagation of access labels  Access Control Authorizations associate triples in the RDF/S graph with abstract tokens: quadruples SACMAT-2012  RDFS inference rules for computing the access labels of implied quadruples  Propagation rules to specify how access labels are propagated along the subClassOf and 9 subPropertyOf relations
  • 10. ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012  Abstract Access Control Model defined by a set of abstract tokens and abstract operators  ⊙: binary operator over access tokens to model RDFS inference  computes the label of implied RDF triples for the subClassOf/subPropertyOf and type hierarchies SACMAT-2012 (A1, sc, A2, l1) (A2, sc, A3, l2) (A1, sc, A3, l1 ⊙ l2) 10
  • 11. ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012  Abstract Access Control Model defined by a set of abstract tokens and abstract operators  ⊗ : unary operator over multi-sets of access tokens to model propagation of access labels  propagates the access labels along the subclass/subproperty and type hierarchies  the subclasses of a class inherit the label of its superclass, the instances of a class inherit the label of its superclass, etc. SACMAT-2012 (A1, type, class, l1) (A2, sc, A1, l2) (A2, type, class, l3) (A2, type, class, ⊗ (l1 )) 11
  • 12. ANNOTATION - DETERMINING THE ABSTRACT EXPRESSIONS (1/3) June 20-22, 2012  Apply authorizations Authorizations (Query, access token)  we are going from A1: (construct {?x sc ?y}, at1) triples to quadruples A2: (construct {?x type Student }, at2) A3: (construct {?x type class}, at3) A4: (construct {?x ?p Person}, at4) id S p o id s p o l t1 Student sc Person q1 Student sc Person at1 SACMAT-2012 t2 Person sc Agent q2 Person sc Agent at1 t3 &a type Student q3 &a type Student at2 t4 &a lastName “Smith” q4 &a lastName “Smith” ⊥ t5 Agent type Class q5 Agent type Class at3 12 q6 Student sc Person at4
  • 13. ANNOTATION - DETERMINING THE ABSTRACT EXPRESSIONS (2/3) June 20-22, 2012  Apply RDFS id s p o l inference rules q1 Student sc Person at1  New quadruples Person sc Agent at1 q2 produced q3 &a type Student at2 R1 q6 Student sc Person at4 (A, sc, B, l1) (A, sc, C, l1⊙l2) … (B, sc, C, l2) SACMAT-2012 q7 Student sc Agent q1 q2 R2 (x, type, A, l1) q8 Student sc Agent q6 q2 (x, type, B, l1⊙l2) (A, sc, B, l2) q9 &a type Person q3 q1 q10 &a type Agent q3 (q1 q2 ) … 13
  • 14. ANNOTATION - DETERMINING THE ABSTRACT EXPRESSIONS (3/3) June 20-22, 2012  Apply propagation id s p o l rules Agent type Class at3 q5  Add new labels to q10 &a type Agent q3 (q1 q2 ) existing triples …  e.g. classes propagate q11 &a type Agent ⊗q5 labels to their … instances and SACMAT-2012 subclasses R5 R6 (B, type, class, l1) (A, type, class, l1) (A, sc, B, l2) (x, type, class, ⊗l1) (x, type, A, ⊗l1) (x, type, A, l2) (A, type, class, l3) 14
  • 15. EVALUATION - DETERMINING ACCESSIBILITY June 20-22, 2012  We have to define  Set of Concrete Tokens and a Mapping from abstract to concrete tokens  Set of Concrete operators that implement the abstract ones  Conflict resolution operator to resolve ambiguous SACMAT-2012 labels  Access Function to decide when a triple is accessible 15
  • 16. CONCRETE ACCESS CONTROL POLICY June 20-22, 2012  Example: Concrete Policy 1  Concrete tokens: LP = {true, false}  Inference operator: (∧) Conjunction  ⊙  Propagation operator: (IDL ) Identity function  ⊗  Conflict resolution operator: (∧) Conjunction  ⊕ SACMAT-2012  Access function: triples with label true are accessible, otherwise, inaccessible 16
  • 17. EVALUATION FOR CP1 (1/3) COMPUTE LABELS June 20-22, 2012  Concrete policy 1 id s p o l  LP = {true, false} q1 Student sc Person at1 true  ∧⊙ q2 Person sc Agent true at1  IDL  ⊗ q5 Agent type Class false at3  ∧⊕ q6 Student sc Person false at4 q7 Student sc Agent true⊙q2 qtrue 1 ∧ true  Map abstract tokens SACMAT-2012 q8 Student sc Agent q6⊙ true false∧q2 false to concrete q11 &a type Agent false ⊗q5  true  at1, at2  false  at3, at4 17
  • 18. EVALUATION FOR CP1 (2/3) AMBIGUOUS LABELS REMOVAL June 20-22, 2012  Back from quadruples to triples subject predicate object label Student sc Person true Student sc Person false SACMAT-2012 subject predicate object label Student sc Person true∧ false false 18
  • 19. EVALUATION FOR CP1 (3/3) DETERMINING ACCESSIBILITY June 20-22, 2012  The essence of access control: subject predicate object label Student sc Person false Inaccessible Person sc Agent true Accessible &a type Student true Accessible &a lastName “Smith” false Inaccessible SACMAT-2012 Agent type Class false Inaccessible 19
  • 20. PROS & CONS OF ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012  Pros:  The same application can experiment with different concrete policies over the same dataset  liberal vs conservative policies for different classes of users  Different applications can experiment with different concrete policies for the same data  In the case of updates there is no need to re- compute the inferred triples SACMAT-2012  Cons:  overhead in the required storage space  algebraic expressions can become complex depending on the structure of the dataset 20
  • 21. IMPLEMENTATION June 20-22, 2012  Used a relational schema to store quadruples and their labels (including abstract expressions)  Using stored procedure mechanism through which we perform annotation and evaluation  MonetDB  PostgreSQL SACMAT-2012 21
  • 22. EXPERIMENTS June 20-22, 2012  Experiments  Experiment 1: annotation time (the time required to compute the inferred triples with their labels and the propagated labels)  Experiment 2: evaluation time (a) (the time needed to compute for a concrete policy, the concrete access labels of all RDF triples)  Experiment 3: evaluation time (b) (the time SACMAT-2012 needed to compute for a concrete policy, the concrete access label of a percentage of the RDF triples)  Datasets:  Synthetic Schemas produced with PowerGen 22  Real: CIDOC, GO
  • 23. EXPERIMENTAL RESULTS ANNOTATION TIME – MONETDB (SYNTHETIC) June 20-22, 2012  Annotation time increases as the number of implied triples increases  Plunges are due to changes in the structure of the SACMAT-2012 ontology (reduction of the depth)  152 Synthetic ontologies  100-1000 classes, 113-1635 properties, 124-50295 class instances 23 and 110-1321 property instances  Different depth for the sc and sp hierarchies (from 4 to 8)
  • 24. EXPERIMENTAL RESULTS EVALUATION TIME (FULL) June 20-22, 2012  Evaluation time increases linearly as the number of total triples increases  MonetDB outperforms PostgreSQL SACMAT-2012  Some of synthetic datasets couldn’t be evaluated 24
  • 25. EXPERIMENTAL RESULTS EVALUATION TIME (DATASET PERCENTAGE) - MONETDB June 20-22, 2012  Evaluation time for largest dataset that evaluated successfully on Experiment 2  Similar conclu- sions as with SACMAT-2012 Experiment 2 25
  • 26. EXPERIMENTAL RESULTS - REAL DATASETS June 20-22, 2012  CIDOC  Annotation time  MonetDB: 69ms  PostgreSQL: 4000ms  Evaluation time (full)  MonetDB – CP1: 7775ms  MonetDB – CP2: 3923ms GO SACMAT-2012   Annotation time  MonetDB: 32s  PostgreSQL: 844s  Evaluation time (full) 26  Exceeded our set timeout
  • 27. CONCLUSIONS June 20-22, 2012  Proposed a new paradigm based on abstract models and operators  Advantages  Flexibility and easy adaptation to change (no re- computation necessary)  Easy experimentation with different access control policies SACMAT-2012  Disadvantages  Increased space requirements  Overhead at query time (for evaluation)  Suitable for dynamic datasets 27
  • 29. EXPERIMENTAL RESULTS ANNOTATION TIME – POSTGRESQL(SYNTHETIC) June 20-22, 2012  Annotation time increases as the number of implied triples increases  One plunges are due to change in the structure of SACMAT-2012 the ontology (reduction of the depth)  Up to 1000 classes, 1635 properties, 50167 class instances and 95 property instances before reaching the timeout. 29
  • 30. IMPLEMENTATION June 20-22, 2012  Used a relational schema to store quadruples  Quad(qid, s, p, o, propop, inferop, label)  inferop, propop: boolean values indicating whether the label is obtained through propagation or inference  LabelStore(qid, qid_uses)  stores the access label of a triple  qid: the quadruple whose label is stored  qid_uses: the explict quadruple’s qid through which qid SACMAT-2012 produced. 30
  • 31. IMPLEMENTATION id s p o l id s p o iop pop l q1 Student sc Person at1 q1 Student sc Person f f at1 q2 Person sc Agent at1 q2 Person sc Agent f f at1 q3 &a type Student at2 q3 &a type Student f f at2 q5 Agent type Class at3 q5 Agent type Class f f at3 q6 Student sc Person at4 q6 Student sc Person f f at4 q7 Student sc Agent at1⊙at1 q7 Student sc Agent t f null q10 &a type Agent at2⊙(at1⊙at1) q9 &a type Agent t f null q11 &a type Agent ⊗at3 q10 Person Sc Agent f t null Quadruples (Motivating example) Quad(qid,s,p,o,propop,inferop,label) 31
  • 32. IMPLEMENTATION June 20-22, 2012 id s p o l qid qid_uses q1 Student sc Person at1 q7 q1 q2 Person sc Agent at1 q7 q2 q3 &a type Student at2 q10 q3 q5 Agent type Class at3 q10 q1 q6 Student sc Person at4 q10 q2 q7 Student sc Agent at1⊙at1 q11 q5 SACMAT-2012 q10 &a type Agent at2⊙(at1⊙at1) q11 &a type Agent ⊗at3 Quadruples (Motivating example) Labelstore(qid,qid_uses) 32