O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Using OWL in
Closed World Applications

         Evren Sirin, CTO
        Clark & Parsia, LLC
      evren@clarkparsia.com
Who are we?
• Clark & Parsia is a semantic software startup 
  – HQ in Washington, DC & office in Boston
• Provides softwar...
Some Applications
• Customer and product data
  – Find which customer would be interested in buying a
    certain product
...
Common Theme
• There is data and lots of it!
• Adding semantics to the data helps a lot
  – Some times simple taxonomies, ...
Data Validation
• Fundamental data management problem
  – Verify data integrity and correctness 
  – Enforce validity of u...
Disclaimer
• Data validity not important for every use case
  – Invalid data may be fine for an application
  – Invalidity ...
Roadmap for an App
• How to build one of these applications?
  – Represent data as RDF triples
     • First step for accom...
Reasoning Example
• Input ontology
      # Every manager is an employee
      Manager subClassOf Employee
      # Person08...
Reasoning Example
• Input ontology
      # Every manager is an employee
                                       Schema
    ...
Reasoning Example
• Input ontology
      # Every manager is an employee
                                       Schema
    ...
Validating RDF Data
• Common misunderstanding
  – RDFS/OWL is to RDF what XML Schema is to XML
  – Describe integrity cond...
Closed vs. Open World
• Two different views on truth:
   – CWA: Any statement that is not known to be true is false
   – O...
Typing Constraint
 • Only managers can supervise employees
 • Input ontology
    o   supervises domain Manager
    o   Per...
Participation Constraint
• Each supervisor must supervise at least
  one employee
• Input axioms
  o   Supervisor subClass...
Uniqueness Constraint
 • Employees can have at most one supervisor
 • Input axioms
    o   supervises InverseFunctional
  ...
Workarounds for CW
• Manually close the world
  – Declare all individuals different from each other
  – Count existing pro...
Problem Summary
• Definitions in an OWL schema may have two
  purposes
  – Infer new statements
  – Check if existing state...
Integrity Constraint
             Solution
• We defined an alternative semantics for OWL
  – Integrity Constraint (IC) sema...
IC Extension
• Syntax specification
  – How do we syntactically say an axiom is an IC and
    not a regular OWL axiom?
• Se...
IC Syntax
• Similar approach to using owl:imports
• Define a new annotation property in a new
  namespace

         Ont1 ow...
IC Semantics
• OWL semantics based on model theory
  – Similar to First Order Logic
  – Formal, precise, and unambiguous
•...
Use Case: SKOS
• Simple Knowledge Organization System (SKOS)
• SKOS provides a model for expressing the basic
  structure ...
SKOS Example
# SKOS reference ontology that contains inference rules
skos:broaderTransitive Transitive                    ...
Explanation
VIOLATION: A violates related propertyDisjointWith broaderTransitive
   INFERRED: A related C
      ASSERTED: ...
Another SKOS Example
# SKOS-XL ontology with a cardinality restriction
skosxl:Label subClassOf                            ...
Another SKOS Example
# SKOS-XL ontology with a cardinality restriction
skosxl:Label subClassOf                            ...
Linked Data Application
• Large amounts of instance data
• Validate before publishing/consuming LOD
• Instance data + Infe...
Validation Algorithm
• An automated translation algorithm
• Automatically maps an OWL IC to ...
  – A SPARQL query, or
  –...
SPARQL Translation
Supervisor subClassOf supervises some Employee



       SELECT * {
          ?x type Supervisor.
     ...
RIF Translation
Supervisor subClassOf supervises some Employee



       Forall ?x ?y (
         invalid() :- And (
      ...
Solution Summary
• Separate ICs from regular OWL ICs
  – No new syntax
  – Import-based mechanism
• Alternative semantics ...
Performance
• Using ICs can improve performance!
• Expressive OWL reasoning is not easy
• Profiles of OWL defined for tracta...
Prototype
• Pellet IC validator
  –   Translates ICs into SPARQL queries automatically
  –   Executes SPARQL queries with ...
Code Example
// create an inferencing model using Pellet reasoner
InfModel dataModel = ModelFactory.createInfModel(r);

//...
Next Steps
• W3C Member submission for IC semantics
• Robust IC validator implementation
  – Incremental validation
  – Mu...
References
• Evren Sirin, Michael Smith, Evan Wallace
  Opening, Closing Worlds - On Integrity Constraints
  OWL: Experien...
Questions
Próximos SlideShares
Carregando em…5
×

Validating Linked Data with OWL

  • Entre para ver os comentários

Validating Linked Data with OWL

  1. 1. Using OWL in Closed World Applications Evren Sirin, CTO Clark & Parsia, LLC evren@clarkparsia.com
  2. 2. Who are we? • Clark & Parsia is a semantic software startup  – HQ in Washington, DC & office in Boston • Provides software development and integration services • Specializing in Semantic Web, web services, and advanced AI technologies for federal and enterprise customers  http://clarkparsia.com/ Twitter: @candp 2
  3. 3. Some Applications • Customer and product data – Find which customer would be interested in buying a certain product • System and component descriptions – Configure components to build a desired system • Workforce and employee data – Locate employees with desired expertise • Patient history and drug data – Detect and prevent potentially harmful drug interactions 3
  4. 4. Common Theme • There is data and lots of it! • Adding semantics to the data helps a lot – Some times simple taxonomies, but other times, complex ontologies • We have complete knowledge about the domain • Errors in the data cause problems – Failures in applications, errors in decision making, potential loss of revenue, security vulnerabilities, etc. 4
  5. 5. Data Validation • Fundamental data management problem – Verify data integrity and correctness  – Enforce validity of updates  • Relevant in many scenarios – Storing data for stand-alone applications – Exchanging data in distributed settings • Solved (to some degree) in RDBMSs – Harder to achieve as data semantics increase and/or more expressive integrity conditions are required 5
  6. 6. Disclaimer • Data validity not important for every use case – Invalid data may be fine for an application – Invalidity may even be a requirement • Focus of this talk is cases where data consistency and integrity are crucial 6
  7. 7. Roadmap for an App • How to build one of these applications? – Represent data as RDF triples • First step for accomplishing data integration and analysis – Enrich data with more semantics (RDFS, OWL) • Infer implicit information from explicit assertions – Ensure data validity • Detect errors in the data – Do something cool with the data • Obviously... 7
  8. 8. Reasoning Example • Input ontology # Every manager is an employee Manager subClassOf Employee # Person0853 is a manager Person0853 type Manager • Output inferences # Person0853 is an employee Person0853 type Employee
  9. 9. Reasoning Example • Input ontology # Every manager is an employee Schema Manager subClassOf Employee # Person0853 is a manager Person0853 type Manager • Output inferences # Person0853 is an employee Person0853 type Employee
  10. 10. Reasoning Example • Input ontology # Every manager is an employee Schema Manager subClassOf Employee # Person0853 is a manager Person0853 type Manager Instance data • Output inferences # Person0853 is an employee Person0853 type Employee
  11. 11. Validating RDF Data • Common misunderstanding – RDFS/OWL is to RDF what XML Schema is to XML – Describe integrity conditions in RDFS or OWL • Typing constraints - RDFS domain/range • Participation constraints - OWL some values restrictions • Uniqueness constraints - OWL cardinality restriction – Use a reasoner to find inconsistencies • Problem: Open World Assumption 9
  12. 12. Closed vs. Open World • Two different views on truth: – CWA: Any statement that is not known to be true is false – OWA: A statement is false only if it is known to be false • Used in different contexts – Databases use CWA because (typically) they contain  complete information – Ontologies use OWA because (typically) they don't... that is, they contain incomplete information • Data validation results significantly different when using CWA instead of OWA 10
  13. 13. Typing Constraint • Only managers can supervise employees • Input ontology o supervises domain Manager o Person085 supervises Person173 OWA CWA  Consistent true false Infer that Assume that  Reason Person085 type Manager Person085 type not Manager
  14. 14. Participation Constraint • Each supervisor must supervise at least one employee • Input axioms o Supervisor subClassOf supervises some Employee o Person085 type Supervisor OWA CWA Consistent true false Infer that Assume that Reason Person085 supervises _:b Person085 supervises _:b _:b type Employee does not exist
  15. 15. Uniqueness Constraint • Employees can have at most one supervisor • Input axioms o supervises InverseFunctional o Person085 supervises Person173 o Person632 supervises Person173 OWA CWA Consistent true false Assume that Infer that Reason Person085 sameAs Person632 Person085 sameAs Person632 does not hold
  16. 16. Workarounds for CW • Manually close the world – Declare all individuals different from each other – Count existing property values and add a max cardinality restriction – Make all disjointness statements explicit and add negated types to individuals • Drawbacks – Can be computationally expensive – Likely to be error-prone
  17. 17. Problem Summary • Definitions in an OWL schema may have two purposes – Infer new statements – Check if existing statements are valid • Using OWA for validation is undesirable – Not always but in many cases • In a problem domain we may have: – Complete knowledge about some parts of the domain – Incomplete knowledge about the other parts
  18. 18. Integrity Constraint Solution • We defined an alternative semantics for OWL – Integrity Constraint (IC) semantics use CWA – Can be combined with regular inference axioms • Ontology developer chooses which axioms will be interpreted with... – OWA - regular OWL axiom, or – CWA - integrity constraint
  19. 19. IC Extension • Syntax specification – How do we syntactically say an axiom is an IC and not a regular OWL axiom? • Semantics specification – How do we exactly interpret an IC? • Validation algorithm – Given the semantics how do we check for IC violations?
  20. 20. IC Syntax • Similar approach to using owl:imports • Define a new annotation property in a new namespace Ont1 owl:imports Ont2 Ont1 ic:imports IC1 • Backward compatible, requires minimum change in tools
  21. 21. IC Semantics • OWL semantics based on model theory – Similar to First Order Logic – Formal, precise, and unambiguous • IC semantics specification – Extends OWL model theory – Change couple basic definitions, everything else follows • Details published in technical papers – We are submitting a W3C member submission soon
  22. 22. Use Case: SKOS • Simple Knowledge Organization System (SKOS) • SKOS provides a model for expressing the basic structure and content of concept schemes – Thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, etc. • SKOS data model specification – Informal (Text): http://www.w3.org/TR/skos-reference/ – Formal (OWL): http://www.w3.org/2004/02/skos/core.rdf 20
  23. 23. SKOS Example # SKOS reference ontology that contains inference rules skos:broaderTransitive Transitive skos-reference.ttl skos:broaderTransitive subPropertyOf skos:broader # Constraints from SKOS reference expressed as ICs skos:related propertyDisjointWith skos:broaderTransitive skos-constraints.ttl # SKOS data that violates the SKOS data model [] a owl:Ontology ; owl:imports skos-reference.ttl ;                  ic:imports skos-constraints.ttl . skos-invalid.ttl A skos:broader B ; skos:related C . B skos:broader C .
  24. 24. Explanation VIOLATION: A violates related propertyDisjointWith broaderTransitive INFERRED: A related C ASSERTED: A related C INFERRED: A broaderTransitive C ASSERTED: A broader B ASSERTED: B broader C ASSERTED: broader subPropertyOf broaderTransitive ASSERTED: broaderTransitive Transitive 22
  25. 25. Another SKOS Example # SKOS-XL ontology with a cardinality restriction skosxl:Label subClassOf skos-xl.ttl skosxl:literalForm cardinality 1 # SKOS data that violates the SKOS data model [] a owl:Ontology ; owl:imports skos-xl.ttl . skos-data.tll A skosxl:labelRelation LabelA LabelA type skosxl:Label . Result: Consistent
  26. 26. Another SKOS Example # SKOS-XL ontology with a cardinality restriction skosxl:Label subClassOf skos-xl.ttl skosxl:literalForm cardinality 1 # SKOS data that violates the SKOS data model [] a owl:Ontology ; owl:imports skos-xl.ttl ;                  ic:imports skos-xl.ttl . skos-data.tll A skosxl:labelRelation LabelA LabelA type skosxl:Label . Result: IC Violation
  27. 27. Linked Data Application • Large amounts of instance data • Validate before publishing/consuming LOD • Instance data + Inference axioms + Constraints – Infer new facts using inference axioms with OWA – Validate data using constraints with CWA – Inference axioms and constraints are both expressed in OWL 25
  28. 28. Validation Algorithm • An automated translation algorithm • Automatically maps an OWL IC to ... – A SPARQL query, or – A RIF rule • Many different implementation possibilities • Off-the-shelf tools can be used for IC validation
  29. 29. SPARQL Translation Supervisor subClassOf supervises some Employee SELECT * { ?x type Supervisor. NOT EXISTS { ?x supervises ?y. ?y type Employee. } }
  30. 30. RIF Translation Supervisor subClassOf supervises some Employee Forall ?x ?y ( invalid() :- And ( ?x[type -> Supervisor] Naf And ( ?x[supervises -> ?y] ?y[type -> Employee] )))
  31. 31. Solution Summary • Separate ICs from regular OWL ICs – No new syntax – Import-based mechanism • Alternative semantics for ICs – Extends OWL model theory – Provides the meanings of ICs formally • Validation algorithm – Translate ICs to another formalism – SPARQL or RIF engines can be used
  32. 32. Performance • Using ICs can improve performance! • Expressive OWL reasoning is not easy • Profiles of OWL defined for tractable reasoning – OWL 2 QL, OWL 2 EL, OWL 2 RL – Less expressive but more efficient • Modeling some OWL axioms as ICs may reduce the overall expressivity 30
  33. 33. Prototype • Pellet IC validator – Translates ICs into SPARQL queries automatically – Executes SPARQL queries with Pellet – Query results show constraint violations – Automatically explain constraint violations • Free download – http://clarkparsia.com/pellet/icv 31
  34. 34. Code Example // create an inferencing model using Pellet reasoner InfModel dataModel = ModelFactory.createInfModel(r); // load the schema and instance data to Pellet dataModel.read( "file:data.rdf" ); dataModel.read( "file:schema.owl" ); // Create the IC validator and associate it with the dataset JenaICValidator validator = new JenaICValidator(dataModel); // Load the constraints into the IC validator validator.getConstraints().read("file:constraints.owl"); // Get the constraint violations Iterator<ConstraintViolation> violations = validator.getViolations();
  35. 35. Next Steps • W3C Member submission for IC semantics • Robust IC validator implementation – Incremental validation – Multi-threaded validation • Support for IC editing • Integration with PelletDb – Scalable reasoning + validation 33
  36. 36. References • Evren Sirin, Michael Smith, Evan Wallace Opening, Closing Worlds - On Integrity Constraints OWL: Experiences and Directions Workshop (OWLED '08), October 2008. • Evren Sirin, Jiao Tao Towards Integrity Constraints in OWL OWL: Experiences and Directions Workshop (OWLED '09), October 2009. • Jiao Tao, Evren Sirin, Jie Bao, Deborah L. McGuinness Integrity Constraints in OWL To AppearThe 24th AAAIConference on Artificial Intelligence (AAAI '10), July 2010.
  37. 37. Questions

×