1. Using OWL in
Closed World Applications
Evren Sirin, CTO
Clark & Parsia, LLC
evren@clarkparsia.com
2. Who are we?
• Clark & Parsia is a semantic software startup
– HQ in Washington, DC & office in Boston
• Provides software development and integration
services
• Specializing in Semantic Web, web services, and
advanced AI technologies for federal and
enterprise customers
http://clarkparsia.com/
Twitter: @candp
2
3. Some Applications
• Customer and product data
– Find which customer would be interested in buying a
certain product
• System and component descriptions
– Configure components to build a desired system
• Workforce and employee data
– Locate employees with desired expertise
• Patient history and drug data
– Detect and prevent potentially harmful drug interactions
3
4. Common Theme
• There is data and lots of it!
• Adding semantics to the data helps a lot
– Some times simple taxonomies, but other times,
complex ontologies
• We have complete knowledge about the domain
• Errors in the data cause problems
– Failures in applications, errors in decision making,
potential loss of revenue, security vulnerabilities, etc.
4
5. Data Validation
• Fundamental data management problem
– Verify data integrity and correctness
– Enforce validity of updates
• Relevant in many scenarios
– Storing data for stand-alone applications
– Exchanging data in distributed settings
• Solved (to some degree) in RDBMSs
– Harder to achieve as data semantics increase and/or
more expressive integrity conditions are required
5
6. Disclaimer
• Data validity not important for every use case
– Invalid data may be fine for an application
– Invalidity may even be a requirement
• Focus of this talk is cases where data consistency
and integrity are crucial
6
7. Roadmap for an App
• How to build one of these applications?
– Represent data as RDF triples
• First step for accomplishing data integration and analysis
– Enrich data with more semantics (RDFS, OWL)
• Infer implicit information from explicit assertions
– Ensure data validity
• Detect errors in the data
– Do something cool with the data
• Obviously...
7
8. Reasoning Example
• Input ontology
# Every manager is an employee
Manager subClassOf Employee
# Person0853 is a manager
Person0853 type Manager
• Output inferences
# Person0853 is an employee
Person0853 type Employee
9. Reasoning Example
• Input ontology
# Every manager is an employee
Schema
Manager subClassOf Employee
# Person0853 is a manager
Person0853 type Manager
• Output inferences
# Person0853 is an employee
Person0853 type Employee
10. Reasoning Example
• Input ontology
# Every manager is an employee
Schema
Manager subClassOf Employee
# Person0853 is a manager
Person0853 type Manager Instance data
• Output inferences
# Person0853 is an employee
Person0853 type Employee
11. Validating RDF Data
• Common misunderstanding
– RDFS/OWL is to RDF what XML Schema is to XML
– Describe integrity conditions in RDFS or OWL
• Typing constraints - RDFS domain/range
• Participation constraints - OWL some values restrictions
• Uniqueness constraints - OWL cardinality restriction
– Use a reasoner to find inconsistencies
• Problem: Open World Assumption
9
12. Closed vs. Open World
• Two different views on truth:
– CWA: Any statement that is not known to be true is false
– OWA: A statement is false only if it is known to be false
• Used in different contexts
– Databases use CWA because (typically) they contain
complete information
– Ontologies use OWA because (typically) they don't...
that is, they contain incomplete information
• Data validation results significantly different when
using CWA instead of OWA
10
13. Typing Constraint
• Only managers can supervise employees
• Input ontology
o supervises domain Manager
o Person085 supervises Person173
OWA CWA
Consistent true false
Infer that Assume that
Reason Person085 type Manager Person085 type not Manager
14. Participation Constraint
• Each supervisor must supervise at least
one employee
• Input axioms
o Supervisor subClassOf supervises some Employee
o Person085 type Supervisor
OWA CWA
Consistent true false
Infer that Assume that
Reason Person085 supervises _:b Person085 supervises _:b
_:b type Employee does not exist
15. Uniqueness Constraint
• Employees can have at most one supervisor
• Input axioms
o supervises InverseFunctional
o Person085 supervises Person173
o Person632 supervises Person173
OWA CWA
Consistent true false
Assume that
Infer that
Reason Person085 sameAs Person632
Person085 sameAs Person632
does not hold
16. Workarounds for CW
• Manually close the world
– Declare all individuals different from each other
– Count existing property values and add a max
cardinality restriction
– Make all disjointness statements explicit and add
negated types to individuals
• Drawbacks
– Can be computationally expensive
– Likely to be error-prone
17. Problem Summary
• Definitions in an OWL schema may have two
purposes
– Infer new statements
– Check if existing statements are valid
• Using OWA for validation is undesirable
– Not always but in many cases
• In a problem domain we may have:
– Complete knowledge about some parts of the domain
– Incomplete knowledge about the other parts
18. Integrity Constraint
Solution
• We defined an alternative semantics for OWL
– Integrity Constraint (IC) semantics use CWA
– Can be combined with regular inference axioms
• Ontology developer chooses which axioms will
be interpreted with...
– OWA - regular OWL axiom, or
– CWA - integrity constraint
19. IC Extension
• Syntax specification
– How do we syntactically say an axiom is an IC and
not a regular OWL axiom?
• Semantics specification
– How do we exactly interpret an IC?
• Validation algorithm
– Given the semantics how do we check for IC
violations?
20. IC Syntax
• Similar approach to using owl:imports
• Define a new annotation property in a new
namespace
Ont1 owl:imports Ont2
Ont1 ic:imports IC1
• Backward compatible, requires minimum change
in tools
21. IC Semantics
• OWL semantics based on model theory
– Similar to First Order Logic
– Formal, precise, and unambiguous
• IC semantics specification
– Extends OWL model theory
– Change couple basic definitions, everything else
follows
• Details published in technical papers
– We are submitting a W3C member submission soon
22. Use Case: SKOS
• Simple Knowledge Organization System (SKOS)
• SKOS provides a model for expressing the basic
structure and content of concept schemes
– Thesauri, classification schemes, subject heading lists,
taxonomies, folksonomies, etc.
• SKOS data model specification
– Informal (Text): http://www.w3.org/TR/skos-reference/
– Formal (OWL): http://www.w3.org/2004/02/skos/core.rdf
20
23. SKOS Example
# SKOS reference ontology that contains inference rules
skos:broaderTransitive Transitive skos-reference.ttl
skos:broaderTransitive subPropertyOf skos:broader
# Constraints from SKOS reference expressed as ICs
skos:related propertyDisjointWith skos:broaderTransitive skos-constraints.ttl
# SKOS data that violates the SKOS data model
[] a owl:Ontology ; owl:imports skos-reference.ttl ;
ic:imports skos-constraints.ttl . skos-invalid.ttl
A skos:broader B ; skos:related C .
B skos:broader C .
24. Explanation
VIOLATION: A violates related propertyDisjointWith broaderTransitive
INFERRED: A related C
ASSERTED: A related C
INFERRED: A broaderTransitive C
ASSERTED: A broader B
ASSERTED: B broader C
ASSERTED: broader subPropertyOf broaderTransitive
ASSERTED: broaderTransitive Transitive
22
25. Another SKOS Example
# SKOS-XL ontology with a cardinality restriction
skosxl:Label subClassOf skos-xl.ttl
skosxl:literalForm cardinality 1
# SKOS data that violates the SKOS data model
[] a owl:Ontology ; owl:imports skos-xl.ttl .
skos-data.tll
A skosxl:labelRelation LabelA
LabelA type skosxl:Label .
Result: Consistent
26. Another SKOS Example
# SKOS-XL ontology with a cardinality restriction
skosxl:Label subClassOf skos-xl.ttl
skosxl:literalForm cardinality 1
# SKOS data that violates the SKOS data model
[] a owl:Ontology ; owl:imports skos-xl.ttl ;
ic:imports skos-xl.ttl . skos-data.tll
A skosxl:labelRelation LabelA
LabelA type skosxl:Label .
Result: IC Violation
27. Linked Data Application
• Large amounts of instance data
• Validate before publishing/consuming LOD
• Instance data + Inference axioms + Constraints
– Infer new facts using inference axioms with OWA
– Validate data using constraints with CWA
– Inference axioms and constraints are both expressed
in OWL
25
28. Validation Algorithm
• An automated translation algorithm
• Automatically maps an OWL IC to ...
– A SPARQL query, or
– A RIF rule
• Many different implementation possibilities
• Off-the-shelf tools can be used for IC validation
30. RIF Translation
Supervisor subClassOf supervises some Employee
Forall ?x ?y (
invalid() :- And (
?x[type -> Supervisor]
Naf And (
?x[supervises -> ?y]
?y[type -> Employee] )))
31. Solution Summary
• Separate ICs from regular OWL ICs
– No new syntax
– Import-based mechanism
• Alternative semantics for ICs
– Extends OWL model theory
– Provides the meanings of ICs formally
• Validation algorithm
– Translate ICs to another formalism
– SPARQL or RIF engines can be used
32. Performance
• Using ICs can improve performance!
• Expressive OWL reasoning is not easy
• Profiles of OWL defined for tractable reasoning
– OWL 2 QL, OWL 2 EL, OWL 2 RL
– Less expressive but more efficient
• Modeling some OWL axioms as ICs may reduce
the overall expressivity
30
33. Prototype
• Pellet IC validator
– Translates ICs into SPARQL queries automatically
– Executes SPARQL queries with Pellet
– Query results show constraint violations
– Automatically explain constraint violations
• Free download
– http://clarkparsia.com/pellet/icv
31
34. Code Example
// create an inferencing model using Pellet reasoner
InfModel dataModel = ModelFactory.createInfModel(r);
// load the schema and instance data to Pellet
dataModel.read( "file:data.rdf" );
dataModel.read( "file:schema.owl" );
// Create the IC validator and associate it with the dataset
JenaICValidator validator = new JenaICValidator(dataModel);
// Load the constraints into the IC validator
validator.getConstraints().read("file:constraints.owl");
// Get the constraint violations
Iterator<ConstraintViolation> violations =
validator.getViolations();
35. Next Steps
• W3C Member submission for IC semantics
• Robust IC validator implementation
– Incremental validation
– Multi-threaded validation
• Support for IC editing
• Integration with PelletDb
– Scalable reasoning + validation
33
36. References
• Evren Sirin, Michael Smith, Evan Wallace
Opening, Closing Worlds - On Integrity Constraints
OWL: Experiences and Directions Workshop
(OWLED '08), October 2008.
• Evren Sirin, Jiao Tao
Towards Integrity Constraints in OWL
OWL: Experiences and Directions Workshop
(OWLED '09), October 2009.
• Jiao Tao, Evren Sirin, Jie Bao, Deborah L. McGuinness
Integrity Constraints in OWL
To AppearThe 24th AAAIConference on Artificial
Intelligence (AAAI '10), July 2010.