SlideShare uma empresa Scribd logo
1 de 63
Validating RDF data:
Challenges and perspectives
Jose Emilio Labr Gayo
WESO Research Group (WEb Semantics Oviedo)
University of Oviedo, Spain
About me…
Founded WESO (Web Semantics Oviedo) research group
Practical applications of semantic technologies since 2004
Several domains: e-Government, e-Health
Some books:
"Web semántica" (in Spanish), 2012
"Validating RDF data", 2017
…and software:
SHaclEX (Scala library, implements ShEx & SHACL)
RDFShape (RDF playground)
HTML version: http://book.validatingrdf.com
Examples: https://github.com/labra/validatingRDFBookExamples
Contents
Short intro to semantic Web & knowledge graphs
Short overview of RDF
Understanding the RDF validation problem
2 technologies: ShEx/SHACL
Challenges & perspectives
Semantic Web
Vision of the Web as web of data
Not only pages, but also data: linked data
Data understandable by machines
Related with:
Big Data
Artificial intelligence
Knowledge representation
Example: Wikipedia vs Wikidata
Tim Berners-Lee
Source: Wikipedia
Source: http://www.internetlivestats.com/total-number-of-websites/
Date: 05/04/2019 (19:05h)
Who consumes Web information?
People
We access through some device
...and Machines
They show us web pages (browsers)
...but also manage information (bots)
Filtering content
Doing suggestions
...
"If Google doesn't understand your Web page, it almost doesn't exist"
Web information
Documentos
Information
retrieval
DocumentosDocumentos
Posts
Tweets
Web pages
...
Publish
Web
documents
Reality
Mental model
Web information must be machine-processable
Semantic web technologies focus on how to achieve this goal
Sensors Generate
User
Content
generation &
filtering
People vs Machines
Programmed for some tasks
Predictable (no mistakes*)
No problem with repetitive tasks
Problems to understand the context
Creativity, imagination
Unpredictable (we do mistakes)
We get tired with repetitive tasks
Understanding based on context
*when well programmed
Information understandable by machines?
Example: "Oviedo has a temperature of 36 degrees"
Problem: Ambiguity and context identification
Oviedo ? It can be... A city in Spain
...or a city in Florida, USA
...or a soccer player http://www.wikidata.org/entity/Q325997
http://wikidata.org/entity/Q14317
http://www.wikidata.org/entity/Q1813449
...has a temperatrure of... https://www.wikidata.org/wiki/Property:P2076
Identifiers (URIs) in Wikidata
Machine representation
http://www.wikidata.org/entity/Q14317
https://www.wikidata.org/prop/direct/P2076
36^^<http://www.w3.org/2001/XMLSchema#integer>
Knowledge representation for machines
Simple statement (triple 3 elements): subject, predicate, object
http://www.wikidata.org/entity/Q14317
https://www.wikidata.org/prop/direct/P2076
subject predicate object
36^^<http://www.w3.org/2001/XMLSchema#integer>
wd:Q14317
wdt:P2076
36^^xsd:integer
wd: <http://www.wikidata.org/entity/>
wdt: <https://www.wikidata.org/prop/direct/>
xsd: <http://www.w3.org/2001/XMLSchema#>
Prefix declarations help simplify long URIs
Knowledge graphs
wd:Q14317 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
Wikidata statistics (17-07-2019)
57,462,853 entities
728,979,397 triples
Source: https://www.wikidata.org/wiki/Wikidata:Statistics
Wikidata is an example of a knowledge graph
There are a lot more:
Open: DBpedia, BabelNet, etc
Proprietary: Google, facebook, Microsoft, etc. also have knowledge graphs
Web information & knowledge graphs
Documentos
Information
retrieval
Knowledge
graphs
DocumentosDocumentos
Posts
Tweets
Web pages
...
Publish
Web
documents
Reality
Mental model
Knowledge graphs are a key part of the Web nowadays
Sensors Generate
User
Content
generation &
filtering
RDF
RDF (Resource Description Framework)
Allows to represent knowledge graphs
RDF data model = graph as a set of statements
Predicates = URIs
Several syntaxes
Turtle, N-Triples, JSON-LD,...
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
prefix wd: <http://www.wikidata.org/entity/>
prefix wdt: <https://www.wikidata.org/prop/direct/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
wd:Q14317 wdt:P1082 220020 ;
wdt:P2076 36 ;
wdt:P1376 wd:Q3934 .
wd:Q10514 wdt:P19 wd:Q14317 ;
wdt:P569 "1981-07-29"^^xsd:date ;
wdt:P18 <picture.jpg> .
RDF syntaxes: Turtle
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1082> "220020"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1376> <http://www.wikidata.org/entity/Q3934> .
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P2076> "36"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P18> <picture.jpg> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P19> <http://www.wikidata.org/entity/Q14317> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P569> "1981-07-29"^^<http://www.w3.org/2001/XMLSchema#date> .
RDF syntaxes: N-Triples
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
{
"@context" : "http://...wikidata.json",
"@graph" : [ {
"@id" : "wd:Q10514",
"wdt:P18" : "picture.jpg",
"wdt:P19" : "wd:Q14317",
"P569" : "1981-07-29"
}, {
"@id" : "wd:Q14317",
"wdt:P1082" : 220020,
"wdt:P1376" : "wd:Q3934",
"wdt:P2076" : 36
}
]
}
RDF syntaxes: JSON-LD
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:wdt="https://www.wikidata.org/prop/direct/"
xmlns:wd="http://www.wikidata.org/entity/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
<rdf:Description rdf:about="http://www.wikidata.org/entity/Q10514">
<wdt:P18>picture.jpg</wdt:P18>
<wdt:P19>
<rdf:Description rdf:about="http://www.wikidata.org/entity/Q14317">
<wdt:P1082 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">220020</wdt:P1082>
<wdt:P1376 rdf:resource="http://www.wikidata.org/entity/Q3934"/>
<wdt:P2076 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">36</wdt:P2076>
</rdf:Description>
</wdt:P19>
<wdt:P569 rdf:datatype=http://www.w3.org/2001/XMLSchema#date>1981-07-29</wdt:P569>
</rdf:Description>
</rdf:RDF>
RDF syntaxes: RDF/XML
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
RDF graphs are composable
RDF helps information integration
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
:alice
:bob
:carol
schema:birthPlace
schema:birthPlace
schema:knows
schema:knows
Alice Cooper
Schema:name Robert Smith
schema:nameschema:knows
Carol King
schema:name
RDF data model
Graph = set of statements (triples)
Predicates (arcs) = URIs
Subjects: URIs*
Objects: URIs or literals* 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
*Exception: blank nodes (next slide)
RDF data model: Blank nodes
A statement about something
Example: "Alice knows someone who knows Bob"
:alice
schema:knows
:bob
schema:knows
:alice schema:knows _:x ;
_:x schema:knows :bob .
Notation (in Turtle)
:alice schema:knows [
schema:knows :bob
] .
 x ( :alice :knows x 
x :knows :bob )
RDF data model: Literals
:bob schema:name "Robert"^^xsd:string ;
schema:age "18"^^xsd:integer ;
schema:birthDate "2010-04-12"^^xsd:date .
:bob schema:name "Robert" ;
schema:age 18 ;
schema:birthDate "2010-04-12"^^xsd:date .
Objects can also be literals
Literals contain a lexical form and a datatype
Common datatypes: XML Schema primitive datatypes
If not specified, a literal has type xsd:string
:bob
schema:name
"2010-04-12"^^xsd:date18"Robert Smith"
schema:age
schema:birthDate
RDF data model: Language tagged strings
String literals can be qualified by a language tag
They have datatype rdfs:langString
:spain rdfs:label "Spain"@en ;
rdfs:label "España"@es .
:Spain
"España"@es"Spain"@en
rdfs:label
rdfs:label
...and that's all
Yes, the RDF Data model is simple
RDF ecosystem
SPARQL
Vocabularies
Ontologies and inference
SELECT ?name ?birthDate ?picture WHERE {
?person wdt:P19 wd:Q14317 ;
rdfs:label ?name ;
wdt:P569 ?birthDate ;
wdt:P18 ?picture .
FILTER (Lang(?name) = 'es')
}
SPARQL
SPARQL = Simple Protocol and RDF Query Language
Similar to SQL, but for RDF
Example: People born in Oviedo
wdt:P19
"1981-07-29"^^xsd:date
wdt:P18
wdt:P569
Oviedo
wd:Q14317
"Fernando Alonso"@en
rdfs:label
?person
?birthPlace?name ?picture
wdt:P19
wdt:P18wdt:P569
wd:Q14317
rdfs:label
wd:Q10514
wdt:P19
...
Shared entities and vocabularies
URIs as global identifiers allow to reuse them
Lots of vocabularies: domain specific or general purpose
Examples:
schema.org: properties that major browsers can recognize
Linked Open Vocabularies
Inference and ontologies
Some vocabularies define properties that can infer new information
RDF Schema: Classes, subclasses, etc.
OWL: Ontologies using description logics
Trade-off between expressiveness and complexity
rdf:type
:Student
:alice
:Person
rdfs:subClassOf
:HumanBeing
owl:sameAs
rdf:type
rdf:type
RDF, the good parts...
RDF as an integration language
RDF as a lingua franca for semantic web and linked data
RDF flexibility & integration
Data can be adapted to multiple environments
Open and reusable data by default
RDF for knowledge representation
RDF data stores & SPARQL
RDF, the other parts
Consuming & producing RDF
Multiple syntaxes: Turtle, RDF/XML, JSON-LD, ...
Embedding RDF in HTML
Describing and validating RDF content
Producer Consumer
Why describe & validate RDF?
For producers
Developers can understand the contents they are going to produce
Ensure they produce the expected structure
Advertise and document the structure
Generate interfaces
For consumers
Understand the contents
Verify the structure before processing it
Query generation & optimization
Producer Consumer
Shapes
Similar technologies
Technology Schema
Relational Databases DDL
XML DTD, XML Schema, RelaxNG, Schematron
Json Json Schema
RDF ?
Fill that gap
Understanding the problem
Identifying the shape of graphs...
Shapes can describe the form of a node (node constraint)
...and the number of possible arcs incoming/outgoing from a node
...and the possible values associated with those arcs
:alice schema:name "Alice";
schema:knows :bob, :carol .
IRI schema:name string 1
schema:knows IRI 0, 1,...
RDF Node
Shape
RDF Node that
represents a User
<UserShape> IRI {
schema:name xsd:string ;
schema:knows IRI *
}
ShEx
Understanding the problem
Repeated properties
The same property can be used for different purposes in the same data
Example: A product must have 2 codes with different structure
:product schema:productID "isbn:123-456-789";
schema:productID "code456" .
A practical example from FHIR
See: http://hl7-fhir.github.io/observation-example-bloodpressure.ttl.html
Understanding the problem
Shapes ≠ types
Nodes in RDF graphs can have 0, 1 or many rdf:type declarations
A type can be used in multiple contexts, e.g. foaf:Person
Nodes are not necessarily annotated with discriminating types
Nodes with type :Person can represent friends, students, patients,...
Different meanings and different structure depending on context
Specific validation constraints for different contexts
Understanding the problem
RDF validation ≠ ontology definition ≠ instance data
Ontologies are usually focused on domain entities
RDF validation is focused on RDF graph features (lower level)
Ontology
Constraints
RDF Validation
Instance data
Different levels
:alice schema:name "Alice";
schema:knows :bob .
<User> IRI {
schema:name xsd:string ;
schema:knows IRI
}
schema:knows a owl:ObjectProperty ;
rdfs:domain schema:Person ;
rdfs:range schema:Person .
A user must have only two properties:
schema:name of value xsd:string
schema:knows with an IRI value
Why not using SPARQL to validate?
Pros:
Expressive
Ubiquitous
Cons
Expressive
Idiomatic
many ways to encode the same constraint
ASK {{ SELECT ?Person {
?Person schema:name ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person {
?Person schema:name ?o .
FILTER ( isLiteral(?o) &&
datatype(?o) = xsd:string )
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person (COUNT(*) AS ?c1) {
?Person schema:gender ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)}
{ SELECT ?Person (COUNT(*) AS ?c2) {
?S schema:gender ?o .
FILTER ((?o = schema:Female ||
?o = schema:Male))
} GROUP BY ?Person HAVING (COUNT(*)=1)}
FILTER (?c1 = ?c2)
}
Example: Define SPARQL query to check:
There must be one schema:name which
must be a xsd:string, and one
schema:gender which must be
schema:Male or schema:Female
Previous/other approaches
SPIN, by TopQuadrant http://spinrdf.org/
SPARQL templates, Influenced SHACL
Stardog ICV: http://docs.stardog.com/icv/icv-specification.html
OWL with UNA and CWA
OSLC resource shapes: https://www.w3.org/Submission/shapes/
Vocabulary for RDF validation
Dublin Core Application profiles (K. Coyle, T. Baker)
http://dublincore.org/documents/dc-dsp/
RDF Data Descriptions (Fischer et al)
http://ceur-ws.org/Vol-1330/paper-33.pdf
RDFUnit (D. Kontokostas)
http://aksw.org/Projects/RDFUnit.html
...
ShEx and SHACL
2013 RDF Validation Workshop
Conclusions of the workshop:
There is a need of a higher level, concise language for RDF Validation
ShEx initially proposed (v 1.0)
2014 W3c Data Shapes WG chartered
2017 SHACL accepted as W3C recommendation
2017 ShEx 2.0 released as Community group draft
2018 ShEx adopted by Wikidata
Short intro to ShEx
ShEx (Shape Expressions Language)
Concise and human-readable language for RDF validation & description
Syntax similar to SPARQL, Turtle
Semantics inspired by regular expressions & RelaxNG
2 syntaxes: Compact and RDF/JSON-LD
Official info: http://shex.io
Semantics: http://shex.io/shex-semantics/, primer: http://shex.io/shex-primer
ShEx implementations and playgrounds
Libraries:
shex.js: Javascript
SHaclEX: Scala (Jena/RDF4j)
PyShEx: Python
shex-java: Java
Ruby-ShEx: Ruby
Online demos & playgrounds
ShEx-simple
RDFShape
ShEx-Java
ShExValidata
Simple example
Nodes conforming to <User> shape must:
• Be IRIs
• Have exactly one schema:name with a value of type xsd:string
• Have zero or more schema:knows whose values conform to <User>
prefix schema: <http://schema.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
<User> IRI {
schema:name xsd:string ;
schema:knows @<User> *
}
Prefix
declarations
as in
Turtle/SPARQL
RDF Validation using ShEx
:alice schema:name "Alice" ;
schema:knows :alice .
:bob schema:knows :alice ;
schema:name "Robert".
:carol schema:name "Carol", "Carole" .
:dave schema:name 234 .
:emily foaf:name "Emily" .
:frank schema:name "Frank" ;
schema:email <mailto:frank@example.org> ;
schema:knows :alice, :bob .
:grace schema:name "Grace" ;
schema:knows :alice, _:1 .
_:1 schema:name "Unknown" .
Try it (RDFShape): https://goo.gl/97bYdv
Try it (ShExDemo):https://goo.gl/Y8hBsW
Schema
Data
<User> IRI {
schema:name xsd:string ;
schema:knows @<User> *
}
:alice@<User>,
:bob @<User>,
:carol@<User>,
:dave @<User>,
:emily@<User>,
:frank@<User>,
:grace@<User>
Shape map







Validation process
:alice@:User, :bob@:User, :carol@:User
ShEx
Validator
Result shape map
:User {
schema:name xsd:string ;
schema:knows @:User *
}
ShEx Schema
:alice schema:name "Alice" ;
schema:knows :alice .
:bob schema:knows :alice ;
schema:name "Robert".
:carol schema:name "Carol", "Carole" .
RDF data
Shape map :alice@:User,
:bob@:User,
:carol@!:User
Input: RDF data, ShEx schema, Shape map
Output: Result shape map
Example with more ShEx features
:AdultPerson EXTRA rdf:type {
rdf:type [ schema:Person ] ;
:name xsd:string ;
:age MinInclusive 18 ;
:gender [:Male :Female] OR xsd:string ;
:address @:Address ? ;
:worksFor @:Company + ;
}
:Address CLOSED {
:addressLine xsd:string {1,3} ;
:postalCode /[0-9]{5}/ ;
:state @:State ;
:city xsd:string
}
:Company {
:name xsd:string ;
:state @:State ;
:employee @:AdultPerson * ;
}
:State /[A-Z]{2}/
:alice rdf:type :Student, schema:Person ;
:name "Alice" ;
:age 20 ;
:gender :Male ;
:address [
:addressLine "Bancroft Way" ;
:city "Berkeley" ;
:postalCode "55123" ;
:state "CA"
] ;
:worksFor [
:name "Company" ;
:state "CA" ;
:employee :alice
] . Try it: https://tinyurl.com/yd5hp9z4
More info about ShEx
See:
ShEx by Example (slides):
https://figshare.com/articles/ShExByExample_pptx/6291464
ShEx chapter from Validating RDF data book:
http://book.validatingrdf.com/bookHtml010.html
Short intro to SHACL
W3C recommendation:
https://www.w3.org/TR/shacl/ (July 2017)
RDF vocabulary
2 parts: SHACL-Core, SHACL-SPARQL
SHACL implementations
Name Parts Language - Library Comments
Topbraid SHACL API SHACL Core, SPARQL Java (Jena) Used by TopBraid composer
SHACL playground SHACL Core Javascript (rdflib.js) http://shacl.org/playground/
SHACLEX SHACL Core Scala (Jena, RDF4j) http://rdfshape.weso.es
pySHACL SHACL Core, SPARQL Python (rdflib) https://github.com/RDFLib/pySHACL
Corese SHACL SHACL Core, SPARQL Java (STTL) http://wimmics.inria.fr/corese
RDFUnit SHACL Core, SPARQL Java (Jena) https://github.com/AKSW/RDFUnit
Basic example
Try it. RDFShape https://goo.gl/T1uuzv
prefix : <http://example.org/>
prefix sh: <http://www.w3.org/ns/shacl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix schema: <http://schema.org/>
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
:hasEmail sh:path schema:email ;
sh:minCount 1;
sh:maxCount 1;
sh:nodeKind sh:IRI .
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org> .
:bob schema:firstName "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email "carol@mail.org" .
Shapes graph
Data graph


Same example with blank nodes
Try it. RDFShape https://goo.gl/ukY5vq
prefix : <http://example.org/>
prefix sh: <http://www.w3.org/ns/shacl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix schema: <http://schema.org/>
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property [
sh:path schema:name ;
sh:minCount 1; sh:maxCount 1;
sh:datatype xsd:string ;
] ;
sh:property [
sh:path schema:email ;
sh:minCount 1; sh:maxCount 1;
sh:nodeKind sh:IRI ;
] .
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org> .
:bob schema:firstName "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email "carol@mail.org" .
Shapes graph
Data graph


Some definitions about SHACL
Shape: collection of targets and constraints components
Targets: specify which nodes in the data graph must conform to a shape
Constraint components: Determine how to validate a node
Shape
target declarations
constraint components
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
. . .
Validation Report
The output of the validation process is a list of violation errors
No errors  RDF conforms to shapes graph
[ a sh:ValidationReport ;
sh:conforms false ;
sh:result [
a sh:ValidationResult ;
sh:focusNode :bob ;
sh:message
"MinCount violation. Expected 1, obtained: 0" ;
sh:resultPath schema:name ;
sh:resultSeverity sh:Violation ;
sh:sourceConstraintComponent
sh:MinCountConstraintComponent ;
sh:sourceShape :hasName
] ;
...
[ a sh:ValidationReport ;
sh:conforms true
].
SHACL processor
SHACL
Processor
Validation report
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
. . .
Shapes
graph
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org>.
:bob schema:name "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email <mailto:carol@mail.org> .
Data
Graph
[ a sh:ValidationReport ;
sh:conforms true
].
SHACL Core built-in constraint components
Type Constraints
Cardinality minCount, maxCount
Types of values class, datatype, nodeKind
Values node, in, hasValue, property
Range of values minInclusive, maxInclusive
minExclusive, maxExclusive
String based minLength, maxLength, pattern
Language based languageIn, uniqueLang
Logical constraints not, and, or, xone
Closed shapes closed, ignoredProperties
Property pair constraints equals, disjoint, lessThan, lessThanOrEquals
Non-validating constraints name, description, order, group
Qualified shapes qualifiedValueShape, qualifiedValueShapesDisjoint
qualifiedMinCount, qualifiedMaxCount
Longer example
:AdultPerson EXTRA a {
a [ schema:Person ] ;
:name xsd:string ;
:age MinInclusive 18 ;
:gender [:Male :Female] OR xsd:string ;
:address @:Address ? ;
:worksFor @:Company + ;
}
:Address CLOSED {
:addressLine xsd:string {1,3} ;
:postalCode /[0-9]{5}/ ;
:state @:State ;
:city xsd:string
}
:Company {
:name xsd:string ;
:state @:State ;
:employee @:AdultPerson * ;
}
:State /[A-Z]{2}/
:AdultPerson a sh:NodeShape ;
sh:property [
sh:path rdf:type ;
sh:qualifiedValueShape [
sh:hasValue schema:Person
];
sh:qualifiedMinCount 1 ;
sh:qualifiedMaxCount 1 ;
] ;
sh:targetNode :alice ;
sh:property [ sh:path :name ;
sh:minCount 1; sh:maxCount 1;
sh:datatype xsd:string ;
] ;
sh:property [ sh:path :gender ;
sh:minCount 1; sh:maxCount 1;
sh:in (:Male :Female);
] ;
sh:property [ sh:path :age ;
sh:maxCount 1;
sh:minInclusive 18
] ;
sh:property [ sh:path :address ;
sh:node :Address ;
sh:minCount 1 ; sh:maxCount 1
] ;
sh:property [ sh:path :worksFor ;
sh:node :Company ;
sh:minCount 1 ; sh:maxCount 1
].
:Address a sh:NodeShape ;
sh:closed true ;
sh:property [ sh:path :addressLine;
sh:datatype xsd:string ;
sh:minCount 1 ; sh:maxCount 3
] ;
sh:property [ sh:path :postalCode ;
sh:pattern "[0-9]{5}" ;
sh:minCount 1 ; sh:maxCount 3
] ;
sh:property [ sh:path :city ;
sh:datatype xsd:string ;
sh:minCount 1 ; sh:maxCount 1
] ;
sh:property [ sh:path :state ;
sh:node :State ;
] .
:Company a sh:NodeShape ;
sh:property [ sh:path :name ;
sh:datatype xsd:string
] ;
sh:property [
sh:path :state ;
sh:node :State
] ;
sh:property [ sh:path :employee ;
sh:node :AdultPerson ;
] ;.
:State a sh:NodeShape ;
sh:pattern "[A-Z]{2}" .
In ShEx
In SHACL
Its recursive!!! (not well defined SHACL)
Implementation dependent feature
Try it: https://tinyurl.com/ycl3mkzr
More info about SHACL
SHACL by example (slides):
https://figshare.com/articles/SHACL_by_example/6449645
SHACL chapter at Validating RDF data book
http://book.validatingrdf.com/bookHtml011.html
Some challenges and perspectives
Theoretical foundations of ShEx/SHACL
Inference shapes from data
Validation Usability
RDF Stream validation
Schema ecosystems
Wikidata
Solid
Theoretical foundations of ShEx/SHACL
Conversion between ShEx and SHACL
SHaclEX library converts subsets of both
Challenges
Recursion and negation
Performance and algorithmic complexity
Detect useful subsets of the languages
Convert to SPARQL
Schema/data mapping Shapes Shapes
ShEx SHACL
Inference of Shapes from Data
Useful use case in practice
Knowledge Graph summarization
Some prototypes:
ShExer, RDFShape, ShapeArchitect
Try it with RDFShape:
https://tinyurl.com/y8pjcbyf
Shape Expression
generated for
wd:Q51613194
Shapes
RDF data
infer
Validation usability
Learning from users
Early adopters: WebIndex, HL7 FHIR, Eclipse Lyo, GenWiki,…
Improve error information/visualization/navigation/repairing
Authoring/visualization tools
Propose annotation sets
UI generation
Error reporting/suggestion (SHOULD/MUST/…)
Shapes
RDF Stream validation
Validation of RDF streams
Challenges:
Incremental validation
Named graphs
Addition/removal of triples
Sensor
Stream
RDF
data
Shapes
Validator
Stream
Validation
results
Schema ecosystems: Wikidata
In May, 2019, Wikidata announced ShEx adoption
New namespace for schemas
Example: https://www.wikidata.org/wiki/EntitySchema:E2
It opens lots of opportunities/challenges
Schema evolution and comparison
Schema ecosystems: Solid project
SOLID (SOcial Linked Data): Promoted by Tim Berners-Lee
Goal: Re-decentralize the Web
Separate data from apps
Give users more control about their data
Internally using linked data & RDF
Shapes needed for interoperability
See:
https://ruben.verborgh.org/blog/2019/06/17/shaping-linked-data-apps/
Data
pod
Shape
A
Shape
C
Shape
B
App
1
App
2
App
3
Conclusions
RDF as a basis for knowledge graphs
Explicit schemas can help improve data quality
2 languages proposed: ShEx/SHACL
Lots of new challenges and opportunities
End of presentation

Mais conteúdo relacionado

Mais procurados

RDF 개념 및 구문 소개
RDF 개념 및 구문 소개RDF 개념 및 구문 소개
RDF 개념 및 구문 소개Dongbum Kim
 
Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Fabien Gandon
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)Dan Brickley
 
A la découverte du Web sémantique
A la découverte du Web sémantiqueA la découverte du Web sémantique
A la découverte du Web sémantiqueGautier Poupeau
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Ontotext
 
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseBringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseJimmy Angelakos
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage灿辉 葛
 

Mais procurados (20)

RDF data validation 2017 SHACL
RDF data validation 2017 SHACLRDF data validation 2017 SHACL
RDF data validation 2017 SHACL
 
JSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge GraphsJSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge Graphs
 
RDF 개념 및 구문 소개
RDF 개념 및 구문 소개RDF 개념 및 구문 소개
RDF 개념 및 구문 소개
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
RDF data model
RDF data modelRDF data model
RDF data model
 
HyperGraphQL
HyperGraphQLHyperGraphQL
HyperGraphQL
 
Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
A la découverte du Web sémantique
A la découverte du Web sémantiqueA la découverte du Web sémantique
A la découverte du Web sémantique
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseBringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
 
ontop: A tutorial
ontop: A tutorialontop: A tutorial
ontop: A tutorial
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
ShEx by Example
ShEx by ExampleShEx by Example
ShEx by Example
 
RDF : une introduction
RDF : une introductionRDF : une introduction
RDF : une introduction
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage
 

Semelhante a Validating RDF data: Challenges and perspectives

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapesJose Emilio Labra Gayo
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2Dimitris Kontokostas
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsDr. Neil Brittliff
 
Validating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsValidating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsJose Emilio Labra Gayo
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...Jennifer Bowen
 
Semantic Web
Semantic WebSemantic Web
Semantic Webhardchiu
 
Linked open data sandwich
Linked open data sandwichLinked open data sandwich
Linked open data sandwichThimo Thoeye
 
MWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdfMWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdfBernhard Krabina
 
Culture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data LandCulture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data Landval.cartei
 
Linked opendata parisemantique.fr - 24062011
Linked opendata   parisemantique.fr - 24062011Linked opendata   parisemantique.fr - 24062011
Linked opendata parisemantique.fr - 24062011Loïc Dias Da Silva
 
Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012Alberto Labarga
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked datavafopoulos
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
 
Introduction to linked data and the semantic web
Introduction to linked data and the semantic webIntroduction to linked data and the semantic web
Introduction to linked data and the semantic webDave Reynolds
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFDimitris Kontokostas
 

Semelhante a Validating RDF data: Challenges and perspectives (20)

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your Analytics
 
Validating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsValidating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape Expressions
 
Sem webmaubeuge
Sem webmaubeugeSem webmaubeuge
Sem webmaubeuge
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
 
Semantic Web
Semantic WebSemantic Web
Semantic Web
 
Linked open data sandwich
Linked open data sandwichLinked open data sandwich
Linked open data sandwich
 
MWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdfMWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdf
 
Culture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data LandCulture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data Land
 
Semantic Web talk TEMPLATE
Semantic Web talk TEMPLATESemantic Web talk TEMPLATE
Semantic Web talk TEMPLATE
 
Linked opendata parisemantique.fr - 24062011
Linked opendata   parisemantique.fr - 24062011Linked opendata   parisemantique.fr - 24062011
Linked opendata parisemantique.fr - 24062011
 
Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 
Introduction to linked data and the semantic web
Introduction to linked data and the semantic webIntroduction to linked data and the semantic web
Introduction to linked data and the semantic web
 
SWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDFSWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDF
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 

Mais de Jose Emilio Labra Gayo

Introducción a la investigación/doctorado
Introducción a la investigación/doctoradoIntroducción a la investigación/doctorado
Introducción a la investigación/doctoradoJose Emilio Labra Gayo
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data qualityJose Emilio Labra Gayo
 
Legislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesLegislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesJose Emilio Labra Gayo
 
Como publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosComo publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosJose Emilio Labra Gayo
 
Arquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorArquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorJose Emilio Labra Gayo
 
RDF Validation Future work and applications
RDF Validation Future work and applicationsRDF Validation Future work and applications
RDF Validation Future work and applicationsJose Emilio Labra Gayo
 

Mais de Jose Emilio Labra Gayo (20)

Publicaciones de investigación
Publicaciones de investigaciónPublicaciones de investigación
Publicaciones de investigación
 
Introducción a la investigación/doctorado
Introducción a la investigación/doctoradoIntroducción a la investigación/doctorado
Introducción a la investigación/doctorado
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data quality
 
Wikidata
WikidataWikidata
Wikidata
 
Legislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesLegislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologies
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Introducción a la Web Semántica
Introducción a la Web SemánticaIntroducción a la Web Semántica
Introducción a la Web Semántica
 
2017 Tendencias en informática
2017 Tendencias en informática2017 Tendencias en informática
2017 Tendencias en informática
 
19 javascript servidor
19 javascript servidor19 javascript servidor
19 javascript servidor
 
Como publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosComo publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazados
 
16 Alternativas XML
16 Alternativas XML16 Alternativas XML
16 Alternativas XML
 
XSLT
XSLTXSLT
XSLT
 
XPath
XPathXPath
XPath
 
Arquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorArquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el Servidor
 
RDF Validation Future work and applications
RDF Validation Future work and applicationsRDF Validation Future work and applications
RDF Validation Future work and applications
 
ShEx vs SHACL
ShEx vs SHACLShEx vs SHACL
ShEx vs SHACL
 
Máster en Ingeniería Web
Máster en Ingeniería WebMáster en Ingeniería Web
Máster en Ingeniería Web
 
2016 temuco tecnologias_websemantica
2016 temuco tecnologias_websemantica2016 temuco tecnologias_websemantica
2016 temuco tecnologias_websemantica
 
2015 bogota datos_enlazados
2015 bogota datos_enlazados2015 bogota datos_enlazados
2015 bogota datos_enlazados
 
17 computacion servidor
17 computacion servidor17 computacion servidor
17 computacion servidor
 

Último

定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 

Último (20)

定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 

Validating RDF data: Challenges and perspectives

  • 1. Validating RDF data: Challenges and perspectives Jose Emilio Labr Gayo WESO Research Group (WEb Semantics Oviedo) University of Oviedo, Spain
  • 2. About me… Founded WESO (Web Semantics Oviedo) research group Practical applications of semantic technologies since 2004 Several domains: e-Government, e-Health Some books: "Web semántica" (in Spanish), 2012 "Validating RDF data", 2017 …and software: SHaclEX (Scala library, implements ShEx & SHACL) RDFShape (RDF playground) HTML version: http://book.validatingrdf.com Examples: https://github.com/labra/validatingRDFBookExamples
  • 3. Contents Short intro to semantic Web & knowledge graphs Short overview of RDF Understanding the RDF validation problem 2 technologies: ShEx/SHACL Challenges & perspectives
  • 4. Semantic Web Vision of the Web as web of data Not only pages, but also data: linked data Data understandable by machines Related with: Big Data Artificial intelligence Knowledge representation Example: Wikipedia vs Wikidata Tim Berners-Lee Source: Wikipedia Source: http://www.internetlivestats.com/total-number-of-websites/ Date: 05/04/2019 (19:05h)
  • 5. Who consumes Web information? People We access through some device ...and Machines They show us web pages (browsers) ...but also manage information (bots) Filtering content Doing suggestions ... "If Google doesn't understand your Web page, it almost doesn't exist"
  • 6. Web information Documentos Information retrieval DocumentosDocumentos Posts Tweets Web pages ... Publish Web documents Reality Mental model Web information must be machine-processable Semantic web technologies focus on how to achieve this goal Sensors Generate User Content generation & filtering
  • 7. People vs Machines Programmed for some tasks Predictable (no mistakes*) No problem with repetitive tasks Problems to understand the context Creativity, imagination Unpredictable (we do mistakes) We get tired with repetitive tasks Understanding based on context *when well programmed
  • 8. Information understandable by machines? Example: "Oviedo has a temperature of 36 degrees" Problem: Ambiguity and context identification Oviedo ? It can be... A city in Spain ...or a city in Florida, USA ...or a soccer player http://www.wikidata.org/entity/Q325997 http://wikidata.org/entity/Q14317 http://www.wikidata.org/entity/Q1813449 ...has a temperatrure of... https://www.wikidata.org/wiki/Property:P2076 Identifiers (URIs) in Wikidata Machine representation http://www.wikidata.org/entity/Q14317 https://www.wikidata.org/prop/direct/P2076 36^^<http://www.w3.org/2001/XMLSchema#integer>
  • 9. Knowledge representation for machines Simple statement (triple 3 elements): subject, predicate, object http://www.wikidata.org/entity/Q14317 https://www.wikidata.org/prop/direct/P2076 subject predicate object 36^^<http://www.w3.org/2001/XMLSchema#integer> wd:Q14317 wdt:P2076 36^^xsd:integer wd: <http://www.wikidata.org/entity/> wdt: <https://www.wikidata.org/prop/direct/> xsd: <http://www.w3.org/2001/XMLSchema#> Prefix declarations help simplify long URIs
  • 10. Knowledge graphs wd:Q14317 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso Wikidata statistics (17-07-2019) 57,462,853 entities 728,979,397 triples Source: https://www.wikidata.org/wiki/Wikidata:Statistics Wikidata is an example of a knowledge graph There are a lot more: Open: DBpedia, BabelNet, etc Proprietary: Google, facebook, Microsoft, etc. also have knowledge graphs
  • 11. Web information & knowledge graphs Documentos Information retrieval Knowledge graphs DocumentosDocumentos Posts Tweets Web pages ... Publish Web documents Reality Mental model Knowledge graphs are a key part of the Web nowadays Sensors Generate User Content generation & filtering
  • 12. RDF RDF (Resource Description Framework) Allows to represent knowledge graphs RDF data model = graph as a set of statements Predicates = URIs Several syntaxes Turtle, N-Triples, JSON-LD,... 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 13. prefix wd: <http://www.wikidata.org/entity/> prefix wdt: <https://www.wikidata.org/prop/direct/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> wd:Q14317 wdt:P1082 220020 ; wdt:P2076 36 ; wdt:P1376 wd:Q3934 . wd:Q10514 wdt:P19 wd:Q14317 ; wdt:P569 "1981-07-29"^^xsd:date ; wdt:P18 <picture.jpg> . RDF syntaxes: Turtle 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 14. <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1082> "220020"^^<http://www.w3.org/2001/XMLSchema#integer> . <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1376> <http://www.wikidata.org/entity/Q3934> . <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P2076> "36"^^<http://www.w3.org/2001/XMLSchema#integer> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P18> <picture.jpg> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P19> <http://www.wikidata.org/entity/Q14317> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P569> "1981-07-29"^^<http://www.w3.org/2001/XMLSchema#date> . RDF syntaxes: N-Triples 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 15. { "@context" : "http://...wikidata.json", "@graph" : [ { "@id" : "wd:Q10514", "wdt:P18" : "picture.jpg", "wdt:P19" : "wd:Q14317", "P569" : "1981-07-29" }, { "@id" : "wd:Q14317", "wdt:P1082" : 220020, "wdt:P1376" : "wd:Q3934", "wdt:P2076" : 36 } ] } RDF syntaxes: JSON-LD 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 16. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wdt="https://www.wikidata.org/prop/direct/" xmlns:wd="http://www.wikidata.org/entity/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#"> <rdf:Description rdf:about="http://www.wikidata.org/entity/Q10514"> <wdt:P18>picture.jpg</wdt:P18> <wdt:P19> <rdf:Description rdf:about="http://www.wikidata.org/entity/Q14317"> <wdt:P1082 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">220020</wdt:P1082> <wdt:P1376 rdf:resource="http://www.wikidata.org/entity/Q3934"/> <wdt:P2076 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">36</wdt:P2076> </rdf:Description> </wdt:P19> <wdt:P569 rdf:datatype=http://www.w3.org/2001/XMLSchema#date>1981-07-29</wdt:P569> </rdf:Description> </rdf:RDF> RDF syntaxes: RDF/XML 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 17. RDF graphs are composable RDF helps information integration 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg wd:Q14317 :alice :bob :carol schema:birthPlace schema:birthPlace schema:knows schema:knows Alice Cooper Schema:name Robert Smith schema:nameschema:knows Carol King schema:name
  • 18. RDF data model Graph = set of statements (triples) Predicates (arcs) = URIs Subjects: URIs* Objects: URIs or literals* 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg wd:Q14317 *Exception: blank nodes (next slide)
  • 19. RDF data model: Blank nodes A statement about something Example: "Alice knows someone who knows Bob" :alice schema:knows :bob schema:knows :alice schema:knows _:x ; _:x schema:knows :bob . Notation (in Turtle) :alice schema:knows [ schema:knows :bob ] .  x ( :alice :knows x  x :knows :bob )
  • 20. RDF data model: Literals :bob schema:name "Robert"^^xsd:string ; schema:age "18"^^xsd:integer ; schema:birthDate "2010-04-12"^^xsd:date . :bob schema:name "Robert" ; schema:age 18 ; schema:birthDate "2010-04-12"^^xsd:date . Objects can also be literals Literals contain a lexical form and a datatype Common datatypes: XML Schema primitive datatypes If not specified, a literal has type xsd:string :bob schema:name "2010-04-12"^^xsd:date18"Robert Smith" schema:age schema:birthDate
  • 21. RDF data model: Language tagged strings String literals can be qualified by a language tag They have datatype rdfs:langString :spain rdfs:label "Spain"@en ; rdfs:label "España"@es . :Spain "España"@es"Spain"@en rdfs:label rdfs:label
  • 22. ...and that's all Yes, the RDF Data model is simple
  • 24. SELECT ?name ?birthDate ?picture WHERE { ?person wdt:P19 wd:Q14317 ; rdfs:label ?name ; wdt:P569 ?birthDate ; wdt:P18 ?picture . FILTER (Lang(?name) = 'es') } SPARQL SPARQL = Simple Protocol and RDF Query Language Similar to SQL, but for RDF Example: People born in Oviedo wdt:P19 "1981-07-29"^^xsd:date wdt:P18 wdt:P569 Oviedo wd:Q14317 "Fernando Alonso"@en rdfs:label ?person ?birthPlace?name ?picture wdt:P19 wdt:P18wdt:P569 wd:Q14317 rdfs:label wd:Q10514 wdt:P19 ...
  • 25. Shared entities and vocabularies URIs as global identifiers allow to reuse them Lots of vocabularies: domain specific or general purpose Examples: schema.org: properties that major browsers can recognize Linked Open Vocabularies
  • 26. Inference and ontologies Some vocabularies define properties that can infer new information RDF Schema: Classes, subclasses, etc. OWL: Ontologies using description logics Trade-off between expressiveness and complexity rdf:type :Student :alice :Person rdfs:subClassOf :HumanBeing owl:sameAs rdf:type rdf:type
  • 27. RDF, the good parts... RDF as an integration language RDF as a lingua franca for semantic web and linked data RDF flexibility & integration Data can be adapted to multiple environments Open and reusable data by default RDF for knowledge representation RDF data stores & SPARQL
  • 28. RDF, the other parts Consuming & producing RDF Multiple syntaxes: Turtle, RDF/XML, JSON-LD, ... Embedding RDF in HTML Describing and validating RDF content Producer Consumer
  • 29. Why describe & validate RDF? For producers Developers can understand the contents they are going to produce Ensure they produce the expected structure Advertise and document the structure Generate interfaces For consumers Understand the contents Verify the structure before processing it Query generation & optimization Producer Consumer Shapes
  • 30. Similar technologies Technology Schema Relational Databases DDL XML DTD, XML Schema, RelaxNG, Schematron Json Json Schema RDF ? Fill that gap
  • 31. Understanding the problem Identifying the shape of graphs... Shapes can describe the form of a node (node constraint) ...and the number of possible arcs incoming/outgoing from a node ...and the possible values associated with those arcs :alice schema:name "Alice"; schema:knows :bob, :carol . IRI schema:name string 1 schema:knows IRI 0, 1,... RDF Node Shape RDF Node that represents a User <UserShape> IRI { schema:name xsd:string ; schema:knows IRI * } ShEx
  • 32. Understanding the problem Repeated properties The same property can be used for different purposes in the same data Example: A product must have 2 codes with different structure :product schema:productID "isbn:123-456-789"; schema:productID "code456" . A practical example from FHIR See: http://hl7-fhir.github.io/observation-example-bloodpressure.ttl.html
  • 33. Understanding the problem Shapes ≠ types Nodes in RDF graphs can have 0, 1 or many rdf:type declarations A type can be used in multiple contexts, e.g. foaf:Person Nodes are not necessarily annotated with discriminating types Nodes with type :Person can represent friends, students, patients,... Different meanings and different structure depending on context Specific validation constraints for different contexts
  • 34. Understanding the problem RDF validation ≠ ontology definition ≠ instance data Ontologies are usually focused on domain entities RDF validation is focused on RDF graph features (lower level) Ontology Constraints RDF Validation Instance data Different levels :alice schema:name "Alice"; schema:knows :bob . <User> IRI { schema:name xsd:string ; schema:knows IRI } schema:knows a owl:ObjectProperty ; rdfs:domain schema:Person ; rdfs:range schema:Person . A user must have only two properties: schema:name of value xsd:string schema:knows with an IRI value
  • 35. Why not using SPARQL to validate? Pros: Expressive Ubiquitous Cons Expressive Idiomatic many ways to encode the same constraint ASK {{ SELECT ?Person { ?Person schema:name ?o . } GROUP BY ?Person HAVING (COUNT(*)=1) } { SELECT ?Person { ?Person schema:name ?o . FILTER ( isLiteral(?o) && datatype(?o) = xsd:string ) } GROUP BY ?Person HAVING (COUNT(*)=1) } { SELECT ?Person (COUNT(*) AS ?c1) { ?Person schema:gender ?o . } GROUP BY ?Person HAVING (COUNT(*)=1)} { SELECT ?Person (COUNT(*) AS ?c2) { ?S schema:gender ?o . FILTER ((?o = schema:Female || ?o = schema:Male)) } GROUP BY ?Person HAVING (COUNT(*)=1)} FILTER (?c1 = ?c2) } Example: Define SPARQL query to check: There must be one schema:name which must be a xsd:string, and one schema:gender which must be schema:Male or schema:Female
  • 36. Previous/other approaches SPIN, by TopQuadrant http://spinrdf.org/ SPARQL templates, Influenced SHACL Stardog ICV: http://docs.stardog.com/icv/icv-specification.html OWL with UNA and CWA OSLC resource shapes: https://www.w3.org/Submission/shapes/ Vocabulary for RDF validation Dublin Core Application profiles (K. Coyle, T. Baker) http://dublincore.org/documents/dc-dsp/ RDF Data Descriptions (Fischer et al) http://ceur-ws.org/Vol-1330/paper-33.pdf RDFUnit (D. Kontokostas) http://aksw.org/Projects/RDFUnit.html ...
  • 37. ShEx and SHACL 2013 RDF Validation Workshop Conclusions of the workshop: There is a need of a higher level, concise language for RDF Validation ShEx initially proposed (v 1.0) 2014 W3c Data Shapes WG chartered 2017 SHACL accepted as W3C recommendation 2017 ShEx 2.0 released as Community group draft 2018 ShEx adopted by Wikidata
  • 38. Short intro to ShEx ShEx (Shape Expressions Language) Concise and human-readable language for RDF validation & description Syntax similar to SPARQL, Turtle Semantics inspired by regular expressions & RelaxNG 2 syntaxes: Compact and RDF/JSON-LD Official info: http://shex.io Semantics: http://shex.io/shex-semantics/, primer: http://shex.io/shex-primer
  • 39. ShEx implementations and playgrounds Libraries: shex.js: Javascript SHaclEX: Scala (Jena/RDF4j) PyShEx: Python shex-java: Java Ruby-ShEx: Ruby Online demos & playgrounds ShEx-simple RDFShape ShEx-Java ShExValidata
  • 40. Simple example Nodes conforming to <User> shape must: • Be IRIs • Have exactly one schema:name with a value of type xsd:string • Have zero or more schema:knows whose values conform to <User> prefix schema: <http://schema.org/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> <User> IRI { schema:name xsd:string ; schema:knows @<User> * } Prefix declarations as in Turtle/SPARQL
  • 41. RDF Validation using ShEx :alice schema:name "Alice" ; schema:knows :alice . :bob schema:knows :alice ; schema:name "Robert". :carol schema:name "Carol", "Carole" . :dave schema:name 234 . :emily foaf:name "Emily" . :frank schema:name "Frank" ; schema:email <mailto:frank@example.org> ; schema:knows :alice, :bob . :grace schema:name "Grace" ; schema:knows :alice, _:1 . _:1 schema:name "Unknown" . Try it (RDFShape): https://goo.gl/97bYdv Try it (ShExDemo):https://goo.gl/Y8hBsW Schema Data <User> IRI { schema:name xsd:string ; schema:knows @<User> * } :alice@<User>, :bob @<User>, :carol@<User>, :dave @<User>, :emily@<User>, :frank@<User>, :grace@<User> Shape map       
  • 42. Validation process :alice@:User, :bob@:User, :carol@:User ShEx Validator Result shape map :User { schema:name xsd:string ; schema:knows @:User * } ShEx Schema :alice schema:name "Alice" ; schema:knows :alice . :bob schema:knows :alice ; schema:name "Robert". :carol schema:name "Carol", "Carole" . RDF data Shape map :alice@:User, :bob@:User, :carol@!:User Input: RDF data, ShEx schema, Shape map Output: Result shape map
  • 43. Example with more ShEx features :AdultPerson EXTRA rdf:type { rdf:type [ schema:Person ] ; :name xsd:string ; :age MinInclusive 18 ; :gender [:Male :Female] OR xsd:string ; :address @:Address ? ; :worksFor @:Company + ; } :Address CLOSED { :addressLine xsd:string {1,3} ; :postalCode /[0-9]{5}/ ; :state @:State ; :city xsd:string } :Company { :name xsd:string ; :state @:State ; :employee @:AdultPerson * ; } :State /[A-Z]{2}/ :alice rdf:type :Student, schema:Person ; :name "Alice" ; :age 20 ; :gender :Male ; :address [ :addressLine "Bancroft Way" ; :city "Berkeley" ; :postalCode "55123" ; :state "CA" ] ; :worksFor [ :name "Company" ; :state "CA" ; :employee :alice ] . Try it: https://tinyurl.com/yd5hp9z4
  • 44. More info about ShEx See: ShEx by Example (slides): https://figshare.com/articles/ShExByExample_pptx/6291464 ShEx chapter from Validating RDF data book: http://book.validatingrdf.com/bookHtml010.html
  • 45. Short intro to SHACL W3C recommendation: https://www.w3.org/TR/shacl/ (July 2017) RDF vocabulary 2 parts: SHACL-Core, SHACL-SPARQL
  • 46. SHACL implementations Name Parts Language - Library Comments Topbraid SHACL API SHACL Core, SPARQL Java (Jena) Used by TopBraid composer SHACL playground SHACL Core Javascript (rdflib.js) http://shacl.org/playground/ SHACLEX SHACL Core Scala (Jena, RDF4j) http://rdfshape.weso.es pySHACL SHACL Core, SPARQL Python (rdflib) https://github.com/RDFLib/pySHACL Corese SHACL SHACL Core, SPARQL Java (STTL) http://wimmics.inria.fr/corese RDFUnit SHACL Core, SPARQL Java (Jena) https://github.com/AKSW/RDFUnit
  • 47. Basic example Try it. RDFShape https://goo.gl/T1uuzv prefix : <http://example.org/> prefix sh: <http://www.w3.org/ns/shacl#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> prefix schema: <http://schema.org/> :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . :hasEmail sh:path schema:email ; sh:minCount 1; sh:maxCount 1; sh:nodeKind sh:IRI . :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org> . :bob schema:firstName "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email "carol@mail.org" . Shapes graph Data graph  
  • 48. Same example with blank nodes Try it. RDFShape https://goo.gl/ukY5vq prefix : <http://example.org/> prefix sh: <http://www.w3.org/ns/shacl#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> prefix schema: <http://schema.org/> :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property [ sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string ; ] ; sh:property [ sh:path schema:email ; sh:minCount 1; sh:maxCount 1; sh:nodeKind sh:IRI ; ] . :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org> . :bob schema:firstName "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email "carol@mail.org" . Shapes graph Data graph  
  • 49. Some definitions about SHACL Shape: collection of targets and constraints components Targets: specify which nodes in the data graph must conform to a shape Constraint components: Determine how to validate a node Shape target declarations constraint components :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . . . .
  • 50. Validation Report The output of the validation process is a list of violation errors No errors  RDF conforms to shapes graph [ a sh:ValidationReport ; sh:conforms false ; sh:result [ a sh:ValidationResult ; sh:focusNode :bob ; sh:message "MinCount violation. Expected 1, obtained: 0" ; sh:resultPath schema:name ; sh:resultSeverity sh:Violation ; sh:sourceConstraintComponent sh:MinCountConstraintComponent ; sh:sourceShape :hasName ] ; ... [ a sh:ValidationReport ; sh:conforms true ].
  • 51. SHACL processor SHACL Processor Validation report :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . . . . Shapes graph :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org>. :bob schema:name "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email <mailto:carol@mail.org> . Data Graph [ a sh:ValidationReport ; sh:conforms true ].
  • 52. SHACL Core built-in constraint components Type Constraints Cardinality minCount, maxCount Types of values class, datatype, nodeKind Values node, in, hasValue, property Range of values minInclusive, maxInclusive minExclusive, maxExclusive String based minLength, maxLength, pattern Language based languageIn, uniqueLang Logical constraints not, and, or, xone Closed shapes closed, ignoredProperties Property pair constraints equals, disjoint, lessThan, lessThanOrEquals Non-validating constraints name, description, order, group Qualified shapes qualifiedValueShape, qualifiedValueShapesDisjoint qualifiedMinCount, qualifiedMaxCount
  • 53. Longer example :AdultPerson EXTRA a { a [ schema:Person ] ; :name xsd:string ; :age MinInclusive 18 ; :gender [:Male :Female] OR xsd:string ; :address @:Address ? ; :worksFor @:Company + ; } :Address CLOSED { :addressLine xsd:string {1,3} ; :postalCode /[0-9]{5}/ ; :state @:State ; :city xsd:string } :Company { :name xsd:string ; :state @:State ; :employee @:AdultPerson * ; } :State /[A-Z]{2}/ :AdultPerson a sh:NodeShape ; sh:property [ sh:path rdf:type ; sh:qualifiedValueShape [ sh:hasValue schema:Person ]; sh:qualifiedMinCount 1 ; sh:qualifiedMaxCount 1 ; ] ; sh:targetNode :alice ; sh:property [ sh:path :name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string ; ] ; sh:property [ sh:path :gender ; sh:minCount 1; sh:maxCount 1; sh:in (:Male :Female); ] ; sh:property [ sh:path :age ; sh:maxCount 1; sh:minInclusive 18 ] ; sh:property [ sh:path :address ; sh:node :Address ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:path :worksFor ; sh:node :Company ; sh:minCount 1 ; sh:maxCount 1 ]. :Address a sh:NodeShape ; sh:closed true ; sh:property [ sh:path :addressLine; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 3 ] ; sh:property [ sh:path :postalCode ; sh:pattern "[0-9]{5}" ; sh:minCount 1 ; sh:maxCount 3 ] ; sh:property [ sh:path :city ; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:path :state ; sh:node :State ; ] . :Company a sh:NodeShape ; sh:property [ sh:path :name ; sh:datatype xsd:string ] ; sh:property [ sh:path :state ; sh:node :State ] ; sh:property [ sh:path :employee ; sh:node :AdultPerson ; ] ;. :State a sh:NodeShape ; sh:pattern "[A-Z]{2}" . In ShEx In SHACL Its recursive!!! (not well defined SHACL) Implementation dependent feature Try it: https://tinyurl.com/ycl3mkzr
  • 54. More info about SHACL SHACL by example (slides): https://figshare.com/articles/SHACL_by_example/6449645 SHACL chapter at Validating RDF data book http://book.validatingrdf.com/bookHtml011.html
  • 55. Some challenges and perspectives Theoretical foundations of ShEx/SHACL Inference shapes from data Validation Usability RDF Stream validation Schema ecosystems Wikidata Solid
  • 56. Theoretical foundations of ShEx/SHACL Conversion between ShEx and SHACL SHaclEX library converts subsets of both Challenges Recursion and negation Performance and algorithmic complexity Detect useful subsets of the languages Convert to SPARQL Schema/data mapping Shapes Shapes ShEx SHACL
  • 57. Inference of Shapes from Data Useful use case in practice Knowledge Graph summarization Some prototypes: ShExer, RDFShape, ShapeArchitect Try it with RDFShape: https://tinyurl.com/y8pjcbyf Shape Expression generated for wd:Q51613194 Shapes RDF data infer
  • 58. Validation usability Learning from users Early adopters: WebIndex, HL7 FHIR, Eclipse Lyo, GenWiki,… Improve error information/visualization/navigation/repairing Authoring/visualization tools Propose annotation sets UI generation Error reporting/suggestion (SHOULD/MUST/…) Shapes
  • 59. RDF Stream validation Validation of RDF streams Challenges: Incremental validation Named graphs Addition/removal of triples Sensor Stream RDF data Shapes Validator Stream Validation results
  • 60. Schema ecosystems: Wikidata In May, 2019, Wikidata announced ShEx adoption New namespace for schemas Example: https://www.wikidata.org/wiki/EntitySchema:E2 It opens lots of opportunities/challenges Schema evolution and comparison
  • 61. Schema ecosystems: Solid project SOLID (SOcial Linked Data): Promoted by Tim Berners-Lee Goal: Re-decentralize the Web Separate data from apps Give users more control about their data Internally using linked data & RDF Shapes needed for interoperability See: https://ruben.verborgh.org/blog/2019/06/17/shaping-linked-data-apps/ Data pod Shape A Shape C Shape B App 1 App 2 App 3
  • 62. Conclusions RDF as a basis for knowledge graphs Explicit schemas can help improve data quality 2 languages proposed: ShEx/SHACL Lots of new challenges and opportunities

Notas do Editor

  1. ShEx: 20 lines of code SHACL: 60 lines of code