SlideShare uma empresa Scribd logo
1 de 25
Inference and Serialization
of Latent Graph Schemata
Using Shex
Speaker: Daniel Fernández-Álvarez
Category: Idea
Daniel Fernández-Álvarez* Jose Emilio Labra-Gayo* Herminio García-González*
danifdezalvarez@gmail.com labra@uniovi.es herminiogg@gmail.com
*Department of Computer Science
WESO Research Group
University of Oviedo
Oviedo, Spain
Motivational
example
Motivation: Torimbia Beach
Motivation: Torimbia Beach
• Country: Spain
• Region: Asturias
• Council/city: Llanes
• Lat/long: 43.44, -4.85
• Length: 500 m
• Width: 100 m
• Naturist: True
Motivation: Torimbia Beach
*Batu Ferringhi, Horseshoe Bay, Manly Beach, Marina Beach, Playa Arcadia, Red Beach
Region Lat/long Width
X
X
X
X
X
6 different random but relevant beaches in DBPedia*
The same happens with country, council/city, length and naturist
Motivation
I would like to…
check the concept of beach, not the instances
make a single query/click to discover usual schemata
be correct, coherent and exhaustive
Idea
Proposal
• Analysis of the neighborhood of nodes that fit in a certain condition
to induce usual schemata:
• Typical condition: rdf:type
• Serialization of inferred schemata with ShEx (Shape Expressions).
• Association to a type (class)
• Management of trustworthiness
• Handy for:
• Documentation
• Verification of quality
• Discovering “hidden” entities
How?
Workflow
ShEx
<Person> {
}
Source graph:
Dbpedia,
Wikidata…
Inference Serialization
Abstract
schemata
representation
Textual schemata
representation
with ShEx
Schemata Inference: current approaches
• Ontology integration to find shared core elements [Zhao,13]
• Association rule mining (Apriori)
• Rule-based classification (Decision Tables)
• Logical axioms at ontology level [Völker,11]
• Association rule mining (Apriori)
• Axioms represented with OWL 2 EL
• Graph schemata al class level[Christodoulou,15]
• Clusters of similar individuals (ideally, cluster=class).
• Results in an ad-hoc syntax.
Schemata Inference: our current status
Some promising ideas:
Instance clustering
Association rule mining
Some issues linked to the target graph:
Noise management
Adaptation to data model
Graph size & complexity
Completeness and coherence
Schemata Serialization I
Need: Standard syntax to express constraints in RDF graphs at class
level:
• XML: RelaxNG, DTD, Xml Schema
• Relational databases: DDL
• Json: Json Schema
RDF candidates:
ShEx
Grammar-oriented
Recursion
Human-friendly syntax
SHACL
Constraint-oriented
No recursion (by now)
RDF syntax (by now)
19%
59%
83%
83%
87%
69%
32%
Schemata Serialization II
Pure ShEx
<Beach> {
dbp:width xsd:integer,
dbp:length xsd:integer,
geo:lat xsd:long,
geo:long xsd:long,
dbo:isPartOf @<Place>*
}
Anotated ShEx
<Beach> {
dbp:width xsd:integer,
dbp:length xsd:integer,
geo:lat xsd:long,
geo:long xsd:long,
geo:geometry @<Point>,
dbo:isPartOf @<Place>*,
dbo:country @<Country>
}
Use cases?
Context: Types of graphs
Specific purpose
Automatically built
Managed by a single agent
General purpose
Manually built
Managed by community
Reality
Context: Collaborative graphs
Key points:
• Schemata are not planned, they just emerge
• Schemata change in time
Posibilities:
• Schemata inference on users’ demand
• What is associated to a type, instead of how a type should be
• Freedom: ShEx as guide, not dogma
To summarize…
Conclusions and Future Work
What we have done:
Idea
Inference of Latent Graph Schemata
Serialization through ShEx syntax
What we want to do:
Prototype
Selection of techniques
Selection of target source/s
Tests
Usefulness in different domains
Feasibility: reached trustworthiness
User’s acceptance
References
• Zhao, L., & Ichise, R. (2013, May). Instance-based ontological
knowledge acquisition. In Extended Semantic Web Conference (pp.
155-169). Springer Berlin Heidelberg.
• [2] Völker, J., & Niepert, M. (2011, May). Statistical schema induction.
In Extended Semantic Web Conference (pp. 124-138). Springer Berlin
Heidelberg.
• [3] Christodoulou, K., Paton, N. W., & Fernandes, A. A. (2015).
Structure inference for linked data sources using clustering.
In Transactions on Large-Scale Data-and Knowledge-Centered
Systems XIX (pp. 1-25). Springer Berlin Heidelberg.
Inference and Serialization
of Latent Graph Schemata
Using Shex
Speaker: Daniel Fernández-Álvarez
Category: Idea
Daniel Fernández-Álvarez* Jose Emilio Labra-Gayo* Herminio García-González*
danifdezalvarez@gmail.com labra@uniovi.es herminiogg@gmail.com
*Department of Computer Science
WESO Research Group
University of Oviedo
Oviedo, Spain
Extra information for Torimbia example I
Latlong* Naturist
Batu Ferringhi
dbp:latd, dbp:longd, georss:point,
geo:geometry, geo:lat, geo:long X
Horseshoe Bay geo:geometry, geo:lat, geo:long X
Manly Beach
georss:point, geo:geometry, geo:lat,
geo:long X
Marina Beach
georss:point, geo:geometry, geo:lat,
geo:long X
Playa Arcadia
georss:point, geo:geometry, geo:lat,
geo:long X
Red Beach
dbp:latDeg, dbp:longDeg, georss:point,
geo:geometry, geo:lat, geo:long X
*Some lat/long properties has been omitted. Some of them work togheter in order to
get a precise coordinate (total degrees + orientation N/S E/W)
Extra information for Torimbia example II
Lenght Width Council Region Country
Batu
Ferringhi X X shared entity dbo:isPartOf dbo:country
Horseshoe
Bay X X description description
rdf:type
(BeachesOfBer
muda)
Manly Beach X X description
dct:subject
dbc:Beaches_of_N
ew_South_Wales description
Marina
Beach dbp:height description dct:subject dct:subject
Playa ArcadiaX X dct:subject X dct:subject
Red Beach X dbp:width dbp:city is dbp:south of description
Wikimedia Strategy: Templates and Mappings
• Mappings
• Designed to automatically import data from Wikipedia’s infoboxes and tables
into DBpedia.
• Wikipedia Templates define expected properties for certain types. Mappings
define which property should be used to create a triple when finding an
occurrence of an expected property.
PROS
• Preserves Wikipedia’s quality.
• Handy as guide for content
represented in Wikipedia.
• It may enrich both Wikipedia and
DBpedia
• Templates can evolve guided by
community
CONS
• Depends on Wikipedia’s quality.
• It can only manage content
represented in Wikipedia.
• Non transposable to standalone RDF
graph projects.
• It assumes that the community is
following the templates. It may not
reflect the real graph.
ShEx vs SHACL
ShEx
<UserShape> {
dbp:label xsd:string,
ex:role ( ex:User ) ?
}
SHACL
:UserShape
a sh:Shape ;
sh:property [
sh:predicate rdfs:label ;
sh:datatype xsd:string ;
sh:minCount 1 ;
sh:maxCount 1 ;
] ;
sh:property [
sh:predicate ex:role ;
sh:hasValue ex:User ;
sh:filterShape [
sh:property [
sh:predicate ex:role ;
sh:minCount 1 ;
]
] ;
sh:maxCount 1 ; ] .

Mais conteúdo relacionado

Destaque

4.unidad didactica primer grado
4.unidad didactica primer grado4.unidad didactica primer grado
4.unidad didactica primer gradodavid quispe
 
Actividad 1 humanidades
Actividad 1 humanidadesActividad 1 humanidades
Actividad 1 humanidadesDavid Guzman
 
Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4
Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4
Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4Giuliano Ganassi
 
Teoria de geometria euclideana
Teoria de geometria euclideanaTeoria de geometria euclideana
Teoria de geometria euclideanaDavid Guzman
 
130103 fbgis 2008_2012
130103 fbgis 2008_2012130103 fbgis 2008_2012
130103 fbgis 2008_2012Fernando Gil
 
Redação oficial e pronomes de tratamento
Redação oficial e pronomes de tratamentoRedação oficial e pronomes de tratamento
Redação oficial e pronomes de tratamentoluisinhow
 

Destaque (12)

4.unidad didactica primer grado
4.unidad didactica primer grado4.unidad didactica primer grado
4.unidad didactica primer grado
 
Actividad 1 humanidades
Actividad 1 humanidadesActividad 1 humanidades
Actividad 1 humanidades
 
Tecnologiaaa
TecnologiaaaTecnologiaaa
Tecnologiaaa
 
Bing Ads
Bing AdsBing Ads
Bing Ads
 
Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4
Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4
Campionati giovanili di pallavolo CSI Vallecamonica - Comunicato N°4
 
Teoria de geometria euclideana
Teoria de geometria euclideanaTeoria de geometria euclideana
Teoria de geometria euclideana
 
130103 fbgis 2008_2012
130103 fbgis 2008_2012130103 fbgis 2008_2012
130103 fbgis 2008_2012
 
Fundamento de empaque y conservación
Fundamento de empaque y conservaciónFundamento de empaque y conservación
Fundamento de empaque y conservación
 
Redação oficial e pronomes de tratamento
Redação oficial e pronomes de tratamentoRedação oficial e pronomes de tratamento
Redação oficial e pronomes de tratamento
 
Conventional loom and modern loom
Conventional loom and modern loomConventional loom and modern loom
Conventional loom and modern loom
 
Bm examination
Bm examinationBm examination
Bm examination
 
Marketing
MarketingMarketing
Marketing
 

Mais de Daniel Fernández Álvarez (6)

Mini tutorial rdflib
Mini tutorial rdflibMini tutorial rdflib
Mini tutorial rdflib
 
Wikidata: qué es y cómo subirse al carro
Wikidata: qué es y cómo subirse al carroWikidata: qué es y cómo subirse al carro
Wikidata: qué es y cómo subirse al carro
 
Presentation shexer
Presentation shexerPresentation shexer
Presentation shexer
 
Wikidata intro
Wikidata introWikidata intro
Wikidata intro
 
Presentation ClassRank WikidataCon 2017
Presentation ClassRank WikidataCon 2017Presentation ClassRank WikidataCon 2017
Presentation ClassRank WikidataCon 2017
 
Presentation to KILT
Presentation to KILTPresentation to KILT
Presentation to KILT
 

Último

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...masabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Último (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

Slides SEMAPRO 2016 University of Oviedo

  • 1. Inference and Serialization of Latent Graph Schemata Using Shex Speaker: Daniel Fernández-Álvarez Category: Idea Daniel Fernández-Álvarez* Jose Emilio Labra-Gayo* Herminio García-González* danifdezalvarez@gmail.com labra@uniovi.es herminiogg@gmail.com *Department of Computer Science WESO Research Group University of Oviedo Oviedo, Spain
  • 4. Motivation: Torimbia Beach • Country: Spain • Region: Asturias • Council/city: Llanes • Lat/long: 43.44, -4.85 • Length: 500 m • Width: 100 m • Naturist: True
  • 5. Motivation: Torimbia Beach *Batu Ferringhi, Horseshoe Bay, Manly Beach, Marina Beach, Playa Arcadia, Red Beach Region Lat/long Width X X X X X 6 different random but relevant beaches in DBPedia* The same happens with country, council/city, length and naturist
  • 6. Motivation I would like to… check the concept of beach, not the instances make a single query/click to discover usual schemata be correct, coherent and exhaustive
  • 8. Proposal • Analysis of the neighborhood of nodes that fit in a certain condition to induce usual schemata: • Typical condition: rdf:type • Serialization of inferred schemata with ShEx (Shape Expressions). • Association to a type (class) • Management of trustworthiness • Handy for: • Documentation • Verification of quality • Discovering “hidden” entities
  • 10. Workflow ShEx <Person> { } Source graph: Dbpedia, Wikidata… Inference Serialization Abstract schemata representation Textual schemata representation with ShEx
  • 11. Schemata Inference: current approaches • Ontology integration to find shared core elements [Zhao,13] • Association rule mining (Apriori) • Rule-based classification (Decision Tables) • Logical axioms at ontology level [Völker,11] • Association rule mining (Apriori) • Axioms represented with OWL 2 EL • Graph schemata al class level[Christodoulou,15] • Clusters of similar individuals (ideally, cluster=class). • Results in an ad-hoc syntax.
  • 12. Schemata Inference: our current status Some promising ideas: Instance clustering Association rule mining Some issues linked to the target graph: Noise management Adaptation to data model Graph size & complexity Completeness and coherence
  • 13. Schemata Serialization I Need: Standard syntax to express constraints in RDF graphs at class level: • XML: RelaxNG, DTD, Xml Schema • Relational databases: DDL • Json: Json Schema RDF candidates: ShEx Grammar-oriented Recursion Human-friendly syntax SHACL Constraint-oriented No recursion (by now) RDF syntax (by now)
  • 14. 19% 59% 83% 83% 87% 69% 32% Schemata Serialization II Pure ShEx <Beach> { dbp:width xsd:integer, dbp:length xsd:integer, geo:lat xsd:long, geo:long xsd:long, dbo:isPartOf @<Place>* } Anotated ShEx <Beach> { dbp:width xsd:integer, dbp:length xsd:integer, geo:lat xsd:long, geo:long xsd:long, geo:geometry @<Point>, dbo:isPartOf @<Place>*, dbo:country @<Country> }
  • 16. Context: Types of graphs Specific purpose Automatically built Managed by a single agent General purpose Manually built Managed by community Reality
  • 17. Context: Collaborative graphs Key points: • Schemata are not planned, they just emerge • Schemata change in time Posibilities: • Schemata inference on users’ demand • What is associated to a type, instead of how a type should be • Freedom: ShEx as guide, not dogma
  • 19. Conclusions and Future Work What we have done: Idea Inference of Latent Graph Schemata Serialization through ShEx syntax What we want to do: Prototype Selection of techniques Selection of target source/s Tests Usefulness in different domains Feasibility: reached trustworthiness User’s acceptance
  • 20. References • Zhao, L., & Ichise, R. (2013, May). Instance-based ontological knowledge acquisition. In Extended Semantic Web Conference (pp. 155-169). Springer Berlin Heidelberg. • [2] Völker, J., & Niepert, M. (2011, May). Statistical schema induction. In Extended Semantic Web Conference (pp. 124-138). Springer Berlin Heidelberg. • [3] Christodoulou, K., Paton, N. W., & Fernandes, A. A. (2015). Structure inference for linked data sources using clustering. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XIX (pp. 1-25). Springer Berlin Heidelberg.
  • 21. Inference and Serialization of Latent Graph Schemata Using Shex Speaker: Daniel Fernández-Álvarez Category: Idea Daniel Fernández-Álvarez* Jose Emilio Labra-Gayo* Herminio García-González* danifdezalvarez@gmail.com labra@uniovi.es herminiogg@gmail.com *Department of Computer Science WESO Research Group University of Oviedo Oviedo, Spain
  • 22. Extra information for Torimbia example I Latlong* Naturist Batu Ferringhi dbp:latd, dbp:longd, georss:point, geo:geometry, geo:lat, geo:long X Horseshoe Bay geo:geometry, geo:lat, geo:long X Manly Beach georss:point, geo:geometry, geo:lat, geo:long X Marina Beach georss:point, geo:geometry, geo:lat, geo:long X Playa Arcadia georss:point, geo:geometry, geo:lat, geo:long X Red Beach dbp:latDeg, dbp:longDeg, georss:point, geo:geometry, geo:lat, geo:long X *Some lat/long properties has been omitted. Some of them work togheter in order to get a precise coordinate (total degrees + orientation N/S E/W)
  • 23. Extra information for Torimbia example II Lenght Width Council Region Country Batu Ferringhi X X shared entity dbo:isPartOf dbo:country Horseshoe Bay X X description description rdf:type (BeachesOfBer muda) Manly Beach X X description dct:subject dbc:Beaches_of_N ew_South_Wales description Marina Beach dbp:height description dct:subject dct:subject Playa ArcadiaX X dct:subject X dct:subject Red Beach X dbp:width dbp:city is dbp:south of description
  • 24. Wikimedia Strategy: Templates and Mappings • Mappings • Designed to automatically import data from Wikipedia’s infoboxes and tables into DBpedia. • Wikipedia Templates define expected properties for certain types. Mappings define which property should be used to create a triple when finding an occurrence of an expected property. PROS • Preserves Wikipedia’s quality. • Handy as guide for content represented in Wikipedia. • It may enrich both Wikipedia and DBpedia • Templates can evolve guided by community CONS • Depends on Wikipedia’s quality. • It can only manage content represented in Wikipedia. • Non transposable to standalone RDF graph projects. • It assumes that the community is following the templates. It may not reflect the real graph.
  • 25. ShEx vs SHACL ShEx <UserShape> { dbp:label xsd:string, ex:role ( ex:User ) ? } SHACL :UserShape a sh:Shape ; sh:property [ sh:predicate rdfs:label ; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ; ] ; sh:property [ sh:predicate ex:role ; sh:hasValue ex:User ; sh:filterShape [ sh:property [ sh:predicate ex:role ; sh:minCount 1 ; ] ] ; sh:maxCount 1 ; ] .