SlideShare uma empresa Scribd logo
1 de 16
Detecting Good Practices and
Pitfalls when Publishing
Vocabularies on the Web
María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 ,
1Ontology

Engineering Group. Universidad Politécnica de Madrid. Spain.
2Mondeca, Paris, France.

mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es

Speaker: Asunción Gómez-Pérez
Contact author: María Poveda-Villalón: mpoveda@fi.upm.es

Date: 10/28/13
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Results and Analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

2
Introduction
•  Different formats: RDFS, OWL, HTML
•  Different configurations
•  Do they ease or impede applications
consuming vocabularies?
Ø  Good practices & Pitfalls

Vocabularies bring semantics to data

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
http://lod-cloud.net/”

Along this work:
•  Detailed analysis of 355 vocabularies gathered in the
LOV registry (http://lov.okfn.org/)
•  Why LOV: complete information about each
vocabulary, namely URI, namespace and prefix
•  Results:
1.  a non exhaustive list of good practices and
pitfalls about publishing LD vocabularies
2.  specific methods for detecting such good
practices and pitfalls
3.  some metadata about ontology quality
4.  the inclusion of pitfalls in services such as
OOPS! (http://www.oeg-upm.net/oops) to help
eager vocabulary managers

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

3
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Previous work
•  Proposed good practices and pitfalls

•  Results and analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

4
Previous work (I)
Linked Open Data 5 Star rating system (Tim Bernes-Lee) http://www.w3.org/DesignIssues/
LinkedData.html. 2006 (last change 2009).
LOD1. Available on the web (whatever format) but with an open licence, to be Open Data
LOD2. Available as machine-readable structured data (e.g. excel instead of image scan of a table)
LOD3. As (2) plus non-proprietary format (e.g. CSV instead of excel)
LOD4. All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things,
so that people can point at your stuff
LOD5. All the above plus Link your data to other people’s data to provide context

Is your linked data vocabulary 5-star? (Bernard Vatant) http://bvatant.blogspot.fr/2012/02/is-yourlinked-data-vocabulary-5-star_9588.html. 2012.
LDV1. Publish your vocabulary on the Web at a stable URI
LDV2. Provide human-readable documentation and basic metadata such as creator, publisher,
date of creation, last modification, version number
LDV3. Provide labels and descriptions, if possible in several languages, to make your
vocabulary usable in multiple linguistic scopes
LDV4. Make your vocabulary available via its namespace URI, both as a formal file and humanreadable documentation, using content negotiation
LDV5. Link to other vocabularies by re-using elements rather than re-inventing

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

5
Previous work (II)
Archer, P., Goedertier, S., and Loutas, N. D7.1.3 – Study on persistent URIs, with identification of
best practices and recommendations on the topic for the MSs and the EC. Deliverable. December
17, 2012.

Heath, T., Bizer, C.: Linked data: Evolving the Web into a global data space (1st edition). Morgan &
Claypool. 2011.

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

6
Proposed good practices and pitfalls
Proposals

Inspired by

Previous work brief reminder
Linked Open Data 5 Star
LOD1. on the web. Open.
LOD2. machine-readable
LOD3. non-proprietary
LOD4. open standards
LOD5. Link

Good practices
GP1. Provide RDF description
GP2. Provide HTML documentation
GP3. Content negotiation for RDF
GP4. Content negotiation for HTML
GP5. Provide vann metadata
GP6. Well-established prefix

Is your linked data vocabulary 5-star?
LDV1. vocabulary on the Web
LDV2. human-readable and metadata
LDV3. labels and descriptions
LDV4. content negotiation
LDV5. Link

Pitfalls
P36. URI contains file extension
P37. Ontology not available
P38. No OWL ontology declaration
P39. Ambiguous namespace
P40. Namespace hijacking

10 rules for persistent URIs

✔
Linked data: Evolving the Web
into a global data space:
“Only define new terms in a
namespace that you control.”

✖


• Follow the pattern
• Re-use existing identifiers
• Multiple representations
• Implements 303 redirects
• Use a dedicated server

• Avoid stating ownership
• Avoid version numbers
• Avoid using auto-increment
• Avoid query strings
• Avoid file extensions

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

7
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Results and analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

8
Results and analysis over LOV vocabularies (I)
Good practices and pitfalls frequency

355 vocabularies registered in LOV - 19th June, 2013

GP1. Provide RDF
description
GP2. Provide HTML
documentation
GP3. Content negotiation for
RDF
GP4. Content negotiation for
HTML
GP5. Provide vann metadata
GP6. Well-established prefix

Pitfalls distribution

Good practices distribution

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

9

P36. URI contains file
extension
P37. Ontology not
available
P38. No OWL ontology
declaration
P39. Ambiguous
namespace
P40. Namespace
hijacking
Results and analysis over LOV vocabularies (I)
Grid with vocabularies according to the number of good practices and pitfalls observed.
Available at http://goo.gl/zu9ZbW

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

10
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Results and analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

11
Conclusions
•  6 good practices and 5 pitfalls proposed
•  Based on existing works
•  Implementation of the detection methods
•  Grid-based rating system proposed. Useful for:
•  Vocabulary registry maintainers
•  Vocabulary developers and creators
•  Execution over 355 vocabularies
•  All good practices and pitfalls are observed
•  Some of them surprisingly (e.g.: P40. Namespace hijacking)
•  LOV vocabularies seem to be well maintained and likely to be high quality . Due to
semi-handcrafted maintenance instead of crawlers?

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

12
Future work (I)

Linked Open Data 5 Star
LOD1. on the web. Open.
LOD2. machine-readable
LOD3. non-proprietary
LOD4. open standards
LOD5. Link

•  Take into account:
•  metadata about licences
•  other metadata, e.g., creators, authors,
dates, languages, etc.
•  linguistic information
•  reused terms from other vocabularies
•  Provide guidelines to solve pitfalls and to
follow good practices
•  Execute methods over LOV in regular basis
•  Observe evaluation of the ecosystem
•  Draw tends for vocabulary publication

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

Is your linked data vocabulary 5-star?
LDV1. vocabulary on the Web
LDV2. human-readable and metadata
LDV3. labels and descriptions
LDV4. content negotiation
LDV5. Link

13
Future work (II)
•  Integration with third party systems. E.g.
•  LOV search
•  OOPS! - OntOlogy Pitfall Scanner! (http://oeg-upm.net/oops/)
ü  Done for pitfalls
•  Assign importance levels for good practices and pitfalls
ü  Done for pitfalls

…
…
…
…
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

14
Questions?

Thanks!
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

15
Detecting Good Practices and
Pitfalls when Publishing
Vocabularies on the Web
María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 ,
1Ontology

Engineering Group. Universidad Politécnica de Madrid. Spain.
2Mondeca, Paris, France.

mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es

Speaker: Asunción Gómez-Pérez
Contact author: María Poveda-Villalón: mpoveda@fi.upm.es

Date: 10/28/13

Mais conteúdo relacionado

Mais procurados

Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in Linguistics
Richard Littauer
 

Mais procurados (16)

Efficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data StreamsEfficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data Streams
 
Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Building Blocks for Accessing Multilingual Data: CLDR
Building Blocks for Accessing Multilingual Data: CLDRBuilding Blocks for Accessing Multilingual Data: CLDR
Building Blocks for Accessing Multilingual Data: CLDR
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
 
DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization data
 
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in Linguistics
 
Vocabularies - Managing Them
Vocabularies - Managing ThemVocabularies - Managing Them
Vocabularies - Managing Them
 
Contexts and Importing in RDF
Contexts and Importing in RDFContexts and Importing in RDF
Contexts and Importing in RDF
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
 

Destaque

Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012
María Poveda Villalón
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012
María Poveda Villalón
 
About André T. (anno June '11)
About André T. (anno June '11)About André T. (anno June '11)
About André T. (anno June '11)
André Torkveen
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
María Poveda Villalón
 
Ontology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosisOntology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosis
María Poveda Villalón
 

Destaque (7)

Detrás de un gran dataset siempre hay un gran vocabulario
Detrás de un gran dataset siempre hay un gran vocabularioDetrás de un gran dataset siempre hay un gran vocabulario
Detrás de un gran dataset siempre hay un gran vocabulario
 
Ee bdm ws-v1
Ee bdm ws-v1Ee bdm ws-v1
Ee bdm ws-v1
 
Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012
 
About André T. (anno June '11)
About André T. (anno June '11)About André T. (anno June '11)
About André T. (anno June '11)
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Ontology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosisOntology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosis
 

Semelhante a Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
National Information Standards Organization (NISO)
 

Semelhante a Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web (20)

Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
 
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...
 
Coming to terms to FAIR semantics
Coming to terms to FAIR semanticsComing to terms to FAIR semantics
Coming to terms to FAIR semantics
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
 
Opening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked dataOpening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked data
 
Linked Data
Linked DataLinked Data
Linked Data
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
New member
New member New member
New member
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through Semantics
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4
 
New member webinar 052418
New member webinar 052418New member webinar 052418
New member webinar 052418
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 

Mais de María Poveda Villalón

Mais de María Poveda Villalón (7)

Ontology development basic tools
Ontology development basic toolsOntology development basic tools
Ontology development basic tools
 
Chowlk notation
Chowlk notation Chowlk notation
Chowlk notation
 
New trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and toolsNew trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and tools
 
Publishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of OntologiesPublishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of Ontologies
 
Introducción a la web semántica
Introducción a la web semánticaIntroducción a la web semántica
Introducción a la web semántica
 
Semantic Discovery in the Web of Things
Semantic Discovery in the Web of ThingsSemantic Discovery in the Web of Things
Semantic Discovery in the Web of Things
 
Linked Open Vocabularies
Linked Open VocabulariesLinked Open Vocabularies
Linked Open Vocabularies
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

  • 1. Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 , 1Ontology Engineering Group. Universidad Politécnica de Madrid. Spain. 2Mondeca, Paris, France. mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es Speaker: Asunción Gómez-Pérez Contact author: María Poveda-Villalón: mpoveda@fi.upm.es Date: 10/28/13
  • 2. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Results and Analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 2
  • 3. Introduction •  Different formats: RDFS, OWL, HTML •  Different configurations •  Do they ease or impede applications consuming vocabularies? Ø  Good practices & Pitfalls Vocabularies bring semantics to data “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Along this work: •  Detailed analysis of 355 vocabularies gathered in the LOV registry (http://lov.okfn.org/) •  Why LOV: complete information about each vocabulary, namely URI, namespace and prefix •  Results: 1.  a non exhaustive list of good practices and pitfalls about publishing LD vocabularies 2.  specific methods for detecting such good practices and pitfalls 3.  some metadata about ontology quality 4.  the inclusion of pitfalls in services such as OOPS! (http://www.oeg-upm.net/oops) to help eager vocabulary managers Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 3
  • 4. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Previous work •  Proposed good practices and pitfalls •  Results and analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 4
  • 5. Previous work (I) Linked Open Data 5 Star rating system (Tim Bernes-Lee) http://www.w3.org/DesignIssues/ LinkedData.html. 2006 (last change 2009). LOD1. Available on the web (whatever format) but with an open licence, to be Open Data LOD2. Available as machine-readable structured data (e.g. excel instead of image scan of a table) LOD3. As (2) plus non-proprietary format (e.g. CSV instead of excel) LOD4. All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff LOD5. All the above plus Link your data to other people’s data to provide context Is your linked data vocabulary 5-star? (Bernard Vatant) http://bvatant.blogspot.fr/2012/02/is-yourlinked-data-vocabulary-5-star_9588.html. 2012. LDV1. Publish your vocabulary on the Web at a stable URI LDV2. Provide human-readable documentation and basic metadata such as creator, publisher, date of creation, last modification, version number LDV3. Provide labels and descriptions, if possible in several languages, to make your vocabulary usable in multiple linguistic scopes LDV4. Make your vocabulary available via its namespace URI, both as a formal file and humanreadable documentation, using content negotiation LDV5. Link to other vocabularies by re-using elements rather than re-inventing Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 5
  • 6. Previous work (II) Archer, P., Goedertier, S., and Loutas, N. D7.1.3 – Study on persistent URIs, with identification of best practices and recommendations on the topic for the MSs and the EC. Deliverable. December 17, 2012. Heath, T., Bizer, C.: Linked data: Evolving the Web into a global data space (1st edition). Morgan & Claypool. 2011. Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 6
  • 7. Proposed good practices and pitfalls Proposals Inspired by Previous work brief reminder Linked Open Data 5 Star LOD1. on the web. Open. LOD2. machine-readable LOD3. non-proprietary LOD4. open standards LOD5. Link Good practices GP1. Provide RDF description GP2. Provide HTML documentation GP3. Content negotiation for RDF GP4. Content negotiation for HTML GP5. Provide vann metadata GP6. Well-established prefix Is your linked data vocabulary 5-star? LDV1. vocabulary on the Web LDV2. human-readable and metadata LDV3. labels and descriptions LDV4. content negotiation LDV5. Link Pitfalls P36. URI contains file extension P37. Ontology not available P38. No OWL ontology declaration P39. Ambiguous namespace P40. Namespace hijacking 10 rules for persistent URIs ✔ Linked data: Evolving the Web into a global data space: “Only define new terms in a namespace that you control.” ✖ • Follow the pattern • Re-use existing identifiers • Multiple representations • Implements 303 redirects • Use a dedicated server • Avoid stating ownership • Avoid version numbers • Avoid using auto-increment • Avoid query strings • Avoid file extensions Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 7
  • 8. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Results and analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 8
  • 9. Results and analysis over LOV vocabularies (I) Good practices and pitfalls frequency 355 vocabularies registered in LOV - 19th June, 2013 GP1. Provide RDF description GP2. Provide HTML documentation GP3. Content negotiation for RDF GP4. Content negotiation for HTML GP5. Provide vann metadata GP6. Well-established prefix Pitfalls distribution Good practices distribution Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 9 P36. URI contains file extension P37. Ontology not available P38. No OWL ontology declaration P39. Ambiguous namespace P40. Namespace hijacking
  • 10. Results and analysis over LOV vocabularies (I) Grid with vocabularies according to the number of good practices and pitfalls observed. Available at http://goo.gl/zu9ZbW Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 10
  • 11. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Results and analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 11
  • 12. Conclusions •  6 good practices and 5 pitfalls proposed •  Based on existing works •  Implementation of the detection methods •  Grid-based rating system proposed. Useful for: •  Vocabulary registry maintainers •  Vocabulary developers and creators •  Execution over 355 vocabularies •  All good practices and pitfalls are observed •  Some of them surprisingly (e.g.: P40. Namespace hijacking) •  LOV vocabularies seem to be well maintained and likely to be high quality . Due to semi-handcrafted maintenance instead of crawlers? Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 12
  • 13. Future work (I) Linked Open Data 5 Star LOD1. on the web. Open. LOD2. machine-readable LOD3. non-proprietary LOD4. open standards LOD5. Link •  Take into account: •  metadata about licences •  other metadata, e.g., creators, authors, dates, languages, etc. •  linguistic information •  reused terms from other vocabularies •  Provide guidelines to solve pitfalls and to follow good practices •  Execute methods over LOV in regular basis •  Observe evaluation of the ecosystem •  Draw tends for vocabulary publication Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web Is your linked data vocabulary 5-star? LDV1. vocabulary on the Web LDV2. human-readable and metadata LDV3. labels and descriptions LDV4. content negotiation LDV5. Link 13
  • 14. Future work (II) •  Integration with third party systems. E.g. •  LOV search •  OOPS! - OntOlogy Pitfall Scanner! (http://oeg-upm.net/oops/) ü  Done for pitfalls •  Assign importance levels for good practices and pitfalls ü  Done for pitfalls … … … … Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 14
  • 15. Questions? Thanks! Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 15
  • 16. Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 , 1Ontology Engineering Group. Universidad Politécnica de Madrid. Spain. 2Mondeca, Paris, France. mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es Speaker: Asunción Gómez-Pérez Contact author: María Poveda-Villalón: mpoveda@fi.upm.es Date: 10/28/13