1.
A framework for managing vocabularies
Outcomes of the TDWG Vocabulary Management Task Group
TDWG 2013 Annual Conference, Florence, Italy
Dag Endresen*, Éamonn O Tuama, Gregor Hagedorn and Steve
Baskauf
* GBIF-Norway, Natural History Museum of the University in Oslo
Global Biodiversity Information Facility (GBIF)
31 October 2013
2. VoMaG
report
Three
main
sec,ons
• Status
of
the
TDWG
ontology
• Seman4c
MediaWiki
as
a
development
pla<orm
• Requirements
for
a
framework
for
managing
vocabularies
5. Vocabularies/ontologies
• Provide
a
shared
understanding
of
what
we
mean
when
describing
biodiversity
en44es.
• What
kind
of
thing
or
property.
• A
list
of
things
we
as
a
community
can
agree
upon
the
meaning
of.
• “Concept
repository”
with
terms
iden4fied
by
URIs.
TDWG Technical Roadmap 2008. (TDWG TAG, convened by Roger Hyam).
6. Vocabulary
governance
• Development,
management
and
governance
remain
a
challenge.
• Maintenance
requires
efforts.
• Challenge:
Lack
of
resources
-‐
No
TDWG
Ontology
coordinator
(since
2007).
• Experiment:
Possible
as
a
collabora4ve
community
approach
coordinated
using
a
Seman4c
Wiki…???
8. Concept
vocabularies
• Collabora4ve
development
of
terms
at:
h[p://terms.gbif.org/wiki/
[?
terms.tdwg.org/wiki/
?]
• Terms
grouped
into
“concept
vocabularies”
each
controlled
by
a
TDWG
task
group.
• REUSE
terms
(and
term
URI)
whenever
possible!!
9. Data
standards
Important principle: Re-use of terms from
standardized terminologies wherever possible.
9
The cartoon is from XKCD: http://xkcd.com/927/ CC-BY-NC
10. Term
versus
Concept
“The SKOS (simple knowledge organization system) format is designed to present
KOS data in a format that is suitable for machine inferencing and particularly for use
in the Semantic Web (….) concepts – units of thought – and distinguishes these
from the terms that are used to label these concepts.
Will, L. (2012). The ISO 25964 Data Model for the Structure of an Information Retrieval Thesaurus. Bulletin of the American
Society for Information Science and Technology 38(4): 48-51.
Dextre Clarke, S.G. and L. Zeng (2012). From ISO 2788 to ISO 25964: the evolution of thesaurus Standards towards
Interoperability and data modeling. ISQ Information Standards Quarterly 24(1): 20-26.
10
11. Vocabularies/ontologies
• What
kind
of
thing
or
property
• Concept
(skos:Concept)
• Reused
as:
• Class
(rdf:Class)
[?]
• Property
(rdfs:Property)
• Controlled
element
values
12. Why use a flat vocabulary ?
• Maximize the reuse of terms, focus on the definition
and labels for basic terms.
• Low threshold for non-technical biologists and
biodiversity domain experts to access terms and
contribute (compared to richer ontologies).
• Preferred technology: RDF (resource description
framework) and SKOS (simple knowledge organization
system).
• Construction and maintenance of OWL ontologies are
demanding in respect to expertise, effort and costs.
• Maintaining SKOS vocabularies are less demanding.
• RDF resources are designed to be easily extended.
• Ontologies (OWL) can be based on (extend) terms
declared by a RDF/SKOS vocabulary.
• SKOS became a W3C recommendation in 2009.
12
13. Darwin Core – a glossary of terms
Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, De Giovanni R, Robertson T, and Vieglais D (2012)
Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 7(1): e29715. doi:
10.1371/journal.pone.0029715
14. Vocabulary
repository
• Establish
a
repository
for
“concept
vocabularies”.
• Example
at:
h[p://rs.gbif.org/terms/
[?
rs.tdwg.org/terms/
?]
Photo CC-by-3.0 by Hannes Grobe/AWI. Palaeoclimate archives: Core repository of AWI,
http://commons.wikimedia.org/wiki/File:Core-repository_hg.jpg
15. Work-flow for Vocabulary management
1. Mint and maintain concepts and terms, in domain-expert working groups.
Using e.g. the http://terms.gbif.org/wiki/ as a collaboration platform.
2. Release final version as a Concept Vocabulary.
E.g. publish vocabulary at the GBIF or a TDWG Resources Repository.
REUSE terms from published concept vocabularies and other ontologies
when designing new application schema such as DwC-A controlled term and
value vocabularies.
Term Wiki
For vocabulary
development
http://terms.gbif.org/wiki/
15
Concept
Vocabulary
(rdf, skos)
Resources
Repository
http://rs.gbif.org/terms/
16. Example: master SKOS/RDF resource
en
[
es
[
zh
[
ja
[
16
http://rs.gbif.org/terms/dwc/dwc_translations.rdf
18. BioPortal ontology repository
A biodiversity “slice” was established at the NCBO
BioPortal.
• Loading biodiversity ontologies into the NCBO
BioPortal promotes mapping (and reuse of terms)
between bio-medical and biodiversity ontologies.
h9p://
bis.bio
por
18
tal.bioo
ntology
.org/on
tologie
s?filter
=BIS
19. Agenda
Time
Title/Presenter
09.00
–
09.05
Introduc,on
to
VoMaG
Éamonn
Ó
Tuama
09.05
–
09.20
The
TDWG
Ontologies
Steve
Baskauf
09.20
–
09.35
A
framework
for
managing
vocabularies
Dag
Endresen
09.35
–
09.45
Seman,c
MediaWiki
-‐
a
community
plaWorm
for
vocabulary
development
Gregor
Hagedorn
09.45
–
09.55
Developing
a
vocabulary:
the
SINP
use
case
Julie
Chataigner
&
Laurent
Poncet
09.55
–
10.05
Modeling
property
values:
Is
SKOS
part
of
the
solu,on
or
part
of
the
problem?
Robert
A.
Morris
10.05
–
10.30
General
discussion