The document discusses an evaluation of metadata usage and distribution in a linked data environment. It analyzes datasets from different institutions that mapped manuscript metadata to the Europeana Data Model (EDM) and a DM2E model. The evaluation aims to discover similarities and differences between datasets from different mapping institutions. It finds variations in usage of classes, properties, ontologies, and structural metrics like predicate-object-equality-ratio. The conclusion is that linked data quality assurance is important and people have a strong influence on metadata mapping.
How to Troubleshoot Apps for the Modern Connected Worker
Dc 2014 baierer-droege
1. co-funded by the European Union
Linked Data Mapping Cultures
An Evaluation of Metadata Usage and Distribution in a Linked Data Environment
Konstantin Baierer, Evelyn Dröge, Vivien Petras, Violeta Trkulja Berlin School of Library and Information Science, Humboldt-Universität zu Berlin
Presentation at the International Conference on Dublin Core and Metadata Applications Austin, October 9, 2014
2. Outline
Linked Data Mapping Cultures
2
09.10.2014
1.Linked Data mapping cultures
2.Digitised Manuscripts to Europeana
3.EDM and DM2E model
4.Evaluation: aim, datasets, methods
5.Results of the evaluation
6.Conclusion
3. Linked Data mapping cultures
•Linked Data offers great expressivity
With great freedom comes great responsibility
•Data in DM2E:
–Different data formats
–Different data curation background = Different cultures in Linked Data
•Data providers ≠ data mapping institutions
•Mapping is influenced by policies, technology, best practices, personal preferences…
Linked Data Mapping Cultures
3
09.10.2014
4. Digitised Manuscripts to Europeana (DM2E)
4
09.10.2014
Linked Data Mapping Cultures
Heterogeneous object data in independent resources
5. EDM and DM2E model
EDM = Europeana Data Model
•Used to describe Cultural Heritage Objects (CHOs)
•Very generic but can be specialized
DM2E model: Specialization of EDM for manuscripts
Linked Data Mapping Cultures
5
09.10.2014
dm2e: <http://onto.dm2e.eu/schemas/dm2e/1.0/> dm2edata: <http://data.dm2e.eu/data/>
edm: <http://www.europeana.eu/schemas/edm/>
6. DM2E model: Example
Linked Data Mapping Cultures
6
09.10.2014
foaf:Person dm2edata:agent/uib/ wab/ Ludwig_Wittgenstein
ore:Aggregation dm2edata:aggregation/uib/wab/Ms-115/Ms-115-2
skos: prefLabel
“Ludwig Wittgenstein”@de
“remark Ms-115,1[2]et2[1] from Wittgenstein Nachlass MS 115”@en
edm:ProvidedCHO
dm2edata: item/uib/wab/ Ms-115/Ms-115-2
foaf:Organization dm2edata:agent/uib/wab/ Wittgenstein_Archives
edm:WebResource
http://wab.uib.no/cost- a32_fax/115/Ms-115%2c1.jpg
dm2e:Paragraph
dc:type
7. Aim of the evaluation
•Evaluation of datasets from the DM2E project
–Based on mappings to the DM2E model
•Aim: discover similarities and differences between datasets from different mapping institutions
Linked Data Mapping Cultures
7
09.10.2014
Do mapping preferences of individual institutions influence the resulting data from a mapping process?
8. Analyzed datasets
•Datasets as of May 1, 2014
•Analyzed datasets:
–Eight data providers DP I – DP VIII
–Ten datasets Dataset 1 – 10
–Six mapping institutions MI A – F
–Variety of metadata formats
Linked Data Mapping Cultures
8
09.10.2014
DP
Dataset
Metadata format
MI
DP I
Dataset 1
proprietary format
MI A
DP I
Dataset 2
proprietary format
MI A
DP II
Dataset 3
MAB2
MI B
DP II
Dataset 4
MAB2
MI B
DP III
Dataset 5
METS/
MODS
MI C
DP IV
Dataset 6
METS/ MODS
MI C
DP V
Dataset 7
TEI P5
MI D
DP VI
Dataset 8
EAD
MI D
DP VII
Dataset 9
TEI P5
MI E
DP VIII
Dataset 10
TEI P5
MI F
DP: Data Provider
MI: Mapping institution
18. Average number of statements (ANOS)
Linked Data Mapping Cultures
18
09.10.2014
19. Conclusion
•Linked Data quality assurance is vital
•Structural metrics help everybody
•Ontology engineering as a cyclic process
•“Ontology pruning”
•People > data in metadata mapping
Linked Data Mapping Cultures
19
09.10.2014
20. Thank you for your attention!
Konstantin Baierer
Evelyn Dröge
Berlin School of Library and Information Science
Humboldt-Universität zu Berlin
www.ibi.hu-berlin.de
Digitised Manuscripts to Europeana
www.dm2e.eu
konstantin.baierer@ibi.hu-berlin.de
evelyn.droege@ibi.hu-berlin.de
Linked Data Mapping Cultures
20
09.10.2014
21. References
Literature
•Alexander, Keith, Richard Cyganiak, Michael Hausenblas, and Jun Zhao. (2009). Describing Linked Datasets. On the Design and Usage of VoID, the “Vocabulary of Interlinked Datasets”. In Bizer et al. (Eds.), Proceedings of the Linked Data on the Web Workshop (LDOW2009), Madrid, Spain, April 20, 2009, CEUR Workshop Proceedings. Retrieved, May 14, 2014, from http://ceur-ws.org/Vol-538/.
•Auer, Sören, Jan Demter, Michael Martin, and Jens Lehmann. (2012). LODStats – An Extensible Framework for High-Performance Dataset Analytics. In ten Teije et al. (Eds.), Knowledge Engineering and Knowledge Management. 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012, Proceedings (pp. 356-362). Berlin, Heidelberg: Springer. doi: 10.1007/978-3-642-33876-2.
•Carroll, J. Carroll, Christian Bizer, Pat Hayes, and Patrick Stickler. (2005). Named Graphs. In Journal of Web Semantics, 3, 247-267.
•Dröge, Evelyn, Julia Iwanowa, and Steffen Hennicke. (2014a). A specialisation of the Europeana Data Model for the representation of manuscripts: The DM2E model. In Libraries in the Digital Age (LIDA) Proceedings, Volume 13, 2014. Retrieved, July, 24, 2014, from http://ozk.unizd.hr/proceedings/index.php/lida/article/view/117.
•Dröge, Evelyn, Julia Iwanowa, Steffen Hennicke and Kai Eckert. (2014b, March). DM2E Model V1.1 Retrieved, May 12, 2014, from http://pro.europeana.eu/documents/1044284/0/DM2E+Model+V+1.1+Specification.
•Europeana Data Model Primer, v14/07/2013. (2013, July). Retrieved from: Europeana Professional website. Retrieved, April 28, 2014, from http://pro.europeana.eu/ documents/900548/770bdb58-c60e-4beb-a687-874639312ba5.
•Heath, Tom, and Christian Bizer. (2011). Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web: Theory and Technology (Vol. 1). Morgan & Claypool.
•Klimek, Jakub, Jirí Helmich, and Martin Necasky. (2014). An analysis supported by numerous visualizations Application of the Linked Data Visualization Model on Real World Data from the Czech LOD Cloud. Linked Data on the Web (LDOW 2014) Workshop. Retrieved, May 14, 2014, from http://events.linkeddata.org/ldow2014/papers/ldow2014_paper_13.pdf.
•Palavitsinis, Nikos, Nikos Manouselis, and Salvador Sanchez-Alonso. (2014). Metadata quality in digital repositories: Empirical results from the cross-domain transfer of a quality assurance process. Journal of the Association for Information Science and Technology. doi: 10.1002/asi.23045.
•Seiffert, Florian. (2001). Eine Analyse der Verbunddaten des HBZ. ABI-technik 21(2): 125-146.
•Smith-Yoshimura, Karen, Catherine Argus, Timothy J. Dickey, Chew Chiat Naun, Lisa Rowlison de Ortiz, Hugh Taylor. (2010, March). Implications of MARC Tag Usage on Library Metadata Practices, OCLC Online Computer Library Center, Inc. Retrieved, May 14, 2014, from http://www.oclc.org/research/publications/library/2010/2010-06.pdf
Images
•Speech Bubble (Slide 2): http://commons.wikimedia.org/wiki/File:Blue-Speech-Bubble.png
•IBI (Slide 20): http://commons.wikimedia.org/wiki/File:Berlin,_Mitte, _Dorotheenstrasse,_Handelskammer_Berlin_02.jpg
Linked Data Mapping Cultures
21
09.10.2014