This workshop will introduce participants to Linked Data, a key semantic web technology, and its uses in the digital humanities. Through examples of Linked Data websites and applications, we will explore how Linked Data is being used by individual digital humanities scholars, by organisations such as the BBC and the Central Statistics Office, and by cultural heritage institutions worldwide. We will make comparisons to other approaches to structuring data (including markup and metadata approaches such as TEI and XML) and discuss best practices for creating and reusing Linked Data (such as the importance of identifiers and standard vocabularies). Participants will also be introduced to tools for creating and exploring Linked Data. The workshop will also include a hands-on exercise in creating Linked Data.
Linked Data in the Digital Humanities was a Skills Workshop
http://dri.ie/skills-workshops
part of Realising the Opportunities of Digital Humanities
http://dri.ie/realising-opportunities-digital-humanities
Presenters: Jodi Schneider and Michael Hausenblas
with support from
Stefan Decker, Nuno Lopes, and Bahareh Heravi
all of the Digital Enterprise Research Institute, National University of Ireland Galway
Decarbonising Buildings: Making a net-zero built environment a reality
Linked data in the digital humanities skills workshop for realising the opportunities of the digital humanities 2012.10.25
1. Digital Enterprise Research Institute www.deri.ie
Linked Data in the Digital Humanities
Jodi Schneider & Michael Hausenblas
with Stefan Decker & Nuno Lopes
Realising the Opportunities of Digital Humanities Thursday 25th October 2012
National University of Ireland, Maynooth
Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Enabling Networked Knowledge
1
3. What is Linked Data? Why use it? What are some
examples?
How do Linked Data applications differ from
conventional ones?
How is Linked Data different from other structured
data used in digital humanities? (e.g. TEI, XML)
What are the best practices for creating Linked Data?
(3) 3
4. Using identifiers
to enable access
to add structure
to link to other stuff
(4) 4
7. A “Web” where
documents are available for download on the Internet
but there would be no hyperlinks among them
(7)
Slide credit: Ivan Herman
7
8.
9.
10. We need a proper infrastructure for a real Web of Data
data is available on the Web
• accessible via standard Web technologies
data are interlinked over the Web
ie, data can be integrated over the Web
We need Linked Data
(10)
Slide credit: Ivan Herman 10
12. Mass Media
BBC
New York Times
Guardian
Scholarly Publishers
Nature
CrossRef
Data Publishers
USData.gov
Data.gov.uk
Central Statistics Office
Libraries
(12) 12
29. Using identifiers
to enable access
to add structure
to link to other stuff
(29) 29
30. Remember, everything must nest properly!
Document
Paragraph Paragraph
Sentence Sentence Sentence Sentence Sentence
We use family tree terms: parent, child, sibling, ancestor, and descendent.
Slide credit: Susan Schreibman
(30) 30
32. structural divisions within a text
title-page, chapter, scene, stanza, line, etc
typographical elements
changes in typeface, special characters, etc
other textual features
grammatical structures, location of illustrations, variant
forms, etc
Slide credit: Susan Schreibman
(32) 32
33. structural divisions within a text
title-page, chapter, scene, stanza, line, etc
typographical elements
changes in typeface, special characters, etc
other textual features
grammatical structures, location of illustrations, variant
forms, etc
Slide credit: Susan Schreibman
(33) 33
43. A Uniform Resource Identifier (URI) is a compact
sequence of characters that identifies an abstract
or physical resource. [RFC3986]
Syntax
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
Example
foo://example.com:8042/over/there?name=ferret#nose
_/ _________________/____/ _______________/
____/
| | | | |
scheme authority path query fragment
Slide Credit: Michael Hausenblas
(43) 43
67. What is Linked Data? Why use it? What are some
examples?
How do Linked Data applications differ from
conventional ones?
(67) 67
68. core record
DBpedia
record
Europeana
record
app records app
British Library
record
data.gov.ie
record
conventional back-end Linked Data back-end
69. What is Linked Data? Why use it? What are some
examples?
How do Linked Data applications differ from
conventional ones?
How is Linked Data different from other
structured data used in digital humanities? (e.g.
TEI, XML)
(69) 69
71. What is Linked Data? Why use it? What are some
examples?
How do Linked Data applications differ from
conventional ones?
How is Linked Data different from other structured
data used in digital humanities? (e.g. TEI, XML)
What are the best practices for creating Linked
Data?
(71) 71
72. What is Linked Data? Why use it? What are some
examples?
How do Linked Data applications differ from
conventional ones?
How is Linked Data different from other structured
data used in digital humanities? (e.g. TEI, XML)
What are the best practices for creating Linked Data?
(72) 72
88. Digital Enterprise Research Institute www.deri.ie
Thanks to our funders!
Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Enabling Networked Knowledge
99. AKA “An introduction to the Semantic Web (Through an Example)” by Ivan Herman
(99) 99
100. Map the various data onto an abstract data
representation
make the data independent of its internal representation…
Merge the resulting representations
Start making queries on the whole!
queries not possible on the individual data sets
(100) 100
101.
102. ISBN Author Title Publisher Year
0006511409X id_xyz The Glass Palace id_qpr 2000
ID Name Homepage
id_xyz Ghosh, Amitav http://www.amitavghosh.com
ID Publisher’s name City
id_qpr Harper Collins London
(102) 102
103. The Glass Palace
http://…isbn/000651409X
2000
London a:author
Harper Collins
a:name
a:homepage
Ghosh, Amitav http://www.amitavghosh.com
(103) 103
104. Relations form a graph
the nodes refer to the “real” data or contain some literal
how the graph is represented in machine is immaterial for now
(104) 104
105.
106. A B C D
1
ID Titre Traducteur Original
2
ISBN 2020286682 Le Palais des Miroirs $A12$ ISBN 0-00-6511409-X
3
4
5
6
ID Auteur
7
ISBN 0-00-6511409-X $A11$
8
9
10
Nom
11
Ghosh, Amitav
12
Besse, Christianne
(106) 106
107. http://…isbn/000651409X Le palais des miroirs
f:auteur
http://…isbn/202038668
2
f:traducteur
f:nom
f:nom
Ghosh, Amitav
Besse, Christianne
(107) 107
108. The Glass Palace
http://…isbn/000651409X
2000
London
a:author
Harper Collins
a:name http://…isbn/000651409X
a:homepage
Le palais des miroirs
Ghosh, Amitav
http://www.amitavghosh.com
f:auteur
http://…isbn/2020386682
f:traducteu
r
f:nom
f:nom
Ghosh, Amitav
Besse, Christianne
(108) 108
109. The Glass Palace
http://…isbn/000651409X
2000
London Same URI!
a:author
Harper Collins
a:name http://…isbn/000651409X
a:homepage
Le palais des miroirs
Ghosh, Amitav
http://www.amitavghosh.com
f:auteur
http://…isbn/2020386682
f:traducteu
r
f:nom
f:nom
Ghosh, Amitav
Besse, Christianne
(109) 109
110. The Glass Palace
http://…isbn/000651409X
2000
London
a:author
Harper Collins f:original
a:name f:auteur
a:homepage
Le palais des miroirs
Ghosh, Amitav
http://www.amitavghosh.com
http://…isbn/2020386682
f:traducteu
r
f:no
m
f:nom
Ghosh, Amitav
Besse, Christianne
(110) 110
111. User of data “F” can now ask queries like:
“give me the title of the original”
• well, … « donnes-moi le titre de l‟original »
This information is not in the dataset “F”…
…but can be retrieved by merging with dataset “A”!
(111) 111
112. We “feel” that a:author and f:auteur should be the
same
But an automatic merge doest not know that!
Let us add some extra information to the merged
data:
a:author same as f:auteur
both identify a “Person”
a term that a community may have already defined:
• a “Person” is uniquely identified by his/her name and, say,
homepage
• it can be used as a “category” for certain type of resources
(112) 112
113. The Glass Palace
http://…isbn/000651409X
2000
Le palais des miroirs
f:original
London
a:author http://…isbn/2020386682
Harper Collins f:auteur
r:type f:traducteu
r:type r
a:name
a:homepage http://…foaf/Person
f:nom
f:nom
Besse, Christianne
Ghosh, Amitav
http://www.amitavghosh.com
(113) 113
114. User of dataset “F” can now query:
“donnes-moi la page d‟accueil de l‟auteur de l‟original”
• well… “give me the home page of the original‟s „auteur‟”
The information is not in datasets “F” or “A”…
…but was made available by:
merging datasets “A” and datasets “F”
adding three simple extra statements as an extra “glue”
(114) 114
115. Using, e.g., the “Person”, the dataset can be
combined with other sources
For example, data in Wikipedia can be extracted
using dedicated tools
e.g., the “dbpedia” project can extract the “infobox” information
from Wikipedia already…
(115) 115
116. The Glass Palace
http://…isbn/000651409X
2000
Le palais des miroirs
f:original
London
a:author http://…isbn/2020386682
Harper Collins f:auteur
r:type f:traducteu
r
a:name r:type
a:homepage http://…foaf/Person
f:no
m f:nom
r:type
Besse, Christianne
Ghosh, Amitav http://www.amitavghosh.com
foaf:name w:reference
http://dbpedia.org/../Amitav_Ghos
h
(116) 116
117. The Glass Palace
http://…isbn/000651409X
2000
Le palais des miroirs
f:original
London
a:author http://…isbn/2020386682
Harper Collins f:auteur
r:type f:traducteu
r
a:name r:type
a:homepage http://…foaf/Person
f:nom
f:nom
r:type
w:isbn
Besse, Christianne
Ghosh, Amitav http://www.amitavghosh.com
foaf:name w:reference http://dbpedia.org/../The_Glass_Palace
w:author_of
http://dbpedia.org/../Amitav_Ghos
h
w:author_of
http://dbpedia.org/../The_Hungry_Tide
w:author_of
http://dbpedia.org/../The_Calcutta_Chromosome
(117) 117
118. The Glass Palace
http://…isbn/000651409X
2000
Le palais des miroirs
f:original
London
a:author http://…isbn/2020386682
Harper Collins f:auteur
r:type f:traducteu
r
a:name r:type
a:homepage http://…foaf/Person
f:nom
f:no
r:type
m
w:isbn
Besse, Christianne
Ghosh, Amitav http://www.amitavghosh.com
foaf:name w:reference http://dbpedia.org/../The_Glass_Palace
w:author_of
http://dbpedia.org/../Amitav_Ghos
h w:born_in
w:author_of http://dbpedia.org/../Kolkata
http://dbpedia.org/../The_Hungry_Tide
w:long w:lat
w:author_o
f
http://dbpedia.org/../The_Calcutta_Chromosome
(118) 118
119. It may look like it but, in fact, it should not be…
What happened via automatic means is done every
day by Web users!
The difference: a bit of extra rigour so that machines
could do this, too
(119) 119
120. We could add extra knowledge to the merged
datasets
e.g., a full classification of various types of library data
geographical information
etc.
This is where ontologies, extra rules, etc, come in
ontologies/rule sets can be relatively simple and small, or
huge, or anything in between…
Even more powerful queries can be asked as a result
(120) 120
121. Manipulate
Applications Query
…
Map,
Data represented in abstract format Expose,
…
Data in various formats
(121) 121
We were inspired by Ivan Herman’s presentations: http://www.w3.org/People/Ivan/CorePresentations/Several sections of this presentation are taken from“An introduction to the Semantic Web (Through an Example)”, available in PDF and at http://www.w3.org/People/Ivan/CorePresentations/IntroThroughExample/ and described as:The targeted audience are more the managers, whose technical background is not great and do not want to hear or see, for example, any XML code… But want to have an idea of what this beast is all about. It is done by showing how data integration works in very general terms and through an (artificial) example.FYI, those slides are also part (roughly the first 40 slides) of the “Tutorial on Semantic Web Technologies”, PDF and PPT available from http://www.w3.org/People/Ivan/CorePresentations/SWTutorial/And described asFull, cca. 3.5–4 hours’ worth of introductory tutorial to Semantic Web. The targeted audience is techies who have a good knowledge of the Web in general, have programming experience or at least minimal knowledge, and want to understand what this beast is all about…
Using identifiersto enable accessto add structure to link to other stuff
identifiers for machines – display in any language
Identifiers: ExercisesFor each of the examples here, answer the following questions:What does it identify?In what context is this identifier unique?What would be needed to make this identifier unique on the Web?http://kcoyle.net/metadata/3exercises.html
Identifiers: ExercisesFor each of the examples here, answer the following questions:What does it identify?In what context is this identifier unique?What would be needed to make this identifier unique on the Web?http://kcoyle.net/metadata/3exercises.html
Identifiers: ExercisesFor each of the examples here, answer the following questions:What does it identify?In what context is this identifier unique?What would be needed to make this identifier unique on the Web?http://kcoyle.net/metadata/3exercises.html
Identifiers: ExercisesFor each of the examples here, answer the following questions:What does it identify?In what context is this identifier unique?What would be needed to make this identifier unique on the Web?http://kcoyle.net/metadata/3exercises.html
Identifiers: ExercisesFor each of the examples here, answer the following questions:What does it identify?In what context is this identifier unique?What would be needed to make this identifier unique on the Web?http://kcoyle.net/metadata/3exercises.html
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType
Slide credit: Ivan Herman
Slide credit: Ivan Herman
Slide by Michael Hausenblas
Slide by Michael Hausenblas
Based on a slide by Siegfried Handschuh
Identifiers: ExercisesFor each of the examples here, answer the following questions:What does it identify?In what context is this identifier unique?What would be needed to make this identifier unique on the Web?http://kcoyle.net/metadata/3exercises.html
All 16:http://dublincore.org/documents/dces/ContributorCoverageCreatorDateDescriptionFormatIdentifierPublisherRelationRightsSourceSubjectTitleType