Philosophy of Education and Educational Philosophy
Linked Open Data Utrecht University Library
1. December 2021
dr. Ruben Schalk
Subject Specialist History & Digital Humanities
Utrecht University Library
r.schalk@uu.nl
https://www.uu.nl/staff/RSchalk
Linked (Open) Data for researchers and libraries:
intro & showcase
2. dr. Ruben Schalk
WHY THIS PRESENTATION
Subject Specialist History & Digital Humanities
Utrecht University Library
r.schalk@uu.nl
• (University) Libraries ideally positioned:
Metadata experts
Disclosure and accessibility of collections
Different collections & formats
Networks to facilitate use of metadata standards
Open Science & Open Access
FAIR research support/ workflow
Research output management
Links up with Digital Humanities support
• But requires additional skills
4. 1. What is Linked Data?
‘Linked Data is structured data which is interlinked with other data so it
becomes more useful through semantic queries’
Source: Wikipedia
I almost get it…?
5. 1. What is Linked Data: example
c_code country_name gdp_capita year
634 Qatar 156029 2015
578 Norway 82713 2015
784 United Arab Emirates 74746 2015
414 Kuwait 71354 2015
702 Singapore 65660 2015
756 Switzerland 59307 2015
442 Luxembourg 55972 2015
372 Ireland 54278 2015
840 United States 52591 2015
702 Singapore 65660 2015
?
6. 1. What is Linked Data: from rows to graphs
702 Singapore 65660 2015
Singapore
Country
65660
2015
702
rdf:type
clio:hasGDP
schema: observationDate
schema:country_code
Data are now semantically defined:
Codebook inherent to data
Human readable
Machine readable
“code that
represents
a country”
skos:label prov:wasDerivedFrom
DOI to
paper about
this code
7. 1. What is Linked Data: travel linked data graph
Singapore
Country
65660
2015
702
rdf:type
clio:hasGDP
schema: observationDate
schema:country_code
owl:sameAs
Wikipedia:
Singapore
schema:country_code
Some other economic
indicators on Singapore
in another dataset
8. 1. What is Linked Data: building blocks
Singapore
Wikipedia:
Singapore
owl:sameAs
2015
SUBJECT PREDICATE OBJECT
Basically: a statement or a fact
schema:observationDate
Often written as N-triples:
<https://uu.nl/datasets/mydata/country/Singapore> <http://schema.org/observationDate> <“2015”^^xsd:gYear> .
<https://uu.nl/datasets/mydata/country/Singapore> <http://www.w3.org/2002/07/owl#sameAs> <https://en.wikipedia.org/wiki/Singapore>
Or use prefixes:
mydata:Singapore schema:observationDate “2015” ”^^xsd:gYear
9. 1. What is Linked Data: building blocks
SUBJECT PREDICATE OBJECT
Basically: a statement or a fact
The elements of a triple are URI references, literals (or blank nodes):
• URI references: a standardized way to identify objects (often online): ISBN, URL, DOI, email address,
places, landmark, etc.
That does not work for things like numbers, that have multiple meanings in different contexts…
• Literals: data values such as strings, dates, integers, decimals, etc.
Type of literal is specified inside the triple, remember <“2015”^^xsd:gYear> ?
10. 1. What is Linked Data?
Linked Data graph
=
Combination of triples
=
That point to inside (dataset/ collection) and outside
information - ideally defined using common standards
=
Internet as a global database where everything is
connected
11. 1. What is Linked Data: how to access it
• Many linked data services run in the background of websites
• Linked data browsers provide facetted browsing over graph patterns
• Specify the graph patterns you want to retrieve with SPARQL queries:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?capital ?RESULT WHERE {
?RESULT dbo:capital ?capital .
?capital dbo:areaCode "030" .
}
ORDER BY DESC(?capital)
LIMIT 100
subject predicate object
13. Can make it as complex as you like
Store queries online (e.g. Github) for sharing and replication!
Use API calls to present results to end user
14. 2. Why should I use it?
Researchers:
Linked Data = FAIR data!
5-star data:
★ make your stuff available on the Web (whatever format) under an open license
★★ make it available as structured data (e.g., Excel instead of image scan of a table)
★★★ make it available in a non-proprietary open format (e.g., CSV instead of Excel)
★★★★ use URIs to denote things, so that people can point at your stuff
★★★★★ link your data to other data to provide context, and benefit from the network effect
Source: https://5stardata.info/en/
15. 2. Why should I use it?
Researchers:
Linked Data = FAIR data!
But also enhances your workflow:
Annotate, code and harmonize data in one go, using community standards
Share dataset, script (SPARQL), and results live on the Web
Answer new questions & find novel patterns by interlinking datasets
Run analyses across multiple datasets at the same time
No need for codebooks or complex relational queries: it’s all in the data!
Graph data model suited to heterogeneous or sparse data
Replicable research
Easy collaboration
16. 2. Why should I use it?
Libraries:
Superior Search & Find:
• Execute very detailed searches by combining metadata
• Search across different formats: datasets, books, illustrations, maps, archives, music, etc.
• Connect different materials: maps to books, journal papers to related datasets and
code, etc.
• Easy to embed (part of) catalogue on website
• Prioritized by Google (schema.org vocabulary)
Link items/catalogues to all types of external data, yet keep them separate
Contextualize search results, enhance metadata, or recommend stuff
Concentrate on your own expertise
Use graph patterns to ascertain quality of the metadata
Generic tools instead of domain-specific software for cataloguing
18. Research: what if we combined datasets on historical
stature as Linked Data?
• Initiated by Prof. Joerg Baten (University of Tuebingen)
• Shows added value of linking various small to large N datasets
centering around the same topic
• Possibilities with Linked Data:
Link to Clio-Infra LOD dataset to get GDP: correlate average
height and GDP before 1950; analyzing all 32 datasets, or
380,000 observations at once!
Link to C-shapes LOD for maps: average stature around the
world visualized.
• Available at:
https://druid.datalegend.net/dataLegend/microHeights
21. Research: use LOD to study excess mortality during the
Spanish Flu epidemic?
• CSV on deaths 1910-20 converted to
Linked Open Data (using COW)
• Harmonized using other LOD datasets
• GIS added using yet another LOD dataset
• Research output published live on the Web
• Downloadable results
Carpenter
Deceased:
Carpenter jobhoard:occupation
HISCO:95490
jobhoard:HISCO
52,50
jobhoard:HISCAM
= indicator for social economic status
24. Research: Infant mortality in 19th c. Amsterdam
• Project on infant mortality
(Radboud University, Prof.
Angelique Janssens)
• Street-level information on
births and deaths
• Neighborhoods retrieved from
Amsterdam Time Machine
Simply connect!
30. If you are
interested
in this:
You might
also like:
{insert titles or
authors from our
earlier query here}
Run SPARQL query in the background and use machine-readable
semantic relations for automated suggestions:
32. Collections: give me the different types of work associated with Karl Marx,
using IISH knowledge graph:
SELECT ?type (COUNT(?work) as ?n) WHERE
{
?topic a schema:Person .
?topic schema:name ?name .
?work schema:about ?topic .
?work rdf:type ?type .
FILTER(REGEX(?name, "Marx, Karl"))
}
ORDER BY DESC (?n)
Connect anything you like
type n
http://purl.org/dc/dcmitype/Text 2114
http://purl.org/dc/dcmitype/StillImage 747
http://schema.org/Photograph 202
https://iisg.amsterdam/vocab/Poster 189
https://iisg.amsterdam/vocab/Print 142
https://iisg.amsterdam/vocab/Drawing 117
http://purl.org/dc/dcmitype/PhysicalObject 74
http://schema.org/CreativeWork 67
https://iisg.amsterdam/vocab/Postcard 61
http://purl.org/dc/dcmitype/Collection 7
http://purl.org/dc/dcmitype/Sound 7
http://schema.org/Collection 7
https://iisg.amsterdam/vocab/ImageCollection 1
http://schema.org/CreativeWorkSeries 1
http://schema.org/Game 1
33. Use URI’s to connect information from different collections
Place and/or time :
Give me all information on Amsterdam in the year 1790:
Source: http://years.amsterdamtimemachine.nl/?year=1790
Special collections:
Give me all digitized versions of Blaeu’s Atlas Major across libraries:
<http://dbpedia.org/resource/Atlas_Maior> <dbo:wikiPageExternalLink> ?url_to_work .
Connect anything you like
<https://utrechtuniversity.on.worldcat.org/oclc/901235386>
<https://www.erfgoedleiden.nl/schatkamer/bladeren-door-blaeu>
<http://digital2.library.ucla.edu/viewItem.do%3Fark=21198/zz0017r9p5>
<http://digital.ub.uni-duesseldorf.de/urn/urn:nbn:de:hbz:061:1-37297>
<http://maps.nls.uk/atlas/blaeu/>
34. • And use URI’s to put your collection in context
I have a picture of a railway station. How do I find out who’s the architect if that’s not in
the metadata…?
• Link station URI with another catalogue, and improve metadata!
Source:
https://api.data.netwerkdigitaalerfgoed.nl/s/QY9kX9nB
https://github.com/RubenSchalk/grlc-test/blob/master/hua_beeldbank_architects.rq
Connect anything you like
35. Ask any question you like
• Royal Dutch library (KB): ‘which animals
featured most in novels by Dutch women
writers since the 1980s?’
Source: https://data.netwerkdigitaalerfgoed.nl/enno/-/
queries/Dieren-en-vrouwen/4
• DBpedia: ‘soccer players who were born in a country
with more than 10 million inhabitants, who played as
goalkeeper for a club that has a stadium with more than
30,000 seats, and whose club country is different from
their birth country’
animal count
cats 314
birds 240
dogs ; separate breeds 188
dogs 180
aviary birds 80
cats ; separate breeds 63
soccerplayer countryOfBirth team countryOfTeam stadiumcapacity
Losseny_Doumbia Niger Daring_Club_Motema_Pembe Democratic_Republic_of_Congo 80000
Arakaza_MacArthur Burundi Lusaka_Dynamos_F.C. Zambia 60000
Daniel_Ferreyra Argentina FBC_Melgar Peru 60000
Mohammed_M._Tagoe Ghana Lusaka_Dynamos_F.C. Zambia 60000
Sunday_Rotimi Nigeria Mekelle_70_Enderta_F.C. Ethiopia 60000
Anthony_Scribe France FC_Dinamo_Tbilisi Georgia_(country) 54549
Zaur_Khapov Russia FC_Dinamo_Tbilisi Georgia_(country) 54549
Jose_Carlos_Fernnndez Bolivia Deportivo_Cali Colombia 44000
Leonardo_Daaz Argentina Deportivo_Cali Colombia 44000
36. To conclude: some useful links
Utrecht University Library: https://www.uu.nl/en/university-library
UU Library Digital Humanities Support: https://www.uu.nl/en/university-library/advice-
support-to/researchers/digital-humanities-support
Royal Dutch Library: https://www.kb.nl/bronnen-zoekwijzers/dataservices-en-apis/linked-
data-van-de-kb
Netwerk Digitaal Erfgoed: https://netwerkdigitaalerfgoed.nl/activiteiten/linked-data/
Make your own Linked Data:
• https://ldwizard.netwerkdigitaalerfgoed.nl/
• https://github.com/CLARIAH/COW
• https://marcedit.reeset.net/
SPARQL 101: http://www.learningsparql.com/
Open source SPARQL interface: http://yasgui.triply.cc/
Generate API calls on SPARQL queries: http://grlc.io/