This document summarizes research into using IIIF and Sitemaps technologies for metadata aggregation at Europeana. It describes case studies conducted with the National Library of Wales and University College Dublin on crawling their IIIF services using IIIF collections and Sitemaps. The studies found these technologies provided simple, effective solutions for metadata aggregation with few technological obstacles. Future work includes additional case studies and monitoring new trends to improve aggregation workflows at Europeana.
2. Title here
CC BY-SA
Outline
● Motivation for rethinking metadata aggregation approaches
• Focus: technology adoption in Cultural Heritage/Europeana
● Investigated technologies: IIIF and Sitemaps
● Case studies
● Application of the results in aggregation at Europeana
● Ongoing and future work
CC BY-SA
Metadata aggregation of IIIF Resources at Europeana
3. Czech Republic, PD
1887, Uměleckoprůmyslové museum v Praze
Preissig, Vojtech
Coloured etchings
Motivation in the context
of Cultural Heritage
4. Title here
CC BY-SA
Europeana
The Platform for Europe’s Digital Cultural Heritage
● Europeana aggregates (and makes available) metadata:
• From all EU countries
• From ~3,500 galleries,
libraries, archives and museums
• Under a CC0 licence
• More than 54M objects
• In about 50 languages
“We transform the world with culture! We
want to build on Europe’s rich heritage and
make it easier for people to use, whether
for work, for learning or just for fun.”
CC BY-SA
5. Title here
CC BY-SA
What kinds of technologies are we
considering?
● Focus on technology adoption:
• Technologies that present low barriers for adoption by data providers
● Technologies used by Cultural Heritage institutions for other purposes
• Search engine optimization
• Linked data
• Social web technologies
• IIIF
● What are the successors of OAI-PMH?
CC BY-SA
6. Cristallisation ou Mouvement du
temps, René Bord
1987, Bibliothèque Municipale De Lyon,
public domain
Investigated technologies:
IIIF
7. Brief introduction to the IIIF APIs
Europeana & IIIF
CC BY-SA
How can IIIF be used for metadata aggregation?
8. Ben Albritton Mike Appleby Tom Cramer Jon Stroop Rob Sanderson Stu Snydman Simeon Warner IIIF.io
@bla222 @mikeapps @tcramer @jpstroop @azaroth42 @stusnydman @zimeon @iiif_io
Object = Image + Presentation
9. Ben Albritton Mike Appleby Tom Cramer Jon Stroop Rob Sanderson Stu Snydman Simeon Warner IIIF.io
@bla222 @mikeapps @tcramer @jpstroop @azaroth42 @stusnydman @zimeon @iiif_io
Presentation API
•Descriptive:
label, description
•Rights: license,
attribution
(to be c’ed)
Image API
● Image Data
Object = Image + Presentation
10. Ben Albritton Mike Appleby Tom Cramer Jon Stroop Rob Sanderson Stu Snydman Simeon Warner IIIF.io
@bla222 @mikeapps @tcramer @jpstroop @azaroth42 @stusnydman @zimeon @iiif_io
Presentation API (c’ed)
• Structure
• Collections of objects
• Manifests organizing Items, Sequences, Parts together with their
metadata
• Linking
• service: additional service endpoint
• related: resource to display to the user
• seeAlso: semantic metadata resource
11. Cristallisation ou Mouvement du
temps, René Bord
1987, Bibliothèque Municipale De Lyon,
public domain
Investigated technologies:
Sitemaps
12. Sitemaps
CC BY-SA
● Sitemaps allow webmasters to inform search engines about pages on their
sites that are available for crawling
● Sitemaps are supported/used by:
• all major search engines
• many content management systems
• many Europeana data providers
● Sitemaps provide a simple technological solution with a very low
implementation barrier
● Sitemaps can support a large range of resource types
• Sitemaps has extensions for images and videos (defined by Google)
14. First case study:
Crawling services across the IIIF universe
Questions addressed:
• Can Europeana find the available IIIF services through IIIF Service
Registries?
• Is the output of IIIF crawlable? Can robots follow links in IIIF output and
reach all resources?
• How mature and uniform are existing IIIF implementations ?
• Is metadata available?
• Are machine readable licenses available?
CC BY-SA
15. First case study:
Crawling services across the IIIF universe
Main conclusions:
• Registries are available and are machine readable, but coverage was only
partial
• IIIF provides all that is necessary, but some features are optional (e.g. IIIF
Collections)
• Minor compliance problems only due to immaturity of the
implementations
• IIIF provides a way to link to metadata, but it is optional (and often not
used, misused, or not fully informative)
• IIIF provides licensing information, but it is optional (and often not used)
CC BY-SA
16. Case studies with
Europeana Partners
Netherlands, Public Domain
1910-1925, Rijksmuseum
Anonymous
Tak met vier mangolia’s
17. Case studies with partners
Europeana & IIIF
CC BY-SA
To study the feasibility of performing metadata aggregation via IIIF/Sitemaps
we have undertaken case studies with providers of the Europeana Network
• National Library of Wales
• Very active in the IIIF community
• Very advanced in IIIF implementation
• Expertise in full-text content (over IIIF)
• University College Dublin
• Very advanced in IIIF implementation
• Expertise in internet search engine optimization (Sitemaps and its media specific
extensions)
18. Case studies with National Library of Wales
and University College Dublin
• Crawling IIIF services via IIIF Collections
• Crawling IIIF services via Sitemaps
• Standard Sitemaps
• Sitemaps extended with elements used in IIIF specifications
• Sitemaps extended with elements from the ResourceSync namespace
• Crawling IIIF services via IIIF Collections and HTTP cache headers
• HTTP cache headers allows crawlers to use resource modification
timestamps
• Timestamps are essential for aggregating large collections
CC BY-SA
19. CC BY-SA
Main conclusions from the case studies
• Applying these technologies was straightforward for providers
• When providers have in-house knowledge on a technology, its adoption/adaptation is
simplified
• None of the case studies presented serious technological obstacles
• Very simple technological solutions are available
• Only very large collections may require additional complexity
• ...the main challenge is to choose among the several possibilities and
establishing a standard (or best practice) within the community(ies):
• Europeana is working with the IIIF community in the context of the IIIF Discovery Technical
Specification group
• Europeana will prepare recommendations targeted at its own partner network.
20. Cristallisation ou Mouvement du
temps, René Bord
1987, Bibliothèque Municipale De Lyon,
public domain
Application of the results
21. CC BY-SA
Operational IIIF/Sitemaps harvests so far
@Europeana
The outcomes of the case studies have resulted in real
cases of IIIF/Sitemaps based aggregation into Europeana:
• National Library of Wales
• Sitemaps + IIIF
• University College Dublin
• Sitemaps + IIIF +Sitemaps Video Extension
• Wellcome library
• IIIF Collection + IIIF
22. Future work
France, Public Domain
Agence Rol. Agence photographique,
Bibliothèque national de France
Chat "regardant" à travers une longue-vue et
autre chat perché dessus
23. CC BY-SA
R&D ongoing work
Crawling websites/LOD/IIIF in search for
resources represented with Schema.org
• Research Question:
• Can metadata still comply with the requirements of Europeana/EDM, by being
represented with Schema.Org? If so, with what level of quality?
• One IIIF case study is in progress at this time
• IIIF provider: North Carolina State University Libraries
24. CC BY-SA
Future work
• Research the implications of IIIF and Sitemaps harvesting for the internal
workflows of aggregators
• ResourceSync: one case study in preparation with a collection of more than
600.000 resources
• Continue monitoring and investigating technology trends in our domain:
• Follow the outcomes from the IIIF Discovery Technical Specification Group[1]
• The Linked Data Platform [2]
• Notification Frameworks usage for metadata aggregation
WebSub[3], Linked Data Notifications [4]
25. Thank you for your attention
nuno.freire@tecnico.ulisboa.pt
Netherlands, Public Domain
1660 - 1625, Rijksmuseum
Anonymous
Arrival of a Portuguese ship
Acknowledgments
Valentine Charles, Europeana Foundation
Fundação para a Ciência e a Tecnologia (FCT): UID/CEC/50021/2013
European Commission: grant agreement number CEF-TC-2015-1-01.