PACKED advocates for open data in the cultural heritage sector. They discuss infrastructure for publishing open data, including using persistent URI's and platforms like Wikidata. PACKED provides training on open data topics and helps cultural institutions publish collections. Their goal is to make more data available and reusable while addressing challenges like inconsistent data formats across institutions. The Flemish Art Collection discusses their work to aggregate collection data from different museums into a central Datahub and publish it through their Arthub portal. They aim to improve data quality and automate sharing to open up more collections.
Postal Ballots-For home voting step by step process 2024.pptx
Shifting minds open belgium 2019
1. Shifting minds in the
cultural sector
Towards open data in practice
Open Belgium 04.03.2019
Sam Donvil Alina Saenko
sam@packed.be alina@packed.be
@PACKEDvzw
2. • Non-profit
• 2006 - 2010: Platform for Archiving and Conservation of Art on Electronic and Digital Media
• 2011 - 2018: Centre of Expertise Digital Heritage
• from 2019: A department in VIAA - Flemish Institute for Audiovisual Archiving
• Flemish, Belgian and European projects
• Support for ICT-processes at heritage and arts organisations (o.a. creating, storage,
cataloguing, online access, exchange and reuse) and policy thereof
• Central concern: sustainability (digital heritage is vulnerable)
www.packed.be | www.viaa.be | www.projectcest.be | www.scart.be | www.projecttracks.be | www.scoremodel.org ...
3. Programme ‘Shifting minds’ session
- PACKED:
- Open GLAM community: International and in Belgium
- PACKED: open data projects: persistent identification and publishing of data
- PACKED: advocacy en training
- Conclusions and wishlist
- Use cases from the cultural heritage sector:
- Wikimedia publication of King Baudouin Foundation collections - Olivier Van
D’huynslager
- Flemish Art Collection: Arthub/Datahub - Matthias Vandermaesen
6. Infrastructure for open data in the cultural sector
Data:
- Messy and not complete, but a lot of potential knowledge
- Captured on different carriers (systems, digital and analogue files)
- Often closed (and obsolete) software
- ‘4-star’ data is becoming reachable for the cultural heritage sector
- ‘5-star’ Linked Open Data: persistent URI’s solution?
Publishing:
- Where? Own platforms vs existing Open Data Repositories
7. Infrastructure: Persistent URI’s
- Nobody knows what PIDs are and why you should use it
- Not that obvious in the cultural heritage sector
- Not just a standard part of the used collection mgmt systems,
- Not just something that you ask your IT to configure (because there is no IT)
- PID-projects (2013-2016) - Resolver-tool v1
- Open Summer of code 2018 - redevelopment
8. What is cultURIze?
Culturize is a tool for museum administrators to share data about
their collection using persistent URI’s.
Who needs CultURIze?
Registrars, curators and managers of small or medium cultural heritage
collections.
https://github.com/PACKED-vzw/CultURIze
https://github.com/PACKED-vzw/CultURIze/wiki
- prototype developed during Open Summer of code 2018 in Belgium
- inspired by W3id on Github
9. How does it work?
CultURIze is four-step process to create a persistent URI for a collection
item on the web:
- Record persistent URI's and the corresponding webresources in a
spreadsheet
- Turn the spreadsheet into a server configuration file
- Upload the file to a code sharing platform
- Periodically update your webserver to activate the persistent
URI's.
CSV-file
cultURIze app
Github repo
Webserver
10. 2019: Governance and contribution project
- Research on a sustainable governance model for the CultURIze
project.
- How to generate financial means and create a community of
developers around the CultURIze?
- Partners:
- Open Knowledge Belgium
- Flemish Art Collection
- Flemish Architecture Institute
> input is very welkom!
11. Infrastructure: Platforms for publishing open data
Use/Build your own platforms:
- Own website (download file)
- Own API’s, OAI-PMH and SPARQL endpoints
- Local/National/International cultural heritage aggregators
and datahubs
12.
13.
14.
15.
16. Infrastructure: Platforms for publishing open data
Use already existing open data repositories:
- Government Open data platforms
- Github (download file)
- Wikidata/Wikimedia Commons
17.
18.
19.
20. Infrastructure: Wikimedia platforms
Cultural heritage sector and living heritage on Wikidata and Wikimedia Commons:
- 2016: Groeningemuseum, KMSKA, MSKGent, SMAK, MuZEE, Museum M
Leuven
- 2017 - …: Centrum voor Agrarische Geschiedenis, Musea en Erfgoed
Antwerpen, Museum Plantin-Moretus, Rubenshuis, Gruuthuse Museum,
University of Antwerp library / Prentenkabinet, University of Ghent Library,
Royal Library Brussels, Letterenhuis, Horta museum, Fondation CIVA, King
Baudouin Foundation, Vlaams Architectuurinstituut VAi, MAS museum,
Iedereen leest, Kunstenpunt, De Witte Raaf, ...
- https://www.wikidata.org/wiki/Wikidata:Flemish_art_collections,_Wikidata_and_Linked_Open_Data
- https://nl.wikipedia.org/wiki/Wikipedia:Wikiproject/Procesbeschrijvingen_Belgisch-Nederlandse_podi
umkunsten
21.
22. Advocacy and Training: Open Data Bootcamp
Presentations:
● Rights clearance and rightsstatements
● Datacleaning and persistent identification
Hands-on workshops:
● Datacleaning (Openrefine)
● Enrichment (Openrefine)
● Persistent identification
(Resolver tool -> Culturize)
23. Advocacy and Training: Wikidata Birthday
Hands-on workshops:
● Manual Wikidata editing
● SPARQL-querying
Presentations:
● Upload projects
● LOD research methods
● New Wikidata features: federated
querying, …
● Data visualisations
24. Advocacy and Training: Public Domain Day
Annual Royal Library Public Domain Day
edit-a-thon after image/data donation by
institutions
Annual mini-conference
25. Advocacy and Training: Public Domain Day reuse
activities by Constant vzw
Remix Wonder Woman & Victor
Horta by Plus-tôt Te laat
Cinema Nova: performance music
by Reynaldo Hahn & screening
Ernst Lubitsch
The Death of the Authors, 1946: Xavan & Jaluka
door Peter Westenberg / Constant vzw
26. Advocacy and training: Wiki Loves Art / Heritage
Crowdsourcing
Donations images and data by
institutions
27. Challenges for LOD in cultural sector
digital movement <> no digital mindset
reducing costs <> out-of-control IT-budgets
new ‘digital’ audiences <> losing ‘traditional’ audiences
showing off with fancy tools <> locked up in obsolete technology
engaging with the ‘crowd’ <> abandoned web portals
28. Wishlist
● More IT profiles in the cultural heritage sector and less dependence on the providers
● More freedom to play around and test things out
● Better open infrastructures and tools that make better internal workflows and LOD
possible:
○ Back office tools: Collection Management Systems, DAMs , Datahubs…
○ Publishing online tools: Resolvers, API’s...
○ Reusing: Wikimedia art viewers, apps...
○ etc...
● Vocal and demanding audience:
○ curating/enriching data
○ (examples of) reuse:
■ ie. unisex Hokusai kimono by Noir Noir
29. In 2019 PACKED wants to continue to push for open data in the heritage sector:
● Projects:
○ Projects which pool resources for infrastructure on a sector-wide scale:
■ Option 1: sharing ‘one big machine’
■ Option 2: ‘networked’ infrastructure - several machines talk to each other
○ Structured Data on Commons pilots
○ Public domain publication
○ Multilingual information museums of Bruges
○ ...
● More focus on training, institutions publish LOD (semi-)autonomously
Do you see yourself contributing to one of the projects or to the wishlist?
Idea’s or feedback? Please contact us!
PACKED vzw 2019 -...
32. Infrastructure at the King Baudouin Foundation
- Messy and incomplete data
- Works in collection are dispersed over 80+ institutions (depot).
- Registered differently according to its location
- Huge gap! (7000 records of aprox. 26.000 pieces)
- Shared in different ways
- Analog x digital (closed x open(?))
- Webportals
- DAMS
- Arthub
- Overall → DISPERSED
33. Challenges (case: Collection Van Herck)
- Collect data from various
sources:
- Sculptures and drawings
- 4 institutions (3 dutch / 1 french)
- Normalize data so the metadata
becomes linkable:
- inventory numbers from various
locations:
- CVH 11A(1)
- Inv 30A
- IB00.106
34. Challenges (case: Collection Van Herck)
- Collect data from sources
- Normalize data so the metadata
becomes linkable:
- inventory numbers from various
locations:
- Different titles;
- Aaron
- Aäron
- Aaron
- Different thesauri
- …
35. WIKIDATA
Arthub (VKC)
DAMS Antwerpen
(plantin-Moretus &
Rubenshuis)
Webportal KBS
CMS
(Adlib Museum)
URI’s
KMSKA
Normalization
/ Cleaning /
Linking
spreadsheet
Grouped
XML
Manual
reconciliation
(not open)
Open
Refine
Wikimedia
Commons
Wikipedia
Processing
Quick
Statements
Pattypan
38. Results
- Response and commitment → sensibilisation
- Want to commit but find it hard to let go <> volunteers
- Multilingual and connected to authorities and identifiers:
- AAT / RKD
- Access anywhere
- Ingest DAMS (translations titles)
39. - Open web portal for collections that don’t own one
40. - Wikidata Query Service as a tool to distribute on
other Wikimedia projects:
48. Leveraging museal data is challenging
● Getting registration data out of collection management systems
○ No API’s (manual exchange)
○ Proprietary vendor formats
○ Legacy systems
○ Different institutions, different contexts.
● Quality of the data
○ Inconsistent registration (decades of organic growth)
○ No normalisation (no or limited authorities: VIAF, AAT, ICONCLASS,...)
● Which data to use?
○ Context dependent: online browseable collections.
○ Currently: Basic registration (24 base fields)
○ SPECTRUM 5.0: 21 procedures (Acquisition, Loan in/out, Condition,...)
49. Goals
● Automate sharing museal data between applications
○ Less time between registration and publishing online
● Connect complementary collections and museums
○ Fashion, Industry, Art, Folklore,...
○ Location and time period
● Audit museal data
○ Quality assessment v. digital (re)usability
○ Enrich data with external authorities (linked data)
● Open up collections
○ Publish museal data under Creative Commons licenses.
52. Arthub Flanders
● Online catalogue of the Flemish museums of Fine Arts and Contemporary Arts.
● Currently disseminates collections of:
○ Groeningemuseum (Bruges)
○ Museum of Fine Arts Ghent
○ Royal Museum of Fine Arts Antwerp
● Usable discovery interface
○ Search should yield relevant search results
○ Fast delivery of search results
○ Presentation should be usable for humans
https://arthub.vlaamsekunstcollectie.be
57. The Datahub
● Aggregation
○ Central “hub” for collection records from different sources
● Persistent storage of a copy of the collection records
○ But NOT a collection management system.
● Publication of collection records via web services
○ Open protocols: HTTP REST API & OAI-PMH
○ Open formats: LIDO XML
60. ETL Pipelining
● Extract Transform Load
○ Fetch data from a source (database, flat file, API,...)
○ Transform the data (different format, different structure)
○ Load transformed data to a destination (database, flat file, API,...)
● A pipeline is actually an automated ETL process on a server
○ Reliable
○ Modifiable
○ …
● Mappings between CMS’es, The Datahub and Arthub Flanders
○ Based on context specific business rules
○ Only: Basic registration fields
○ Concerns: Security, Confidentiality, Privacy, Copyright.
http://librecat.org
61. Data Quality
● What does “Quality” mean?
○ Does the data yield relevant answers?
○ Can I import the data in my application?
○ Can I combine the data with other datasets?
○ Can I present the data with low effort?
○ ...
● Quality is context dependeable
○ Who uses the data? Museum workers, researchers, policy makers,...
○ Where is the data used? Exhibition hall, at home, at an office,...
○ When is the data used? Before a visit, during a meeting,...
https://dashboard.vlaamsekunstcollectie.be
65. Next up
● Expanding the number of art collections on Arthub Flanders
○ Mu.ZEE, M HKA, Middelheim, S.M.A.K.
● Integrate IIIF support and offer improved image quality
○ International Image Interoperability Framework
○ https://iiif.io
● Expand towards other interested museums
○ Complementary collections in similar platforms
○ Requires further valorisation of collections out there
● Find new use cases for open museal data
○ Improve operations in museums themselves
○ Sharing knowledge with other organisations
○ Creative industries
○ Tourism & Marketing
○ ...
66. Find us on Github
https://github.com/thedatahub
https://github.com/vlaamsekunstcollectie