Presentation at the ENICPA Round Table on 22 October 2019 in Prague on Wikidata and performing arts. Author: Beat Estermann, Bern University of Applied Sciences.
Beat EstermannResearcher em Bern University of Applied Sciences
Estermann ENICPA Wiki Loves Performing Arts 20191022
1. Wikidata & Performing Arts
Prof. Beat Estermann, Bern University of Applied Sciences
ENICPA Round Table, Prague, 22 October 2019
Unless otherwise noted, the content of this presentation is made available under the CC BY 4.0 license.
Photo: Phantom black light theatre (theatre group HILT), User:Black light theatre Prague, Wikimedia Commons, CC BY-SA 4.0.
2. • Short introduction to Wikidata
• What is its purpose?
• How does it work?
• Wikidata + Performing Arts
• Aim & vision
• Where do we stand?
• From “Sum of All GLAMs” to “Wiki Loves Performing Arts”
• What is Sum of All GLAMs?
• The role of Wikidata in the wider context of the Linked Open Data
Ecosystem for the Performing Arts
On the Programme Today
6. Purpose of Wikidata
• Centralized Interwiki-Links [Example: Prague]
• Centralized Data Management for Infoboxes [Example: Bern Theatre]
• Centralized Data Management for Lists [Example: Lista de pinturas de A. Norfini]
• Possibility of Querying the Data in a Standardized Format
[Example: Histropedia]
« The Sum of All Human Knowledge» as Linked Open Data
Multilingual
With Sourced Statements
Freely usable by anyone (CC Zero)
8. • Realize an international performing arts database on the
basis of Wikidata
• Provide a powerful finding aid for performing arts related content
on Wikimedia Commons
• Promote Wikidata-powered performing arts related information in
the various language versions of Wikipedia
• Get heritage institutions to make their performing arts related
data and content available through Wikidata & Wikimedia
Commons
The Vision: International Database for the Performing
Arts (Wikidata Project Performing Arts)
Role Model Projects for Inspiration:
• MusicBrainz (music recordings)
• IMDb (movies)
• IMSLP (music scores)
• Operabase (opera)
10. Wikidata & Performing Arts: Some Statistics
Class of items N items (Oct. 2019) Δ since April 2019
musical work 450’000 +13’000
play 22’000 +650
choreographic work 900 +7
character role 23’000 +12’000
performing arts building 21’000 +1000
musician 270’000 +10’000
actor/actress 270’000 +16’000
musical ensemble 93’000 +5’900
theatre troupe 5’200 +150
dance troupe 370 +28
performing arts production 21’000 +2’500
11. Core Aspects of Linked Data Publication
Source: eCH-0205 – Linked Open Data
12. Current Challenges
Source of the graphic: eCH-0205 – Linked Open Data
Data
scraping &
cleansing
Data Ingest (data
mapping &
matching) Data
Modelling
Issues
✔ ✔
Manual Data
Entry
How to Overcome the
Chicken-and-Egg
Problem?
13. • Current examples show great potential for the inclusion of
Wikidata-powered content in the field of the performing arts.
• Current initiatives may benefit from improved coordination –
also across linguistic borders.
• Examples...
Status Quo – Wikipedia
14. Example: List of Productions of « Les Galas Karsenty »
(French Wikipedia)
20. Example: List of Artistic Directors and Well-Known
Artists at Stadttheater Bern (German Wikipedia)
21. Status Quo – Wikipedia/Wikidata
Despite many examples of how structured data is used, the data is
usually not pulled from Wikidata.
Large parts of the structured data in Wikipedia related to the
performing arts is not available on Wikidata.
22. From “Sum of All GLAMs”
to Wiki Loves Performing Arts
• Presenting “Sum of All GLAMs”
• Wiki Loves Performing Arts
23. Various layers of information about heritage institutions
Sum of All GLAMs Project
Source: Fontenelle & Estermann (2019) An International Knowledge Base for All Heritage Institutions
Wiki Movement Brazil and OpenGLAM CH, with the
support of the MY-D Foundation
FindingGLAMs
Wikimedia Sweden, UNESCO and WMF, with the support of the
Swedish Postcode Foundation
27. • Start by describing performing arts venues and organizations with the
goal of engaging people and organizations directly.
• Continue with further classes...
Wiki Loves Performing Arts
Phase 1
• Tackle data modelling issues
• Locate and ingest existing datasets
• Create infobox templates
• Start monitoring the data for quality and completeness
Phase 2
• Run crowdsourcing campaigns to complement the data,
targeting both Wikipedians and arts organizations
28. Breakdown of Tasks / Possibilities for Contribution
International
Coordination
Group
Wikidata
Team
Country-
specific
Teams
Language-
specific
Teams
(Wikipedia)
Arts organi-
zations;
heritage
institutions
Tackle data modelling issues
on Wikidata
Contribute Lead Contribute
Track data quality &
completeness
Contribute Lead Contribute Contribute
Ingest data from existing
databases
Contribute Lead Contribute
Run campaigns to enhance
the data on Wikidata
Contribute Lead Contribute
Get heritage institutions to
curate their own data
Lead Contribute
Implement Infobox and Mbabel
templates on Wikipedias
(secure community buy-in)
Lead
Promote the use of the
templates
Contribute Lead Contribute
Promote other uses of the data Contribute Lead Contribute
Write guidelines & reach out to
people in further countries
Lead Contribute Contribute Contribute Contribute
29. Wikidata and classical LOD are complementary
Wikidata Classical LOD
Strengths
Fully-fledged crowdsourcing
platform; further parties can easily be
invited to contribute.
Data owners keep the control over
their «graphs»; data quality and
completeness remains under the
control of the data provider.
Immediate integration with the
worldwide linked data cloud
(reconciliation at the moment of data
ingest)
Data can be published in RDF format,
is linkable; reconcilation against other
databases can be done step by step.
Community-supported LOD service
with a certain level of reliability
Weaknesses
Monitoring data quality and
completeness is a permanent and
challenging task.
Third parties cannot readily fix issues
related to data quality or comple-
teness that are not taken care of by the
data provider.
Harmonization of data modelling
practices is a challenge.
Harmonization of data modelling
practices within one’s own «silo» is
straightforward, but might be a great
challenge to implement across «silos».
Perceived «loss of control» by data
owners
Introducing collaborative data
maintenance practices is difficult.
Many current LOD services are of
questionable reliability.
30. Wikidata in the wider context of the Linked Open
Data Ecosystem for the Performing Arts
When it comes to publishing data on Wikidata, priority should be
given to data:
• where it is unclear who would be the «natural» authority in the
given area (on a global scale);
• where there is a high potential for enhancing data through
crowdsourcing approaches (including community or expert
sourcing);
• where data is likely to be reused in the context of Wikipedia;
• where international coordination to ensure semantic
interoperability of the data is unlikely to take place elsewhere.
Focus on base registers / authority files and controlled vocabularies
first; they facilitate further interlinking of datasets!
31. Questions / Feedback?
Contact
Prof. Beat Estermann
Bern University of Applied Sciences
Institute for Public Sector Transformation
beat.estermann@bfh.ch
+41 31 848 34 38
https://www.wikidata.org/wiki/Wikidata:WikiProject_Performing_arts