SlideShare uma empresa Scribd logo
1 de 25
Multilinguality of Metadata
Measuring the Multilingual Degree of Europeana‘s Metadata
Juliane Stiller1, Péter Király2
1 Berlin School of Library and Information Science, Humboldt-Universität zu Berlin
2 Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
ISI 2017, March 14, 2017
1
Languages by eltpics
Agenda
1. Multilinguality in Europeana
2. Multilingual Score for Metadata
3. Implementation
4. Discussion & Future Work
2
Plattform for Cultural
Heritage Material
www.europeana.eu
3
○ Books, newspapers, letters, paintings,
photographs, radio shows, films, etc.
○ Text, images, video, audio, sounds, 3D
○ Over 54 million objects
○ > 50 languages
Europeana - Facts
http://statistics.europeana.eu/europeana 4
Thumbnail
Metadata
Link to Provider
Metadata Multilinguality
6+ 40 other languages....
The Multilingual Problem
7
○ Mona Lisa 456 results
○ La Gioconda 365 results
○ La Joconde 71 results
http://www.europeana.eu/portal/en/r
ecord/90402/RP_F_00_351.html
Metadata Enrichment
8
Quantify the Multilinguality of Data to
○ Take measures to improve multilinguality in data
○ Establish a sense of the multilingual reach of Europeana
○ Distribution of languages
○ Devise strategies for underrepresented languages
Multilingual Score for Metadata
10
Multilingual saturation of metadata
11
Text w/o language annotation (dc.subject: Germany)
Text w language annotation (dc.subject: Germany@en)
Text w several language annotations (dc.subject:
Germany@en, Deutschland@de)
Link to (multilingual) vocabulary (http://www.geonames.org
/2921044/ federal-republic-of-germany)
Calculation
Missing field
Text string without language tag (language not known)
Text string with 2-3 different language tags
Text string with 4-9 different language tags
Text string with more than 10 different language tags
Link to (multilingual) vocabulary
Text string with language tag (language known)
NA
0
1
2
2.3
2.6
3
Example score
13
Text w/o language annotation (dc.subject: Germany):
Text w language annotation (dc.subject: Germany@en)
Text w several language annotations (dc.subject:
Germany@en, Deutschland@de)
Link to (multilingual) vocabulary (http://www.geonames.org
/2921044/ federal-republic-of-germany)
0
1
2
3
Aggregation of property dc:subject
The Wittgenstein
Archives at the
University of
Bergen: high
saturation
National Library Portugal: low
saturation
14http://144.76.218.178/europeana-qa/saturation.php?collectionId=all&field=proxy_dc_subject&type=average
Good examples
"Die Mauer muß weg!"@de
"Die Mauer muß weg! (The Wall
must go!)"@en
15
"Kommentiertes Fotorama mit
Bildern von 1989-1990 in
Berlin"@de
"Annotated images from 1989-
1990 in Berlin"@en
dc:descriptiondc:title
"Brandenburger Tor"@de
"Brandenburg Gate"@en
"Grenzübergang Potsdamer Platz"@de
"Postdamer Platz border crossing"@en
"Reichstag"@de
"Reichstag building"@en
Place/skos:prefLabel
Descriptive fields Subject headings
Implementation
source codes: http://pkiraly.github.io/about/#source-codes
data source: http://hdl.handle.net/21.11101/0000-0001-781F-7
(Europeana snapshot, 2015 december) 16
Data processing workflow
web interfacestatistical analysismeasuringingestion
★ OAI-PMH
★ Europeana API
★ Hadoop
★ NoSQL
★ Spark
★ Hadoop
★ Java
★ Apache Solr
★ Spark
★ R
★ PHP
★ D3.js
★ highchart.js
★ NoSQL
json csv json, png html, svg
17
Visualization
1818
APIs,
abstraction,
reusing
"Place/skos:altLabel": {
"instances": [
{"TRANSLATION": 2.0},
{"TRANSLATION": 2.0},
{"TRANSLATION": 2.0},
...
{"TRANSLATION": 2.40},
{"STRING": 0.0},
],
"score": {
"sum": 20.40,
"average": 1.85454545,
"normalized": 0.649681
}
}
Discussion & Future
Work
20
extension I. recalculation
The new metrics
★ Distinct languages per object
★ Language tags per object
★ Literals per language
★ Number of multilingual properties (a.k.a. fields)
★ Number of multilingual statements (a.k.a. field instances)
★ Average number of languages per property with language
★ Average number of languages per proxy
21
extension II. record views
ex:providerProxy
dc:subject "special relativity"@en ;
dc:creator <http://vocab.getty.eu/ulan/500240971> ;
dc:type <http://udcdata.info/001684> .
ex:europeanaProxy
dc:subject <http://dbpedia.org/resource/Physics> .
<http://vocab.getty.edu/ulan/500240971>
skos:prefLabel "Einstein, Albert"@de .
standard vocabulary
<http://dbpedia.org/resource/Physics>
skos:prefLabel "Physics"@en .
<http://udcdata.info/001684>
skos:prefLabel "Books in general"@en .
standard vocabulary
non-standard vocabulary
22
extension II. record views
source field link value ① ② ③ ④
ex:providerProxy dc:subject literal "special relativity"@en ① ② ③ ④
dc:creator standard "Einstein, Albert"@de ① ② ③ ④
dc:type non-std "Books in general"@en ② ④
ex:europeanaProxy dc:subject standard "Physics"@en ③ ④
① data provider's proxy and standard enrichments
② data provider's proxy and enrichments
③ all proxies and standard enrichments
④ all proxies and enrichments
23
Questions
○ contact
juliane.stiller@ibi.hu-berlin.de
peter.kiraly@gwdg.de
○ Metadata Quality Assurance
Framework
http://144.76.218.178/europeana-qa
○ Europeana Data Quality Committee
http://pro.europeana.eu/page/dat
a-quality-committee
24
Appendix
Europeana data structure in 30 sec
provider proxy
Europeana proxy
Agent
Concept
Place
Timespan
descriptive fields
subject headings
semanticweb

Mais conteúdo relacionado

Mais procurados

A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...
A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...
A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...
Matthias Arnold
 
Supporting the Digital Scholar: Experiences from the British Library Labs
Supporting the Digital Scholar:Experiences from the British Library LabsSupporting the Digital Scholar:Experiences from the British Library Labs
Supporting the Digital Scholar: Experiences from the British Library Labs
labsbl
 

Mais procurados (20)

Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomCorpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
 
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
 
Introduction to Annotation, Content Search, and IIIF Authentication from the ...
Introduction to Annotation, Content Search, and IIIF Authentication from the ...Introduction to Annotation, Content Search, and IIIF Authentication from the ...
Introduction to Annotation, Content Search, and IIIF Authentication from the ...
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Open Cultural Data in Switzerland
Open Cultural Data in SwitzerlandOpen Cultural Data in Switzerland
Open Cultural Data in Switzerland
 
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
 
Modelling annotations for Europeana and related projects - DARIAH-EU WS
Modelling annotations for Europeana and related projects - DARIAH-EU WSModelling annotations for Europeana and related projects - DARIAH-EU WS
Modelling annotations for Europeana and related projects - DARIAH-EU WS
 
Multilingual challenges in Europeana
Multilingual challenges in EuropeanaMultilingual challenges in Europeana
Multilingual challenges in Europeana
 
Cooperating with Google
Cooperating with GoogleCooperating with Google
Cooperating with Google
 
Estermann Panel on Authority Files, 3 June 2020
Estermann Panel on Authority Files, 3 June 2020Estermann Panel on Authority Files, 3 June 2020
Estermann Panel on Authority Files, 3 June 2020
 
Toulouse Data Science meetup - Apache zeppelin
Toulouse Data Science meetup - Apache zeppelinToulouse Data Science meetup - Apache zeppelin
Toulouse Data Science meetup - Apache zeppelin
 
Estermann wikidata performing-arts-20181109
Estermann wikidata performing-arts-20181109Estermann wikidata performing-arts-20181109
Estermann wikidata performing-arts-20181109
 
A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...
A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...
A Corpus of Chinese Comic Books: Database, Metadata, and Visual Object Recogn...
 
Supporting the Digital Scholar: Experiences from the British Library Labs
Supporting the Digital Scholar:Experiences from the British Library LabsSupporting the Digital Scholar:Experiences from the British Library Labs
Supporting the Digital Scholar: Experiences from the British Library Labs
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
 
Methodological Proposals for Designing Federative Platforms in Cultural Linke...
Methodological Proposals for Designing Federative Platforms in Cultural Linke...Methodological Proposals for Designing Federative Platforms in Cultural Linke...
Methodological Proposals for Designing Federative Platforms in Cultural Linke...
 
Wikidata and performing_arts_20180116
Wikidata and performing_arts_20180116Wikidata and performing_arts_20180116
Wikidata and performing_arts_20180116
 
Post-Its and Placemarks
Post-Its and PlacemarksPost-Its and Placemarks
Post-Its and Placemarks
 
Open ONI and IIIF: NDNP data in an IIIF Viewer
Open ONI and IIIF: NDNP data in an IIIF ViewerOpen ONI and IIIF: NDNP data in an IIIF Viewer
Open ONI and IIIF: NDNP data in an IIIF Viewer
 
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
 

Destaque

Destaque (20)

Aναδρομή στα Graffiti
Aναδρομή στα GraffitiAναδρομή στα Graffiti
Aναδρομή στα Graffiti
 
김영욱 - Microsoft Bot Framework [WSConf. Seoul 2017]
김영욱 - Microsoft Bot Framework [WSConf. Seoul 2017]김영욱 - Microsoft Bot Framework [WSConf. Seoul 2017]
김영욱 - Microsoft Bot Framework [WSConf. Seoul 2017]
 
Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)
 
Top Schools in Dehradun
Top Schools in DehradunTop Schools in Dehradun
Top Schools in Dehradun
 
Boarding schools in Dehradun
Boarding schools in DehradunBoarding schools in Dehradun
Boarding schools in Dehradun
 
ヘルスリズム資料
ヘルスリズム資料ヘルスリズム資料
ヘルスリズム資料
 
Trabajo de atención primaria
Trabajo de atención primariaTrabajo de atención primaria
Trabajo de atención primaria
 
Devenir best friend forever avec vos développeurs measure camp nantes 2016
Devenir best friend forever avec vos développeurs   measure camp nantes 2016Devenir best friend forever avec vos développeurs   measure camp nantes 2016
Devenir best friend forever avec vos développeurs measure camp nantes 2016
 
Dasen brajkovic understanding mental-health-and-mental-illness.ppt
Dasen brajkovic understanding mental-health-and-mental-illness.pptDasen brajkovic understanding mental-health-and-mental-illness.ppt
Dasen brajkovic understanding mental-health-and-mental-illness.ppt
 
El gran libro_del_dibujo
El gran libro_del_dibujoEl gran libro_del_dibujo
El gran libro_del_dibujo
 
BlueMonkeyプロジェクトのご紹介
BlueMonkeyプロジェクトのご紹介BlueMonkeyプロジェクトのご紹介
BlueMonkeyプロジェクトのご紹介
 
Channel Marketing Scorecard
Channel Marketing ScorecardChannel Marketing Scorecard
Channel Marketing Scorecard
 
Lost paradise by photographer Sergey Shapochkin (Putilov)
Lost paradise  by photographer Sergey Shapochkin (Putilov)Lost paradise  by photographer Sergey Shapochkin (Putilov)
Lost paradise by photographer Sergey Shapochkin (Putilov)
 
Seo camp'us 2017 utiliser google analytics comme un voyou - aristide riou
Seo camp'us 2017   utiliser google analytics comme un voyou - aristide riouSeo camp'us 2017   utiliser google analytics comme un voyou - aristide riou
Seo camp'us 2017 utiliser google analytics comme un voyou - aristide riou
 
Conf orm - explain
Conf orm - explainConf orm - explain
Conf orm - explain
 
The eXtensible Catalog Drupal Toolkit
The eXtensible Catalog Drupal ToolkitThe eXtensible Catalog Drupal Toolkit
The eXtensible Catalog Drupal Toolkit
 
Solr in Drupal
Solr in DrupalSolr in Drupal
Solr in Drupal
 
resume-detailedv2
resume-detailedv2resume-detailedv2
resume-detailedv2
 
Metadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full versionMetadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full version
 
Botijos y ollas
Botijos y ollasBotijos y ollas
Botijos y ollas
 

Semelhante a Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s Metadata

Europeana Sounds in Strategy 2020, Feb 2014
Europeana Sounds in Strategy 2020, Feb 2014 Europeana Sounds in Strategy 2020, Feb 2014
Europeana Sounds in Strategy 2020, Feb 2014
Europeana
 
Мартин Клиим. Open City Data.
Мартин Клиим. Open City Data.Мартин Клиим. Open City Data.
Мартин Клиим. Open City Data.
zabej
 
The British Library, London: Old Maps Online
The British Library,  London: Old Maps OnlineThe British Library,  London: Old Maps Online
The British Library, London: Old Maps Online
Petr Pridal
 

Semelhante a Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s Metadata (20)

The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
 
Data Quality Assessment in Europeana: Metrics for Multilinguality
Data Quality Assessment in Europeana:  Metrics for MultilingualityData Quality Assessment in Europeana:  Metrics for Multilinguality
Data Quality Assessment in Europeana: Metrics for Multilinguality
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
 
Europeana Sounds in Strategy 2020, Feb 2014
Europeana Sounds in Strategy 2020, Feb 2014 Europeana Sounds in Strategy 2020, Feb 2014
Europeana Sounds in Strategy 2020, Feb 2014
 
What's up, Europeana Newspapers?
What's up, Europeana Newspapers?What's up, Europeana Newspapers?
What's up, Europeana Newspapers?
 
Open City Data & Open Culture Data
Open City Data & Open Culture DataOpen City Data & Open Culture Data
Open City Data & Open Culture Data
 
Мартин Клиим. Open City Data.
Мартин Клиим. Open City Data.Мартин Клиим. Open City Data.
Мартин Клиим. Open City Data.
 
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
 
Europeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) caseEuropeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) case
 
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
 
Des nouvelles d’Europeana
Des nouvelles d’EuropeanaDes nouvelles d’Europeana
Des nouvelles d’Europeana
 
Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13
 
DARIAH-DE WS: Modeling annotations for Europeana and related projects
DARIAH-DE WS: Modeling annotations for Europeana and related projectsDARIAH-DE WS: Modeling annotations for Europeana and related projects
DARIAH-DE WS: Modeling annotations for Europeana and related projects
 
Europeana Newspapers LFT Infoday Muehlberger
Europeana Newspapers LFT Infoday MuehlbergerEuropeana Newspapers LFT Infoday Muehlberger
Europeana Newspapers LFT Infoday Muehlberger
 
The current status of TDM in Europe
The current status of TDM in EuropeThe current status of TDM in Europe
The current status of TDM in Europe
 
The British Library, London: Old Maps Online
The British Library,  London: Old Maps OnlineThe British Library,  London: Old Maps Online
The British Library, London: Old Maps Online
 
Digital Humanities @ Net7
Digital Humanities @ Net7Digital Humanities @ Net7
Digital Humanities @ Net7
 
Measuring Metadata Quality in Europeana (ADOCHS 2017)
Measuring Metadata Quality in Europeana (ADOCHS 2017)Measuring Metadata Quality in Europeana (ADOCHS 2017)
Measuring Metadata Quality in Europeana (ADOCHS 2017)
 
Doing Digital Research @ British Library
Doing Digital Research @ British LibraryDoing Digital Research @ British Library
Doing Digital Research @ British Library
 
Europeana Newspapers - Data, Tools & Future Plans
 Europeana Newspapers - Data, Tools & Future Plans  Europeana Newspapers - Data, Tools & Future Plans
Europeana Newspapers - Data, Tools & Future Plans
 

Mais de Péter Király

Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Péter Király
 

Mais de Péter Király (20)

Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
 
Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)
 
Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)
 
Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)
 
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
 
Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)
 
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
 
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
 
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
 
Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)
 
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
 
FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)
 
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
 
Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...
 
Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)
 
Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)
 
Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)
 
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
 
Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)
Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)
Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)
 
Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)
 

Último

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 

Último (20)

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s Metadata

  • 1. Multilinguality of Metadata Measuring the Multilingual Degree of Europeana‘s Metadata Juliane Stiller1, Péter Király2 1 Berlin School of Library and Information Science, Humboldt-Universität zu Berlin 2 Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen ISI 2017, March 14, 2017 1 Languages by eltpics
  • 2. Agenda 1. Multilinguality in Europeana 2. Multilingual Score for Metadata 3. Implementation 4. Discussion & Future Work 2
  • 3. Plattform for Cultural Heritage Material www.europeana.eu 3
  • 4. ○ Books, newspapers, letters, paintings, photographs, radio shows, films, etc. ○ Text, images, video, audio, sounds, 3D ○ Over 54 million objects ○ > 50 languages Europeana - Facts http://statistics.europeana.eu/europeana 4
  • 6. Metadata Multilinguality 6+ 40 other languages....
  • 7. The Multilingual Problem 7 ○ Mona Lisa 456 results ○ La Gioconda 365 results ○ La Joconde 71 results http://www.europeana.eu/portal/en/r ecord/90402/RP_F_00_351.html
  • 9. Quantify the Multilinguality of Data to ○ Take measures to improve multilinguality in data ○ Establish a sense of the multilingual reach of Europeana ○ Distribution of languages ○ Devise strategies for underrepresented languages
  • 10. Multilingual Score for Metadata 10
  • 11. Multilingual saturation of metadata 11 Text w/o language annotation (dc.subject: Germany) Text w language annotation (dc.subject: Germany@en) Text w several language annotations (dc.subject: Germany@en, Deutschland@de) Link to (multilingual) vocabulary (http://www.geonames.org /2921044/ federal-republic-of-germany)
  • 12. Calculation Missing field Text string without language tag (language not known) Text string with 2-3 different language tags Text string with 4-9 different language tags Text string with more than 10 different language tags Link to (multilingual) vocabulary Text string with language tag (language known) NA 0 1 2 2.3 2.6 3
  • 13. Example score 13 Text w/o language annotation (dc.subject: Germany): Text w language annotation (dc.subject: Germany@en) Text w several language annotations (dc.subject: Germany@en, Deutschland@de) Link to (multilingual) vocabulary (http://www.geonames.org /2921044/ federal-republic-of-germany) 0 1 2 3
  • 14. Aggregation of property dc:subject The Wittgenstein Archives at the University of Bergen: high saturation National Library Portugal: low saturation 14http://144.76.218.178/europeana-qa/saturation.php?collectionId=all&field=proxy_dc_subject&type=average
  • 15. Good examples "Die Mauer muß weg!"@de "Die Mauer muß weg! (The Wall must go!)"@en 15 "Kommentiertes Fotorama mit Bildern von 1989-1990 in Berlin"@de "Annotated images from 1989- 1990 in Berlin"@en dc:descriptiondc:title "Brandenburger Tor"@de "Brandenburg Gate"@en "Grenzübergang Potsdamer Platz"@de "Postdamer Platz border crossing"@en "Reichstag"@de "Reichstag building"@en Place/skos:prefLabel Descriptive fields Subject headings
  • 16. Implementation source codes: http://pkiraly.github.io/about/#source-codes data source: http://hdl.handle.net/21.11101/0000-0001-781F-7 (Europeana snapshot, 2015 december) 16
  • 17. Data processing workflow web interfacestatistical analysismeasuringingestion ★ OAI-PMH ★ Europeana API ★ Hadoop ★ NoSQL ★ Spark ★ Hadoop ★ Java ★ Apache Solr ★ Spark ★ R ★ PHP ★ D3.js ★ highchart.js ★ NoSQL json csv json, png html, svg 17
  • 19. APIs, abstraction, reusing "Place/skos:altLabel": { "instances": [ {"TRANSLATION": 2.0}, {"TRANSLATION": 2.0}, {"TRANSLATION": 2.0}, ... {"TRANSLATION": 2.40}, {"STRING": 0.0}, ], "score": { "sum": 20.40, "average": 1.85454545, "normalized": 0.649681 } }
  • 21. extension I. recalculation The new metrics ★ Distinct languages per object ★ Language tags per object ★ Literals per language ★ Number of multilingual properties (a.k.a. fields) ★ Number of multilingual statements (a.k.a. field instances) ★ Average number of languages per property with language ★ Average number of languages per proxy 21
  • 22. extension II. record views ex:providerProxy dc:subject "special relativity"@en ; dc:creator <http://vocab.getty.eu/ulan/500240971> ; dc:type <http://udcdata.info/001684> . ex:europeanaProxy dc:subject <http://dbpedia.org/resource/Physics> . <http://vocab.getty.edu/ulan/500240971> skos:prefLabel "Einstein, Albert"@de . standard vocabulary <http://dbpedia.org/resource/Physics> skos:prefLabel "Physics"@en . <http://udcdata.info/001684> skos:prefLabel "Books in general"@en . standard vocabulary non-standard vocabulary 22
  • 23. extension II. record views source field link value ① ② ③ ④ ex:providerProxy dc:subject literal "special relativity"@en ① ② ③ ④ dc:creator standard "Einstein, Albert"@de ① ② ③ ④ dc:type non-std "Books in general"@en ② ④ ex:europeanaProxy dc:subject standard "Physics"@en ③ ④ ① data provider's proxy and standard enrichments ② data provider's proxy and enrichments ③ all proxies and standard enrichments ④ all proxies and enrichments 23
  • 24. Questions ○ contact juliane.stiller@ibi.hu-berlin.de peter.kiraly@gwdg.de ○ Metadata Quality Assurance Framework http://144.76.218.178/europeana-qa ○ Europeana Data Quality Committee http://pro.europeana.eu/page/dat a-quality-committee 24
  • 25. Appendix Europeana data structure in 30 sec provider proxy Europeana proxy Agent Concept Place Timespan descriptive fields subject headings semanticweb

Notas do Editor

  1. Neu machen
  2. Warum hat nun der Link zu einem kontrollierten Vokabular die höchste Sättigung? Das es uns parallele Sprachvarianten in verschiedenen Sprachen bietet von dene wir sicher sind dass es Übersetzungen sind. NOTE (Péter): Antoine distinguished at least 2 categories: 1) link to vocabulary which is deferencable by Europeana (such as Geonames, VIAF, GND etc. - I would call them standard vocabularies) 2) link to other vocabulary
  3. Warum hat nun der Link zu einem kontrollierten Vokabular die höchste Sättigung? Das es uns parallele Sprachvarianten in verschiedenen Sprachen bietet von dene wir sicher sind dass es Übersetzungen sind.