SlideShare uma empresa Scribd logo
1 de 42
Towards an extensible measurement
of metadata quality
Péter Király
Way to DATeCH, 2017-05-22
Measuring metadata quality. The problem
2
there are “good” and “bad” metadata records
but we don’t have clear metrics like this:
functional requirements
good
acceptable
bad
Measuring metadata quality. Non-informative values
3
non informative dc:title:
“photograph, framed”,
“group photograph”
“photograph”
informative dc:title:
“Photograph of Sir Dugald Clerk”,
“Photograph of "Puffing Billy"”
Measuring metadata quality. Copy & paste cataloging
4
from a template?
more examples in Report and Recommendations from the Task Force on Metadata Quality (2015)
Measuring metadata quality. Why data quality is important?
5
“Fitness for purpose” (QA principle)
no metadata no access to data no data usage
more explanation:
Data on the Web Best Practices
W3C Working Draft, https://www.w3.org/TR/dwbp/
Measuring metadata quality. Hypothesis
6
by measuring structural elements we
can predict metadata record quality
≃ metadata smell
Measuring metadata quality. Purposes
7
▪ improve the metadata
▪ services: good data → reliable functions
▪ better metadata schema & documentation
▪ propagate “good practice”
Measuring metadata quality. What to measure?
8
▪ Structural and semantic features
Cardinality, uniqueness, length, dictionary entry, data type conformance,
multilinguality (schema-independent measurements)
▪ Functional requirement analysis / Discovery scenarios
Requirements of the most important functions
▪ Problem catalog
Known metadata problems
Measuring metadata quality. Metadata requirements // User scenario
9
As a user I want to be able to filter by whether a person is the subject
of a book, or its author, engraver, printer etc.
Metadata analysis
In each case the underlying requirement is that the relevant EDM
fields for objects be populated with URIs rather than free text. These
URIs need to be related, at a minimum, to a label for each of the
supported languages.
Measurement rules
▪ the relevant field values should be resolvable URI
▪ each URI should be associated with labels in multiple languages
Measuring metadata quality. Metadata requirements // Supported functions
10
#1 Resource Discovery
★ Search Search for a resource corresponding to stated criteria (i.e., to search either
a single entity or a set of entities using an attribute or relationship of the entity as
the search criteria).
★ Identify confirm that the entity described or located corresponds to the entity sought
★ Select choose an entity that meets the user’s requirements
★ Obtain access a resource either physically or electronically
#2 Resource Use
★ Restrict
★ Manage
★ Operate
★ Interpret
#3 Data Management
★ Identify
★ Process
★ Sort
★ Display
Functional Analysis of the MARC 21 Bibliographic and Holdings Formats
http://www.loc.gov/marc/marc-functional-analysis/source/analysis.pdf
Measuring metadata quality. Metadata requirements // element—function map
11
Europeana sub-dimensions MARC Summary of Mapping to User Tasks
Measuring metadata quality. The data workflow (in Europeana)
12
data transformations Europeana Data Model (EDM)
Dublin Core,
LIDO, EAD,
MARC, EDM
custom, ...
Measuring metadata quality. Measurement
13
overall view collection view record view
Completeness
Field cardinality
Uniqueness
Multilinguality
Language specification
Problem catalog
etc.
links
measurements
aggregated statistics
metrics
Measuring metadata quality. Field frequency per collections
14
no record has alternative title
every record has alternative title
filters
Measuring metadata quality. Details of field cardinality
15
128 subjects in one record
median is 0, mean is close to 1
link to interesting records
Measuring metadata quality. Multilinguality
16
@resource is a URI
@ = language notation in RDF
no language specification
Measuring metadata quality. Language frequency
17
has language
specification
has no language
specification
Measuring metadata quality. Encoding problems
18
same language,
different encodings
Measuring metadata quality. Multilinguality
19
★ Mona Lisa → 456
results
★ La Gioconda → 365
results
★ La Joconde → 71
results
http://www.europeana.eu/portal/en/record/90402/RP_F_00_351.html
Measuring metadata quality. What Could be Measured?
20
★ Number of (distinct) languages in the metadata
★ Number of tagged literals
★ Tagged literals per language
Requirement: language annotations / tags!
Measuring metadata quality. Distinct Languages
21
Text w/o language annotation (dc.subject: Germany):
Text w language annotation (dc.subject: Germany@en)
Text w several language annotations (dc.subject:
Germany@en, Deutschland@de)
Link to (multilingual) vocabulary (http://www.geonames.org
/2921044/federal-republic-of-germany)
0
1
2
n
Measuring metadata quality. Record level
22
<#record> a ore:Proxy ;
dc:subject “Ballet”, “Opera” .
<#record> a ore:Proxy ; edm:europeanaProxy true ;
dc:subject <http://data.europeana.eu/concept/base/264>
, <http://data.europeana.eu/concept/base/247> .
<http://data.europeana.eu/concept/base/264> a skos:Concept .
skos:prefLabel "Ballett"@no, "बैले"@hi, "Ballett"@de, "Балет"@be, "Балет"@ru
, "Balé"@pt, "Балет"@bg, "Baletas"@lt, "Balet"@hr, "Balets"@lv .
<http://data.europeana.eu/concept/base/247>
skos:prefLabel "Opera"@no, "ओपेरा (गीतिनाटक)"@hi, "Oper"@de, "Ooppera"@fi
, "Опера"@be, "Опера"@ru, "Ópera"@pt, "Опера"@bg, "Opera"@lt .
0
0
11 19
Distinct languages Tagged literals 1,7 Literals per language
dereferencing
Measuring metadata quality. Processes Contributing to Multilinguality
23
Data
dc:creator dc:type
dc:subject
<http://dbpedia.org/a
SubjectID>
dc:subject
Data from Provider
dc:creator
dereferenced
Quantifiable
Data added by Europeana
“subject”@en <http://vocab.getty.e
du/aPersonNumber>
dc:subject
“Subject” <http://udcdata.info/rdf
/065280>
Measuring metadata quality. Multilingual saturation I.
24
Measuring metadata quality. Good example
25
dc:description
dc:title
Place/skos:prefLabel
Descriptive fields Subject headings
"Brandenburger Tor"@de
"Brandenburg Gate"@en
"Grenzübergang Potsdamer Platz"@de
"Postdamer Platz border crossing"@en
"Reichstag"@de
"Reichstag building"@en
"Die Mauer muß weg!"@de
"Die Mauer muß weg! (The
Wall must go!)"@en
"Kommentiertes Fotorama mit
Bildern von 1989-1990 in
Berlin"@de
"Annotated images from 1989-
1990 in Berlin"@en
Measuring metadata quality. Linked data - depth of iteration
26
Measuring metadata quality. Linked data - lost links
27
Measuring metadata quality. Outliers
28
bulk of records are close to zero
although 25% are between 0.05 and 1.25
Measuring metadata quality. Outliers
29
Measuring metadata quality. Outliers
30
zeros /
lower outliers high outliers
“normal”
values
Measuring metadata quality. Layers
31
:provider
dc:subject "special relativity"@en ;
dc:creator <http://vocab.getty.eu/ulan/500240971> ;
dc:type <http://udcdata.info/001684> .
:enhancement
dc:subject <http://dbpedia.org/resource/Physics> .
deferencable vocabulary
deferencable vocabulary
non-deferencable vocabulary
<http://vocab.getty.edu/ulan/500240971>
skos:prefLabel "Einstein, Albert"@de .
<http://dbpedia.org/resource/Physics>
skos:prefLabel "Physics"@en .
<http://udcdata.info/001684>
skos:prefLabel "Books in general"@en .
Measuring metadata quality. Layers
32
source field link value ① ② ③ ④
:provider dc:subject literal "special relativity"@en ① ② ③ ④
dc:creator standard "Einstein, Albert"@de ① ② ③ ④
dc:type non-std "Books in general"@en ② ④
:enhancement dc:subject standard "Physics"@en ③ ④
① data provider's proxy and dereferencable enrichments
② data provider's proxy and all enrichments
③ all proxies and dereferencable enrichments
④ all proxies and all enrichments
credit: Antoine Isaac
Measuring metadata quality. Data processing workflow
33
★ OAI-PMH
★ Europeana API
★ Hadoop
★ NoSQL
★ Spark
★ Hadoop
★ Java
★ Apache Solr
★ Spark
★ R
★ PHP
★ D3.js
★ highchart.js
★ NoSQL
json csv json, png html, svg
ingest measure statistical
analysis
web
interface
Measuring metadata quality. Data processing workflow
34
http://pkiraly.github.io/cheatsheet/
Measuring metadata quality. Modules
35
metadata-qa-api
europeana-qa-api
europeana-qa-spark europeana-qa-rest
marc-qa-api* ddb-qa-api*
★ Metadata schema
abstraction
★ Metrics definition
★ Iteration
★ Result data structure
★ ...
<dependencies>
<dependency>
<groupId>de.gwdg.metadataqa</groupId>
<artifactId>metadata−qa−api</artifactId>
<version>0.4</version>
</dependency>
<dependency>
<groupId>de.gwdg.metadataqa</groupId>
<artifactId>europeana−qa−api</artifactId>
<version>0.4</version>
</dependency>
...
</dependencies>
Measuring metadata quality. Batch API
36
client Metadata QA
/batch/measuring/start
sessionID
/batch/[recordId]
csv
for each records
/batch/measuring/stop
“success” | “failure”
/batch/analyzing/start
“success” | “failure”
/batch/analyzing/status
“in progress” | “ready”
/batch/analyzing/retriev
e
compressed package
periodically
measurement
analysis
Measuring metadata quality. Formal issue definition I. RDFUnit
37
SELECT ?s WHERE {
?s %% P1 %% ?v1 .
?s %% P2 %% ?v2 .
FILTER ( ?v1 %% OP %% ?v2 )
} SELECT ?s WHERE {
?s dbo: birthDate ?v1.
?s dbo: deathDate ?v2.
FILTER ( ?v1 > ?v2 )
}
pattern
SPARQL
P1 => dbo : birthDate
P2 => dbo : deathDate
OP => >
parameters
Kontokostas et al. (2014), Test-driven Evaluation of Linked Data Quality
Measuring metadata quality. Formal issue definition II. SHACL
38
<IssueShape> sh:property [
sh:predicate ex:submittedBy;
sh:minLength 20
] .
<IssueShape> <issue1> pass
<IssueShape> <issue2> fail ex:submittedOn expected to be >= 20
characters, 3 characters found.
shape
result
<issue1> ex:submittedBy
<http://a.example/bob> .
<issue2> ex:submittedBy
"Bob" .
RDF triplets
SHACL Core Abstract Syntax and Semantics
W3C First Public Working Draft 25 August 2016
Measuring metadata quality. Community bibliography
39
zotero.org/groups/metadata_assessment
dlfmetadataassessment.github.io
Measuring metadata quality. Cooperations and project proposals
40
★Europeana Network’s Data Quality Committee
★Digital Library Federation Metadata Assessment Group
★Deutsche Digitale Bibliothek
Measuring metadata quality. Further steps
41
▪ Translate the results into
documentation,
recommendations
▪ Communication with data
providers
▪ Human evaluation of metadata
quality
▪ Cooperation with other projects
▪ Incorporating into ingestion
process
▪ Shape Constraint Language
(SHACL) for defining patterns
▪ Process usage statistics
▪ Measuring changes of scores
▪ Machine learning based
classification & clustering
human analysis technical
Measuring metadata quality. Links
42
★Europeana Data Quality Committee // http://pro.europeana.eu/europeana-
tech/data-quality-committee
★site // http://144.76.218.178/europeana-qa/
★source codes (GPL v3.0) // http://pkiraly.github.io/about/#source-codes
★Europeana data (CC0) // http://hdl.handle.net/21.11101/0000-0001-781F-7
★Library of Congress data (OA) //
http://www.loc.gov/cds/products/marcDist.php

Mais conteúdo relacionado

Semelhante a Towards an extensible measurement of metadata quality (DATeCH 2017)

Resume_Md ZakirHussain
Resume_Md ZakirHussainResume_Md ZakirHussain
Resume_Md ZakirHussain
zakir hussain
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
JeffRichardsonResume2016
JeffRichardsonResume2016JeffRichardsonResume2016
JeffRichardsonResume2016
Jeff Richardson
 
CV_JMorilloEN-LinkedIn
CV_JMorilloEN-LinkedInCV_JMorilloEN-LinkedIn
CV_JMorilloEN-LinkedIn
Jos Morillo
 

Semelhante a Towards an extensible measurement of metadata quality (DATeCH 2017) (20)

FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014:  Social Network Benchmark (SNB) Graph GeneratorFOSDEM 2014:  Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
 
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
 
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
 
Metadata Quality assessment tool for Open Access
Metadata Quality assessment tool for Open AccessMetadata Quality assessment tool for Open Access
Metadata Quality assessment tool for Open Access
 
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
 
ZakirHussain
ZakirHussainZakirHussain
ZakirHussain
 
Resume_Md ZakirHussain
Resume_Md ZakirHussainResume_Md ZakirHussain
Resume_Md ZakirHussain
 
ISO 15926 Reference Data Engineering Methodology
ISO 15926 Reference Data Engineering MethodologyISO 15926 Reference Data Engineering Methodology
ISO 15926 Reference Data Engineering Methodology
 
Enterprise Data Science
Enterprise Data ScienceEnterprise Data Science
Enterprise Data Science
 
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
 
"Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa..."Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa...
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
JeffRichardsonResume2016
JeffRichardsonResume2016JeffRichardsonResume2016
JeffRichardsonResume2016
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
 
CV_JMorilloEN-LinkedIn
CV_JMorilloEN-LinkedInCV_JMorilloEN-LinkedIn
CV_JMorilloEN-LinkedIn
 
Zakir_Hussain_cv
Zakir_Hussain_cvZakir_Hussain_cv
Zakir_Hussain_cv
 

Mais de Péter Király

Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Péter Király
 

Mais de Péter Király (20)

Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
 
Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)
 
Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)
 
Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)
 
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
 
Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)
 
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
 
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
 
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
 
Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)
 
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
 
FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)
 
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
 
Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...
 
Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)
 
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
 
Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)
 
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
 
Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)
Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)
Measuring Completeness as Metadata Quality Metric in Europeana (CAS 2018)
 
Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)
 

Último

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 

Towards an extensible measurement of metadata quality (DATeCH 2017)

  • 1. Towards an extensible measurement of metadata quality Péter Király Way to DATeCH, 2017-05-22
  • 2. Measuring metadata quality. The problem 2 there are “good” and “bad” metadata records but we don’t have clear metrics like this: functional requirements good acceptable bad
  • 3. Measuring metadata quality. Non-informative values 3 non informative dc:title: “photograph, framed”, “group photograph” “photograph” informative dc:title: “Photograph of Sir Dugald Clerk”, “Photograph of "Puffing Billy"”
  • 4. Measuring metadata quality. Copy & paste cataloging 4 from a template? more examples in Report and Recommendations from the Task Force on Metadata Quality (2015)
  • 5. Measuring metadata quality. Why data quality is important? 5 “Fitness for purpose” (QA principle) no metadata no access to data no data usage more explanation: Data on the Web Best Practices W3C Working Draft, https://www.w3.org/TR/dwbp/
  • 6. Measuring metadata quality. Hypothesis 6 by measuring structural elements we can predict metadata record quality ≃ metadata smell
  • 7. Measuring metadata quality. Purposes 7 ▪ improve the metadata ▪ services: good data → reliable functions ▪ better metadata schema & documentation ▪ propagate “good practice”
  • 8. Measuring metadata quality. What to measure? 8 ▪ Structural and semantic features Cardinality, uniqueness, length, dictionary entry, data type conformance, multilinguality (schema-independent measurements) ▪ Functional requirement analysis / Discovery scenarios Requirements of the most important functions ▪ Problem catalog Known metadata problems
  • 9. Measuring metadata quality. Metadata requirements // User scenario 9 As a user I want to be able to filter by whether a person is the subject of a book, or its author, engraver, printer etc. Metadata analysis In each case the underlying requirement is that the relevant EDM fields for objects be populated with URIs rather than free text. These URIs need to be related, at a minimum, to a label for each of the supported languages. Measurement rules ▪ the relevant field values should be resolvable URI ▪ each URI should be associated with labels in multiple languages
  • 10. Measuring metadata quality. Metadata requirements // Supported functions 10 #1 Resource Discovery ★ Search Search for a resource corresponding to stated criteria (i.e., to search either a single entity or a set of entities using an attribute or relationship of the entity as the search criteria). ★ Identify confirm that the entity described or located corresponds to the entity sought ★ Select choose an entity that meets the user’s requirements ★ Obtain access a resource either physically or electronically #2 Resource Use ★ Restrict ★ Manage ★ Operate ★ Interpret #3 Data Management ★ Identify ★ Process ★ Sort ★ Display Functional Analysis of the MARC 21 Bibliographic and Holdings Formats http://www.loc.gov/marc/marc-functional-analysis/source/analysis.pdf
  • 11. Measuring metadata quality. Metadata requirements // element—function map 11 Europeana sub-dimensions MARC Summary of Mapping to User Tasks
  • 12. Measuring metadata quality. The data workflow (in Europeana) 12 data transformations Europeana Data Model (EDM) Dublin Core, LIDO, EAD, MARC, EDM custom, ...
  • 13. Measuring metadata quality. Measurement 13 overall view collection view record view Completeness Field cardinality Uniqueness Multilinguality Language specification Problem catalog etc. links measurements aggregated statistics metrics
  • 14. Measuring metadata quality. Field frequency per collections 14 no record has alternative title every record has alternative title filters
  • 15. Measuring metadata quality. Details of field cardinality 15 128 subjects in one record median is 0, mean is close to 1 link to interesting records
  • 16. Measuring metadata quality. Multilinguality 16 @resource is a URI @ = language notation in RDF no language specification
  • 17. Measuring metadata quality. Language frequency 17 has language specification has no language specification
  • 18. Measuring metadata quality. Encoding problems 18 same language, different encodings
  • 19. Measuring metadata quality. Multilinguality 19 ★ Mona Lisa → 456 results ★ La Gioconda → 365 results ★ La Joconde → 71 results http://www.europeana.eu/portal/en/record/90402/RP_F_00_351.html
  • 20. Measuring metadata quality. What Could be Measured? 20 ★ Number of (distinct) languages in the metadata ★ Number of tagged literals ★ Tagged literals per language Requirement: language annotations / tags!
  • 21. Measuring metadata quality. Distinct Languages 21 Text w/o language annotation (dc.subject: Germany): Text w language annotation (dc.subject: Germany@en) Text w several language annotations (dc.subject: Germany@en, Deutschland@de) Link to (multilingual) vocabulary (http://www.geonames.org /2921044/federal-republic-of-germany) 0 1 2 n
  • 22. Measuring metadata quality. Record level 22 <#record> a ore:Proxy ; dc:subject “Ballet”, “Opera” . <#record> a ore:Proxy ; edm:europeanaProxy true ; dc:subject <http://data.europeana.eu/concept/base/264> , <http://data.europeana.eu/concept/base/247> . <http://data.europeana.eu/concept/base/264> a skos:Concept . skos:prefLabel "Ballett"@no, "बैले"@hi, "Ballett"@de, "Балет"@be, "Балет"@ru , "Balé"@pt, "Балет"@bg, "Baletas"@lt, "Balet"@hr, "Balets"@lv . <http://data.europeana.eu/concept/base/247> skos:prefLabel "Opera"@no, "ओपेरा (गीतिनाटक)"@hi, "Oper"@de, "Ooppera"@fi , "Опера"@be, "Опера"@ru, "Ópera"@pt, "Опера"@bg, "Opera"@lt . 0 0 11 19 Distinct languages Tagged literals 1,7 Literals per language dereferencing
  • 23. Measuring metadata quality. Processes Contributing to Multilinguality 23 Data dc:creator dc:type dc:subject <http://dbpedia.org/a SubjectID> dc:subject Data from Provider dc:creator dereferenced Quantifiable Data added by Europeana “subject”@en <http://vocab.getty.e du/aPersonNumber> dc:subject “Subject” <http://udcdata.info/rdf /065280>
  • 24. Measuring metadata quality. Multilingual saturation I. 24
  • 25. Measuring metadata quality. Good example 25 dc:description dc:title Place/skos:prefLabel Descriptive fields Subject headings "Brandenburger Tor"@de "Brandenburg Gate"@en "Grenzübergang Potsdamer Platz"@de "Postdamer Platz border crossing"@en "Reichstag"@de "Reichstag building"@en "Die Mauer muß weg!"@de "Die Mauer muß weg! (The Wall must go!)"@en "Kommentiertes Fotorama mit Bildern von 1989-1990 in Berlin"@de "Annotated images from 1989- 1990 in Berlin"@en
  • 26. Measuring metadata quality. Linked data - depth of iteration 26
  • 27. Measuring metadata quality. Linked data - lost links 27
  • 28. Measuring metadata quality. Outliers 28 bulk of records are close to zero although 25% are between 0.05 and 1.25
  • 30. Measuring metadata quality. Outliers 30 zeros / lower outliers high outliers “normal” values
  • 31. Measuring metadata quality. Layers 31 :provider dc:subject "special relativity"@en ; dc:creator <http://vocab.getty.eu/ulan/500240971> ; dc:type <http://udcdata.info/001684> . :enhancement dc:subject <http://dbpedia.org/resource/Physics> . deferencable vocabulary deferencable vocabulary non-deferencable vocabulary <http://vocab.getty.edu/ulan/500240971> skos:prefLabel "Einstein, Albert"@de . <http://dbpedia.org/resource/Physics> skos:prefLabel "Physics"@en . <http://udcdata.info/001684> skos:prefLabel "Books in general"@en .
  • 32. Measuring metadata quality. Layers 32 source field link value ① ② ③ ④ :provider dc:subject literal "special relativity"@en ① ② ③ ④ dc:creator standard "Einstein, Albert"@de ① ② ③ ④ dc:type non-std "Books in general"@en ② ④ :enhancement dc:subject standard "Physics"@en ③ ④ ① data provider's proxy and dereferencable enrichments ② data provider's proxy and all enrichments ③ all proxies and dereferencable enrichments ④ all proxies and all enrichments credit: Antoine Isaac
  • 33. Measuring metadata quality. Data processing workflow 33 ★ OAI-PMH ★ Europeana API ★ Hadoop ★ NoSQL ★ Spark ★ Hadoop ★ Java ★ Apache Solr ★ Spark ★ R ★ PHP ★ D3.js ★ highchart.js ★ NoSQL json csv json, png html, svg ingest measure statistical analysis web interface
  • 34. Measuring metadata quality. Data processing workflow 34 http://pkiraly.github.io/cheatsheet/
  • 35. Measuring metadata quality. Modules 35 metadata-qa-api europeana-qa-api europeana-qa-spark europeana-qa-rest marc-qa-api* ddb-qa-api* ★ Metadata schema abstraction ★ Metrics definition ★ Iteration ★ Result data structure ★ ... <dependencies> <dependency> <groupId>de.gwdg.metadataqa</groupId> <artifactId>metadata−qa−api</artifactId> <version>0.4</version> </dependency> <dependency> <groupId>de.gwdg.metadataqa</groupId> <artifactId>europeana−qa−api</artifactId> <version>0.4</version> </dependency> ... </dependencies>
  • 36. Measuring metadata quality. Batch API 36 client Metadata QA /batch/measuring/start sessionID /batch/[recordId] csv for each records /batch/measuring/stop “success” | “failure” /batch/analyzing/start “success” | “failure” /batch/analyzing/status “in progress” | “ready” /batch/analyzing/retriev e compressed package periodically measurement analysis
  • 37. Measuring metadata quality. Formal issue definition I. RDFUnit 37 SELECT ?s WHERE { ?s %% P1 %% ?v1 . ?s %% P2 %% ?v2 . FILTER ( ?v1 %% OP %% ?v2 ) } SELECT ?s WHERE { ?s dbo: birthDate ?v1. ?s dbo: deathDate ?v2. FILTER ( ?v1 > ?v2 ) } pattern SPARQL P1 => dbo : birthDate P2 => dbo : deathDate OP => > parameters Kontokostas et al. (2014), Test-driven Evaluation of Linked Data Quality
  • 38. Measuring metadata quality. Formal issue definition II. SHACL 38 <IssueShape> sh:property [ sh:predicate ex:submittedBy; sh:minLength 20 ] . <IssueShape> <issue1> pass <IssueShape> <issue2> fail ex:submittedOn expected to be >= 20 characters, 3 characters found. shape result <issue1> ex:submittedBy <http://a.example/bob> . <issue2> ex:submittedBy "Bob" . RDF triplets SHACL Core Abstract Syntax and Semantics W3C First Public Working Draft 25 August 2016
  • 39. Measuring metadata quality. Community bibliography 39 zotero.org/groups/metadata_assessment dlfmetadataassessment.github.io
  • 40. Measuring metadata quality. Cooperations and project proposals 40 ★Europeana Network’s Data Quality Committee ★Digital Library Federation Metadata Assessment Group ★Deutsche Digitale Bibliothek
  • 41. Measuring metadata quality. Further steps 41 ▪ Translate the results into documentation, recommendations ▪ Communication with data providers ▪ Human evaluation of metadata quality ▪ Cooperation with other projects ▪ Incorporating into ingestion process ▪ Shape Constraint Language (SHACL) for defining patterns ▪ Process usage statistics ▪ Measuring changes of scores ▪ Machine learning based classification & clustering human analysis technical
  • 42. Measuring metadata quality. Links 42 ★Europeana Data Quality Committee // http://pro.europeana.eu/europeana- tech/data-quality-committee ★site // http://144.76.218.178/europeana-qa/ ★source codes (GPL v3.0) // http://pkiraly.github.io/about/#source-codes ★Europeana data (CC0) // http://hdl.handle.net/21.11101/0000-0001-781F-7 ★Library of Congress data (OA) // http://www.loc.gov/cds/products/marcDist.php