SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Andreas Blumauer
CEO & Managing Partner
Semantic Web Company &
PoolParty Semantic Suite
TAXONOMY QUALITY
ASSESSMENT:
TOOLS & TECHNIQUES
Taxonomy
Boot Camp 2016
Washington, DC
1
INTRODUCTION
2
Semantic Web
Company
founder &
CEO of
Andreas
Blumauer
developer and
vendor of
2004
founded
5.5
current
Version
active at
based on
Vienna
located
part of Taxonomy Knowledge Graph
standard for
part of is a
>200serves customers
Ontology
manages
part ofis a
Aspects of
Taxonomy Quality
Types of taxonomy quality metrics,
and for which scenarios they are relevant
3
Why is taxonomy
quality important?
Some examples for
quality issues and
their possible
consequences
4 ▸ Missing labels
▹ AGROVOC (FAO) defines concepts in 25 different languages. While most concepts have
English labels attached, only 38% have German labels.
▹ This can be a problem for multilingual applications that rely on label translations.
▸ Orphan concepts
▹ An orphan concept is a concept that has no semantic relation with any other concept.
Although it might have attached lexical labels, it lacks valuable context information.
▹ This can be crucial for retrieval tasks such as search query expansion.
▸ Mismatch between content and taxonomy
▹ There are only minor overlaps between the scope of the documents (or data) to be
indexed and the scope of the controlled vocabulary in use.
▹ This leads to a sparse enrichment of the document index by semantic information.
See also: Finding quality issues in SKOS vocabularies
(Christian Mader, Bernhard Haslhofer, Antoine Isaac)
Taxonomy quality
issues are more
frequently
observed than
some might expect
5
See also: Finding quality issues in SKOS vocabularies
Taxonomy quality
criteria and issues
at different levels
6
1. Formal integrity conditions based on SKOS
▹ Construction of well-formed and consistent data to promote interoperability
▹ Example: No two concepts may be connected by both related and broader transitive
▹ Read more: SKOS: A Guide for Information Professionals (Jane Frazier)
2. Labeling and documentation issues
▹ Construction of taxonomies that allow support for complex retrieval tasks
▹ Example: No two concepts of a concept scheme may have the same preferred label
▹ Read more: SKOS Primer (Antoine Isaac / Ed Summers)
3. Structural issues
▹ Logic-based based processing of taxonomies
▹ Example: Avoidance of hierarchical cycles
▹ Read more: Key choices in the design of SKOS (Thomas Baker et al)
4. Content coverage
▹ Development of taxonomies that reflect well the scope of represented content
▹ Example: Avoid maintaining subtrees that only have limited occurrences in a representative
document corpus
▹ Read more: Corpus management with PoolParty
5. Network topological issues (experimental)
▹ (Co-)occurrences of concepts in a corpus should be reflected in the network topology of a
knowledge graph
▹ Example: Nodes/concepts with high betweenness centrality should occur correspondingly
in a reference document corpus
Why are
standards-based
technologies and
tools so important
when it comes to
taxonomy quality
management?
7
Spreadsheet editors are still the most common type of software application
being used for taxonomy management. They cannot measure quality automatically.
‘Good’ quality
depends on the
usage scenario
8
Example: Google Product Taxonomy has no synonyms at all, only hierarchical relations
How to pick the
most relevant
quality criteria for a
taxonomy project
9
PoolParty supports various application scenarios. Quality checks can be enforced,
reported, or ignored.
How to pick the
most relevant
quality criteria for a
taxonomy project
10 ▸ General purpose thesaurus vs.
Custom enterprise taxonomy
▹ Custom enterprise taxonomies can be developed specifically on top of reference corpora
▹ General purpose thesauri are frequently used in the context of linked data environments
→ Linked data specific issues become more important
■ Missing In-Links
■ Missing Out-Links
■ Broken Links
■ Undefined SKOS Resources
■ HTTP URI Scheme Violation
See also: PoolParty SKOS Quality Checker based on qSKOS
Taxonomy
Quality Metrics
How quality issues can be unveiled
and how insights can be used for further improvements
11
Repair label issues
12
Repair structural
issues
13
Unveil mismatch
between taxonomy
and document
corpus
14 Content Manager
Integrator
Taxonomist/
Ontologist
Thesaurus
Server
Extractor
PowerTagging
uses API
is user of
is user of
is basis of
is basis of
Index
annotates
enriches
Corpus Learning/
Semantic Analysis
CMS
extends
is basis of
analyzes
uses API
Unveil mismatch
between taxonomy
and document
corpus
15
PoolParty extracts concepts not being used in a reference corpus at all and provides
suggestions how those concepts could be reworked or extended to become relevant.
Unveil mismatch
between taxonomy
and document
corpus
16
PoolParty extracts relevant candidate concepts based on a deep corpus analysis.
Unveil mismatch
between taxonomy
and document
corpus
17
PoolParty suggest possible ‘right places’ for the candidate concepts within the approved
taxonomy.
Unveil network
topological issues
18
Example: STW Thesaurus for Economics
Unveil network
topological issues
19
Example: STW Thesaurus for Economics - Top 10 thesaurus concepts (betweenness)
Combined analysis
over network
topology and
reference corpus
20
Example: STW Thesaurus for Economics and reference corpus about ‘Crude Oil Market’
Combined analysis
over network
topology and
reference corpus
21
Example: STW Thesaurus for Economics and reference corpus about ‘Crude Oil Market’
Combined analysis
over network
topology and
reference corpus:
Correlation
Betweenness &
Document
Frequency
22
Example: STW Thesaurus for Economics and reference corpus about ‘Crude Oil Market’
Techniques and Tools
How they help to assess
Taxonomy Quality
23
BARTOC.org
Basel Register of
Thesauri,
Ontologies &
Classifications
▸ Unveil Taxonomy Quality by the Wisdom of the Crowd
24
qSKOS
▸ qSKOS is a tool for finding quality issues in SKOS vocabularies
▸ Available as free online service at http://qskos.poolparty.biz/
▸ SKOS taxonomy being analyzed with regards to 24 issues
25
PoolParty Import
Validator
26
▸ RDF Validation to go beyond SKOS
▸ Checks are defined in RDF, repair strategies also defined as RDF
▸ 15 checks have been integrated
Shapes Constraint
Language (SHACL)
▸ “Do for RDF what XML Schema does for XML”
▸ Language for validating RDF graphs against a set of conditions
▸ SHACL shape graphs are used to validate that data graphs satisfy a set of
conditions
▸ Current status: W3C Working Draft (14 August 2016)
See also: Towards maintainable constraint validation and repair for taxonomies:
The PoolParty approach (Christian Mader and Monika Solanki)
27
GET YOUR
TEST ACCOUNT
GET CERTIFIED
28
Get your test account at
www.poolparty.biz/demo
Get certified at
www.poolparty.biz/academy/
CONNECT
Andreas Blumauer
CEO, Semantic Web Company
▸ a.blumauer@semantic-web.at
▸ http://at.linkedin.com/in/andreasblumauer
▸ https://twitter.com/semwebcompany
▸ https://www.poolparty.biz
▸ https://www.semantic-web.at
29
© Semantic Web Company - http://www.semantic-web.at/ and http://www.poolparty.biz/

Mais conteúdo relacionado

Mais procurados

Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextRafał Kuć
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with SupersetDataWorks Summit
 
Sharding
ShardingSharding
ShardingMongoDB
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesDataWorks Summit
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...
AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...
AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...Amazon Web Services Korea
 
[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천
[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천
[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천BOAZ Bigdata
 
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.jsHeeJung Hwang
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Milind Bhandarkar
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsSteven Francia
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster AnalysisDerek Kane
 
[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?
[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?
[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?Amazon Web Services Korea
 
Ml3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsMl3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsankit_ppt
 

Mais procurados (20)

Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with Superset
 
Sharding
ShardingSharding
Sharding
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! Perspectives
 
03 hive query language (hql)
03 hive query language (hql)03 hive query language (hql)
03 hive query language (hql)
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...
AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...
AWS Fargate와 Amazon ECS를 사용한 CI/CD 베스트 프랙티스 - 유재석, AWS 솔루션즈 아키텍트 :: AWS Build...
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천
[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천
[분석]워드임베딩과 인공신경망을 이용한 개인 맞춤형 레시피 추천
 
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
HBase
HBaseHBase
HBase
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster Analysis
 
[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?
[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?
[AWS Builders] 클라우드 비용, 어떻게 줄일 수 있을까?
 
모두싸인의 AWS 성장기
모두싸인의 AWS 성장기모두싸인의 AWS 성장기
모두싸인의 AWS 성장기
 
Ml3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsMl3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metrics
 

Destaque

Taming taxonomy—a practical intro
Taming taxonomy—a practical introTaming taxonomy—a practical intro
Taming taxonomy—a practical introAlberta Soranzo
 
Interactions South America 2015 Keynote
Interactions South America 2015 KeynoteInteractions South America 2015 Keynote
Interactions South America 2015 KeynoteAbby Covert
 
Achim Steinacker: Technical Documentation in the age of Industry 4.0
Achim Steinacker: Technical Documentation in the age of Industry 4.0Achim Steinacker: Technical Documentation in the age of Industry 4.0
Achim Steinacker: Technical Documentation in the age of Industry 4.0Semantic Web Company
 
Taxonomies for E-commerce
Taxonomies for E-commerceTaxonomies for E-commerce
Taxonomies for E-commerceHeather Hedden
 
Understanding Website Taxonomy
Understanding Website TaxonomyUnderstanding Website Taxonomy
Understanding Website TaxonomyIksula
 
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataSemantic Web Company
 
Taxonomy Is User Experience
Taxonomy Is User ExperienceTaxonomy Is User Experience
Taxonomy Is User ExperienceDave Cooksey
 
Taxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingTaxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingSemantic Web Company
 
Financing options for brouwersdam in zuid holland
Financing options for brouwersdam in zuid hollandFinancing options for brouwersdam in zuid holland
Financing options for brouwersdam in zuid hollandEIP Water
 
Pivot Conference '13 - Snackable Take Home Lessons
Pivot Conference '13 - Snackable Take Home LessonsPivot Conference '13 - Snackable Take Home Lessons
Pivot Conference '13 - Snackable Take Home LessonsMichoel Ogince
 
The A-to-Z Guide to SlideShare
The A-to-Z Guide to SlideShareThe A-to-Z Guide to SlideShare
The A-to-Z Guide to SlideShareBarry Feldman
 

Destaque (13)

Taxonomy-Driven UX
Taxonomy-Driven UXTaxonomy-Driven UX
Taxonomy-Driven UX
 
Taming taxonomy—a practical intro
Taming taxonomy—a practical introTaming taxonomy—a practical intro
Taming taxonomy—a practical intro
 
Interactions South America 2015 Keynote
Interactions South America 2015 KeynoteInteractions South America 2015 Keynote
Interactions South America 2015 Keynote
 
Achim Steinacker: Technical Documentation in the age of Industry 4.0
Achim Steinacker: Technical Documentation in the age of Industry 4.0Achim Steinacker: Technical Documentation in the age of Industry 4.0
Achim Steinacker: Technical Documentation in the age of Industry 4.0
 
Taxonomies for E-commerce
Taxonomies for E-commerceTaxonomies for E-commerce
Taxonomies for E-commerce
 
Understanding Website Taxonomy
Understanding Website TaxonomyUnderstanding Website Taxonomy
Understanding Website Taxonomy
 
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked Data
 
Taxonomy Is User Experience
Taxonomy Is User ExperienceTaxonomy Is User Experience
Taxonomy Is User Experience
 
Taxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingTaxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
 
Financing options for brouwersdam in zuid holland
Financing options for brouwersdam in zuid hollandFinancing options for brouwersdam in zuid holland
Financing options for brouwersdam in zuid holland
 
Pivot Conference '13 - Snackable Take Home Lessons
Pivot Conference '13 - Snackable Take Home LessonsPivot Conference '13 - Snackable Take Home Lessons
Pivot Conference '13 - Snackable Take Home Lessons
 
Blooms taxonomy powerpoint
Blooms taxonomy powerpointBlooms taxonomy powerpoint
Blooms taxonomy powerpoint
 
The A-to-Z Guide to SlideShare
The A-to-Z Guide to SlideShareThe A-to-Z Guide to SlideShare
The A-to-Z Guide to SlideShare
 

Semelhante a Taxonomy Quality Assessment

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
PoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional OverviewPoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional OverviewSemantic Web Company
 
Transforming knowledge management for climate action
Transforming knowledge management for climate action  Transforming knowledge management for climate action
Transforming knowledge management for climate action weADAPT
 
Low Hanging Fruit Breakout Discussion #2
Low Hanging Fruit Breakout Discussion #2 Low Hanging Fruit Breakout Discussion #2
Low Hanging Fruit Breakout Discussion #2 Pistoia Alliance
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Jenn Riley
 
Expressing Concept Schemes & Competency Frameworks in CTDL
Expressing Concept Schemes & Competency Frameworks in CTDLExpressing Concept Schemes & Competency Frameworks in CTDL
Expressing Concept Schemes & Competency Frameworks in CTDLCredential Engine
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSemantic Web Company
 
PoolParty Semantic Suite: Management Briefing and Functional Overview
PoolParty Semantic Suite: Management Briefing and Functional Overview PoolParty Semantic Suite: Management Briefing and Functional Overview
PoolParty Semantic Suite: Management Briefing and Functional Overview Martin Kaltenböck
 
Metadata: Digital Humanties
Metadata: Digital HumantiesMetadata: Digital Humanties
Metadata: Digital HumantiesMatthew Miguez
 
SWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSSWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSMariano Rodriguez-Muro
 
Building Bridges with Taxonomy: Enabling Semantic Integration
Building Bridges with Taxonomy: Enabling Semantic IntegrationBuilding Bridges with Taxonomy: Enabling Semantic Integration
Building Bridges with Taxonomy: Enabling Semantic IntegrationDesign for Context
 
Content Analysis Keys Reuse
Content Analysis Keys ReuseContent Analysis Keys Reuse
Content Analysis Keys ReuseClearPath, LLC
 
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web María Poveda Villalón
 
Cataloging roundtable discussion questions
Cataloging roundtable discussion questionsCataloging roundtable discussion questions
Cataloging roundtable discussion questionsrobin fay
 
S doherty counting_dragons_dita-reuse
S doherty counting_dragons_dita-reuseS doherty counting_dragons_dita-reuse
S doherty counting_dragons_dita-reuseStan Doherty
 
Text Analytics for Non-Experts
Text Analytics for Non-ExpertsText Analytics for Non-Experts
Text Analytics for Non-ExpertsSynaptica, LLC
 
IWMW 2002: The Value of Metadata and How to Realise It
IWMW 2002: The Value of Metadata and How to Realise ItIWMW 2002: The Value of Metadata and How to Realise It
IWMW 2002: The Value of Metadata and How to Realise ItIWMW
 

Semelhante a Taxonomy Quality Assessment (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
PoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional OverviewPoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional Overview
 
Transforming knowledge management for climate action
Transforming knowledge management for climate action  Transforming knowledge management for climate action
Transforming knowledge management for climate action
 
Low Hanging Fruit Breakout Discussion #2
Low Hanging Fruit Breakout Discussion #2 Low Hanging Fruit Breakout Discussion #2
Low Hanging Fruit Breakout Discussion #2
 
Aiim motorola-taxo-integration-03-15-10-cg
Aiim motorola-taxo-integration-03-15-10-cgAiim motorola-taxo-integration-03-15-10-cg
Aiim motorola-taxo-integration-03-15-10-cg
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
Expressing Concept Schemes & Competency Frameworks in CTDL
Expressing Concept Schemes & Competency Frameworks in CTDLExpressing Concept Schemes & Competency Frameworks in CTDL
Expressing Concept Schemes & Competency Frameworks in CTDL
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategies
 
PoolParty Semantic Suite: Management Briefing and Functional Overview
PoolParty Semantic Suite: Management Briefing and Functional Overview PoolParty Semantic Suite: Management Briefing and Functional Overview
PoolParty Semantic Suite: Management Briefing and Functional Overview
 
Metadata: Digital Humanties
Metadata: Digital HumantiesMetadata: Digital Humanties
Metadata: Digital Humanties
 
SWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSSWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFS
 
Building Bridges with Taxonomy: Enabling Semantic Integration
Building Bridges with Taxonomy: Enabling Semantic IntegrationBuilding Bridges with Taxonomy: Enabling Semantic Integration
Building Bridges with Taxonomy: Enabling Semantic Integration
 
Taxonomies and Metadata
Taxonomies and MetadataTaxonomies and Metadata
Taxonomies and Metadata
 
Content Analysis Keys Reuse
Content Analysis Keys ReuseContent Analysis Keys Reuse
Content Analysis Keys Reuse
 
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web
 
Taxonomy Governance and Iteration
Taxonomy Governance and IterationTaxonomy Governance and Iteration
Taxonomy Governance and Iteration
 
Cataloging roundtable discussion questions
Cataloging roundtable discussion questionsCataloging roundtable discussion questions
Cataloging roundtable discussion questions
 
S doherty counting_dragons_dita-reuse
S doherty counting_dragons_dita-reuseS doherty counting_dragons_dita-reuse
S doherty counting_dragons_dita-reuse
 
Text Analytics for Non-Experts
Text Analytics for Non-ExpertsText Analytics for Non-Experts
Text Analytics for Non-Experts
 
IWMW 2002: The Value of Metadata and How to Realise It
IWMW 2002: The Value of Metadata and How to Realise ItIWMW 2002: The Value of Metadata and How to Realise It
IWMW 2002: The Value of Metadata and How to Realise It
 

Mais de Semantic Web Company

How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...Semantic Web Company
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AISemantic Web Company
 
Deep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from textDeep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from textSemantic Web Company
 
Leveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management SystemLeveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management SystemSemantic Web Company
 
Linking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured DataLinking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured DataSemantic Web Company
 
The Fast Track to Knowledge Engineering
The Fast Track to Knowledge EngineeringThe Fast Track to Knowledge Engineering
The Fast Track to Knowledge EngineeringSemantic Web Company
 
Leveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine LearningLeveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine LearningSemantic Web Company
 
PoolParty GraphSearch - The Fusion of Search, Recommendation and Analytics
PoolParty GraphSearch - The Fusion of Search, Recommendation and AnalyticsPoolParty GraphSearch - The Fusion of Search, Recommendation and Analytics
PoolParty GraphSearch - The Fusion of Search, Recommendation and AnalyticsSemantic Web Company
 
Semantics as the Basis of Advanced Cognitive Computing
Semantics as the Basis of Advanced Cognitive ComputingSemantics as the Basis of Advanced Cognitive Computing
Semantics as the Basis of Advanced Cognitive ComputingSemantic Web Company
 
PoolParty 6.0 - Climbing the Semantic Ladder
PoolParty 6.0 - Climbing the Semantic LadderPoolParty 6.0 - Climbing the Semantic Ladder
PoolParty 6.0 - Climbing the Semantic LadderSemantic Web Company
 
PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)Semantic Web Company
 
PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5Semantic Web Company
 
PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365Semantic Web Company
 
From SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom OntologiesFrom SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom OntologiesSemantic Web Company
 
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...Semantic Web Company
 

Mais de Semantic Web Company (20)

How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
Deep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from textDeep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from text
 
Leveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management SystemLeveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management System
 
Linking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured DataLinking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured Data
 
The Fast Track to Knowledge Engineering
The Fast Track to Knowledge EngineeringThe Fast Track to Knowledge Engineering
The Fast Track to Knowledge Engineering
 
Semantic AI
Semantic AISemantic AI
Semantic AI
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
PoolParty Semantic Classifier
PoolParty Semantic ClassifierPoolParty Semantic Classifier
PoolParty Semantic Classifier
 
Leveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine LearningLeveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine Learning
 
Taxonomies put in the right place
Taxonomies put in the right placeTaxonomies put in the right place
Taxonomies put in the right place
 
PoolParty GraphSearch - The Fusion of Search, Recommendation and Analytics
PoolParty GraphSearch - The Fusion of Search, Recommendation and AnalyticsPoolParty GraphSearch - The Fusion of Search, Recommendation and Analytics
PoolParty GraphSearch - The Fusion of Search, Recommendation and Analytics
 
Semantics as the Basis of Advanced Cognitive Computing
Semantics as the Basis of Advanced Cognitive ComputingSemantics as the Basis of Advanced Cognitive Computing
Semantics as the Basis of Advanced Cognitive Computing
 
Structured Content Meets Taxonomy
Structured Content Meets TaxonomyStructured Content Meets Taxonomy
Structured Content Meets Taxonomy
 
PoolParty 6.0 - Climbing the Semantic Ladder
PoolParty 6.0 - Climbing the Semantic LadderPoolParty 6.0 - Climbing the Semantic Ladder
PoolParty 6.0 - Climbing the Semantic Ladder
 
PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)
 
PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5
 
PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365
 
From SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom OntologiesFrom SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom Ontologies
 
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
 

Último

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 

Último (20)

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 

Taxonomy Quality Assessment

  • 1. Andreas Blumauer CEO & Managing Partner Semantic Web Company & PoolParty Semantic Suite TAXONOMY QUALITY ASSESSMENT: TOOLS & TECHNIQUES Taxonomy Boot Camp 2016 Washington, DC 1
  • 2. INTRODUCTION 2 Semantic Web Company founder & CEO of Andreas Blumauer developer and vendor of 2004 founded 5.5 current Version active at based on Vienna located part of Taxonomy Knowledge Graph standard for part of is a >200serves customers Ontology manages part ofis a
  • 3. Aspects of Taxonomy Quality Types of taxonomy quality metrics, and for which scenarios they are relevant 3
  • 4. Why is taxonomy quality important? Some examples for quality issues and their possible consequences 4 ▸ Missing labels ▹ AGROVOC (FAO) defines concepts in 25 different languages. While most concepts have English labels attached, only 38% have German labels. ▹ This can be a problem for multilingual applications that rely on label translations. ▸ Orphan concepts ▹ An orphan concept is a concept that has no semantic relation with any other concept. Although it might have attached lexical labels, it lacks valuable context information. ▹ This can be crucial for retrieval tasks such as search query expansion. ▸ Mismatch between content and taxonomy ▹ There are only minor overlaps between the scope of the documents (or data) to be indexed and the scope of the controlled vocabulary in use. ▹ This leads to a sparse enrichment of the document index by semantic information. See also: Finding quality issues in SKOS vocabularies (Christian Mader, Bernhard Haslhofer, Antoine Isaac)
  • 5. Taxonomy quality issues are more frequently observed than some might expect 5 See also: Finding quality issues in SKOS vocabularies
  • 6. Taxonomy quality criteria and issues at different levels 6 1. Formal integrity conditions based on SKOS ▹ Construction of well-formed and consistent data to promote interoperability ▹ Example: No two concepts may be connected by both related and broader transitive ▹ Read more: SKOS: A Guide for Information Professionals (Jane Frazier) 2. Labeling and documentation issues ▹ Construction of taxonomies that allow support for complex retrieval tasks ▹ Example: No two concepts of a concept scheme may have the same preferred label ▹ Read more: SKOS Primer (Antoine Isaac / Ed Summers) 3. Structural issues ▹ Logic-based based processing of taxonomies ▹ Example: Avoidance of hierarchical cycles ▹ Read more: Key choices in the design of SKOS (Thomas Baker et al) 4. Content coverage ▹ Development of taxonomies that reflect well the scope of represented content ▹ Example: Avoid maintaining subtrees that only have limited occurrences in a representative document corpus ▹ Read more: Corpus management with PoolParty 5. Network topological issues (experimental) ▹ (Co-)occurrences of concepts in a corpus should be reflected in the network topology of a knowledge graph ▹ Example: Nodes/concepts with high betweenness centrality should occur correspondingly in a reference document corpus
  • 7. Why are standards-based technologies and tools so important when it comes to taxonomy quality management? 7 Spreadsheet editors are still the most common type of software application being used for taxonomy management. They cannot measure quality automatically.
  • 8. ‘Good’ quality depends on the usage scenario 8 Example: Google Product Taxonomy has no synonyms at all, only hierarchical relations
  • 9. How to pick the most relevant quality criteria for a taxonomy project 9 PoolParty supports various application scenarios. Quality checks can be enforced, reported, or ignored.
  • 10. How to pick the most relevant quality criteria for a taxonomy project 10 ▸ General purpose thesaurus vs. Custom enterprise taxonomy ▹ Custom enterprise taxonomies can be developed specifically on top of reference corpora ▹ General purpose thesauri are frequently used in the context of linked data environments → Linked data specific issues become more important ■ Missing In-Links ■ Missing Out-Links ■ Broken Links ■ Undefined SKOS Resources ■ HTTP URI Scheme Violation See also: PoolParty SKOS Quality Checker based on qSKOS
  • 11. Taxonomy Quality Metrics How quality issues can be unveiled and how insights can be used for further improvements 11
  • 14. Unveil mismatch between taxonomy and document corpus 14 Content Manager Integrator Taxonomist/ Ontologist Thesaurus Server Extractor PowerTagging uses API is user of is user of is basis of is basis of Index annotates enriches Corpus Learning/ Semantic Analysis CMS extends is basis of analyzes uses API
  • 15. Unveil mismatch between taxonomy and document corpus 15 PoolParty extracts concepts not being used in a reference corpus at all and provides suggestions how those concepts could be reworked or extended to become relevant.
  • 16. Unveil mismatch between taxonomy and document corpus 16 PoolParty extracts relevant candidate concepts based on a deep corpus analysis.
  • 17. Unveil mismatch between taxonomy and document corpus 17 PoolParty suggest possible ‘right places’ for the candidate concepts within the approved taxonomy.
  • 18. Unveil network topological issues 18 Example: STW Thesaurus for Economics
  • 19. Unveil network topological issues 19 Example: STW Thesaurus for Economics - Top 10 thesaurus concepts (betweenness)
  • 20. Combined analysis over network topology and reference corpus 20 Example: STW Thesaurus for Economics and reference corpus about ‘Crude Oil Market’
  • 21. Combined analysis over network topology and reference corpus 21 Example: STW Thesaurus for Economics and reference corpus about ‘Crude Oil Market’
  • 22. Combined analysis over network topology and reference corpus: Correlation Betweenness & Document Frequency 22 Example: STW Thesaurus for Economics and reference corpus about ‘Crude Oil Market’
  • 23. Techniques and Tools How they help to assess Taxonomy Quality 23
  • 24. BARTOC.org Basel Register of Thesauri, Ontologies & Classifications ▸ Unveil Taxonomy Quality by the Wisdom of the Crowd 24
  • 25. qSKOS ▸ qSKOS is a tool for finding quality issues in SKOS vocabularies ▸ Available as free online service at http://qskos.poolparty.biz/ ▸ SKOS taxonomy being analyzed with regards to 24 issues 25
  • 26. PoolParty Import Validator 26 ▸ RDF Validation to go beyond SKOS ▸ Checks are defined in RDF, repair strategies also defined as RDF ▸ 15 checks have been integrated
  • 27. Shapes Constraint Language (SHACL) ▸ “Do for RDF what XML Schema does for XML” ▸ Language for validating RDF graphs against a set of conditions ▸ SHACL shape graphs are used to validate that data graphs satisfy a set of conditions ▸ Current status: W3C Working Draft (14 August 2016) See also: Towards maintainable constraint validation and repair for taxonomies: The PoolParty approach (Christian Mader and Monika Solanki) 27
  • 28. GET YOUR TEST ACCOUNT GET CERTIFIED 28 Get your test account at www.poolparty.biz/demo Get certified at www.poolparty.biz/academy/
  • 29. CONNECT Andreas Blumauer CEO, Semantic Web Company ▸ a.blumauer@semantic-web.at ▸ http://at.linkedin.com/in/andreasblumauer ▸ https://twitter.com/semwebcompany ▸ https://www.poolparty.biz ▸ https://www.semantic-web.at 29 © Semantic Web Company - http://www.semantic-web.at/ and http://www.poolparty.biz/