SlideShare uma empresa Scribd logo
1 de 15
Jim Sweeney
Product Manager
Synaptica, LLC

Successfully
Managing
Multilingual
Taxonomies
Truly a “Global” World
• We live in societies that require that we are able
to communicate across geography, culture, and
language.
• Being able to arrive at the same concept,
regardless of geography, culture or language is a
necessity in commerce and communication.
• Taxonomies and thesauri are the ways that we
organize and describe the world that we live in,
whether we are consciously aware of them or
not!
Building Multilingual Taxonomies
We will look at 3 approaches to building
and managing multilingual
taxonomies/thesauri in this presentation
and the pros and cons of each:
1. Single Vocabulary Method
2. Asymmetric Multilingual Vocabularies
3. Symmetric Multilingual Vocabularies
Single Vocabulary Method
• Using this method one effectively builds the
taxonomy settling on a primary, or dominant
language, and all structure is assigned based
on that language.
• All translations and associated translated
metadata are assigned as attributes of the
primary language term.
Single Vocabulary Method

The primary language as well
as each translation for the
term and associated metadata
are stored as attributes.
Single Vocabulary Method
• Hierarchical structure is determined by the primary
language.
• Consequently that languages also dictates cultural
and localization values as well.
Pros and Cons of the Method
Pros:
• Simplest of the three methods we will discuss
to design and maintain
• Least resource intensive to manage
Cons:
• Most limiting of the methods
• One language is dominant
• Synonyms may not vary across languages
Asymmetric Multilingual Vocabularies
• This method uses wholly independent, fully
structured taxonomies for each language with
concepts joined using equivalency (LE or EQ)
relationships.
• A single language may be selected as the
exchange language through which all languages
are linked
Asymmetric Multilingual Vocabularies
Though not always recommended, each
Vocabulary may be built using a completely
unique structure as well as number of
concepts to achieve full localization.
Pros and Cons of the Method
Pros:
• Provides for the most complete localization
• Each language may have a unique set of
attributes
• No one language is dominant
• New languages may be readily added
• Synonyms may vary across languages
Cons:
• Most resource intensive method to manage
• Less harmonized than the symmetric model
Symmetric Multilingual Vocabularies
• This model is strongly encouraged by the former and
current ISO standards (5964 and 25964-1)
• Every concept should have a Preferred Term (PT) in
each language
• All languages should share a common hierarchical
and associative structure
• Each language supports independent synonym sets
Symmetric Multilingual Vocabularies

•
•

There should be an instance of every preferred
term in all languages.
These terms may then be related via an
equivalency (LE or EQ) relationship or by making
them preferred labels to be applied to abstract
SKOS concepts.
Pros and Cons of the Method
Pros:
• Allows for management of unique attributes for each
language
• No one language is dominant
• Synonyms may vary across languages
• Much less intensive to manage because
all languages share a common structure
Cons:
• May not allow for subtle differences of language and
culture to be expressed through variations in concepts
and relational structure
Conclusions
• There are several options for managing multilingual vocabularies and each method possesses
some advantages and disadvantages.
• ISO Standards (25964-1) strongly recommend a
symmetric approach whenever possible.
• SKOS-XL provides an effective format that
supports the ISO symmetric model.
• One may employ an asymmetric method when
necessary, but beware the extra costs!
Thank You!

Jim Sweeney
Product Manager, Synaptica
Jim.sweeney@synaptica.com
www.synaptica.com

Successfully
Managing
Multilingual
Taxonomies

Mais conteúdo relacionado

Semelhante a Successfully Managing Multilingual Taxonomies: 3 Methods

2, knowledge of language.pptx
2, knowledge of language.pptx2, knowledge of language.pptx
2, knowledge of language.pptxMemonMemon4
 
2, knowledge of language.pptx
2, knowledge of language.pptx2, knowledge of language.pptx
2, knowledge of language.pptxMemonMemon4
 
Language Competence: Don't Settle for a Piece of the Pie
Language Competence: Don't Settle for a Piece of the PieLanguage Competence: Don't Settle for a Piece of the Pie
Language Competence: Don't Settle for a Piece of the PieSpectronics
 
E10-03 (CAP 1 Y 2)
E10-03 (CAP 1 Y 2)E10-03 (CAP 1 Y 2)
E10-03 (CAP 1 Y 2)unsa1virtual
 
Competency 4 Session 2 ............................
Competency 4 Session 2 ............................Competency 4 Session 2 ............................
Competency 4 Session 2 ............................queenpressman14
 
Variable competence model(Filipino 203)Introduction to Descriptive Linguistics
Variable competence model(Filipino 203)Introduction to Descriptive LinguisticsVariable competence model(Filipino 203)Introduction to Descriptive Linguistics
Variable competence model(Filipino 203)Introduction to Descriptive LinguisticsShirley Veniegas
 
Code switching &; code mixing
Code switching &; code mixingCode switching &; code mixing
Code switching &; code mixingYoushaib Alam
 
Developing your Communicative Competence
Developing your Communicative Competence Developing your Communicative Competence
Developing your Communicative Competence Karl Joshua Enoy Jumoc
 
Corpus Planning, Standardization and Modernization.pptx
Corpus Planning, Standardization and Modernization.pptxCorpus Planning, Standardization and Modernization.pptx
Corpus Planning, Standardization and Modernization.pptxSubramanian Mani
 
5810 day 9 review all
5810 day 9 review all 5810 day 9 review all
5810 day 9 review all SVTaylor123
 
Multi language lms
Multi language lmsMulti language lms
Multi language lmsParadiso LMS
 
Theories of Second Language Acquistion.pptx
Theories of Second Language Acquistion.pptxTheories of Second Language Acquistion.pptx
Theories of Second Language Acquistion.pptxAiza Bheal
 
International Websites and Software
International Websites and SoftwareInternational Websites and Software
International Websites and SoftwareMelody Eye
 
Academic Vocabulary POWERPOINT.pptx
Academic Vocabulary POWERPOINT.pptxAcademic Vocabulary POWERPOINT.pptx
Academic Vocabulary POWERPOINT.pptxPauloAngeles2
 
California World Language Standards Update
California World Language Standards UpdateCalifornia World Language Standards Update
California World Language Standards UpdateCarla Piper
 
Communicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability PhrasesCommunicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability Phrasesipcc-media
 

Semelhante a Successfully Managing Multilingual Taxonomies: 3 Methods (20)

2, knowledge of language.pptx
2, knowledge of language.pptx2, knowledge of language.pptx
2, knowledge of language.pptx
 
2, knowledge of language.pptx
2, knowledge of language.pptx2, knowledge of language.pptx
2, knowledge of language.pptx
 
Language Competence: Don't Settle for a Piece of the Pie
Language Competence: Don't Settle for a Piece of the PieLanguage Competence: Don't Settle for a Piece of the Pie
Language Competence: Don't Settle for a Piece of the Pie
 
Permasalahan penyerta Stuttering.pdf
Permasalahan penyerta Stuttering.pdfPermasalahan penyerta Stuttering.pdf
Permasalahan penyerta Stuttering.pdf
 
E10-03 (CAP 1 Y 2)
E10-03 (CAP 1 Y 2)E10-03 (CAP 1 Y 2)
E10-03 (CAP 1 Y 2)
 
Competency 4 Session 2 ............................
Competency 4 Session 2 ............................Competency 4 Session 2 ............................
Competency 4 Session 2 ............................
 
Variable competence model(Filipino 203)Introduction to Descriptive Linguistics
Variable competence model(Filipino 203)Introduction to Descriptive LinguisticsVariable competence model(Filipino 203)Introduction to Descriptive Linguistics
Variable competence model(Filipino 203)Introduction to Descriptive Linguistics
 
Code switching &; code mixing
Code switching &; code mixingCode switching &; code mixing
Code switching &; code mixing
 
Baby Steps to Note-Taking for Consecutive Interpreting
Baby Steps to Note-Taking for Consecutive InterpretingBaby Steps to Note-Taking for Consecutive Interpreting
Baby Steps to Note-Taking for Consecutive Interpreting
 
Developing your Communicative Competence
Developing your Communicative Competence Developing your Communicative Competence
Developing your Communicative Competence
 
Corpus Planning, Standardization and Modernization.pptx
Corpus Planning, Standardization and Modernization.pptxCorpus Planning, Standardization and Modernization.pptx
Corpus Planning, Standardization and Modernization.pptx
 
5810 day 9 review all
5810 day 9 review all 5810 day 9 review all
5810 day 9 review all
 
Multi language lms
Multi language lmsMulti language lms
Multi language lms
 
Theories of Second Language Acquistion.pptx
Theories of Second Language Acquistion.pptxTheories of Second Language Acquistion.pptx
Theories of Second Language Acquistion.pptx
 
International Websites and Software
International Websites and SoftwareInternational Websites and Software
International Websites and Software
 
ReseachPaper
ReseachPaperReseachPaper
ReseachPaper
 
Pc2
Pc2Pc2
Pc2
 
Academic Vocabulary POWERPOINT.pptx
Academic Vocabulary POWERPOINT.pptxAcademic Vocabulary POWERPOINT.pptx
Academic Vocabulary POWERPOINT.pptx
 
California World Language Standards Update
California World Language Standards UpdateCalifornia World Language Standards Update
California World Language Standards Update
 
Communicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability PhrasesCommunicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability Phrases
 

Último

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Último (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Successfully Managing Multilingual Taxonomies: 3 Methods

  • 1. Jim Sweeney Product Manager Synaptica, LLC Successfully Managing Multilingual Taxonomies
  • 2. Truly a “Global” World • We live in societies that require that we are able to communicate across geography, culture, and language. • Being able to arrive at the same concept, regardless of geography, culture or language is a necessity in commerce and communication. • Taxonomies and thesauri are the ways that we organize and describe the world that we live in, whether we are consciously aware of them or not!
  • 3. Building Multilingual Taxonomies We will look at 3 approaches to building and managing multilingual taxonomies/thesauri in this presentation and the pros and cons of each: 1. Single Vocabulary Method 2. Asymmetric Multilingual Vocabularies 3. Symmetric Multilingual Vocabularies
  • 4. Single Vocabulary Method • Using this method one effectively builds the taxonomy settling on a primary, or dominant language, and all structure is assigned based on that language. • All translations and associated translated metadata are assigned as attributes of the primary language term.
  • 5. Single Vocabulary Method The primary language as well as each translation for the term and associated metadata are stored as attributes.
  • 6. Single Vocabulary Method • Hierarchical structure is determined by the primary language. • Consequently that languages also dictates cultural and localization values as well.
  • 7. Pros and Cons of the Method Pros: • Simplest of the three methods we will discuss to design and maintain • Least resource intensive to manage Cons: • Most limiting of the methods • One language is dominant • Synonyms may not vary across languages
  • 8. Asymmetric Multilingual Vocabularies • This method uses wholly independent, fully structured taxonomies for each language with concepts joined using equivalency (LE or EQ) relationships. • A single language may be selected as the exchange language through which all languages are linked
  • 9. Asymmetric Multilingual Vocabularies Though not always recommended, each Vocabulary may be built using a completely unique structure as well as number of concepts to achieve full localization.
  • 10. Pros and Cons of the Method Pros: • Provides for the most complete localization • Each language may have a unique set of attributes • No one language is dominant • New languages may be readily added • Synonyms may vary across languages Cons: • Most resource intensive method to manage • Less harmonized than the symmetric model
  • 11. Symmetric Multilingual Vocabularies • This model is strongly encouraged by the former and current ISO standards (5964 and 25964-1) • Every concept should have a Preferred Term (PT) in each language • All languages should share a common hierarchical and associative structure • Each language supports independent synonym sets
  • 12. Symmetric Multilingual Vocabularies • • There should be an instance of every preferred term in all languages. These terms may then be related via an equivalency (LE or EQ) relationship or by making them preferred labels to be applied to abstract SKOS concepts.
  • 13. Pros and Cons of the Method Pros: • Allows for management of unique attributes for each language • No one language is dominant • Synonyms may vary across languages • Much less intensive to manage because all languages share a common structure Cons: • May not allow for subtle differences of language and culture to be expressed through variations in concepts and relational structure
  • 14. Conclusions • There are several options for managing multilingual vocabularies and each method possesses some advantages and disadvantages. • ISO Standards (25964-1) strongly recommend a symmetric approach whenever possible. • SKOS-XL provides an effective format that supports the ISO symmetric model. • One may employ an asymmetric method when necessary, but beware the extra costs!
  • 15. Thank You! Jim Sweeney Product Manager, Synaptica Jim.sweeney@synaptica.com www.synaptica.com Successfully Managing Multilingual Taxonomies

Notas do Editor

  1. ISO25964:4.1 says the aim of a thesaurus ‘is to guide the indexer and the searcher to choose the same term for the same concept…’ This is the key idea as to why we create standards for managing multi-lingual vocabularies and what we’ll explore in the next few slides are some different approaches to achieving this.
  2. You might want to verbalise that although there are slots for each language only the dominant language is used as the descriptor for the term record
  3. *Multiple Monolingual vocabularies which are mapped to one another
  4. *work on diagram
  5. *every preferred term (concept) in one language should have an equivalent preferred term (concept) in all other languages. Binding language labels to an abstract SKOS concept.