SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Introduction
Techniques
Overview and Summary
Multilingual Text Classification
Gerard de Melo, Stefan Siersdorfer
Max Planck Institute for Computer Science
Saarbr¨ucken, Germany
2007-04-04
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Text Classification
Text Classification
task: automatically assign text
documents to classes (e.g.
thematically, geographically)
machine learning algorithms, e.g.
SVM, can learn from pre-classified
training documents
multilingual case: documents in
multiple languages
applications: news wire filtering,
library management, e-mail, etc.
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Text Classification
Text Classification
task: automatically assign text
documents to classes (e.g.
thematically, geographically)
machine learning algorithms, e.g.
SVM, can learn from pre-classified
training documents
multilingual case: documents in
multiple languages
applications: news wire filtering,
library management, e-mail, etc.
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Text Classification
Text Classification
task: automatically assign text
documents to classes (e.g.
thematically, geographically)
machine learning algorithms, e.g.
SVM, can learn from pre-classified
training documents
multilingual case: documents in
multiple languages
applications: news wire filtering,
library management, e-mail, etc.
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Text Classification
Text Classification
task: automatically assign text
documents to classes (e.g.
thematically, geographically)
machine learning algorithms, e.g.
SVM, can learn from pre-classified
training documents
multilingual case: documents in
multiple languages
applications: news wire filtering,
library management, e-mail, etc.
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Machine Translation for Multilingual TC
idea: simply translate all documents into a single language LI
(prior work by Jalam 2002, Rigutini et al. 2005)
shortcomings of this approach
lexical variety in LI (English: huge vocabulary, many synonyms)
variety of expression in source languages
lexical ambiguity in LI (unnecessary introduction of additional
ambiguity)
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Machine Translation for Multilingual TC
idea: simply translate all documents into a single language LI
(prior work by Jalam 2002, Rigutini et al. 2005)
shortcomings of this approach
lexical variety in LI (English: huge vocabulary, many synonyms)
variety of expression in source languages
lexical ambiguity in LI (unnecessary introduction of additional
ambiguity)
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Machine Translation for Multilingual TC
idea: simply translate all documents into a single language LI
(prior work by Jalam 2002, Rigutini et al. 2005)
shortcomings of this approach
lexical variety in LI (English: huge vocabulary, many synonyms)
variety of expression in source languages
lexical ambiguity in LI (unnecessary introduction of additional
ambiguity)
Spanish coche −→ car
French voiture −→ automobile
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Semantic Concepts
Idea
map all words to semantic concepts (e.g. WordNet synsets),
thus distinguishing different senses of a word while identifying
synonyms
disambiguate using context information
construct feature vectors by counting occurrences of concepts
rather than terms
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Semantic Concepts
Idea
map all words to semantic concepts (e.g. WordNet synsets),
thus distinguishing different senses of a word while identifying
synonyms
disambiguate using context information
construct feature vectors by counting occurrences of concepts
rather than terms
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Semantic Concepts
Idea
map all words to semantic concepts (e.g. WordNet synsets),
thus distinguishing different senses of a word while identifying
synonyms
disambiguate using context information
construct feature vectors by counting occurrences of concepts
rather than terms
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Semantic Concepts
Problems
understemming
polysemy: highly related senses are treated as distinct
incongruent concepts between languages
variety of expression
lexical lacunae
English I have a headache I have a headache
Spanish Me duele la cabeza *It hurts the head to me
French J’ai mal `a la t^ete *I have pain at the head
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Weight Propagation
propagate weight from original
concepts to related concepts
choose path to c maximizing
its weight
Dijkstra-like algorithm in order
to assign maximal possible
weight to a concept
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Weight Propagation
propagate weight from original
concepts to related concepts
choose path to c maximizing
its weight
Dijkstra-like algorithm in order
to assign maximal possible
weight to a concept
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Machine Translation
Mapping to Semantic Concept
Weight Propagation
Weight Propagation
propagate weight from original
concepts to related concepts
choose path to c maximizing
its weight
Dijkstra-like algorithm in order
to assign maximal possible
weight to a concept
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Overview and Summary
Overview and Summary
Ontology Region Mapping
1 optionally translate the documents – or use a multilingual
lexical resource (aligned wordnets)
2 map terms to concepts
3 search for highly related concepts
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Overview and Summary
Overview and Summary
Ontology Region Mapping
1 optionally translate the documents – or use a multilingual
lexical resource (aligned wordnets)
2 map terms to concepts
3 search for highly related concepts
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
Introduction
Techniques
Overview and Summary
Overview and Summary
Overview and Summary
Ontology Region Mapping
1 optionally translate the documents – or use a multilingual
lexical resource (aligned wordnets)
2 map terms to concepts
3 search for highly related concepts
entire regions of concepts are
relevant, so propagate a part
of the concept’s weight to
related concepts
G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification

Mais conteúdo relacionado

Mais procurados

Word Embedding to Document distances
Word Embedding to Document distancesWord Embedding to Document distances
Word Embedding to Document distancesGanesh Borle
 
Methods for Amharic Part-of-Speech Tagging
Methods for Amharic Part-of-Speech TaggingMethods for Amharic Part-of-Speech Tagging
Methods for Amharic Part-of-Speech TaggingGuy De Pauw
 
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainDDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainLuukBoulogne
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimEdgar Marca
 
Word representations in vector space
Word representations in vector spaceWord representations in vector space
Word representations in vector spaceAbdullah Khan Zehady
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learningfridolin.wild
 
slides
slidesslides
slidesbutest
 
Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...inscit2006
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Bhaskar Mitra
 
An Intuitive Natural Language Understanding System
An Intuitive Natural Language Understanding SystemAn Intuitive Natural Language Understanding System
An Intuitive Natural Language Understanding Systeminscit2006
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
 
Bt0077 multimedia systems
Bt0077   multimedia systemsBt0077   multimedia systems
Bt0077 multimedia systemssmumbahelp
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics Ibutest
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationYunchao He
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupLINAGORA
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentationSoojung Hong
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationMarco Righini
 

Mais procurados (20)

Word Embedding to Document distances
Word Embedding to Document distancesWord Embedding to Document distances
Word Embedding to Document distances
 
Methods for Amharic Part-of-Speech Tagging
Methods for Amharic Part-of-Speech TaggingMethods for Amharic Part-of-Speech Tagging
Methods for Amharic Part-of-Speech Tagging
 
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainDDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensim
 
Word representations in vector space
Word representations in vector spaceWord representations in vector space
Word representations in vector space
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learning
 
slides
slidesslides
slides
 
Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
An Intuitive Natural Language Understanding System
An Intuitive Natural Language Understanding SystemAn Intuitive Natural Language Understanding System
An Intuitive Natural Language Understanding System
 
Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbow
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Bt0077 multimedia systems
Bt0077   multimedia systemsBt0077   multimedia systems
Bt0077 multimedia systems
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classification
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - Meetup
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentation
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 

Semelhante a Multilingual Text Classification using Ontologies

Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...csandit
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioDeep Learning Italia
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMassimo Schenone
 
Doc format.
Doc format.Doc format.
Doc format.butest
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to HindiRajat Jain
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answeringAli Kabbadj
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Facultad de Informática UCM
 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlpLaraOlmosCamarena
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) inventionjournals
 
Scales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduates
Scales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduatesScales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduates
Scales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduatesHans Ecke
 
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...Christophe Tricot
 
The Valladolid Presentation - Nov, 16, 2011
The Valladolid Presentation - Nov, 16, 2011The Valladolid Presentation - Nov, 16, 2011
The Valladolid Presentation - Nov, 16, 2011sdemetri
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONijnlc
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYijnlc
 
IJNLC 2013 - Ambiguity-Aware Document Similarity
IJNLC  2013 - Ambiguity-Aware Document SimilarityIJNLC  2013 - Ambiguity-Aware Document Similarity
IJNLC 2013 - Ambiguity-Aware Document Similaritykevig
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationEugene Nho
 
EasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfEasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfNohaGhoweil
 

Semelhante a Multilingual Text Classification using Ontologies (20)

Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglio
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
 
Doc format.
Doc format.Doc format.
Doc format.
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
speech enhancement
speech enhancementspeech enhancement
speech enhancement
 
Scales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduates
Scales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduatesScales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduates
Scales02WhatProgrammingLanguagesShouldWeTeachOurUndergraduates
 
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
 
The Valladolid Presentation - Nov, 16, 2011
The Valladolid Presentation - Nov, 16, 2011The Valladolid Presentation - Nov, 16, 2011
The Valladolid Presentation - Nov, 16, 2011
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
 
IJNLC 2013 - Ambiguity-Aware Document Similarity
IJNLC  2013 - Ambiguity-Aware Document SimilarityIJNLC  2013 - Ambiguity-Aware Document Similarity
IJNLC 2013 - Ambiguity-Aware Document Similarity
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic Classification
 
EasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfEasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdf
 
Vsm lsi
Vsm lsiVsm lsi
Vsm lsi
 
REPORT.doc
REPORT.docREPORT.doc
REPORT.doc
 

Mais de Gerard de Melo

SEMAC Graph Node Embeddings for Link Prediction
SEMAC Graph Node Embeddings for Link PredictionSEMAC Graph Node Embeddings for Link Prediction
SEMAC Graph Node Embeddings for Link PredictionGerard de Melo
 
How to Manage your Research
How to Manage your ResearchHow to Manage your Research
How to Manage your ResearchGerard de Melo
 
Knowlywood: Mining Activity Knowledge from Hollywood Narratives
Knowlywood: Mining Activity Knowledge from Hollywood NarrativesKnowlywood: Mining Activity Knowledge from Hollywood Narratives
Knowlywood: Mining Activity Knowledge from Hollywood NarrativesGerard de Melo
 
Learning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the WebLearning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the WebGerard de Melo
 
From Big Data to Valuable Knowledge
From Big Data to Valuable KnowledgeFrom Big Data to Valuable Knowledge
From Big Data to Valuable KnowledgeGerard de Melo
 
Scalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningGerard de Melo
 
Searching the Web of Data (Tutorial)
Searching the Web of Data (Tutorial)Searching the Web of Data (Tutorial)
Searching the Web of Data (Tutorial)Gerard de Melo
 
From Linked Data to Tightly Integrated Data
From Linked Data to Tightly Integrated DataFrom Linked Data to Tightly Integrated Data
From Linked Data to Tightly Integrated DataGerard de Melo
 
Information Extraction from Web-Scale N-Gram Data
Information Extraction from Web-Scale N-Gram DataInformation Extraction from Web-Scale N-Gram Data
Information Extraction from Web-Scale N-Gram DataGerard de Melo
 
UWN: A Large Multilingual Lexical Knowledge Base
UWN: A Large Multilingual Lexical Knowledge BaseUWN: A Large Multilingual Lexical Knowledge Base
UWN: A Large Multilingual Lexical Knowledge BaseGerard de Melo
 
Extracting Sense-Disambiguated Example Sentences From Parallel Corpora
Extracting Sense-Disambiguated Example Sentences From Parallel CorporaExtracting Sense-Disambiguated Example Sentences From Parallel Corpora
Extracting Sense-Disambiguated Example Sentences From Parallel CorporaGerard de Melo
 
Towards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceTowards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceGerard de Melo
 
Not Quite the Same: Identity Constraints for the Web of Linked Data
Not Quite the Same: Identity Constraints for the Web of Linked DataNot Quite the Same: Identity Constraints for the Web of Linked Data
Not Quite the Same: Identity Constraints for the Web of Linked DataGerard de Melo
 
Good, Great, Excellent: Global Inference of Semantic Intensities
Good, Great, Excellent: Global Inference of Semantic IntensitiesGood, Great, Excellent: Global Inference of Semantic Intensities
Good, Great, Excellent: Global Inference of Semantic IntensitiesGerard de Melo
 
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged OntologyYAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged OntologyGerard de Melo
 

Mais de Gerard de Melo (15)

SEMAC Graph Node Embeddings for Link Prediction
SEMAC Graph Node Embeddings for Link PredictionSEMAC Graph Node Embeddings for Link Prediction
SEMAC Graph Node Embeddings for Link Prediction
 
How to Manage your Research
How to Manage your ResearchHow to Manage your Research
How to Manage your Research
 
Knowlywood: Mining Activity Knowledge from Hollywood Narratives
Knowlywood: Mining Activity Knowledge from Hollywood NarrativesKnowlywood: Mining Activity Knowledge from Hollywood Narratives
Knowlywood: Mining Activity Knowledge from Hollywood Narratives
 
Learning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the WebLearning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the Web
 
From Big Data to Valuable Knowledge
From Big Data to Valuable KnowledgeFrom Big Data to Valuable Knowledge
From Big Data to Valuable Knowledge
 
Scalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data Mining
 
Searching the Web of Data (Tutorial)
Searching the Web of Data (Tutorial)Searching the Web of Data (Tutorial)
Searching the Web of Data (Tutorial)
 
From Linked Data to Tightly Integrated Data
From Linked Data to Tightly Integrated DataFrom Linked Data to Tightly Integrated Data
From Linked Data to Tightly Integrated Data
 
Information Extraction from Web-Scale N-Gram Data
Information Extraction from Web-Scale N-Gram DataInformation Extraction from Web-Scale N-Gram Data
Information Extraction from Web-Scale N-Gram Data
 
UWN: A Large Multilingual Lexical Knowledge Base
UWN: A Large Multilingual Lexical Knowledge BaseUWN: A Large Multilingual Lexical Knowledge Base
UWN: A Large Multilingual Lexical Knowledge Base
 
Extracting Sense-Disambiguated Example Sentences From Parallel Corpora
Extracting Sense-Disambiguated Example Sentences From Parallel CorporaExtracting Sense-Disambiguated Example Sentences From Parallel Corpora
Extracting Sense-Disambiguated Example Sentences From Parallel Corpora
 
Towards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceTowards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined Evidence
 
Not Quite the Same: Identity Constraints for the Web of Linked Data
Not Quite the Same: Identity Constraints for the Web of Linked DataNot Quite the Same: Identity Constraints for the Web of Linked Data
Not Quite the Same: Identity Constraints for the Web of Linked Data
 
Good, Great, Excellent: Global Inference of Semantic Intensities
Good, Great, Excellent: Global Inference of Semantic IntensitiesGood, Great, Excellent: Global Inference of Semantic Intensities
Good, Great, Excellent: Global Inference of Semantic Intensities
 
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged OntologyYAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
 

Último

Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...HyderabadDolls
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...vershagrag
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 

Último (20)

Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 

Multilingual Text Classification using Ontologies

  • 1. Introduction Techniques Overview and Summary Multilingual Text Classification Gerard de Melo, Stefan Siersdorfer Max Planck Institute for Computer Science Saarbr¨ucken, Germany 2007-04-04 G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 2. Introduction Techniques Overview and Summary Text Classification Text Classification task: automatically assign text documents to classes (e.g. thematically, geographically) machine learning algorithms, e.g. SVM, can learn from pre-classified training documents multilingual case: documents in multiple languages applications: news wire filtering, library management, e-mail, etc. G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 3. Introduction Techniques Overview and Summary Text Classification Text Classification task: automatically assign text documents to classes (e.g. thematically, geographically) machine learning algorithms, e.g. SVM, can learn from pre-classified training documents multilingual case: documents in multiple languages applications: news wire filtering, library management, e-mail, etc. G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 4. Introduction Techniques Overview and Summary Text Classification Text Classification task: automatically assign text documents to classes (e.g. thematically, geographically) machine learning algorithms, e.g. SVM, can learn from pre-classified training documents multilingual case: documents in multiple languages applications: news wire filtering, library management, e-mail, etc. G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 5. Introduction Techniques Overview and Summary Text Classification Text Classification task: automatically assign text documents to classes (e.g. thematically, geographically) machine learning algorithms, e.g. SVM, can learn from pre-classified training documents multilingual case: documents in multiple languages applications: news wire filtering, library management, e-mail, etc. G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 6. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Machine Translation for Multilingual TC idea: simply translate all documents into a single language LI (prior work by Jalam 2002, Rigutini et al. 2005) shortcomings of this approach lexical variety in LI (English: huge vocabulary, many synonyms) variety of expression in source languages lexical ambiguity in LI (unnecessary introduction of additional ambiguity) G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 7. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Machine Translation for Multilingual TC idea: simply translate all documents into a single language LI (prior work by Jalam 2002, Rigutini et al. 2005) shortcomings of this approach lexical variety in LI (English: huge vocabulary, many synonyms) variety of expression in source languages lexical ambiguity in LI (unnecessary introduction of additional ambiguity) G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 8. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Machine Translation for Multilingual TC idea: simply translate all documents into a single language LI (prior work by Jalam 2002, Rigutini et al. 2005) shortcomings of this approach lexical variety in LI (English: huge vocabulary, many synonyms) variety of expression in source languages lexical ambiguity in LI (unnecessary introduction of additional ambiguity) Spanish coche −→ car French voiture −→ automobile G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 9. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Semantic Concepts Idea map all words to semantic concepts (e.g. WordNet synsets), thus distinguishing different senses of a word while identifying synonyms disambiguate using context information construct feature vectors by counting occurrences of concepts rather than terms G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 10. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Semantic Concepts Idea map all words to semantic concepts (e.g. WordNet synsets), thus distinguishing different senses of a word while identifying synonyms disambiguate using context information construct feature vectors by counting occurrences of concepts rather than terms G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 11. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Semantic Concepts Idea map all words to semantic concepts (e.g. WordNet synsets), thus distinguishing different senses of a word while identifying synonyms disambiguate using context information construct feature vectors by counting occurrences of concepts rather than terms G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 12. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Semantic Concepts Problems understemming polysemy: highly related senses are treated as distinct incongruent concepts between languages variety of expression lexical lacunae English I have a headache I have a headache Spanish Me duele la cabeza *It hurts the head to me French J’ai mal `a la t^ete *I have pain at the head G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 13. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Weight Propagation propagate weight from original concepts to related concepts choose path to c maximizing its weight Dijkstra-like algorithm in order to assign maximal possible weight to a concept G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 14. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Weight Propagation propagate weight from original concepts to related concepts choose path to c maximizing its weight Dijkstra-like algorithm in order to assign maximal possible weight to a concept G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 15. Introduction Techniques Overview and Summary Machine Translation Mapping to Semantic Concept Weight Propagation Weight Propagation propagate weight from original concepts to related concepts choose path to c maximizing its weight Dijkstra-like algorithm in order to assign maximal possible weight to a concept G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 16. Introduction Techniques Overview and Summary Overview and Summary Overview and Summary Ontology Region Mapping 1 optionally translate the documents – or use a multilingual lexical resource (aligned wordnets) 2 map terms to concepts 3 search for highly related concepts G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 17. Introduction Techniques Overview and Summary Overview and Summary Overview and Summary Ontology Region Mapping 1 optionally translate the documents – or use a multilingual lexical resource (aligned wordnets) 2 map terms to concepts 3 search for highly related concepts G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification
  • 18. Introduction Techniques Overview and Summary Overview and Summary Overview and Summary Ontology Region Mapping 1 optionally translate the documents – or use a multilingual lexical resource (aligned wordnets) 2 map terms to concepts 3 search for highly related concepts entire regions of concepts are relevant, so propagate a part of the concept’s weight to related concepts G. de Melo, S. Siersdorfer, Max-Planck-Institut Informatik Multilingual Text Classification