SlideShare uma empresa Scribd logo
1 de 52
Baixar para ler offline
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Similarity Measures for
Semantic Relation Extraction
Mont Clair State University, Brown Bag Seminar (USA)
Alexander Panchenko
Universit´e catholique de Louvain &
Ditital Society Laboratory LLC
alexander.panchenko@uclouvain.be
May 2, 2014
Alexander Panchenko 1/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Alexander Panchenko 2/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Lexico-Semantic Search Engine “Serelex”
Filename Categorization System “iCOP”
Alexander Panchenko 3/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Computational Lexical Semantics
* Picture is adapted from Computational Linguistics LINGI2263 course
http://www.uclouvain.be/en-cours-2013-LINGI2263.html
Alexander Panchenko 4/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Introduction
Motivation
1 Synonyms, hypernyms and co-hyponyms are useful for:
text similarity (ˇSaric et al., 2012);
query expansion (Hsu et al., 2006);
question answering (Sun et al., 2005);
2 Manual resource construction is prohibitively expensive.
3 Extractors do not meet quality of the handcrafted resources.
Focus
Similarity-based semantic relation extraction.
Research Question
How to improve precision and coverage of such measures?
Alexander Panchenko 5/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Semantic Resources
Definition
A semantic resource is an undirected graph (C, R):
nodes C represent terms;
edges R represent untyped semantic relations.
Alexander Panchenko 6/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Semantic Relation Extractors
We study extractors based on two components:
1 semantic similarity measures;
2 nearest neighbors procedures.
Terms
Similarity Measure
R
S
Normalizer
S
Semantic Similarity Measure
Semantic Relations
Feature Extractor
Text-Based Data
kNN Procedure
F
C
Semantic Relation Extractor
Alexander Panchenko 7/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Semantic Similarity Measures
Definition
A semantic similarity measure quantifies semantic relatedness input
terms ci , cj with the similarity score sij = sim(ci , cj ):
sij =
high if ci , cj is a pair of syn, hyper, cohypo
0 otherwise
Properties
Nonnegativity: 0 ≤ sij ≤ 1;
Reflexivity: sij = 1 ⇔ ci = cj ;
Symmetry: sij = sji ;
Triangle inequality: sij ≤ sik + skj
Alexander Panchenko 8/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Semantic Similarity Measures
Many dissimilar pairs, few similar pairs: sij ∼ exp(λ):
Similarity distribution of the term “doctor”:
Alexander Panchenko 9/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Evaluation of Semantic Similarity Measures
1 Correlations with human judgments:
Criterion: Pearson correlation (ρ) и Spearman correlation (r).
Datasets: MC, RG, WordSim.
2 Semantic relation ranking:
Criterion: Precision, Recall, F-measure.
Dataset: BLESS, SN.
3 Semantic relation extraction:
Criterion: Precision@k.
Data: annotation and/or dictionaries.
4 Application-based evaluation:
short text classification system (iCOP);
lexico-semantic search engine (Serelex).
Panchenko A., Similarity Measures for Semantic Relation
Extraction. PhD thesis. Universit´e catholique de Louvain. 197
pages, 2013, (Chapter 1).
Alexander Panchenko 10/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Correlations with human judgments
Alexander Panchenko 11/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Semantic Relation Ranking
Precision P(k = 50) = 1
7 ≈ 0.86
word, ci word, cj relation type sij
aficionado enthusiast syn 0.07197
aficionado fan syn 0.05195
aficionado admirer syn 0.01964
aficionado addict syn 0.01326
aficionado devotee syn 0.01163
aficionado foundling random 0.00777
aficionado fanatic syn 0.00414
aficionado adherent syn 0.00353
aficionado capital random 0.00232
aficionado statute random 0.00029
aficionado blot random 0.00025
aficionado meddler random 0.00005
aficionado enlargement random 0.00003
aficionado bawdyhouse random 0.00000
Alexander Panchenko 12/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Lexico-Semantic Search Engine “Serelex”
Filename Categorization System “iCOP”
Alexander Panchenko 13/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Related publications
This work stems from Hearst, M. A. Automatic acquisition of
hyponyms from large text corpora. In ACL, pages 539–545,
1992.
Selected publications:
Panchenko A., Morozova O., Naets H. A Semantic
Similarity Measure Based on Lexico-Syntactic Patterns.
In Proceedings of KONVENS 2012, pp.174–178, Vienna
(Austria), 2012
Panchenko A., Romanov P., Morozova O., Naets H.,
Philippovich A., Fairon C. Serelex: Search and
Visualization of Semantically Related Words. In
Proceedings of the 35th European Conference on Information
Retrieval (ECIR 2013), Moscow (Russia), 2013.
Alexander Panchenko 14/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
A live demo
http://serelex.cental.be/
Alexander Panchenko 15/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Lexico-syntactic patterns
18 patterns that extract hypernyms, co-hyponyms and
synonyms
Alexander Panchenko 16/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Patterns are encoded as FSTs
Finite State Transducers (FSTs)
Open source corpus processing tool Unitex:
http://igm.univ-mlv.fr/~unitex/
Alexander Panchenko 17/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
A pattern encoded as an FST
Take into account linguistic variation
Unlike string-based patterns (Bollegala et al., 2007)
Alexander Panchenko 18/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Patterns extract concordances
such diverse {[occupations]} as {[doctors]},
{[engineers]} and {[scientists]}[PATTERN=1]
such {non-alcoholic [sodas]} as {[root beer]} and
{[cream soda]}[PATTERN=1]
{traditional[food]}, such as
{[sandwich]},{[burger]}, and {[fry]}[PATTERN=2]
Alexander Panchenko 19/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Corpus
Corpus Wikipedia+ukWaC: 2.9 · 1012 tokens
Extracted concordances
Wikipedia – 1.196.468
ukWaC – 2.227.025
WaCypedia+ukWaC – 3.423.493
Alexander Panchenko 20/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Reranking formula Efreq-Rnum-Cfreq-Pnum
sij =
√
pij ·
2 · µb
bi∗ + b∗j
·
P(ci , cj )
P(ci )P(cj )
.
P(ci , cj ) =
eij
ij eij
– extraction probability of the pair ci , cj ,
eij – frequency of co-occurrence of ci and cj in concordances K
P(ci ) = fi
i fi
– probability of the term ci , fi – frequency of ci
bi∗ = j:eij ≥β 1 – the number of extractions for term ci with
the frequency ≥ β, µb = 1
|C|
|C|
i=1 bi∗ – the average number
of extractions per term
pij ∈ [1; 18] – number of distinct patterns which extracted the
relation ci , cj
Alexander Panchenko 21/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Semantic Relation Ranking
Precision is comparable or better w.r.t. the baselines;
Recall is lower w.r.t. the baselines.
Figure : Precision-Recall graphs (the BLESS dataset).
Alexander Panchenko 22/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Semantic Relation Extraction
Precision@1 ≈ 0.80;
“Good” coverage:
Alexander Panchenko 23/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Lexico-Semantic Search Engine “Serelex”
Filename Categorization System “iCOP”
Alexander Panchenko 24/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Related publications
Panchenko A. A Study of Heterogeneous Similarity
Measures for Semantic Relation Extraction. // In
JEP-TALN-RECITAL 2012 — Grenoble (France), 2012.
Panchenko A., Similarity Measures for Semantic Relation
Extraction. PhD thesis. Universit´e catholique de Louvain.
197 pages, 2013: Chapters 2.1, 3.1.
Alexander Panchenko 25/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Compared Semantic Similarity Measures
37 distinct measures;
Q1: Are the measures are complementary?
Q2: If yes, in which respects?
Alexander Panchenko 26/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
The Best Single Measures (MC, RG, WordSim, BLESS, SN)
Each one extracts many co-hyponyms, e.g.:
Canon, Nikon ,
Lamborghini, Ferrari ,
Obama, Romney .
Alexander Panchenko 27/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Further Results
Most dissimilar measures
Figure : 21 measures grouped according to
their relation distributions.
Measures are
complementary w.r.t.:
lexical coverage;
performances;
types of semantic
relations they extract.
Alexander Panchenko 28/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Implementation of the baseline measures
Semantic Vectors:
https://code.google.com/p/semanticvectors/
S-Space Package:
https://code.google.com/p/airhead-research/
WordNet::Similarity:
http://wn-similarity.sourceforge.net
NLTK: http://nltk.googlecode.com/svn/trunk/doc/
howto/wordnet.html
WikiRelate!
PatternSim / Serelex: http://serelex.cental.be
Web-based metrics:
http://cwl-projects.cogsci.rpi.edu/msr
LSA: http://lsa.colorado.edu
Alexander Panchenko 29/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Lexico-Semantic Search Engine “Serelex”
Filename Categorization System “iCOP”
Alexander Panchenko 30/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Related publications
Panchenko A., Morozova O. A Study of Hybrid Similarity
Measures for Semantic Relation Extraction. // Innovative
Hybrid Approaches to the Processing of Textual Data
Workshop, EACL 2012 — Avignon (France), 2012 — pp. 10–18
Panchenko A., Similarity Measures for Semantic Relation
Extraction. PhD thesis. Universit´e catholique de Louvain.
197 pages, 2013, (Chapter 4).
Panchenko A. A Study of Heterogeneous Similarity
Measures for Semantic Relation Extraction. // In
JEP-TALN-RECITAL 2012 — Grenoble (France), 2012 — pp.
29–42.
Alexander Panchenko 31/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Hybrid vs Single Measures
Terms, C
simi
(a) (b)
combination method
Scmb
S1 SN
sim1
S1
simN
norm
SN
...
...norm
norm
Scmb
knn
R
Si
norm
Si
knn
SingleSimilarityMeasure
HybridSimilarityMeasure
Relations,
Terms, C
RRelations,
Features
Figure : Semantic relation extractor based on:
(a) a single similarity measure;
(b) a hybrid similarity measure.
Alexander Panchenko 32/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
16 Features = 16 Single Similarity Measures
5 network-based measures :
1 WuPalmer;
2 Leacock and Chodorow;
3 Resnik;
4 Jiang and Conrath;
5 Lin.
3 web-based measures (NGD-Yahoo/Bing/Google);
5 corpus-based measures:
2 distributional (BDA, SDA)
1 lexico-syntactic patterns (PatternSim)
2 other co-occurence based (LSA, NGD-Factiva)
3 definition-based measures
1 ExtendedLesk;
2 GlossVectors;
3 DefVectors-WktWiki.
Alexander Panchenko 33/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Unsupervised Combination Methods
1 Mean: scmb
ij = 1
K k=1,K sk
ij ;
2 Mean-Nnz: scmb
ij = 1
|k:sk
ij >0,k=1,K| k=1,K sk
ij ;
3 Mean-Zscore: Scmb = 1
K
K
k=1
Sk −µk
σk
;
4 Median: scmb
ij = median(s1
ij , . . . , sK
ij );
5 Max: scmb
ij = max(s1
ij , . . . , sK
ij );
6 RankFusion: scmb
ij = 1
K k=1,K rk
ij ;
7 RelationFusion (Panchenko and Morozova, 2012).
Alexander Panchenko 34/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Supervised Combination Methods
8 Logit, Logit-L1, Logit-L2.
A binary logistic regression;
Positive examples – synonyms, hyponyms, co-hyponyms from
BLESS/SN;
Negative examples – random relations from BLESS/SN;
A relation ci , t, cj ∈ R is represented with a vector of
pairwise similarities: x = (s1
ij , . . . , sN
ij ), N = 2, 16;
Category yij :
yij =
0 if ci , t, cj is a random relation
1 otherwise
Using the model (w1, . . . , wK ) for combination:
scmb
ij =
1
1 + e−z
, z =
K
k=1
wk sk
ij + w0.
Alexander Panchenko 35/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Supervised Combination Methods
9 SVM.
The weights w and the support
vectors SV :
w =
xi ∈SV
αi yi xi .
Using the model
scmb
ij = wT
x+b =
K
k=1
wi sk
ij +b.
Alexander Panchenko 36/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Hybrid Similarity Measures
Precision-Recall graphs calculated on the BLESS dataset:
(a) 16 single measures and the best hybrid measure Logit-E15;
(b) 8 hybrid measures.
Alexander Panchenko 37/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Hybrid Similarity Measure Logit-E15
Figure : Similarity scores between 74 words related to the word “acacia”.
Alexander Panchenko 38/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Supervised Hybrid Similarity Measures
Alexander Panchenko 39/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Supervised Hybrid Similarity Measures (cont.)
Figure : Meta-parameter optimization with the grid search of the
C-SVM-radial-E15 measure.
Alexander Panchenko 40/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Lexico-Semantic Search Engine “Serelex”
Filename Categorization System “iCOP”
Alexander Panchenko 41/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Lexico-Semantic Search Engine “Serelex”
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Lexico-Semantic Search Engine “Serelex”
Filename Categorization System “iCOP”
Alexander Panchenko 42/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Lexico-Semantic Search Engine “Serelex”
Related publications
Panchenko A., Romanov P., Morozova O., Naets H.,
Philippovich A., Fairon C. Serelex: Search and
Visualization of Semantically Related Words. In
Proceedings of the 35th European Conference on Information
Retrieval (ECIR 2013), Moscow (Russia), 2013.
Panchenko A., Naets H., Brouwers L., Romanov P., Fairon C.,
Recherche et visualisation de mots s´emantiquement li´es.
Actes de la 20e conf´erence sur le Traitement Automatique des
Langues Naturelles (TALN’2013). Les Sables d’Olonne,
France. pp.747–754, 2013.
Alexander Panchenko 43/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Lexico-Semantic Search Engine “Serelex”
Search for Related Words: the List and the Graph
http://serelex.cental.be/
Alexander Panchenko 44/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Lexico-Semantic Search Engine “Serelex”
Search for Related Words: the List and the Graph
Alexander Panchenko 45/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Lexico-Semantic Search Engine “Serelex”
Search for Related Words: the Images
Alexander Panchenko 46/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Lexico-Semantic Search Engine “Serelex”
Evaluation of the Serelex
Figure : Users’ satisfaction with the top 20 results.
Alexander Panchenko 47/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Filename Categorization System “iCOP”
Plan
1 The Context and the Problem
2 Pattern-Based Semantic Similarity Measure
3 Comparison of Similarity Measures
4 Hybrid Semantic Similarity Measures
5 Applications of Semantic Similarity Measures
Lexico-Semantic Search Engine “Serelex”
Filename Categorization System “iCOP”
Alexander Panchenko 48/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Filename Categorization System “iCOP”
Related publications
Panchenko A., Naets H., Beaufort R., Fairon C. Towards
Detection of Child Sexual Abuse Media: Classification of
the Associated Filenames. In Proceedings of the 35th
European Conference on Information Retrieval (ECIR 2013).
LNCS 7814, pp. 776-779. Springler-Verlag Berlin Heidelberg
2013.
Panchenko A, Beaufort R., Fairon C. Detection of Child
Sexual Abuse Media on P2P Networks: Normalization
and Classification of Associated Filenames. In
Proceedings of Workshop on Language Resources for Public
Security Applications of the 8th International Conference on
Language Resources and Evaluation (LREC), 2012
Alexander Panchenko 49/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Filename Categorization System “iCOP”
Short text classification with Vocabulary Projection
Alexander Panchenko 50/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Filename Categorization System “iCOP”
Evaluation of the Vocabulary Projection
Training Dataset Test Dataset Accuracy Accuracy (voc. projection)
Gallery (train) Gallery 96.41 96.83 (+0.42)
PirateBay Title+Desc+Tags PirateBay Title+Desc+Tags 98.92 98.86 (–0.06)
PirateBay Title+Tags PirateBay Title+Tags 97.73 97.63 (–0.10)
Gallery PirateBay Title+Desc+Tags 90.57 91.48 (+0.91)
Gallery PirateBay Title+Tags 84.23 88.89 (+4.66)
PirateBay Title+Desc+Tags Gallery 88.83 89.04 (+0.21)
PirateBay Title+Tags Gallery 91.16 91.30 (+0.14)
Table : Performance of an C-SVM linear classifier (10-fold cross
validation).
Alexander Panchenko 51/52
The Problem Pattern-Based Measure Comparison Hybrid Measures Applications
Filename Categorization System “iCOP”
Thank you! Questions?
Alexander Panchenko 52/52

Mais conteúdo relacionado

Mais procurados

EASE 2019 keynote
EASE 2019 keynoteEASE 2019 keynote
EASE 2019 keynotePer Runeson
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingJaguaraci Silva
 
Automatic Generation of Multiple Choice Questions using Surface-based Semanti...
Automatic Generation of Multiple Choice Questions using Surface-based Semanti...Automatic Generation of Multiple Choice Questions using Surface-based Semanti...
Automatic Generation of Multiple Choice Questions using Surface-based Semanti...CSCJournals
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answeringAli Kabbadj
 
Unsupervised Word Usage Similarity in Social Media Texts
Unsupervised Word Usage Similarity in Social Media TextsUnsupervised Word Usage Similarity in Social Media Texts
Unsupervised Word Usage Similarity in Social Media TextsSpandana Gella
 
IRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between SentencesIRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between SentencesIRJET Journal
 
1.model building
1.model building1.model building
1.model buildingVinod Sahu
 
Relevance feature discovery for text mining
Relevance feature discovery for text miningRelevance feature discovery for text mining
Relevance feature discovery for text miningredpel dot com
 
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...Γιώργος Αλεξανδρίδης
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
 
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
 
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...cseij
 
20051128.doc
20051128.doc20051128.doc
20051128.docbutest
 
Ou leverhulme gt
Ou leverhulme gtOu leverhulme gt
Ou leverhulme gtAnne Adams
 
Topic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental surveyTopic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental surveyICDEcCnferenece
 
Building and Evaluating Theories 
 in Software Engineering
Building and Evaluating Theories 
 in Software EngineeringBuilding and Evaluating Theories 
 in Software Engineering
Building and Evaluating Theories 
 in Software EngineeringDaniel Mendez
 

Mais procurados (18)

D1802023136
D1802023136D1802023136
D1802023136
 
ISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-MondalISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-Mondal
 
EASE 2019 keynote
EASE 2019 keynoteEASE 2019 keynote
EASE 2019 keynote
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software Testing
 
Automatic Generation of Multiple Choice Questions using Surface-based Semanti...
Automatic Generation of Multiple Choice Questions using Surface-based Semanti...Automatic Generation of Multiple Choice Questions using Surface-based Semanti...
Automatic Generation of Multiple Choice Questions using Surface-based Semanti...
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
 
Unsupervised Word Usage Similarity in Social Media Texts
Unsupervised Word Usage Similarity in Social Media TextsUnsupervised Word Usage Similarity in Social Media Texts
Unsupervised Word Usage Similarity in Social Media Texts
 
IRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between SentencesIRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between Sentences
 
1.model building
1.model building1.model building
1.model building
 
Relevance feature discovery for text mining
Relevance feature discovery for text miningRelevance feature discovery for text mining
Relevance feature discovery for text mining
 
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
 
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
 
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
 
20051128.doc
20051128.doc20051128.doc
20051128.doc
 
Ou leverhulme gt
Ou leverhulme gtOu leverhulme gt
Ou leverhulme gt
 
Topic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental surveyTopic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental survey
 
Building and Evaluating Theories 
 in Software Engineering
Building and Evaluating Theories 
 in Software EngineeringBuilding and Evaluating Theories 
 in Software Engineering
Building and Evaluating Theories 
 in Software Engineering
 

Destaque

Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...David Talby
 
Detecting Gender by Full Name: Experiments with the Russian Language
Detecting Gender by Full Name:  Experiments with the Russian LanguageDetecting Gender by Full Name:  Experiments with the Russian Language
Detecting Gender by Full Name: Experiments with the Russian LanguageAlexander Panchenko
 
Sentiment Index of the Russian Speaking Facebook
Sentiment Index of the Russian Speaking FacebookSentiment Index of the Russian Speaking Facebook
Sentiment Index of the Russian Speaking FacebookAlexander Panchenko
 
Вычислительная лексическая семантика: метрики семантической близости и их при...
Вычислительная лексическая семантика: метрики семантической близости и их при...Вычислительная лексическая семантика: метрики семантической близости и их при...
Вычислительная лексическая семантика: метрики семантической близости и их при...Alexander Panchenko
 
Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...
Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...
Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...Alexander Panchenko
 
Text Analysis of Social Networks: Working with FB and VK Data
Text Analysis of Social Networks: Working with FB and VK DataText Analysis of Social Networks: Working with FB and VK Data
Text Analysis of Social Networks: Working with FB and VK DataAlexander Panchenko
 
Идея и реальность. Трагедия креативного агентства.
Идея и реальность. Трагедия креативного агентства. Идея и реальность. Трагедия креативного агентства.
Идея и реальность. Трагедия креативного агентства. MOST Creative Club
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilaritySaswat Padhi
 
Неологизмы в социальной сети Фейсбук
Неологизмы в социальной сети ФейсбукНеологизмы в социальной сети Фейсбук
Неологизмы в социальной сети ФейсбукAlexander Panchenko
 
09 semantic web & ontologies
09 semantic web & ontologies09 semantic web & ontologies
09 semantic web & ontologiesMarina Santini
 

Destaque (12)

Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
 
Detecting Gender by Full Name: Experiments with the Russian Language
Detecting Gender by Full Name:  Experiments with the Russian LanguageDetecting Gender by Full Name:  Experiments with the Russian Language
Detecting Gender by Full Name: Experiments with the Russian Language
 
Sentiment Index of the Russian Speaking Facebook
Sentiment Index of the Russian Speaking FacebookSentiment Index of the Russian Speaking Facebook
Sentiment Index of the Russian Speaking Facebook
 
Вычислительная лексическая семантика: метрики семантической близости и их при...
Вычислительная лексическая семантика: метрики семантической близости и их при...Вычислительная лексическая семантика: метрики семантической близости и их при...
Вычислительная лексическая семантика: метрики семантической близости и их при...
 
Making Sense of Word Embeddings
Making Sense of Word EmbeddingsMaking Sense of Word Embeddings
Making Sense of Word Embeddings
 
Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...
Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...
Dmitry Gubanov. An Approach to the Study of Formal and Informal Relations of ...
 
Text Analysis of Social Networks: Working with FB and VK Data
Text Analysis of Social Networks: Working with FB and VK DataText Analysis of Social Networks: Working with FB and VK Data
Text Analysis of Social Networks: Working with FB and VK Data
 
Идея и реальность. Трагедия креативного агентства.
Идея и реальность. Трагедия креативного агентства. Идея и реальность. Трагедия креативного агентства.
Идея и реальность. Трагедия креативного агентства.
 
MOST Creative Club 2014
MOST Creative Club 2014MOST Creative Club 2014
MOST Creative Club 2014
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic Similarity
 
Неологизмы в социальной сети Фейсбук
Неологизмы в социальной сети ФейсбукНеологизмы в социальной сети Фейсбук
Неологизмы в социальной сети Фейсбук
 
09 semantic web & ontologies
09 semantic web & ontologies09 semantic web & ontologies
09 semantic web & ontologies
 

Semelhante a Similarity Measures for Semantic Relation Extraction

Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Nees Jan van Eck
 
Correlation research design presentation 2015
Correlation research design presentation 2015Correlation research design presentation 2015
Correlation research design presentation 2015Syed imran ali
 
Recommender system
Recommender systemRecommender system
Recommender systemBhumi Patel
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...
An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...
An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...CSCJournals
 
ppt research method 1.ppt
ppt research method 1.pptppt research method 1.ppt
ppt research method 1.pptnovasyahminan
 
Correlation research
Correlation researchCorrelation research
Correlation researchAmina Tariq
 
Mixed methods-research -design-and-procedures
Mixed methods-research -design-and-proceduresMixed methods-research -design-and-procedures
Mixed methods-research -design-and-proceduresABCComputers
 
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESNAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESacijjournal
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSScsula its training
 
A Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense DisambiguationA Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense DisambiguationElena-Oana Tabaranu
 
Probit analysis in toxicological studies
Probit analysis in toxicological studies Probit analysis in toxicological studies
Probit analysis in toxicological studies kunthavai Nachiyar
 
Correlational research
Correlational researchCorrelational research
Correlational researchJijo G John
 
Context Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsContext Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsIJCSIS Research Publications
 
Comparison Intelligent Electronic Assessment with Traditional Assessment for ...
Comparison Intelligent Electronic Assessment with Traditional Assessment for ...Comparison Intelligent Electronic Assessment with Traditional Assessment for ...
Comparison Intelligent Electronic Assessment with Traditional Assessment for ...CSEIJJournal
 
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...cseij
 
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...cseij
 

Semelhante a Similarity Measures for Semantic Relation Extraction (20)

Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
 
Correlation research design presentation 2015
Correlation research design presentation 2015Correlation research design presentation 2015
Correlation research design presentation 2015
 
Recommender system
Recommender systemRecommender system
Recommender system
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...
An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...
An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...
 
ppt research method 1.ppt
ppt research method 1.pptppt research method 1.ppt
ppt research method 1.ppt
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Correlation research
Correlation researchCorrelation research
Correlation research
 
Mixed methods-research -design-and-procedures
Mixed methods-research -design-and-proceduresMixed methods-research -design-and-procedures
Mixed methods-research -design-and-procedures
 
Measure Term Similarity Using a Semantic Network Approach
Measure Term Similarity Using a Semantic Network ApproachMeasure Term Similarity Using a Semantic Network Approach
Measure Term Similarity Using a Semantic Network Approach
 
Systematic Reviews, Tech Mining, and Other Knowledge Synthesis Beasts of Burden
Systematic Reviews, Tech Mining, and Other Knowledge Synthesis Beasts of BurdenSystematic Reviews, Tech Mining, and Other Knowledge Synthesis Beasts of Burden
Systematic Reviews, Tech Mining, and Other Knowledge Synthesis Beasts of Burden
 
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESNAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSS
 
A Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense DisambiguationA Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense Disambiguation
 
Probit analysis in toxicological studies
Probit analysis in toxicological studies Probit analysis in toxicological studies
Probit analysis in toxicological studies
 
Correlational research
Correlational researchCorrelational research
Correlational research
 
Context Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsContext Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word Pairs
 
Comparison Intelligent Electronic Assessment with Traditional Assessment for ...
Comparison Intelligent Electronic Assessment with Traditional Assessment for ...Comparison Intelligent Electronic Assessment with Traditional Assessment for ...
Comparison Intelligent Electronic Assessment with Traditional Assessment for ...
 
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
 
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
COMPARISON INTELLIGENT ELECTRONIC ASSESSMENT WITH TRADITIONAL ASSESSMENT FOR ...
 

Mais de Alexander Panchenko

Graph's not dead: from unsupervised induction of linguistic structures from t...
Graph's not dead: from unsupervised induction of linguistic structures from t...Graph's not dead: from unsupervised induction of linguistic structures from t...
Graph's not dead: from unsupervised induction of linguistic structures from t...Alexander Panchenko
 
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlBuilding a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlAlexander Panchenko
 
Improving Hypernymy Extraction with Distributional Semantic Classes
Improving Hypernymy Extraction with Distributional Semantic ClassesImproving Hypernymy Extraction with Distributional Semantic Classes
Improving Hypernymy Extraction with Distributional Semantic ClassesAlexander Panchenko
 
Inducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources
Inducing Interpretable Word Senses for WSD and Enrichment of Lexical ResourcesInducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources
Inducing Interpretable Word Senses for WSD and Enrichment of Lexical ResourcesAlexander Panchenko
 
IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...
IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...
IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...Alexander Panchenko
 
Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...
Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...
Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...Alexander Panchenko
 
The 6th Conference on Analysis of Images, Social Networks, and Texts (AIST 2...
The 6th Conference on Analysis of Images, Social Networks, and Texts  (AIST 2...The 6th Conference on Analysis of Images, Social Networks, and Texts  (AIST 2...
The 6th Conference on Analysis of Images, Social Networks, and Texts (AIST 2...Alexander Panchenko
 
Using Linked Disambiguated Distributional Networks for Word Sense Disambiguation
Using Linked Disambiguated Distributional Networks for Word Sense DisambiguationUsing Linked Disambiguated Distributional Networks for Word Sense Disambiguation
Using Linked Disambiguated Distributional Networks for Word Sense DisambiguationAlexander Panchenko
 
Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...
Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...
Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...Alexander Panchenko
 
Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...
Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...
Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...Alexander Panchenko
 
Getting started in Apache Spark and Flink (with Scala) - Part II
Getting started in Apache Spark and Flink (with Scala) - Part IIGetting started in Apache Spark and Flink (with Scala) - Part II
Getting started in Apache Spark and Flink (with Scala) - Part IIAlexander Panchenko
 
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...Alexander Panchenko
 

Mais de Alexander Panchenko (12)

Graph's not dead: from unsupervised induction of linguistic structures from t...
Graph's not dead: from unsupervised induction of linguistic structures from t...Graph's not dead: from unsupervised induction of linguistic structures from t...
Graph's not dead: from unsupervised induction of linguistic structures from t...
 
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlBuilding a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
 
Improving Hypernymy Extraction with Distributional Semantic Classes
Improving Hypernymy Extraction with Distributional Semantic ClassesImproving Hypernymy Extraction with Distributional Semantic Classes
Improving Hypernymy Extraction with Distributional Semantic Classes
 
Inducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources
Inducing Interpretable Word Senses for WSD and Enrichment of Lexical ResourcesInducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources
Inducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources
 
IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...
IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...
IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Que...
 
Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...
Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...
Fighting with Sparsity of the Synonymy Dictionaries for Automatic Synset Indu...
 
The 6th Conference on Analysis of Images, Social Networks, and Texts (AIST 2...
The 6th Conference on Analysis of Images, Social Networks, and Texts  (AIST 2...The 6th Conference on Analysis of Images, Social Networks, and Texts  (AIST 2...
The 6th Conference on Analysis of Images, Social Networks, and Texts (AIST 2...
 
Using Linked Disambiguated Distributional Networks for Word Sense Disambiguation
Using Linked Disambiguated Distributional Networks for Word Sense DisambiguationUsing Linked Disambiguated Distributional Networks for Word Sense Disambiguation
Using Linked Disambiguated Distributional Networks for Word Sense Disambiguation
 
Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...
Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...
Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction...
 
Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...
Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...
Noun Sense Induction and Disambiguation using Graph-Based Distributional Sema...
 
Getting started in Apache Spark and Flink (with Scala) - Part II
Getting started in Apache Spark and Flink (with Scala) - Part IIGetting started in Apache Spark and Flink (with Scala) - Part II
Getting started in Apache Spark and Flink (with Scala) - Part II
 
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...
 

Último

pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Youngkajalvid75
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 

Último (20)

pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 

Similarity Measures for Semantic Relation Extraction

  • 1. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Similarity Measures for Semantic Relation Extraction Mont Clair State University, Brown Bag Seminar (USA) Alexander Panchenko Universit´e catholique de Louvain & Ditital Society Laboratory LLC alexander.panchenko@uclouvain.be May 2, 2014 Alexander Panchenko 1/52
  • 2. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Alexander Panchenko 2/52
  • 3. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Lexico-Semantic Search Engine “Serelex” Filename Categorization System “iCOP” Alexander Panchenko 3/52
  • 4. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Computational Lexical Semantics * Picture is adapted from Computational Linguistics LINGI2263 course http://www.uclouvain.be/en-cours-2013-LINGI2263.html Alexander Panchenko 4/52
  • 5. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Introduction Motivation 1 Synonyms, hypernyms and co-hyponyms are useful for: text similarity (ˇSaric et al., 2012); query expansion (Hsu et al., 2006); question answering (Sun et al., 2005); 2 Manual resource construction is prohibitively expensive. 3 Extractors do not meet quality of the handcrafted resources. Focus Similarity-based semantic relation extraction. Research Question How to improve precision and coverage of such measures? Alexander Panchenko 5/52
  • 6. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Semantic Resources Definition A semantic resource is an undirected graph (C, R): nodes C represent terms; edges R represent untyped semantic relations. Alexander Panchenko 6/52
  • 7. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Semantic Relation Extractors We study extractors based on two components: 1 semantic similarity measures; 2 nearest neighbors procedures. Terms Similarity Measure R S Normalizer S Semantic Similarity Measure Semantic Relations Feature Extractor Text-Based Data kNN Procedure F C Semantic Relation Extractor Alexander Panchenko 7/52
  • 8. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Semantic Similarity Measures Definition A semantic similarity measure quantifies semantic relatedness input terms ci , cj with the similarity score sij = sim(ci , cj ): sij = high if ci , cj is a pair of syn, hyper, cohypo 0 otherwise Properties Nonnegativity: 0 ≤ sij ≤ 1; Reflexivity: sij = 1 ⇔ ci = cj ; Symmetry: sij = sji ; Triangle inequality: sij ≤ sik + skj Alexander Panchenko 8/52
  • 9. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Semantic Similarity Measures Many dissimilar pairs, few similar pairs: sij ∼ exp(λ): Similarity distribution of the term “doctor”: Alexander Panchenko 9/52
  • 10. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Evaluation of Semantic Similarity Measures 1 Correlations with human judgments: Criterion: Pearson correlation (ρ) и Spearman correlation (r). Datasets: MC, RG, WordSim. 2 Semantic relation ranking: Criterion: Precision, Recall, F-measure. Dataset: BLESS, SN. 3 Semantic relation extraction: Criterion: Precision@k. Data: annotation and/or dictionaries. 4 Application-based evaluation: short text classification system (iCOP); lexico-semantic search engine (Serelex). Panchenko A., Similarity Measures for Semantic Relation Extraction. PhD thesis. Universit´e catholique de Louvain. 197 pages, 2013, (Chapter 1). Alexander Panchenko 10/52
  • 11. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Correlations with human judgments Alexander Panchenko 11/52
  • 12. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Semantic Relation Ranking Precision P(k = 50) = 1 7 ≈ 0.86 word, ci word, cj relation type sij aficionado enthusiast syn 0.07197 aficionado fan syn 0.05195 aficionado admirer syn 0.01964 aficionado addict syn 0.01326 aficionado devotee syn 0.01163 aficionado foundling random 0.00777 aficionado fanatic syn 0.00414 aficionado adherent syn 0.00353 aficionado capital random 0.00232 aficionado statute random 0.00029 aficionado blot random 0.00025 aficionado meddler random 0.00005 aficionado enlargement random 0.00003 aficionado bawdyhouse random 0.00000 Alexander Panchenko 12/52
  • 13. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Lexico-Semantic Search Engine “Serelex” Filename Categorization System “iCOP” Alexander Panchenko 13/52
  • 14. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Related publications This work stems from Hearst, M. A. Automatic acquisition of hyponyms from large text corpora. In ACL, pages 539–545, 1992. Selected publications: Panchenko A., Morozova O., Naets H. A Semantic Similarity Measure Based on Lexico-Syntactic Patterns. In Proceedings of KONVENS 2012, pp.174–178, Vienna (Austria), 2012 Panchenko A., Romanov P., Morozova O., Naets H., Philippovich A., Fairon C. Serelex: Search and Visualization of Semantically Related Words. In Proceedings of the 35th European Conference on Information Retrieval (ECIR 2013), Moscow (Russia), 2013. Alexander Panchenko 14/52
  • 15. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications A live demo http://serelex.cental.be/ Alexander Panchenko 15/52
  • 16. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Lexico-syntactic patterns 18 patterns that extract hypernyms, co-hyponyms and synonyms Alexander Panchenko 16/52
  • 17. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Patterns are encoded as FSTs Finite State Transducers (FSTs) Open source corpus processing tool Unitex: http://igm.univ-mlv.fr/~unitex/ Alexander Panchenko 17/52
  • 18. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications A pattern encoded as an FST Take into account linguistic variation Unlike string-based patterns (Bollegala et al., 2007) Alexander Panchenko 18/52
  • 19. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Patterns extract concordances such diverse {[occupations]} as {[doctors]}, {[engineers]} and {[scientists]}[PATTERN=1] such {non-alcoholic [sodas]} as {[root beer]} and {[cream soda]}[PATTERN=1] {traditional[food]}, such as {[sandwich]},{[burger]}, and {[fry]}[PATTERN=2] Alexander Panchenko 19/52
  • 20. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Corpus Corpus Wikipedia+ukWaC: 2.9 · 1012 tokens Extracted concordances Wikipedia – 1.196.468 ukWaC – 2.227.025 WaCypedia+ukWaC – 3.423.493 Alexander Panchenko 20/52
  • 21. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Reranking formula Efreq-Rnum-Cfreq-Pnum sij = √ pij · 2 · µb bi∗ + b∗j · P(ci , cj ) P(ci )P(cj ) . P(ci , cj ) = eij ij eij – extraction probability of the pair ci , cj , eij – frequency of co-occurrence of ci and cj in concordances K P(ci ) = fi i fi – probability of the term ci , fi – frequency of ci bi∗ = j:eij ≥β 1 – the number of extractions for term ci with the frequency ≥ β, µb = 1 |C| |C| i=1 bi∗ – the average number of extractions per term pij ∈ [1; 18] – number of distinct patterns which extracted the relation ci , cj Alexander Panchenko 21/52
  • 22. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Semantic Relation Ranking Precision is comparable or better w.r.t. the baselines; Recall is lower w.r.t. the baselines. Figure : Precision-Recall graphs (the BLESS dataset). Alexander Panchenko 22/52
  • 23. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Semantic Relation Extraction Precision@1 ≈ 0.80; “Good” coverage: Alexander Panchenko 23/52
  • 24. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Lexico-Semantic Search Engine “Serelex” Filename Categorization System “iCOP” Alexander Panchenko 24/52
  • 25. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Related publications Panchenko A. A Study of Heterogeneous Similarity Measures for Semantic Relation Extraction. // In JEP-TALN-RECITAL 2012 — Grenoble (France), 2012. Panchenko A., Similarity Measures for Semantic Relation Extraction. PhD thesis. Universit´e catholique de Louvain. 197 pages, 2013: Chapters 2.1, 3.1. Alexander Panchenko 25/52
  • 26. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Compared Semantic Similarity Measures 37 distinct measures; Q1: Are the measures are complementary? Q2: If yes, in which respects? Alexander Panchenko 26/52
  • 27. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications The Best Single Measures (MC, RG, WordSim, BLESS, SN) Each one extracts many co-hyponyms, e.g.: Canon, Nikon , Lamborghini, Ferrari , Obama, Romney . Alexander Panchenko 27/52
  • 28. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Further Results Most dissimilar measures Figure : 21 measures grouped according to their relation distributions. Measures are complementary w.r.t.: lexical coverage; performances; types of semantic relations they extract. Alexander Panchenko 28/52
  • 29. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Implementation of the baseline measures Semantic Vectors: https://code.google.com/p/semanticvectors/ S-Space Package: https://code.google.com/p/airhead-research/ WordNet::Similarity: http://wn-similarity.sourceforge.net NLTK: http://nltk.googlecode.com/svn/trunk/doc/ howto/wordnet.html WikiRelate! PatternSim / Serelex: http://serelex.cental.be Web-based metrics: http://cwl-projects.cogsci.rpi.edu/msr LSA: http://lsa.colorado.edu Alexander Panchenko 29/52
  • 30. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Lexico-Semantic Search Engine “Serelex” Filename Categorization System “iCOP” Alexander Panchenko 30/52
  • 31. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Related publications Panchenko A., Morozova O. A Study of Hybrid Similarity Measures for Semantic Relation Extraction. // Innovative Hybrid Approaches to the Processing of Textual Data Workshop, EACL 2012 — Avignon (France), 2012 — pp. 10–18 Panchenko A., Similarity Measures for Semantic Relation Extraction. PhD thesis. Universit´e catholique de Louvain. 197 pages, 2013, (Chapter 4). Panchenko A. A Study of Heterogeneous Similarity Measures for Semantic Relation Extraction. // In JEP-TALN-RECITAL 2012 — Grenoble (France), 2012 — pp. 29–42. Alexander Panchenko 31/52
  • 32. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Hybrid vs Single Measures Terms, C simi (a) (b) combination method Scmb S1 SN sim1 S1 simN norm SN ... ...norm norm Scmb knn R Si norm Si knn SingleSimilarityMeasure HybridSimilarityMeasure Relations, Terms, C RRelations, Features Figure : Semantic relation extractor based on: (a) a single similarity measure; (b) a hybrid similarity measure. Alexander Panchenko 32/52
  • 33. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications 16 Features = 16 Single Similarity Measures 5 network-based measures : 1 WuPalmer; 2 Leacock and Chodorow; 3 Resnik; 4 Jiang and Conrath; 5 Lin. 3 web-based measures (NGD-Yahoo/Bing/Google); 5 corpus-based measures: 2 distributional (BDA, SDA) 1 lexico-syntactic patterns (PatternSim) 2 other co-occurence based (LSA, NGD-Factiva) 3 definition-based measures 1 ExtendedLesk; 2 GlossVectors; 3 DefVectors-WktWiki. Alexander Panchenko 33/52
  • 34. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Unsupervised Combination Methods 1 Mean: scmb ij = 1 K k=1,K sk ij ; 2 Mean-Nnz: scmb ij = 1 |k:sk ij >0,k=1,K| k=1,K sk ij ; 3 Mean-Zscore: Scmb = 1 K K k=1 Sk −µk σk ; 4 Median: scmb ij = median(s1 ij , . . . , sK ij ); 5 Max: scmb ij = max(s1 ij , . . . , sK ij ); 6 RankFusion: scmb ij = 1 K k=1,K rk ij ; 7 RelationFusion (Panchenko and Morozova, 2012). Alexander Panchenko 34/52
  • 35. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Supervised Combination Methods 8 Logit, Logit-L1, Logit-L2. A binary logistic regression; Positive examples – synonyms, hyponyms, co-hyponyms from BLESS/SN; Negative examples – random relations from BLESS/SN; A relation ci , t, cj ∈ R is represented with a vector of pairwise similarities: x = (s1 ij , . . . , sN ij ), N = 2, 16; Category yij : yij = 0 if ci , t, cj is a random relation 1 otherwise Using the model (w1, . . . , wK ) for combination: scmb ij = 1 1 + e−z , z = K k=1 wk sk ij + w0. Alexander Panchenko 35/52
  • 36. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Supervised Combination Methods 9 SVM. The weights w and the support vectors SV : w = xi ∈SV αi yi xi . Using the model scmb ij = wT x+b = K k=1 wi sk ij +b. Alexander Panchenko 36/52
  • 37. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Hybrid Similarity Measures Precision-Recall graphs calculated on the BLESS dataset: (a) 16 single measures and the best hybrid measure Logit-E15; (b) 8 hybrid measures. Alexander Panchenko 37/52
  • 38. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Hybrid Similarity Measure Logit-E15 Figure : Similarity scores between 74 words related to the word “acacia”. Alexander Panchenko 38/52
  • 39. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Supervised Hybrid Similarity Measures Alexander Panchenko 39/52
  • 40. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Supervised Hybrid Similarity Measures (cont.) Figure : Meta-parameter optimization with the grid search of the C-SVM-radial-E15 measure. Alexander Panchenko 40/52
  • 41. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Lexico-Semantic Search Engine “Serelex” Filename Categorization System “iCOP” Alexander Panchenko 41/52
  • 42. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Lexico-Semantic Search Engine “Serelex” Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Lexico-Semantic Search Engine “Serelex” Filename Categorization System “iCOP” Alexander Panchenko 42/52
  • 43. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Lexico-Semantic Search Engine “Serelex” Related publications Panchenko A., Romanov P., Morozova O., Naets H., Philippovich A., Fairon C. Serelex: Search and Visualization of Semantically Related Words. In Proceedings of the 35th European Conference on Information Retrieval (ECIR 2013), Moscow (Russia), 2013. Panchenko A., Naets H., Brouwers L., Romanov P., Fairon C., Recherche et visualisation de mots s´emantiquement li´es. Actes de la 20e conf´erence sur le Traitement Automatique des Langues Naturelles (TALN’2013). Les Sables d’Olonne, France. pp.747–754, 2013. Alexander Panchenko 43/52
  • 44. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Lexico-Semantic Search Engine “Serelex” Search for Related Words: the List and the Graph http://serelex.cental.be/ Alexander Panchenko 44/52
  • 45. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Lexico-Semantic Search Engine “Serelex” Search for Related Words: the List and the Graph Alexander Panchenko 45/52
  • 46. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Lexico-Semantic Search Engine “Serelex” Search for Related Words: the Images Alexander Panchenko 46/52
  • 47. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Lexico-Semantic Search Engine “Serelex” Evaluation of the Serelex Figure : Users’ satisfaction with the top 20 results. Alexander Panchenko 47/52
  • 48. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Filename Categorization System “iCOP” Plan 1 The Context and the Problem 2 Pattern-Based Semantic Similarity Measure 3 Comparison of Similarity Measures 4 Hybrid Semantic Similarity Measures 5 Applications of Semantic Similarity Measures Lexico-Semantic Search Engine “Serelex” Filename Categorization System “iCOP” Alexander Panchenko 48/52
  • 49. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Filename Categorization System “iCOP” Related publications Panchenko A., Naets H., Beaufort R., Fairon C. Towards Detection of Child Sexual Abuse Media: Classification of the Associated Filenames. In Proceedings of the 35th European Conference on Information Retrieval (ECIR 2013). LNCS 7814, pp. 776-779. Springler-Verlag Berlin Heidelberg 2013. Panchenko A, Beaufort R., Fairon C. Detection of Child Sexual Abuse Media on P2P Networks: Normalization and Classification of Associated Filenames. In Proceedings of Workshop on Language Resources for Public Security Applications of the 8th International Conference on Language Resources and Evaluation (LREC), 2012 Alexander Panchenko 49/52
  • 50. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Filename Categorization System “iCOP” Short text classification with Vocabulary Projection Alexander Panchenko 50/52
  • 51. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Filename Categorization System “iCOP” Evaluation of the Vocabulary Projection Training Dataset Test Dataset Accuracy Accuracy (voc. projection) Gallery (train) Gallery 96.41 96.83 (+0.42) PirateBay Title+Desc+Tags PirateBay Title+Desc+Tags 98.92 98.86 (–0.06) PirateBay Title+Tags PirateBay Title+Tags 97.73 97.63 (–0.10) Gallery PirateBay Title+Desc+Tags 90.57 91.48 (+0.91) Gallery PirateBay Title+Tags 84.23 88.89 (+4.66) PirateBay Title+Desc+Tags Gallery 88.83 89.04 (+0.21) PirateBay Title+Tags Gallery 91.16 91.30 (+0.14) Table : Performance of an C-SVM linear classifier (10-fold cross validation). Alexander Panchenko 51/52
  • 52. The Problem Pattern-Based Measure Comparison Hybrid Measures Applications Filename Categorization System “iCOP” Thank you! Questions? Alexander Panchenko 52/52