SlideShare a Scribd company logo
1 of 19
A knowledge based approach
Word Sense Disambiguation
Submitted by:
Pradeep Sachdeva – 10104678
Surbhi Verma – 10104686
Supervisor:
Dr. Sandeep Kumar Singh
• Words in the English language often
correspond to different meanings in different
contexts. Such words are referred to as
polysemous words (words having more than
one sense).
• This project presents a knowledge based
algorithm for disambiguating polysemous
word in any given sentence using
computational linguistics tool, WordNet.
Problem Statement
The album includes a few instrumental pieces.
His efforts have been instrumental in solving the problem.
Consider the following sentences:
The solution to the problem of WSD impacts
other computer related writing such as:
• improving relevance of search engines
• anaphora resolution,
• coherence and inference.
WSD is an intermediate language engineering
technology which could improve applications
such as information retrieval (IR).
Relevance of WSD
• Supervised Methods
• Unsupervised Methods
Dictionary or knowledge based methods
Different Approaches
• Supervised methods are based on the assumption that
the context can provide enough evidence on its own to
disambiguate words. However, they are subject to a
new knowledge acquisition bottleneck since they rely
on substantial amounts of manually sense-tagged
corpora for training, which are laborious and expensive
to create.
• They depend crucially on the existence of manually
annotated examples for every word sense, a requisite
that can so far be met only for a handful of words for
testing purposes.
Supervised Methods
• In this approach the underlying assumption is
that similar senses occur in similar contexts,
and thus senses can be induced from text by
clustering word occurrences using some
measure of similarity of context. New
occurrences of the word can be classified into
the closest induced clusters/senses.
• Performance of unsupervised methods is
lower than other methods.
Unsupervised Methods
• Knowledge based methods rely primarily on
dictionaries, thesauri, and lexical knowledge
bases, without using any corpus evidence.
Therefore, these methods do not require any
kind of training corpus.
• Performance of these methods is high and
also they do not face the challenge of new
knowledge acquisition since there is no
training data required.
Knowledge Based Methods
• WordNet is a lexical database for the English
language which groups English words into sets
of synonyms called synsets, provides short,
general definitions and the various semantic
relations between these synonym sets.
About Wordnet
• Every synset contains a group of synonymous words
or collocations ; different senses of a word are in different synsets.
• The meaning of the synsets is further clarified with short
defining glosses(Definitions and/or example sentences)
• Most synonym sets are connected to other synsets via a number of
semantic relations. A few of them include :
 hypernyms: Y is a hypernym of X if every X is a (kind of) Y (bird is a
hypernym of parrot)
 hyponyms: Y is a hyponym of X if every Y is a (kind of) X (parrot is a
hypernym of bird)
 meronym: Y is a meronym of X if Y is a part of X (window is a
meronym of building)
 holonym: Y is a holonym of X if X is a part of Y (building is a
holonym of window)
The synsets of the word sea are :-
1. sea (synonyms): a division of an ocean or a large body of salt water
partially enclosed by land
– It has hypernyms - body of water, water
– It has hyponyms - south sea
– It has meronyms - bay, inlet, recess, embayment, gulf
– It has holonyms - hydrosphere
2. sea, ocean (synonyms) : anything apparently limitless in quantity or
volume
– It has hypernyms - large indefinite amount, large indefinite quantity
3. Sea (synonyms): turbulent water with swells of considerable size
– It has hypernyms - turbulent flow
– It has hyponyms - head sea
An example
The algorithm computes an overall impact of
the following parameters on the similarity of
two words:
• Intersection
• Hierarchical Level
• Distance
Algorithm
NS1 S2
LEVEL 1
Intersection is computed as the number of overlapping words
between the word families of senses of target word and the
nearby word at various levels of the hierarchy.
At LEVEL 1:
Let us assume there are two senses of the target word. Let the
word families of two senses of a target word be S1 and S2.
Also let the word families of all the senses of a nearby word
be represented by a single set N.
Intersection at Level 1
NS1 S2
PNPS1 PS2
Including the hypernyms at level 2:
Intersection at Level 2
PS1, PS2 and PN are parents or hypernyms of S1, S2 and N respectively
NS1 S2
PN
PS1
PS2
P2S1 P2S2P2N
Including the successive hypernyms at Level 3:
Intersection at Level 3
Score
We compute the overall impact of intersection, hierarchical level and distance on
the degree of similarity between target and nearby words.
We have devised a formula of score as follows:
Score = (Intersection)1/k1
(Level)k2 * (Distance)1/k3
The values of k1, k2 and k3 have been experimentally determined as:
K1 = 3, k2 = 3, k3 = 3
Evaluation - SemCor
The algorithm has been evaluated on the SemCor dataset, which is
the largest publicly available sense-tagged-corpora created at
Princeton University.
It has been automatically mapped to various versions of the
WordNet.
For every polysemous word in a sentence, SemCor provides the
sense it corresponds to in accordance with the WordNet.
The algorithm has been evaluated in the following three ways:
Top 1 – This refers to the case when the correct sense i.e. the
sense specified by Semcor has been given the highest score
by the algorithm and is ranked as first.
Top 2 – This refers to the case when the correct sense i.e. the
sense specified by Semcor is one of the top 2 scoring senses
given by the algorithm.
Top 3 – This refers to the case when the correct sense i.e. the
sense specified by Semcor is one of the top 3 scoring senses
given by the algorithm
Comparison of resultsComparison of results
Therefore the algorithm performs better than the existing approaches in this area.

More Related Content

What's hot

Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative Communication
Divya Sugumar
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Yasir Khan
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Mariana Soffer
 
Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...
csandit
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
Mustafa Jarrar
 

What's hot (20)

Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
NLP
NLPNLP
NLP
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative Communication
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language Technology
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastTextGDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...
 

Viewers also liked

BibleTech2011
BibleTech2011BibleTech2011
BibleTech2011
Andi Wu
 
PhD defense Koen Deschacht
PhD defense Koen DeschachtPhD defense Koen Deschacht
PhD defense Koen Deschacht
guest1add48f
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
vini89
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
Rubén Izquierdo Beviá
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
butest
 

Viewers also liked (10)

Draft programme 15 09-2015
Draft programme 15 09-2015Draft programme 15 09-2015
Draft programme 15 09-2015
 
BibleTech2011
BibleTech2011BibleTech2011
BibleTech2011
 
Thesis
ThesisThesis
Thesis
 
PhD defense Koen Deschacht
PhD defense Koen DeschachtPhD defense Koen Deschacht
PhD defense Koen Deschacht
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
Word Sense Disambiguation and Induction
Word Sense Disambiguation and InductionWord Sense Disambiguation and Induction
Word Sense Disambiguation and Induction
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Technical Analysis Rudramurthy
Technical Analysis   RudramurthyTechnical Analysis   Rudramurthy
Technical Analysis Rudramurthy
 

Similar to An Improved Approach to Word Sense Disambiguation

Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)
luiscarl1981
 

Similar to An Improved Approach to Word Sense Disambiguation (20)

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Word sense disambiguation and lexical chains construction using wordnet
Word sense disambiguation and lexical chains construction using wordnetWord sense disambiguation and lexical chains construction using wordnet
Word sense disambiguation and lexical chains construction using wordnet
 
A semantics theory of word classes.pdf
A semantics theory of word classes.pdfA semantics theory of word classes.pdf
A semantics theory of word classes.pdf
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy words
 
IJNLC 2013 - Ambiguity-Aware Document Similarity
IJNLC  2013 - Ambiguity-Aware Document SimilarityIJNLC  2013 - Ambiguity-Aware Document Similarity
IJNLC 2013 - Ambiguity-Aware Document Similarity
 
Ny3424442448
Ny3424442448Ny3424442448
Ny3424442448
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
 
Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
Class14
Class14Class14
Class14
 
Exempler approach
Exempler approachExempler approach
Exempler approach
 
L05 word representation
L05 word representationL05 word representation
L05 word representation
 
ijcai11
ijcai11ijcai11
ijcai11
 
1 l5eng
1 l5eng1 l5eng
1 l5eng
 
Distributional semantics
Distributional semanticsDistributional semantics
Distributional semantics
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approach
 
Edinburgh
EdinburghEdinburgh
Edinburgh
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 

Recently uploaded

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 

Recently uploaded (20)

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 

An Improved Approach to Word Sense Disambiguation

  • 1. A knowledge based approach Word Sense Disambiguation Submitted by: Pradeep Sachdeva – 10104678 Surbhi Verma – 10104686 Supervisor: Dr. Sandeep Kumar Singh
  • 2. • Words in the English language often correspond to different meanings in different contexts. Such words are referred to as polysemous words (words having more than one sense). • This project presents a knowledge based algorithm for disambiguating polysemous word in any given sentence using computational linguistics tool, WordNet. Problem Statement
  • 3. The album includes a few instrumental pieces. His efforts have been instrumental in solving the problem. Consider the following sentences:
  • 4. The solution to the problem of WSD impacts other computer related writing such as: • improving relevance of search engines • anaphora resolution, • coherence and inference. WSD is an intermediate language engineering technology which could improve applications such as information retrieval (IR). Relevance of WSD
  • 5. • Supervised Methods • Unsupervised Methods Dictionary or knowledge based methods Different Approaches
  • 6. • Supervised methods are based on the assumption that the context can provide enough evidence on its own to disambiguate words. However, they are subject to a new knowledge acquisition bottleneck since they rely on substantial amounts of manually sense-tagged corpora for training, which are laborious and expensive to create. • They depend crucially on the existence of manually annotated examples for every word sense, a requisite that can so far be met only for a handful of words for testing purposes. Supervised Methods
  • 7. • In this approach the underlying assumption is that similar senses occur in similar contexts, and thus senses can be induced from text by clustering word occurrences using some measure of similarity of context. New occurrences of the word can be classified into the closest induced clusters/senses. • Performance of unsupervised methods is lower than other methods. Unsupervised Methods
  • 8. • Knowledge based methods rely primarily on dictionaries, thesauri, and lexical knowledge bases, without using any corpus evidence. Therefore, these methods do not require any kind of training corpus. • Performance of these methods is high and also they do not face the challenge of new knowledge acquisition since there is no training data required. Knowledge Based Methods
  • 9. • WordNet is a lexical database for the English language which groups English words into sets of synonyms called synsets, provides short, general definitions and the various semantic relations between these synonym sets. About Wordnet
  • 10. • Every synset contains a group of synonymous words or collocations ; different senses of a word are in different synsets. • The meaning of the synsets is further clarified with short defining glosses(Definitions and/or example sentences) • Most synonym sets are connected to other synsets via a number of semantic relations. A few of them include :  hypernyms: Y is a hypernym of X if every X is a (kind of) Y (bird is a hypernym of parrot)  hyponyms: Y is a hyponym of X if every Y is a (kind of) X (parrot is a hypernym of bird)  meronym: Y is a meronym of X if Y is a part of X (window is a meronym of building)  holonym: Y is a holonym of X if X is a part of Y (building is a holonym of window)
  • 11. The synsets of the word sea are :- 1. sea (synonyms): a division of an ocean or a large body of salt water partially enclosed by land – It has hypernyms - body of water, water – It has hyponyms - south sea – It has meronyms - bay, inlet, recess, embayment, gulf – It has holonyms - hydrosphere 2. sea, ocean (synonyms) : anything apparently limitless in quantity or volume – It has hypernyms - large indefinite amount, large indefinite quantity 3. Sea (synonyms): turbulent water with swells of considerable size – It has hypernyms - turbulent flow – It has hyponyms - head sea An example
  • 12. The algorithm computes an overall impact of the following parameters on the similarity of two words: • Intersection • Hierarchical Level • Distance Algorithm
  • 13. NS1 S2 LEVEL 1 Intersection is computed as the number of overlapping words between the word families of senses of target word and the nearby word at various levels of the hierarchy. At LEVEL 1: Let us assume there are two senses of the target word. Let the word families of two senses of a target word be S1 and S2. Also let the word families of all the senses of a nearby word be represented by a single set N. Intersection at Level 1
  • 14. NS1 S2 PNPS1 PS2 Including the hypernyms at level 2: Intersection at Level 2 PS1, PS2 and PN are parents or hypernyms of S1, S2 and N respectively
  • 15. NS1 S2 PN PS1 PS2 P2S1 P2S2P2N Including the successive hypernyms at Level 3: Intersection at Level 3
  • 16. Score We compute the overall impact of intersection, hierarchical level and distance on the degree of similarity between target and nearby words. We have devised a formula of score as follows: Score = (Intersection)1/k1 (Level)k2 * (Distance)1/k3 The values of k1, k2 and k3 have been experimentally determined as: K1 = 3, k2 = 3, k3 = 3
  • 17. Evaluation - SemCor The algorithm has been evaluated on the SemCor dataset, which is the largest publicly available sense-tagged-corpora created at Princeton University. It has been automatically mapped to various versions of the WordNet. For every polysemous word in a sentence, SemCor provides the sense it corresponds to in accordance with the WordNet.
  • 18. The algorithm has been evaluated in the following three ways: Top 1 – This refers to the case when the correct sense i.e. the sense specified by Semcor has been given the highest score by the algorithm and is ranked as first. Top 2 – This refers to the case when the correct sense i.e. the sense specified by Semcor is one of the top 2 scoring senses given by the algorithm. Top 3 – This refers to the case when the correct sense i.e. the sense specified by Semcor is one of the top 3 scoring senses given by the algorithm
  • 19. Comparison of resultsComparison of results Therefore the algorithm performs better than the existing approaches in this area.