SlideShare uma empresa Scribd logo
1 de 14
Baixar para ler offline
Leveraging Sentiment to Compute
Word Similarity
Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak
Bhattacharyya
Dept. of Computer Science and Engineering, IIT Bombay
6th International Global Wordnet Conference
GWC 2011,
Matsue, Japan, Jan, 2012
Motivation
2










Motivation
3
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?







Motivation
4
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set






Motivation
5
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
Motivation
6
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
Sentiment-Semantic Correlation
7/23/2013Department of Computer Science and Engineering, IIT Bombay
7
Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS
Meaning 0.768 0.803 0.750 0.527 0.759
Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844
WordNet Relations used for
Expansion
7/23/2013Department of Computer Science and Engineering, IIT Bombay
8
POS WordNet relations used for expansion
Nouns hypernym, hyponym, nominalization
Verbs nominalization, hypernym, hyponym
Adjectives also see, nominalization, attribute
Adverbs derived
Scoring Formula
7/23/2013Department of Computer Science and Engineering, IIT Bombay
9
 ScoreSD(A) = SWNpos(A)- SWNneg(A)
 ScoreSM(A)= max(SWNpos(A), SWNneg(A))
 ScoreTM(A) =
sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A),
SWNneg(A)))
SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B)))
Where,
glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n)
scorex(Y) = Sentiment score of word Y using scoring function x
x = Scoring function of type SD/SM/TD/TM
Evaluation on Gold Standard Data:
Word Pair Similarity10
Evaluation on Gold Standard Data:
Word Pair Similarity11
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Evaluation on Gold Standard Data:
Word Pair Similarity12
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient
Evaluation on Gold Standard Data:
Word Pair Similarity13
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient
Metric Used Overall NOUN VERB ADJECTIVES ADVERBS
LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37
LIN (Lin, 1998) 0.27 0.24 0.00 NA Na
LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA
SenSim (SD) 0.46 0.73 0.55 0.08 0.76
SenSim (SM) 0.50 0.62 0.48 0.06 0.54
SenSim (TD) 0.45 0.73 0.55 0.08 0.59
SenSim (TM) 0.48 0.62 0.48 0.06 0.78
Evaluation on Travel Review Data:
Feature Replacement
7/23/2013Department of Computer Science and Engineering, IIT Bombay
14
Metric Used Accuracy
(%)
PP NP PR NR
Baseline 89.10 91.50 87.07 85.18 91.24
LESK
(Banerjee et al., 2003)
89.36 91.57 87.46 85.68 91.25
LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90
LCH
(Leacock et al., 1998)
89.64 90.48 88.86 86.47 89.63
SenSim (SD) 89.95 91.39 88.65 87.11 90.93
SenSim (SM) 90.06 92.01 88.38 86.67 91.58
SenSim (TD) 90.11 91.68 88.69 86.97 91.23

Mais conteúdo relacionado

Semelhante a Leveraging Sentiment to Compute Word Similarity

Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.pptvisheshs4
 
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...Enrico Santus Aversano
 
A Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisA Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisRichard Hogue
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAbhinav Gupta
 
An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.IJSRD
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skillsguesta59df6
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions SkillsCaroline Chua
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESJournal For Research
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysisishan0019
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesAndre Freitas
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnRwanEnan
 
DETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTDETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTWarNik Chow
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional SemanticsAndre Freitas
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsijnlc
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.pptBereketAraya
 

Semelhante a Leveraging Sentiment to Compute Word Similarity (20)

Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.ppt
 
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
 
Lac presentation
Lac presentationLac presentation
Lac presentation
 
A Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisA Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment Analysis
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
Detecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi LanguageDetecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi Language
 
An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Fyp ca2
Fyp ca2Fyp ca2
Fyp ca2
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
NLP
NLPNLP
NLP
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
DETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTDETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENT
 
N01741100102
N01741100102N01741100102
N01741100102
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy words
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt
 

Mais de Subhabrata Mukherjee

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsXtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsSubhabrata Mukherjee
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Subhabrata Mukherjee
 
OpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesOpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesSubhabrata Mukherjee
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Subhabrata Mukherjee
 
Continuous Experience-aware Language Model
Continuous Experience-aware Language ModelContinuous Experience-aware Language Model
Continuous Experience-aware Language ModelSubhabrata Mukherjee
 
Experience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesExperience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesSubhabrata Mukherjee
 
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Subhabrata Mukherjee
 
Leveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesLeveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesSubhabrata Mukherjee
 
People on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsPeople on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsSubhabrata Mukherjee
 
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Subhabrata Mukherjee
 
Joint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelJoint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelSubhabrata Mukherjee
 
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterTwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterSubhabrata Mukherjee
 
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Subhabrata Mukherjee
 
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...Subhabrata Mukherjee
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviewsSubhabrata Mukherjee
 
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...Subhabrata Mukherjee
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSubhabrata Mukherjee
 

Mais de Subhabrata Mukherjee (18)

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsXtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Fact Checking from Text
Fact Checking from TextFact Checking from Text
Fact Checking from Text
 
OpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesOpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product Profiles
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Continuous Experience-aware Language Model
Continuous Experience-aware Language ModelContinuous Experience-aware Language Model
Continuous Experience-aware Language Model
 
Experience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesExperience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review Communities
 
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
 
Leveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesLeveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News Communities
 
People on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsPeople on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health Forums
 
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
 
Joint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelJoint Author Sentiment Topic Model
Joint Author Sentiment Topic Model
 
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterTwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
 
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
 
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviews
 
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 

Último

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Último (20)

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Leveraging Sentiment to Compute Word Similarity

  • 1. Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global Wordnet Conference GWC 2011, Matsue, Japan, Jan, 2012
  • 3. Motivation 3  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?       
  • 4. Motivation 4  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set      
  • 5. Motivation 5  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
  • 6. Motivation 6  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
  • 7. Sentiment-Semantic Correlation 7/23/2013Department of Computer Science and Engineering, IIT Bombay 7 Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS Meaning 0.768 0.803 0.750 0.527 0.759 Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844
  • 8. WordNet Relations used for Expansion 7/23/2013Department of Computer Science and Engineering, IIT Bombay 8 POS WordNet relations used for expansion Nouns hypernym, hyponym, nominalization Verbs nominalization, hypernym, hyponym Adjectives also see, nominalization, attribute Adverbs derived
  • 9. Scoring Formula 7/23/2013Department of Computer Science and Engineering, IIT Bombay 9  ScoreSD(A) = SWNpos(A)- SWNneg(A)  ScoreSM(A)= max(SWNpos(A), SWNneg(A))  ScoreTM(A) = sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A), SWNneg(A))) SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B))) Where, glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n) scorex(Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM
  • 10. Evaluation on Gold Standard Data: Word Pair Similarity10
  • 11. Evaluation on Gold Standard Data: Word Pair Similarity11 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment
  • 12. Evaluation on Gold Standard Data: Word Pair Similarity12 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient
  • 13. Evaluation on Gold Standard Data: Word Pair Similarity13 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient Metric Used Overall NOUN VERB ADJECTIVES ADVERBS LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37 LIN (Lin, 1998) 0.27 0.24 0.00 NA Na LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA SenSim (SD) 0.46 0.73 0.55 0.08 0.76 SenSim (SM) 0.50 0.62 0.48 0.06 0.54 SenSim (TD) 0.45 0.73 0.55 0.08 0.59 SenSim (TM) 0.48 0.62 0.48 0.06 0.78
  • 14. Evaluation on Travel Review Data: Feature Replacement 7/23/2013Department of Computer Science and Engineering, IIT Bombay 14 Metric Used Accuracy (%) PP NP PR NR Baseline 89.10 91.50 87.07 85.18 91.24 LESK (Banerjee et al., 2003) 89.36 91.57 87.46 85.68 91.25 LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90 LCH (Leacock et al., 1998) 89.64 90.48 88.86 86.47 89.63 SenSim (SD) 89.95 91.39 88.65 87.11 90.93 SenSim (SM) 90.06 92.01 88.38 86.67 91.58 SenSim (TD) 90.11 91.68 88.69 86.97 91.23