SlideShare uma empresa Scribd logo
1 de 56
Baixar para ler offline
Sentiment Analysis in Twitter with
Lightweight Discourse Analysis
Subhabrata Mukherjee, Pushpak Bhattacharyya
IBM Research Lab, India
Dept. of Computer Science and Engineering,
Indian Institute of Technology, Bombay
24th International Conference on Computational Linguistics
COLING 2012,
IIT Bombay, Mumbai, Dec 8 - Dec 15, 2012
2



Discourse
3
 An important component of language comprehension
in most natural language contexts involves
connecting clauses and phrases together in order to
establish a coherent discourse (Wolf et al., 2004).


Discourse
4
 An important component of language comprehension
in most natural language contexts involves
connecting clauses and phrases together in order to
establish a coherent discourse (Wolf et al., 2004).
 Presence of a discourse marker can alter the overall
sentiment of a sentence

Discourse
5
 An important component of language comprehension
in most natural language contexts involves
connecting clauses and phrases together in order to
establish a coherent discourse (Wolf et al., 2004).
 Presence of a discourse marker can alter the overall
sentiment of a sentence
 In most of the bag-of-words models, the discourse
markers are ignored as stop words during feature
vector creation
Discourse
Motivation
 i'm quite excited about Tintin, despite
not really liking original comics -
probably because Joe Cornish had a
hand in
 Think i'll stay with the whole 'sci-fi' shit
but this time...a classic movie.
Motivation Contd…
 Traditional works in discourse analysis use parsing of
some form like a discourse parser or a dependency
parser
 Most of these theories are well-founded for
structured text, and structured discourse annotated
corpora are available to train the models
 However, using these methods for micro-blog
discourse analysis pose some fundamental difficulties
 Micro-blogs, like Twitter, do not have any restriction
on the form and content of the user posts
 Users do not use formal language to communicate in
the micro-blogs. As a result, there are abundant
spelling mistakes, abbreviations, slangs,
discontinuities and grammatical errors
 The errors cause natural language processing tools
like parsers and taggers to fail frequently
 Increased processing time adds an overhead to real-
time applications
Motivation Contd…
Discourse Relation
 A coherently structured discourse is a
collection of sentences having some relation
with each other
 A coherent relation reflects how different
discourse segments interact
 Discourse segments are non-overlapping
spans of text
Discourse Coherent Relations
in generalGeneralization
also; furthermore; in addition; note (furthermore) that;
(for , in, with) which; who; (for, in, on, against, with)
whom
Elaboration
for example; for instanceExample
according to …; …said; claim that …; maintain that …;
stated
Attribution
(and) then; first, second, … before; after; whileTemporal Sequence
by contrast; butContrast
and; (and) similarlySimilarity
if…(then); as long as; whileCondition
although; but; whileViolated Expectations
because; and soCause-effect
ConjunctionsCoherence Relations
Contentful Conjunctions used to illustrate Coherence Relations (Wolf et al. 2005)
1. Cause-effect: (YES! I hope she goes with Chris) so (I can freak out like I did with Emmy
Awards.)
2. Violated Expectations: (i'm quite excited about Tintin), despite (not really liking original
comics.)
3. Condition: If (MicroMax improved its battery life), (it wud hv been a gr8 product).
4. Similarity: (I lyk Nokia) and (Samsung as well).
5. Contrast: (my daughter is off school very poorly), but (brightened up when we saw you on
gmtv today).
6. Temporal Sequence: (The film got boring) after a while.
7. Attribution: (Parliament is a sausage-machine: the world) according to (Kenneth
Clarke).
8. Example: (Dhoni made so many mistakes…) for instance, (he shud’ve let Ishant bowl wn
he was peaking).
9. Elaboration: In addition (to the worthless direction), (the story lacked depth too).
10. Generalization: In general,(movies made under the RGV banner) (are not worth a
penny).
Discourse Coherent Relations Examples
Discourse Relations and
Sentiment Analysis
 Not all discourse relations are significant for sentiment analysis
 Discourse relation essential for Sentiment Analysis
 That connects segments having contrasting information
 Violated Expectations
 That places higher importance to certain discourse segments
 Inferential Conjunctions
 That incorporates hypothetical situation in the context
 Conditionals
 Semantic Operators influencing discourse relations in Sentiment
Analysis
 That incorporates hypothetical situation in the context
 Modals
 That negates the information in the discourse segment
 Negation
Violated Expectations and
Contrast
 Violating expectation conjunctions oppose or refute
the neighboring discourse segment
 We categorize them into Conj_Fol and Conj_Prev
 Conj_Fol is the set of conjunctions that give more
importance to the discourse segment that follows them
 Conj_Prev is the set of conjunctions that give more
importance to the previous discourse segment
Violated Expectations and
Contrast
 Violating expectation conjunctions oppose or refute
the neighboring discourse segment
 We categorize them into Conj_Fol and Conj_Prev
 Conj_Fol is the set of conjunctions that give more
importance to the discourse segment that follows them
 Conj_Prev is the set of conjunctions that give more
importance to the previous discourse segment
(i'm quite excited about Tintin), despite (not really liking
original comics.)
(my daughter is off school very poorly), but (brightened up when we
saw you on gmtv today).
Conclusive or Inferential
Conjunctions
 These are the set of conjunctions that
tend to draw a conclusion or inference
 Hence, the discourse segment following
them should be given more weight
Conclusive or Inferential
Conjunctions
 These are the set of conjunctions that
tend to draw a conclusion or inference
 Hence, the discourse segment following
them should be given more weight
@User I was not much satisfied with ur so-called good phone and
subsequently decided to reject it.
Conditionals
 Conditionals introduce a hypothetical situation in the
context
 The if…then…else constructs depict situations which
may or may not happen subject to certain conditions.
 In our work, the polarity of the discourse segment in
a conditional statement is toned down, in lexicon-
based classification
 In supervised classifiers, the conditionals are marked
as features
Modals
 Events that have happened, events that are happening or events that
are certain to occur are called realis events. Events that have possibly
occurred or have some probability to occur in the distant future are
called irrealis events. Modals depict irrealis events
 We divide the modals into two sub-categories: Strong_Mod and
Weak_Mod.
 Strong_Mod is the set of modals that express a higher degree of
uncertainty in any situation
 Weak_Mod is the set of modals that express lesser degree of uncertainty
and more emphasis on certain events or situations
 In our work, the polarity of the discourse segment neighboring a strong
modal is toned down in lexicon-based classification
 In supervised classifiers, the modals are marked as features.
Modals
 Events that have happened, events that are happening or events that
are certain to occur are called realis events. Events that have possibly
occurred or have some probability to occur in the distant future are
called irrealis events. Modals depict irrealis events
 We divide the modals into two sub-categories: Strong_Mod and
Weak_Mod.
 Strong_Mod is the set of modals that express a higher degree of
uncertainty in any situation
 Weak_Mod is the set of modals that express lesser degree of uncertainty
and more emphasis on certain events or situations
 In our work, the polarity of the discourse segment neighboring a strong
modal is toned down in lexicon-based classification
 In supervised classifiers, the modals are marked as features.(Strong Modals): Unless I missed the announcement their God is
now featured on postage stamps, it might be a hard sell.
(Weak Modals): G.E 12 must be the most deadly General Election
for politicians ever.
Negation
 The negation operator inverts the sentiment of the word
following it
 The usual way of handling negation in SA is to consider a
window of size n (typically 3-5) and reverse the polarity of all
the words in the window
 (Negation): I do not like Nokia but I like Samsung
 We consider a negation window of size 5 and reverse all the
words in the window, till either the window size exceeds or a
violating expectation (or a contrast) conjunction is encountered
Features
not, neither, never, no, norNeg
should, ought to, need not, shall, will, mustWeak_Mod
might, could, can, would, mayStrong_Mod
IfConditionals
therefore, furthermore, consequently, thus, as a
result, subsequently, eventually, hence
Conj_Infer
till, until, despite, in spite, though, althoughConj_Prev
but, however, nevertheless, otherwise, yet, still,
nonetheless
Conj_Fol
AttributesDiscourse Relations
Algorithm





Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence




Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence




• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although



• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although



• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
• Words before them are given more weightage
• Frequency count of those words is incremented by 1
• India staged a marvelous victory+2 down under despite all odds-1.
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although
 All sentences containing if are marked, in supervised classifiers


• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
• Words before them are given more weightage
• Frequency count of those words is incremented by 1
• India staged a marvelous victory+2 down under despite all odds-1.
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although
 All sentences containing if are marked, in supervised classifiers


• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
• Words before them are given more weightage
• Frequency count of those words is incremented by 1
• India staged a marvelous victory+2 down under despite all odds-1.
• In lexicon-based classifiers, their weights are decreased
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although
 All sentences containing if are marked, in supervised classifiers
 All sentences containing strong modals are marked, in supervised
classifiers

• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
• Words before them are given more weightage
• Frequency count of those words is incremented by 1
• India staged a marvelous victory+2 down under despite all odds-1.
• In lexicon-based classifiers, their weights are decreased
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although
 All sentences containing if are marked, in supervised classifiers
 All sentences containing strong modals are marked, in supervised
classifiers

• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
• Words before them are given more weightage
• Frequency count of those words is incremented by 1
• India staged a marvelous victory+2 down under despite all odds-1.
• In lexicon-based classifiers, their weights are decreased• In lexicon-based classifiers, their weights are decreased
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although
 All sentences containing if are marked, in supervised classifiers
 All sentences containing strong modals are marked, in supervised
classifiers
 Negation
• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
• Words before them are given more weightage
• Frequency count of those words is incremented by 1
• India staged a marvelous victory+2 down under despite all odds-1.
• In lexicon-based classifiers, their weights are decreased• In lexicon-based classifiers, their weights are decreased
Algorithm
 Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet,
still, nonetheless, nevertheless, therefore, furthermore,
consequently, thus, as a result, subsequently, eventually, hence
 Conj_Prev - till, until, despite, in spite, though, although
 All sentences containing if are marked, in supervised classifiers
 All sentences containing strong modals are marked, in supervised
classifiers
 Negation
• Words after them are given more weightage
• Frequency count of those words is incremented by 1
• The movie looked promising+1, but it failed-2 to make an impact in the box-
office
• Words before them are given more weightage
• Frequency count of those words is incremented by 1
• India staged a marvelous victory+2 down under despite all odds-1.
• In lexicon-based classifiers, their weights are decreased• In lexicon-based classifiers, their weights are decreased
• A window of 5 is considered
• Polarity of all words in the window are reversed till another violating
expectation conjunction is encountered
• The polarity reversals are specially marked
• I do not like-1 Nokia but I like+2 Samsung.
Algorithm Contd…
 Let a user post R consist of m sentences si (i=1…m), where each si consist of ni
words wij (i=1…m, j=1…ni)
 Let fij be the weight of the word wij in sentence si, initialized to 1
 The weight of a word wij is adjusted according to the presence of a discourse
marker or a semantic operator
 Let flipij be a variable which indicates whether the polarity of wij should be
flipped or not
 Let hypij be a variable which indicates the presence of a conditional or a strong
modal in si.
 Input: Review R
 Output: wij, fij , flipij , hypij
Lexical Classification
Bing Liu sentiment lexicon (Hu et al., 2004) is used to find the polarity pol(wij)
of a word wij
Supervised Classification
 Support Vector Machines are used with the following features:
 N-grams (N=1,2)
 Stop Word Removal (except discourse markers)
 Discourse Weight of Features - fij
 Modal and Conditional Indicators - hypij
 Stemming
 Negation - flipij
 Emoticons
 Part-of-Speech Information
 Feature Space
 Lexeme - wij
 Sense-Space – Synset-id(wij)
Datasets







Datasets
 Dataset 1 (Twitter – Manually Annotated)
 8507 tweets over 2000 entities from 20 domains
 Annotated by 4 annotators into positive, negative and
objective classes




Datasets
 Dataset 1 (Twitter – Manually Annotated)
 8507 tweets over 2000 entities from 20 domains
 Annotated by 4 annotators into positive, negative and
objective classes
 Dataset 2 (Twitter – Auto Annotated)
 15,214 tweets collected and annotated based on hashtags
 Positive hashtags - #positive, #joy, #excited, #happy
 Negative hashtags - #negative, #sad, #depressed,
#gloomy,#disappointed
Datasets Contd…



Manually Annotated Dataset
#Positive #Negative #Objective
Not Spam
#Objective
Spam
Total
2548 1209 2757 1993 8507
Auto Annotated Dataset
#Positive #Negative Total
7348 7866 15214
Datasets Contd…
 Dataset 3 (Travel Domain - Balamurali et al., EMNLP
2011)
 Each word is manually tagged with its
disambiguated WordNet sense
 Contains 595 polarity tagged documents of each
Manually Annotated Dataset
#Positive #Negative #Objective
Not Spam
#Objective
Spam
Total
2548 1209 2757 1993 8507
Auto Annotated Dataset
#Positive #Negative Total
7348 7866 15214
Dataset Domains41
Movie,Restaurant, Television, Politics, Sports, Education, Philosophy, Travel,
Books, Technology, Banking & Finance, Business, Music, Environment,
Computers, Automobiles, Cosmetics brands, Amusement parks, Eatables, History
Baselines
 Twitter
 C-Feel-It (Joshi et al., 2011, ACL)
 Travel Reviews
 Balamurali et al., 2011, EMNLP
 Iterative Word-Sense Disambiguation
Algorithm (Khapra et al., 2010, GWC) is
used to auto sense-annotate the words
Features
 Let a user post R consist of m sentences si (i=1…m), where each si consist of ni
words wij (i=1…m, j=1…ni)
 Let fij be the weight of the word wij in sentence si, initialized to 1
 The weight of a word wij is adjusted according to the presence of a discourse
marker or a semantic operator
 Let flipij be a variable which indicates whether the polarity of wij should be
flipped or not
 Let hypij be a variable which indicates the presence of a conditional or a strong
modal in si.
 Input: Review R
 Output: wij, fij , flipij , hypij
Classification Results in Twitter
(Datasets 1 and 2)
Comparison with C-Feel-It (Joshi et al., ACL 2011)
Classification Results in Twitter
(Datasets 1 and 2)
Comparison with C-Feel-It (Joshi et al., ACL 2011)
Lexicon-based
Classification
Classification Results in Twitter
(Datasets 1 and 2)
Comparison with C-Feel-It (Joshi et al., ACL 2011)
Lexicon-based
Classification
Classification Results in Twitter
(Datasets 1 and 2)
Comparison with C-Feel-It (Joshi et al., ACL 2011)
Lexicon-based
Classification
Supervised
Classification
Classification Results in Twitter
(Datasets 1 and 2)
Comparison with C-Feel-It (Joshi et al., ACL 2011)
Lexicon-based
Classification
Supervised
Classification
Classification Results in Travel Reviews (Dataset 3)
Comparison with Balamurali et al., EMNLP 2011)49
Classification Results in Travel Reviews (Dataset 3)
Comparison with Balamurali et al., EMNLP 2011)50
Lexicon-based
Classification
Classification Results in Travel Reviews (Dataset 3)
Comparison with Balamurali et al., EMNLP 2011)51
71.78Bag-of-Words Model + Discourse
69.62Baseline Bag-of-Words Model
AccuracySentiment Evaluation Criterion
Lexicon-based
Classification
Classification Results in Travel Reviews (Dataset 3)
Comparison with Balamurali et al., EMNLP 2011)52
Supervised
Classification
71.78Bag-of-Words Model + Discourse
69.62Baseline Bag-of-Words Model
AccuracySentiment Evaluation Criterion
Lexicon-based
Classification
Classification Results in Travel Reviews (Dataset 3)
Comparison with Balamurali et al., EMNLP 2011)53
Supervised
Classification
Systems Accuracy (%)
Baseline Accuracy (Only Unigrams) 84.90
Balamurali et al., 2011 (Only IWSD Sense of Unigrams) 85.48
Balamurali et al., 2011 (Unigrams+IWSD Sense of Unigrams) 86.08
Unigrams + IWSD Sense of Unigrams+Discourse Features 88.13
71.78Bag-of-Words Model + Discourse
69.62Baseline Bag-of-Words Model
AccuracySentiment Evaluation Criterion
Lexicon-based
Classification
Drawbacks54
 Usage of a generic lexicon in lexeme feature space
 Lexicons do not have entries for interjections like wow,
duh etc. which are strong indicators of sentiment
 Noisy Text (luv, gr8, spams, …)
 Sparse feature space (140 chars) for supervised
classification
 70% accuracy of IWSD in sense space for travel review
classification
Drawbacks Contd…
55
 I wanted+2 to follow my dreams and ambitions+2 despite all the
obstacles-1, but I did not succeed-2.
 want and ambition will get polarity +2 each, as they appear
before despite, obstacle will get polarity -1 and not succeed will
get a polarity -2 as they appear after but
 Overall polarity is +1, whereas the overall sentiment should be
negative
 We do not consider positional importance of a discourse marker
in the sentence and consider all markers equally important
 Better give a ranking to the discourse markers based on their
positional importance
 Thank You
 Questions ?

Mais conteúdo relacionado

Mais procurados

Translation Techniques from English into Romanian and Russin
Translation  Techniques from English into Romanian and RussinTranslation  Techniques from English into Romanian and Russin
Translation Techniques from English into Romanian and RussinElena Shapa
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelHemantha Kulathilake
 
Fidelityandtransparency translation
Fidelityandtransparency translationFidelityandtransparency translation
Fidelityandtransparency translationEr Animo
 
5. phases of nlp
5. phases of nlp5. phases of nlp
5. phases of nlpmonircse2
 
Translation procedures
Translation proceduresTranslation procedures
Translation proceduresdanyshrek2013
 
Minimalist program
Minimalist programMinimalist program
Minimalist programRabbiaAzam
 
Translation techniques presentation
Translation  techniques  presentationTranslation  techniques  presentation
Translation techniques presentationAngelo pizzuto
 
8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for TranslationRIILP
 
Trasnlation shift
Trasnlation shiftTrasnlation shift
Trasnlation shiftBuhsra
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishHemantha Kulathilake
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...iwan_rg
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Rajnish Raj
 
Some problems of ambiguity in translation with reference to english and arabic
Some problems of ambiguity in translation with reference to english and arabicSome problems of ambiguity in translation with reference to english and arabic
Some problems of ambiguity in translation with reference to english and arabicfalah_hasan77
 
NLP_KASHK:Parsing with Context-Free Grammar
NLP_KASHK:Parsing with Context-Free Grammar NLP_KASHK:Parsing with Context-Free Grammar
NLP_KASHK:Parsing with Context-Free Grammar Hemantha Kulathilake
 
Evaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsEvaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsSajeed Mahaboob
 

Mais procurados (19)

translation method
translation methodtranslation method
translation method
 
Translation Techniques from English into Romanian and Russin
Translation  Techniques from English into Romanian and RussinTranslation  Techniques from English into Romanian and Russin
Translation Techniques from English into Romanian and Russin
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
translation
translationtranslation
translation
 
Fidelityandtransparency translation
Fidelityandtransparency translationFidelityandtransparency translation
Fidelityandtransparency translation
 
5. phases of nlp
5. phases of nlp5. phases of nlp
5. phases of nlp
 
Translation procedures
Translation proceduresTranslation procedures
Translation procedures
 
Minimalist program
Minimalist programMinimalist program
Minimalist program
 
Transposition
TranspositionTransposition
Transposition
 
Translation techniques presentation
Translation  techniques  presentationTranslation  techniques  presentation
Translation techniques presentation
 
Phrase structure grammar
Phrase structure grammarPhrase structure grammar
Phrase structure grammar
 
8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation
 
Trasnlation shift
Trasnlation shiftTrasnlation shift
Trasnlation shift
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for English
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 
Some problems of ambiguity in translation with reference to english and arabic
Some problems of ambiguity in translation with reference to english and arabicSome problems of ambiguity in translation with reference to english and arabic
Some problems of ambiguity in translation with reference to english and arabic
 
NLP_KASHK:Parsing with Context-Free Grammar
NLP_KASHK:Parsing with Context-Free Grammar NLP_KASHK:Parsing with Context-Free Grammar
NLP_KASHK:Parsing with Context-Free Grammar
 
Evaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsEvaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutions
 

Destaque

Sentiment tool Project presentaion
Sentiment tool Project presentaionSentiment tool Project presentaion
Sentiment tool Project presentaionRavindra Chaudhary
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on TwitterNitish J Prabhu
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
10 roses
10 roses10 roses
10 roseshoabido
 
A Fuzzy Approach For Multi-Domain Sentiment Analysis
A Fuzzy Approach For Multi-Domain Sentiment AnalysisA Fuzzy Approach For Multi-Domain Sentiment Analysis
A Fuzzy Approach For Multi-Domain Sentiment AnalysisMauro Dragoni
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...Prateek Singh
 
Mike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupMike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupm1ked
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationVikas Jain
 
02. naive bayes classifier revision
02. naive bayes classifier   revision02. naive bayes classifier   revision
02. naive bayes classifier revisionJeonghun Yoon
 
"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love BucharestStefan Adam
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitterprnk08
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysisSunil Kandari
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes ClassifierYiqun Hu
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...Geetika Gautam
 
Arabic Text mining Classification
Arabic Text mining Classification Arabic Text mining Classification
Arabic Text mining Classification Zakaria Zubi
 
Sentiment analysis of arabic,a survey
Sentiment analysis of arabic,a surveySentiment analysis of arabic,a survey
Sentiment analysis of arabic,a surveyArabic_NLP_ImamU2013
 
Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...Tien-Yang (Aiden) Wu
 
Sentiment Analyzer
Sentiment AnalyzerSentiment Analyzer
Sentiment AnalyzerAnkit Raj
 

Destaque (20)

Sentiment tool Project presentaion
Sentiment tool Project presentaionSentiment tool Project presentaion
Sentiment tool Project presentaion
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
10 roses
10 roses10 roses
10 roses
 
A Fuzzy Approach For Multi-Domain Sentiment Analysis
A Fuzzy Approach For Multi-Domain Sentiment AnalysisA Fuzzy Approach For Multi-Domain Sentiment Analysis
A Fuzzy Approach For Multi-Domain Sentiment Analysis
 
Data mining project
Data mining projectData mining project
Data mining project
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
 
Mike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupMike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backup
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
02. naive bayes classifier revision
02. naive bayes classifier   revision02. naive bayes classifier   revision
02. naive bayes classifier revision
 
"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
 
Arabic Text mining Classification
Arabic Text mining Classification Arabic Text mining Classification
Arabic Text mining Classification
 
Arabic tokenization and stemming
Arabic tokenization and  stemmingArabic tokenization and  stemming
Arabic tokenization and stemming
 
Sentiment analysis of arabic,a survey
Sentiment analysis of arabic,a surveySentiment analysis of arabic,a survey
Sentiment analysis of arabic,a survey
 
Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...
 
Sentiment Analyzer
Sentiment AnalyzerSentiment Analyzer
Sentiment Analyzer
 

Semelhante a Sentiment Analysis in Twitter with Lightweight Discourse Analysis

Zillmer proposal final1
Zillmer proposal final1Zillmer proposal final1
Zillmer proposal final1nicolezillmer
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis Naveen Kumar
 
Discourse analysis
Discourse analysisDiscourse analysis
Discourse analysisAyesha Mir
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...CITE
 
Class3 - What is This Language Structure
Class3 - What is This Language StructureClass3 - What is This Language Structure
Class3 - What is This Language StructureNathacia Lucena
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Discourse analysis-1225482185740463-9
Discourse analysis-1225482185740463-9Discourse analysis-1225482185740463-9
Discourse analysis-1225482185740463-9Victor Canoy
 
Word Segmentation in Sentence Analysis
Word Segmentation in Sentence AnalysisWord Segmentation in Sentence Analysis
Word Segmentation in Sentence AnalysisAndi Wu
 
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainDDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainLuukBoulogne
 
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesJarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesPalGov
 
Sentence analysis
Sentence analysisSentence analysis
Sentence analysiskrukob9
 
PRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGE
PRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGEPRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGE
PRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGEijnlc
 
Scopes of linguistic description 1
Scopes of linguistic description 1Scopes of linguistic description 1
Scopes of linguistic description 1Bel Abbes Neddar
 

Semelhante a Sentiment Analysis in Twitter with Lightweight Discourse Analysis (20)

Zillmer proposal final1
Zillmer proposal final1Zillmer proposal final1
Zillmer proposal final1
 
Language
LanguageLanguage
Language
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 
NLP
NLPNLP
NLP
 
Discourse analysis
Discourse analysisDiscourse analysis
Discourse analysis
 
Unit 4
Unit 4Unit 4
Unit 4
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
 
Class3 - What is This Language Structure
Class3 - What is This Language StructureClass3 - What is This Language Structure
Class3 - What is This Language Structure
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Discourse analysis-1225482185740463-9
Discourse analysis-1225482185740463-9Discourse analysis-1225482185740463-9
Discourse analysis-1225482185740463-9
 
Zillmer proposal
Zillmer proposalZillmer proposal
Zillmer proposal
 
Word Segmentation in Sentence Analysis
Word Segmentation in Sentence AnalysisWord Segmentation in Sentence Analysis
Word Segmentation in Sentence Analysis
 
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainDDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
 
discourse analysis
discourse analysisdiscourse analysis
discourse analysis
 
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesJarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
 
Sentence analysis
Sentence analysisSentence analysis
Sentence analysis
 
PRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGE
PRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGEPRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGE
PRONOUN DISAMBIGUATION: WITH APPLICATION TO THE WINOGRAD SCHEMA CHALLENGE
 
Scopes of linguistic description 1
Scopes of linguistic description 1Scopes of linguistic description 1
Scopes of linguistic description 1
 
Patterns of Value
Patterns of ValuePatterns of Value
Patterns of Value
 

Mais de Subhabrata Mukherjee

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsXtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsSubhabrata Mukherjee
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Subhabrata Mukherjee
 
OpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesOpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesSubhabrata Mukherjee
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Subhabrata Mukherjee
 
Continuous Experience-aware Language Model
Continuous Experience-aware Language ModelContinuous Experience-aware Language Model
Continuous Experience-aware Language ModelSubhabrata Mukherjee
 
Experience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesExperience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesSubhabrata Mukherjee
 
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Subhabrata Mukherjee
 
Leveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesLeveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesSubhabrata Mukherjee
 
People on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsPeople on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsSubhabrata Mukherjee
 
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Subhabrata Mukherjee
 
Joint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelJoint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelSubhabrata Mukherjee
 
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterTwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterSubhabrata Mukherjee
 
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Subhabrata Mukherjee
 
Leveraging Sentiment to Compute Word Similarity
Leveraging Sentiment to Compute Word SimilarityLeveraging Sentiment to Compute Word Similarity
Leveraging Sentiment to Compute Word SimilaritySubhabrata Mukherjee
 
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...Subhabrata Mukherjee
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviewsSubhabrata Mukherjee
 
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...Subhabrata Mukherjee
 

Mais de Subhabrata Mukherjee (18)

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsXtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Fact Checking from Text
Fact Checking from TextFact Checking from Text
Fact Checking from Text
 
OpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesOpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product Profiles
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Continuous Experience-aware Language Model
Continuous Experience-aware Language ModelContinuous Experience-aware Language Model
Continuous Experience-aware Language Model
 
Experience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesExperience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review Communities
 
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
 
Leveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesLeveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News Communities
 
People on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsPeople on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health Forums
 
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
 
Joint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelJoint Author Sentiment Topic Model
Joint Author Sentiment Topic Model
 
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterTwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
 
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
 
Leveraging Sentiment to Compute Word Similarity
Leveraging Sentiment to Compute Word SimilarityLeveraging Sentiment to Compute Word Similarity
Leveraging Sentiment to Compute Word Similarity
 
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviews
 
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Sentiment Analysis in Twitter with Lightweight Discourse Analysis

  • 1. Sentiment Analysis in Twitter with Lightweight Discourse Analysis Subhabrata Mukherjee, Pushpak Bhattacharyya IBM Research Lab, India Dept. of Computer Science and Engineering, Indian Institute of Technology, Bombay 24th International Conference on Computational Linguistics COLING 2012, IIT Bombay, Mumbai, Dec 8 - Dec 15, 2012
  • 3. 3  An important component of language comprehension in most natural language contexts involves connecting clauses and phrases together in order to establish a coherent discourse (Wolf et al., 2004).   Discourse
  • 4. 4  An important component of language comprehension in most natural language contexts involves connecting clauses and phrases together in order to establish a coherent discourse (Wolf et al., 2004).  Presence of a discourse marker can alter the overall sentiment of a sentence  Discourse
  • 5. 5  An important component of language comprehension in most natural language contexts involves connecting clauses and phrases together in order to establish a coherent discourse (Wolf et al., 2004).  Presence of a discourse marker can alter the overall sentiment of a sentence  In most of the bag-of-words models, the discourse markers are ignored as stop words during feature vector creation Discourse
  • 6. Motivation  i'm quite excited about Tintin, despite not really liking original comics - probably because Joe Cornish had a hand in  Think i'll stay with the whole 'sci-fi' shit but this time...a classic movie.
  • 7. Motivation Contd…  Traditional works in discourse analysis use parsing of some form like a discourse parser or a dependency parser  Most of these theories are well-founded for structured text, and structured discourse annotated corpora are available to train the models  However, using these methods for micro-blog discourse analysis pose some fundamental difficulties
  • 8.  Micro-blogs, like Twitter, do not have any restriction on the form and content of the user posts  Users do not use formal language to communicate in the micro-blogs. As a result, there are abundant spelling mistakes, abbreviations, slangs, discontinuities and grammatical errors  The errors cause natural language processing tools like parsers and taggers to fail frequently  Increased processing time adds an overhead to real- time applications Motivation Contd…
  • 9. Discourse Relation  A coherently structured discourse is a collection of sentences having some relation with each other  A coherent relation reflects how different discourse segments interact  Discourse segments are non-overlapping spans of text
  • 10. Discourse Coherent Relations in generalGeneralization also; furthermore; in addition; note (furthermore) that; (for , in, with) which; who; (for, in, on, against, with) whom Elaboration for example; for instanceExample according to …; …said; claim that …; maintain that …; stated Attribution (and) then; first, second, … before; after; whileTemporal Sequence by contrast; butContrast and; (and) similarlySimilarity if…(then); as long as; whileCondition although; but; whileViolated Expectations because; and soCause-effect ConjunctionsCoherence Relations Contentful Conjunctions used to illustrate Coherence Relations (Wolf et al. 2005)
  • 11. 1. Cause-effect: (YES! I hope she goes with Chris) so (I can freak out like I did with Emmy Awards.) 2. Violated Expectations: (i'm quite excited about Tintin), despite (not really liking original comics.) 3. Condition: If (MicroMax improved its battery life), (it wud hv been a gr8 product). 4. Similarity: (I lyk Nokia) and (Samsung as well). 5. Contrast: (my daughter is off school very poorly), but (brightened up when we saw you on gmtv today). 6. Temporal Sequence: (The film got boring) after a while. 7. Attribution: (Parliament is a sausage-machine: the world) according to (Kenneth Clarke). 8. Example: (Dhoni made so many mistakes…) for instance, (he shud’ve let Ishant bowl wn he was peaking). 9. Elaboration: In addition (to the worthless direction), (the story lacked depth too). 10. Generalization: In general,(movies made under the RGV banner) (are not worth a penny). Discourse Coherent Relations Examples
  • 12. Discourse Relations and Sentiment Analysis  Not all discourse relations are significant for sentiment analysis  Discourse relation essential for Sentiment Analysis  That connects segments having contrasting information  Violated Expectations  That places higher importance to certain discourse segments  Inferential Conjunctions  That incorporates hypothetical situation in the context  Conditionals  Semantic Operators influencing discourse relations in Sentiment Analysis  That incorporates hypothetical situation in the context  Modals  That negates the information in the discourse segment  Negation
  • 13. Violated Expectations and Contrast  Violating expectation conjunctions oppose or refute the neighboring discourse segment  We categorize them into Conj_Fol and Conj_Prev  Conj_Fol is the set of conjunctions that give more importance to the discourse segment that follows them  Conj_Prev is the set of conjunctions that give more importance to the previous discourse segment
  • 14. Violated Expectations and Contrast  Violating expectation conjunctions oppose or refute the neighboring discourse segment  We categorize them into Conj_Fol and Conj_Prev  Conj_Fol is the set of conjunctions that give more importance to the discourse segment that follows them  Conj_Prev is the set of conjunctions that give more importance to the previous discourse segment (i'm quite excited about Tintin), despite (not really liking original comics.) (my daughter is off school very poorly), but (brightened up when we saw you on gmtv today).
  • 15. Conclusive or Inferential Conjunctions  These are the set of conjunctions that tend to draw a conclusion or inference  Hence, the discourse segment following them should be given more weight
  • 16. Conclusive or Inferential Conjunctions  These are the set of conjunctions that tend to draw a conclusion or inference  Hence, the discourse segment following them should be given more weight @User I was not much satisfied with ur so-called good phone and subsequently decided to reject it.
  • 17. Conditionals  Conditionals introduce a hypothetical situation in the context  The if…then…else constructs depict situations which may or may not happen subject to certain conditions.  In our work, the polarity of the discourse segment in a conditional statement is toned down, in lexicon- based classification  In supervised classifiers, the conditionals are marked as features
  • 18. Modals  Events that have happened, events that are happening or events that are certain to occur are called realis events. Events that have possibly occurred or have some probability to occur in the distant future are called irrealis events. Modals depict irrealis events  We divide the modals into two sub-categories: Strong_Mod and Weak_Mod.  Strong_Mod is the set of modals that express a higher degree of uncertainty in any situation  Weak_Mod is the set of modals that express lesser degree of uncertainty and more emphasis on certain events or situations  In our work, the polarity of the discourse segment neighboring a strong modal is toned down in lexicon-based classification  In supervised classifiers, the modals are marked as features.
  • 19. Modals  Events that have happened, events that are happening or events that are certain to occur are called realis events. Events that have possibly occurred or have some probability to occur in the distant future are called irrealis events. Modals depict irrealis events  We divide the modals into two sub-categories: Strong_Mod and Weak_Mod.  Strong_Mod is the set of modals that express a higher degree of uncertainty in any situation  Weak_Mod is the set of modals that express lesser degree of uncertainty and more emphasis on certain events or situations  In our work, the polarity of the discourse segment neighboring a strong modal is toned down in lexicon-based classification  In supervised classifiers, the modals are marked as features.(Strong Modals): Unless I missed the announcement their God is now featured on postage stamps, it might be a hard sell. (Weak Modals): G.E 12 must be the most deadly General Election for politicians ever.
  • 20. Negation  The negation operator inverts the sentiment of the word following it  The usual way of handling negation in SA is to consider a window of size n (typically 3-5) and reverse the polarity of all the words in the window  (Negation): I do not like Nokia but I like Samsung  We consider a negation window of size 5 and reverse all the words in the window, till either the window size exceeds or a violating expectation (or a contrast) conjunction is encountered
  • 21. Features not, neither, never, no, norNeg should, ought to, need not, shall, will, mustWeak_Mod might, could, can, would, mayStrong_Mod IfConditionals therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence Conj_Infer till, until, despite, in spite, though, althoughConj_Prev but, however, nevertheless, otherwise, yet, still, nonetheless Conj_Fol AttributesDiscourse Relations
  • 23. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence    
  • 24. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence     • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office
  • 25. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although    • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office
  • 26. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although    • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office • Words before them are given more weightage • Frequency count of those words is incremented by 1 • India staged a marvelous victory+2 down under despite all odds-1.
  • 27. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although  All sentences containing if are marked, in supervised classifiers   • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office • Words before them are given more weightage • Frequency count of those words is incremented by 1 • India staged a marvelous victory+2 down under despite all odds-1.
  • 28. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although  All sentences containing if are marked, in supervised classifiers   • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office • Words before them are given more weightage • Frequency count of those words is incremented by 1 • India staged a marvelous victory+2 down under despite all odds-1. • In lexicon-based classifiers, their weights are decreased
  • 29. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although  All sentences containing if are marked, in supervised classifiers  All sentences containing strong modals are marked, in supervised classifiers  • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office • Words before them are given more weightage • Frequency count of those words is incremented by 1 • India staged a marvelous victory+2 down under despite all odds-1. • In lexicon-based classifiers, their weights are decreased
  • 30. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although  All sentences containing if are marked, in supervised classifiers  All sentences containing strong modals are marked, in supervised classifiers  • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office • Words before them are given more weightage • Frequency count of those words is incremented by 1 • India staged a marvelous victory+2 down under despite all odds-1. • In lexicon-based classifiers, their weights are decreased• In lexicon-based classifiers, their weights are decreased
  • 31. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although  All sentences containing if are marked, in supervised classifiers  All sentences containing strong modals are marked, in supervised classifiers  Negation • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office • Words before them are given more weightage • Frequency count of those words is incremented by 1 • India staged a marvelous victory+2 down under despite all odds-1. • In lexicon-based classifiers, their weights are decreased• In lexicon-based classifiers, their weights are decreased
  • 32. Algorithm  Conj_Fol, Conj_Infer : but, however, nevertheless, otherwise, yet, still, nonetheless, nevertheless, therefore, furthermore, consequently, thus, as a result, subsequently, eventually, hence  Conj_Prev - till, until, despite, in spite, though, although  All sentences containing if are marked, in supervised classifiers  All sentences containing strong modals are marked, in supervised classifiers  Negation • Words after them are given more weightage • Frequency count of those words is incremented by 1 • The movie looked promising+1, but it failed-2 to make an impact in the box- office • Words before them are given more weightage • Frequency count of those words is incremented by 1 • India staged a marvelous victory+2 down under despite all odds-1. • In lexicon-based classifiers, their weights are decreased• In lexicon-based classifiers, their weights are decreased • A window of 5 is considered • Polarity of all words in the window are reversed till another violating expectation conjunction is encountered • The polarity reversals are specially marked • I do not like-1 Nokia but I like+2 Samsung.
  • 33. Algorithm Contd…  Let a user post R consist of m sentences si (i=1…m), where each si consist of ni words wij (i=1…m, j=1…ni)  Let fij be the weight of the word wij in sentence si, initialized to 1  The weight of a word wij is adjusted according to the presence of a discourse marker or a semantic operator  Let flipij be a variable which indicates whether the polarity of wij should be flipped or not  Let hypij be a variable which indicates the presence of a conditional or a strong modal in si.  Input: Review R  Output: wij, fij , flipij , hypij
  • 34. Lexical Classification Bing Liu sentiment lexicon (Hu et al., 2004) is used to find the polarity pol(wij) of a word wij
  • 35. Supervised Classification  Support Vector Machines are used with the following features:  N-grams (N=1,2)  Stop Word Removal (except discourse markers)  Discourse Weight of Features - fij  Modal and Conditional Indicators - hypij  Stemming  Negation - flipij  Emoticons  Part-of-Speech Information  Feature Space  Lexeme - wij  Sense-Space – Synset-id(wij)
  • 37. Datasets  Dataset 1 (Twitter – Manually Annotated)  8507 tweets over 2000 entities from 20 domains  Annotated by 4 annotators into positive, negative and objective classes    
  • 38. Datasets  Dataset 1 (Twitter – Manually Annotated)  8507 tweets over 2000 entities from 20 domains  Annotated by 4 annotators into positive, negative and objective classes  Dataset 2 (Twitter – Auto Annotated)  15,214 tweets collected and annotated based on hashtags  Positive hashtags - #positive, #joy, #excited, #happy  Negative hashtags - #negative, #sad, #depressed, #gloomy,#disappointed
  • 39. Datasets Contd…    Manually Annotated Dataset #Positive #Negative #Objective Not Spam #Objective Spam Total 2548 1209 2757 1993 8507 Auto Annotated Dataset #Positive #Negative Total 7348 7866 15214
  • 40. Datasets Contd…  Dataset 3 (Travel Domain - Balamurali et al., EMNLP 2011)  Each word is manually tagged with its disambiguated WordNet sense  Contains 595 polarity tagged documents of each Manually Annotated Dataset #Positive #Negative #Objective Not Spam #Objective Spam Total 2548 1209 2757 1993 8507 Auto Annotated Dataset #Positive #Negative Total 7348 7866 15214
  • 41. Dataset Domains41 Movie,Restaurant, Television, Politics, Sports, Education, Philosophy, Travel, Books, Technology, Banking & Finance, Business, Music, Environment, Computers, Automobiles, Cosmetics brands, Amusement parks, Eatables, History
  • 42. Baselines  Twitter  C-Feel-It (Joshi et al., 2011, ACL)  Travel Reviews  Balamurali et al., 2011, EMNLP  Iterative Word-Sense Disambiguation Algorithm (Khapra et al., 2010, GWC) is used to auto sense-annotate the words
  • 43. Features  Let a user post R consist of m sentences si (i=1…m), where each si consist of ni words wij (i=1…m, j=1…ni)  Let fij be the weight of the word wij in sentence si, initialized to 1  The weight of a word wij is adjusted according to the presence of a discourse marker or a semantic operator  Let flipij be a variable which indicates whether the polarity of wij should be flipped or not  Let hypij be a variable which indicates the presence of a conditional or a strong modal in si.  Input: Review R  Output: wij, fij , flipij , hypij
  • 44. Classification Results in Twitter (Datasets 1 and 2) Comparison with C-Feel-It (Joshi et al., ACL 2011)
  • 45. Classification Results in Twitter (Datasets 1 and 2) Comparison with C-Feel-It (Joshi et al., ACL 2011) Lexicon-based Classification
  • 46. Classification Results in Twitter (Datasets 1 and 2) Comparison with C-Feel-It (Joshi et al., ACL 2011) Lexicon-based Classification
  • 47. Classification Results in Twitter (Datasets 1 and 2) Comparison with C-Feel-It (Joshi et al., ACL 2011) Lexicon-based Classification Supervised Classification
  • 48. Classification Results in Twitter (Datasets 1 and 2) Comparison with C-Feel-It (Joshi et al., ACL 2011) Lexicon-based Classification Supervised Classification
  • 49. Classification Results in Travel Reviews (Dataset 3) Comparison with Balamurali et al., EMNLP 2011)49
  • 50. Classification Results in Travel Reviews (Dataset 3) Comparison with Balamurali et al., EMNLP 2011)50 Lexicon-based Classification
  • 51. Classification Results in Travel Reviews (Dataset 3) Comparison with Balamurali et al., EMNLP 2011)51 71.78Bag-of-Words Model + Discourse 69.62Baseline Bag-of-Words Model AccuracySentiment Evaluation Criterion Lexicon-based Classification
  • 52. Classification Results in Travel Reviews (Dataset 3) Comparison with Balamurali et al., EMNLP 2011)52 Supervised Classification 71.78Bag-of-Words Model + Discourse 69.62Baseline Bag-of-Words Model AccuracySentiment Evaluation Criterion Lexicon-based Classification
  • 53. Classification Results in Travel Reviews (Dataset 3) Comparison with Balamurali et al., EMNLP 2011)53 Supervised Classification Systems Accuracy (%) Baseline Accuracy (Only Unigrams) 84.90 Balamurali et al., 2011 (Only IWSD Sense of Unigrams) 85.48 Balamurali et al., 2011 (Unigrams+IWSD Sense of Unigrams) 86.08 Unigrams + IWSD Sense of Unigrams+Discourse Features 88.13 71.78Bag-of-Words Model + Discourse 69.62Baseline Bag-of-Words Model AccuracySentiment Evaluation Criterion Lexicon-based Classification
  • 54. Drawbacks54  Usage of a generic lexicon in lexeme feature space  Lexicons do not have entries for interjections like wow, duh etc. which are strong indicators of sentiment  Noisy Text (luv, gr8, spams, …)  Sparse feature space (140 chars) for supervised classification  70% accuracy of IWSD in sense space for travel review classification
  • 55. Drawbacks Contd… 55  I wanted+2 to follow my dreams and ambitions+2 despite all the obstacles-1, but I did not succeed-2.  want and ambition will get polarity +2 each, as they appear before despite, obstacle will get polarity -1 and not succeed will get a polarity -2 as they appear after but  Overall polarity is +1, whereas the overall sentiment should be negative  We do not consider positional importance of a discourse marker in the sentence and consider all markers equally important  Better give a ranking to the discourse markers based on their positional importance
  • 56.  Thank You  Questions ?