3. Roadmap: We Are Here
●
●
●
●
Introduction to Sentiment Analysis
Introduction to Sentiwordnet
Building of Sentiwordnet
Enhancements in 3.0
4. Introduction to Sentiment
Analysis
● The task of identifying the opinion expressed
by a document.
● Can be carried out at various levels:
○
○
○
○
Word level
Sentence level
Document level
Aspect level, etc.
5. Tasks in Sentiment
Analysis
● Determining Text SO-Polarity
○ Subjective vs. Objective
● Determining Text PN-Polarity
○ Positive vs. Negative
● Determining Strength of Text PN-Polarity
○ Weakly Positive vs. Strongly Positive
○ Weakly Negative vs. Strongly Negative
○ Star Rating
6. Tasks in Sentiment
Analysis
● Determining Text SO-Polarity
○ Subjective vs. Objective
● Determining Text PN-Polarity
○ Positive vs. Negative
● Determining Strength of Text PN-Polarity
○ Weakly Positive vs. Strongly Positive
○ Weakly Negative vs. Strongly Negative
○ Star Rating
7. Tasks in Sentiment
Analysis
● Determining Text SO-Polarity
○ Subjective vs. Objective
● Determining Text PN-Polarity
○ Positive vs. Negative
● Determining Strength of Text PN-Polarity
○ Weakly Positive vs. Strongly Positive
○ Weakly Negative vs. Strongly Negative
○ Star Rating
8. Roadmap: We Are Here
●
●
●
●
Introduction to Sentiment Analysis
Introduction to Sentiwordnet
Building of Sentiwordnet
Enhancements in 3.0
9. Introduction to
Sentiwordnet
● Sentiwordnet is a sentiment lexicon
associating sentiment information to each
wordnet synset.
● Sentiwordnet = Wordnet + Sentiment
Information
10. Sentiment Information
For each wordnet synset s, the following
information is available in Sentiwordnet:
● Positive Score Pos(s)
● Negative Score Neg(s)
● Objective Score Obj(s)
Pos(s) + Neg(s) + Obj(s) = 1
11. Roadmap: We Are Here
●
●
●
●
Introduction to Sentiment Analysis
Introduction to Sentiwordnet
Building of Sentiwordnet
Enhancements in 3.0
12. Building Sentiwordnet
● Trained a set of 8 ternary (P vs. N vs. O)
classifiers, differing in
○ Training Set
○ Learning Algorithm
● Scored each synset based on no of
classifiers:
○ P score = No of classifiers stating Positive / 8
○ N score = No of classifiers stating Negative / 8
○ O score = No of classifiers stating Objective / 8
13. Classifiers: Training Sets
● Used semi-supervised approach starting
with a seed set of paradigmatic synsets
(such as nice, nasty, etc.)
● Performed ‘k’ iterations of expansion using
Wordnet lexical relations
○
○
○
○
○
○
Direct antonymy
Similarity
Derived from
Pertains to
Attribute
Also see
15. Classifiers: Learning
Algorithms
● The learning algorithms used were:
○ SVM
○ Rocchio
● Thus all combinations of 4 training sets and
2 learners yield 8 classifiers
16. Classifiers: Assigning
Categories
● Each ternary classifier is a sum of 2 binary
classifiers:
○ Positive vs. Not Positive
○ Negative vs. Not Negative
● Categories are assigned as:
P
NP
N
Objective
Negative
NN
Positive
Objective
17. Classifiers: Observations
● Effect of ‘k’:
○ Low ‘k’ -> Low Recall, High Precision
○ High ‘k’ -> High Recall, Low Precision
● Effect of learning algorithm:
○ SVM -> Favours set with higher cardinality
○ Rocchio -> Equal prior probabilities
18. Statistical Results:
Average Scores
Part of Speech
Positive
Negative
Objective
Adjectives
0.106
0.151
0.743
Names
0.022
0.034
0.944
Verbs
0.026
0.034
0.940
Adverbs
0.235
0.067
0.698
All
0.043
0.054
0.903
19. Roadmap: We Are Here
●
●
●
●
Introduction to Sentiment Analysis
Introduction to Sentiwordnet
Building of Sentiwordnet
Enhancements in 3.0
20. Random Walk
● Views Wordnet as a graph and performs
random walk on it
● Updates P, N and O values till process
converges
● Edge from s1 to s2 if s1 occurs in gloss of s2
21. Random Walk
● Two random walks are performed:
○ P Score
○ N Score
● O Score is assigned so that P + N + O = 1
23. Major References
● SentiWordNet: A Publicly Available Lexical
Resource for Opinion Mining by Andrea
Esuli, Fabrizio Sebastiani, 2006
● SentiWordNet 3.0: An Enhanced Lexical
Resource for Sentiment Analysis and
Opinion Mining by Stefano Baccianella,
Andrea Esuli, and Fabrizio Sebastiani, 2010
25. Further Plan
● Wordnet-Affect (2004) by Carlo Strapparava,
Alessandro Valitutti in proceedings of the 4th
International Conference of Language Resources and
Evaluation (LREC), Lisbon - IN PROGRESS
● Lexicon-based Methods in Sentiment Analysis (2011)
by Maite Taboada, Julian Brooke, Milan Tofiloski,
Kimberly Voll, Manfred Stede in the Journal of
Computational Linguistics