This document presents an approach to opinion mining that uses a holistic lexicon-based method. It focuses on determining whether opinions on identified features are positive, negative, or neutral. It proposes using an opinion lexicon and handling context-dependent opinion words and implicit features indicated by adjectives. Rules are also introduced to determine opinion polarity across sentences. An evaluation shows this approach achieves precision, recall, and F-score of 0.92, 0.91, and 0.91 respectively.
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
Xiaowen Ding, Bing Liu and Philip Yu: A Holistic Lexicon-Based Approach to Opinion Mining
1. Xiaowen Ding, Bing Liu and Philip Yu
Presenter: Quang Nguyen
Date: 2010.10.18
Saltlux Vietnam Development Center
2. Featured-based Opinion Mining Tasks
Task 1: Identify and extract object features F that have been
commented on by an opinion holder (e.g., a reviewer).
Task 2: Determine whether the opinions on the features F are
positive, negative or neutral.
Task 3: Group feature synonyms.
• Produce a feature-based opinion summary of multiple reviews.
This paper focuses on Task 2 assuming that features
have been discovered
2
3. Opinion Words
• Positive: beautiful, wonderful, good, amazing,
• Negative: bad, poor, terrible, cost someone an
arm and a leg (idiom).
One effective approach is to use opinion lexicon,
opinion words.
• Identify all opinion words in a sentence
• Aggregate these words to give the final opinion to
each feature.
3
4. Dictionary-based approaches
• Start from a seed opinion words
• Use Wordnet’s hierarchy and synsets to acquire
more opinion words
Corpus-based approaches: extract opinion
words from large corpora using syntactic
rules and co-occurrence patterns
Do not deal well with context dependent
words!
4
5. Improve lexicon-based approaches using
context dependent opinion words
• Negative: “The bedroom is very small”
• Positive: “The Nokia N3100 is so small as to be
put in any pockets”
Propose a function for aggregating multiple
opinion words in the same sentence
Consider explicit and implicit opinions
5
7. Opinion
on both sides of “and” should be
the same
• E.g., “This camera takes great pictures and has a
long battery life”.
Not likely to say:
• “This camera takes great pictures and has a short
battery life.”
7
8. Sometimes, one may not use an explicit
conjunction “and”.
• Same opinion in same sentence, unless there is a
“but”-like clause
• E.g., “The camera has a long battery life, which is
great”
8
9. Peopleusually express the same opinion
across sentences
• unless there is an indication of opinion change
using words such as “but” and “however”
• E.g., “The picture quality is amazing. The battery life is
long”
Not so natural to say:
• “The picture quality is amazing. The battery life is
short”
9
10. Opinion lexicon is far from sufficient. It needs
special handling:
• Negation/But Rule
• Non-negation contains negative word, e.g., “I like this camera
not just because it is beautiful”
• Not contrary, but has a “but”, e.g., ““I not only like the picture
quality of this camera, but also its size”
• …
10
11. Implicit
Feature is determined through
adjectives (implicit feature indicator)
• E.g., “This camera is very small”
“small” is indicator for “size”
• E.g., “This camera is very heavy”
• “heavy” is indicator for “weight”
11
12. An object O is an entity which can be a product,
person, event, organization, or topic
An object O is represented with a finite set of features,
F = {f1, f2, …, fn}.
• Each feature fi in F can be expressed with a finite set of words
or phrases Wi, which are synonyms.
Model of a review: An opinion holder j comments on a
subset of the features Sj F of object O.
• For each feature fk Sj that j comments on, he/she
chooses a word or phrase from Wk to describe the
feature, and
expresses a positive, negative or neutral opinion on fk.
12
13. Input: a pair (f, s), where f is a product feature and s is a
sentence that contains f.
Output: whether the opinion on f in s is pos, neg, or neut.
wi: opinion word
V: set of all opinion words
dis(wi, f): distance between wi and f
SO: semantic orientation of wi (+1, -1, 0)
13
16. Precision Recall F-Score
FBS
(M. Hu and B. Liu. Mining and 0.93 0.76 0.83
summarizing customer
reviews. KDD’04, 2004)
OPINE
(A-M. Popescu and O. Etzioni.
Extracting Product Features
0.86 0.89 0.87
and Opinions from Reviews. EMNLP-
05, 2005.)
Opinion Observer 0.92 0.91 0.91
(this paper)
16
17. Xiaowen Ding, Bing Liu, and Philip S. Yu, A Holistic
Lexicon-Based Approach to Opinion Mining, Proceedings
of the international conference on Web search and web
data mining, USA, 2008
17