Suppose that a customer who has given a high rating about a mobile phone writes the following review about the product: The front camera of the phone is excellent! Truly speaking, this is the best front camera I have experienced so far. From this review, we can understand two things. First, the customer holds a positive opinion about the phone. Secondly, the front camera of the phone is the targeted feature on which the opinions have been expressed in the review. In this workshop, we will be particularly interested in discovering patterns as indicated in the second case. We will discuss a framework that enables us to first discover the targets on which the opinions have been expressed in a review and then determine the polarity of the opinions. This kind of detailed analysis helps us to discover the components or features of the products which the customers have liked or disliked and thus help us to better summarize the information.
5. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla
glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP
f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery
backup is great. The speaker is not good though.”
REVIEW
6. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla
glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP
f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery
backup is great. The speaker is not good though.”
NON-OPINIONATED
PASSAGES
OBJECTIVE SENTENCE
7. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla
glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP
f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery
backup is great. The speaker is not good though.”
OPINIONATED PASSAGES ON
FEATURES
SUBJECTIVE SENTENCE
8. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla
glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP
f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery
backup is great. The speaker is not good though.”
OPINIONS ON FEATURES
9. front camera - extremely clear (+3)
Rear camera - amazing (+4)
Battery backup - great (+5)
Speaker - not good (-2)
FEATURE-BASED
SENTIMENT SUMMARY
16. Bing Liu
Distinguished Professor
Department of Computer Science
University of Illinois Chicago (UIC)
Minqing Hu
Data Scientist at Signifyd
PhD – Computer Science
University of Illinois Chicago (UIC)
17. Mining Opinion Features in Customer Reviews
M. Hu and B. Liu
Proceedings of the ACM SIGKDD Conference on KDD, 2004
Mining and Summarizing Customer Reviews
M. Hu and B. Liu
Proceedings of the ACM SIGKDD Conference on KDD, 2004
Opinion Observer: Analysing and Comparing Opinions on the web
M. Hu, B. Liu and J. Cheng
Proceedings of WWW, 2005
Sentiment Analysis and Subjectivity
B. Liu
Handbook of Natural Language Processing 2 (2010), 627-666
1
2
3
4
18. Object
Components
Sub
Components /
Attributes
Cellular Phone
Camera Battery Display
Front
Camera
Back
Camera
Rear
Camera
Battery
life
Battery
Size
Battery
performance
ROOT
Size Quality Type
Features being represented by
its synonyms
OBJECT
Thus, an object can be represented as a tree, hierarchy or taxonomy.
20. Explicit Feature Example:
“The battery life of this phone is too short”
Implicit Feature Example:
“The phone doesn’t fit in an usual jeans pocket though.”
Size of the Phone
FEATURES
EXPLICIT VS IMPLICIT
FEATURES
“Don’t know why I had spent so much money for the phone”
Not value for money
21. Explicit Opinions Example:
The display clarity of this phone is amazing!
Implicit Opinions Example:
The phone doesn’t fit in an usual jeans pocket though.
A fact which expresses
dissatisfaction / disappointment
OPINIONS
EXPLICIT VS IMPLICIT
OPINIONS
23. 1. Identification of Frequent Features
2. Identification of Opinions on each features
3. Opinion Orientation Identification
4. Infrequent Feature Identification
5. Summary Generation
THE PROCESS FLOW
24. Step 1: Frequent Feature Mining
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
25. POS TAGGING
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
“The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla
glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP
f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery
backup is great. The speaker is not good though.”
N N N
N N N
N N
26. front | camera | phone | RAM | gorilla | glass | aluminium | frame |
Review
rear | camera | battery | backup | speaker
EXTRACTING NOUNS
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
27. Review 1 : front | camera | phone | gorilla | glass | aluminium | frame | rear | battery | backup | speaker
Review 2 : price | descent | sound | battery | camera | body
Review 3 : phone | battery | performance | camera
Review 4 : phone | life | sound | quality | battery | picture
Review 5 : phone | buy
Review 6 : loudspeaker | time | sound | quality
EXTRACTING NOUNS
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
30. Front
ASSOCIATION RULES
MINING
SUPPORT 0.7 0.6 0.4 0.4 0.3 0.2 0.3
EXAMPLE
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
31. Front
ASSOCIATION RULES
MINING
P(Camera, Front) 0.4
EXAMPLE
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
SUPPORT
33. Front
ASSOCIATION RULES
MINING
P(Battery, Life) 0.4
EXAMPLE
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
SUPPORT
34. Front
ASSOCIATION RULES
MINING
P(Camera, Buy) 0.2
Minimum Support
Threshold = 0.4
(say)
EXAMPLE
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
SUPPORT
35. front
ASSOCIATION RULES
MINING
EXPERIMENTAL
RESULTS
One Plus 6 Features – Extracted from the Reviews written in www.amazon.in
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
36. FEATURE PRUNING
COMPACTNESS
PRUNING
The method checks features that contains at least 2 words and remove those that are likely to be meaningless
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
37. FEATURE PRUNING
COMPACTNESS
PRUNING
The method checks features that contains at least 2 words and remove those that are likely to be meaningless
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
38. FEATURE PRUNING
COMPACTNESS
PRUNING
The method checks features that contains at least 2 words and remove those that are likely to be meaningless
EXAMPLE
“The camera quality is really good”
“I love the quality of the camera”
“awesome camera and the phone comes with a quality display”
counter example
“The phone has an awesome front camera and a quality display”
Compact
Compact
Not Compact
Compact
Compact but has no
dependency
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
39. FEATURE PRUNING
COMPACTNESS
PRUNING
The method checks features that contains at least 2 words and remove those that are likely to be meaningless
COUNTER-
EXAMPLES
“Both the camera and the battery is good”
“Although good camera but not good battery”
“lovely camera quality and nice battery”
Compact
Compact
Compact
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
40. FEATURE PRUNING
COMPACTNESS
PRUNING
The method checks features that contains at least 2 words and remove those that are likely to be meaningless
COUNTER-
EXAMPLES
“Both the camera and the battery is good”
“Although good camera but not good battery”
“lovely camera quality and nice battery”
Compact
Compact
Compact
However note:
The features here are separated by conjunctions (which is mostly the cases)
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
41. FEATURE PRUNING
COMPACTNESS
PRUNING
The method checks features that contains at least 2 words and remove those that are likely to be meaningless
MODIFICATION
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
42. FEATURE PRUNING
EXPERIMENTAL
RESULTS
One Plus 6 Features – Extracted from the Reviews written in www.amazon.in
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
front
43. FEATURE PRUNING
REDUNDANCY
PRUNING
The method checks features that contains SINGLE word and remove those that are likely to be meaningless
p-support (pure support) – p support of a feature f is the number of sentences that f appears and these
sentences must contain no feature phrase that is a superset of f
Example:
Consider the feature: camera
Consider the other features that
contains the word camera:
front camera | rear Camera |
back camera | camera quality.
P-support of camera
= number of reviews in which camera
occurred along and not with any of its
supersets
= 100 – (20 + 15 + 23 + 10)
= 32
A Feature will be considered meaningful if it satisfied the minimum threshold for p-support.
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
44. front
FEATURE PRUNING
EXPERIMENTAL
RESULTS
One Plus 6 Features – Extracted from the Reviews written in www.amazon.in
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
45. FREQUENT FEATURE
MINING
FLOWCHART
Review Database Frequent Features
POS Tagging
Frequent Feature
Identification
Feature Pruning
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
46. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 2: Opinion Word Extraction
47. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
ADJECTIVES AS OPINION
Mining and Summarizing Customer Reviews
M. Hu and B. Liu
Proceedings of the ACM SIGKDD Conference on KDD, 2004
Examples:
“The camera of the phone is good”
“The display looks dull”
“the sound quality of the speaker is fantastic”
“The phone has some really cool features”
Adjective
Adjective
Adjective
Adjective
This was based on previous research works on subjectivity
The nearest
adjective is
considered
as opinion
48. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
ADJECTIVES AS OPINION
COUNTER-
EXAMPLES
Examples:
“The camera of the phone is extremely good”
“The headphone is not working”
“The speaker of the phone is doing great”
“The phone has some nice cool features”
“The display is not bad”
Adverb + Adjective
Negation + Verb
Verb + Adjective
Adjective + Adjective
Negation + Adjective
49. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION EXTRACTIONALGORITHM
Opinion Word/s Extraction:
50. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 3
Opinion Orientation
Identification
51. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
ONLY ADJECTIVES
Adjective list:
Seed list:
52. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
WORDNET
In WordNet , adjectives are organized into bipolar clusters
53. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
WORDNET
Fast = + 2
Seed list:
In general, adjectives share the same orientation
as their synonyms and opposite orientation as
their antonyms.
54. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
ALGORITHM
55. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Examples:
“The camera of the phone is extremely good”
“The headphone is not working”
“The speaker of the phone is doing great”
“The phone has some nice cool features”
“The display is not bad”
Adverb + Adjective
Negation + Verb
Verb + Adjective
Adjective + Adjective
Negation + Adjective
OPINION ORIENTATION
IDENTIFICATIOIN
LIMITATION
56. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
FLOWCHARTTILL NOW!
Review
Database
POS Tagging
Frequent Feature
Identification
Feature Pruning
Frequent
Features
Opinion Word
Identification
Opinion
Orientation
Identification
57. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 4: Infrequent Feature Mining
58. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
“The picture is absolutely amazing.”
“The software that comes with it is amazing”
Note: The above two sentences shares same opinion
‘easy’ yet describing different features.
INFREQUENT FEATURE
MINING
59. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
“The picture is absolutely amazing.”
“The software that comes with it is amazing”
Note: The above two sentences shares same opinion
‘easy’ yet describing different features.
INFREQUENT FEATURE
MINING
COUNTER-
EXAMPLE
“The delivery guy was amazingly patient”
Shares the same
opinion but is not
a relevant feature
60. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
INFREQUENT FEATURE
MINING
Algorithm
61. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Review Database
POS Tagging
Frequent Feature
Identification
Feature Pruning
Frequent
Features
Opinion Word
Identification
Opinion Orientation
Identification
Opinion
Words
Infrequent
Features
Infrequent
Feature
Identification
FLOWCHARTTILL NOW!
62. Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 5: Summary Generation