Axa Assurance Maroc - Insurer Innovation Award 2024
Hierarchical aspect and sentiment model, Context-dependent conceptualisation
1. Hiearchical Aspect-Sentiment Model &
Context-Dependent Conceptualization
Alice Oh
alice.oh@kaist.edu
http://uilab.kaist.ac.kr/
April 11, 2013
2. Overview
¤ Hierarchical Aspect-Sentiment Model (AAAI 2013)
¤ Suin Kim, et al.
¤ Collaboration with Microsoft Research Asia
¤ Context-Dependent Conceptualization (IJCAI 2013)
¤ Dongwoo Kim, Haixun Wang, Alice Oh
¤ Collaboration with Microsoft Research Asia
2
5. Hierarchical aspect-sentiment model
¤ Goal: To discover a hierarchy of aspects and associated
sentiments from a corpus of online reviews
¤ Assumptions
¤ Each sentence expresses a single aspect and a single sentiment
¤ An aspect (e.g., “battery life”) consists of neutral, positive, and
negative words
¤ Model: A hierarchical aspect-sentiment joint model using the
recursive Chinese restaurant processes (rCRP)
¤ Results
¤ A reasonable hierarchy of aspects discovered without supervision
¤ Sentiment classification accuracy comparable other recent
sentiment-aspect joint models
5
6. Aspect-sentiment hierarchy
6
Goals
• To discover and organize the aspects and associated sentiments into a hierarchy
• To determine the aspect in each sentence
• To determine the sentiment of each sentence
7. Comparison to other models
7
Multigrain Topic Model
General
Specific
ct-Sentiment Model
General Specific
Positive
Neutral
Negative
ASUM & JST
Multigrain Topic Model
General
Specific
Positive Negative
Reverse JST
Hierarchical Aspect-Sentiment Model
General Specific
Positive
Neutral
Negative
ASUM & JST
Multigrain Topic Model
General
Specific
Positive Negative
Reverse JST
Hierarchical Aspect-Sentiment Model
General Specific
Positive
Neutral
Negative
ASUM & JST
Multigrain Topic Model
General
Specific
Positive Negative
Reverse JST
Hierarchical Aspect-Sentiment Model
General Specific
Positive
Neutral
Negative
ASUM & JST
Mul
Gen
Spec
Positive Negative
Reverse JST
Hierarchical Aspect-Sentimen
9. Aspect-sentiment hierarchy
9
• Aspects tend to be general near the root and specific toward the leaves
• Each aspect node consists of positive and negative polarity
• Each sentence in a review is generated from a single aspect and sentiment
• Each word in a sentence is either neutral or subjective
10.
11. “The screen is clear and the
picture quality is outstanding.”
12. “The screen is clear and the
picture quality is outstanding.”
13. the screen is and the picture
clear quality outstanding
21. Topic specialization
Evaluates the general-to-
specific nature of the
hierarchy by comparing
the average distance of the
aspect nodes from the root
at each tree depth
22. Hierarchical affinity
Measures whether a parent-child pair shows smaller distance compared to
a non-parent-child pair, one at level L and another at level L+1
23. Aspect-sentiment consistency
Measures how in-node topics are
statistically coherent by comparing
• average intra-node topic distance
• average inter-node topic distance ttt
ttt
ttt
ttt ttt
24. Sentiment classification accuracy
• Sentiment classification using
short (<100 characters) reviews
• Small set contains positive
reviews of 5 stars, negative
reviews of 1 star
• Large set contains positive
reviews of 4~5 stars, negative
reviews of 1~2 stars
25. User scenario
Visualization of hierarchical
aspect-sentiments for a user
who is looking for a camera
with good picture quality
under low lights, a good LCD
screen, and high-end lenses
27. Semantic relatedness
Apple reveals new iPad
Microsoft introduces Surface
Surface vs iPad
Samsung’s new android tablets
iPhone 5, the best smart phone ever
By Topic Modeling
iPad
Apple
Microsoft
iPhone
Software
Samsung
SmartPhone
Android
Software Company
iOS
Mobile Phones
28. Contextual relatedness
Apple reveals new iPad
Fruit
Company
Food
Fresh fruit
Fruit tree
Brand
Crop
Flavor
Item
Manufacturer
Device
Platform
Technology
Mobile device
Tablet
Portable device
Tablet computer
Gadget
Apple product
Output device
29. Conceptualization given semantic context
Apple reveals new iPad
Fruit
Company
Food
Fresh fruit
Fruit tree
Brand
Crop
Flavor
Item
Manufacturer
Device
Platform
Technology
Mobile device
Tablet
Portable device
Tablet computer
Gadget
Apple product
Output device
iPad
Apple
Microsoft
iPhone
Software
Samsung
SmartPhone
Android
SoftwareCompany
iOS
MobilePhones
Semantic Context of Sentence
Concept of Apple
Concept of iPad
30. Conceptualization given semantic context
Apple reveals new iPad
Fruit
Company
Food
Fresh fruit
Fruit tree
Brand
Crop
Flavor
Item
Manufacturer
Device
Platform
Technology
Mobile device
Tablet
Portable device
Tablet computer
Gadget
Apple product
Output device
iPad
Apple
Microsoft
iPhone
Software
Samsung
SmartPhone
Android
SoftwareCompany
iOS
MobilePhones
Semantic Context of Sentence
Concept of Apple
Concept of iPad
Reinforcing concepts
Based on context
Fruit
Company
Food
Fresh fruit
Fruit tree
Brand
Crop
Flavor
Item
Manufacturer
31. Context-dependent conceptualization
company 0.104
client 0.078
tree 0.069
corporation 0.050
computer 0.047
software company 0.041
oems 0.025
laptop 0.020
personal computer 0.019
host 0.019
Concept of Apple
Apple and iPad
fruit 0.039
food 0.035
company 0.026
brand 0.024
flavor 0.021
crop 0.020
juice 0.018
fresh fruit 0.017
plant 0.017
snack 0.015
Apple and Orchard
company 0.063
brand 0.041
client 0.038
corporation 0.033
tree 0.028
business 0.028
computer 0.027
crop 0.027
software company 0.022
computer company 0.021
32. Context-dependent conceptualization
Concept of Jordan
Jordan and Basketball
Jordan and Iraq
country 0.172
state 0.107
place 0.088
arab state 0.070
arab country 0.067
muslim country 0.052
arab nation 0.045
middle eastern country 0.042
islamic country 0.040
regime 0.023
place 0.284
player 0.240
team 0.177
nation 0.106
host country 0.041
professional athlete 0.021
great player 0.020
role model 0.020
shoe 0.018
offensive 0.016
country 0.172
state 0.107
place 0.088
arab state 0.070
arab country 0.067
muslim country 0.052
arab nation 0.045
middle eastern country 0.042
islamic country 0.040
regime 0.023
34. Experiments and Results
¤ Frame elements
¤ Background: Semantic role labeling depends heavily on
annotated data such as FrameNet
¤ Problem: Building FrameNet requires expertise, and while
FrameNet contains 170k annotated sentences, it lacks
coverage
¤ Approach: Expand FrameNet using CDC
1. Conceptualize the frame elements given a sentence as
the context
2. Find other instances given the most probable concepts
¤ Experiment: Compare likelihood of frame elements in
unseen sentences in FrameNet
35. Frame elements
Given sentence :
in
the
I
cook
them
oven
1. What is the frame of this sentence ?
1) abusing 2) closure 3) apply_heat
36. Frame elements
in
the
I
cook
them
oven
Given sentence :
1. What is the frame of this sentence ?
1) abusing 2) closure 3) apply_heat
2. What is the frame element of the word ‘oven’
1) cooker 2) food 3) heat_source
37. Frame elements
inthe
I
cook
them
oven
FE: Cooker FE: Food
FE: Heat source
Frame:
Apply_Heat
Lexical
Unit
(Target)
Final Goal :
FE (Frame Element)
38. Frame elements: conceptualization for expansion
Frame Element : Heat_Source
… egg and chips was sizzling over camp-fires.
… the pig sizzled on the flames , spitting fat …
a large black kettle was sizzling on the hob.
Droplets of coffee sizzled on the hotplate.
… kitchen the meat sizzled in the oven and a big pan of potatoes …
… sizzled, now and then, upon the diminutive stove
☞ Conceptualize labeled frame elements with context
Labeled elements
40. Frame elements: experiment
Per-word heldout log-likelihood of the predicted frame
elements using five-fold validation. The naïve approach is
conceptualization using Probase without context (Song,
IJCAI 2012).
41. Experiments and Results
¤ Frame elements
¤ Word similarity in context
¤ Background: Recent work in word similarity prediction uses
annotated data of words in sentential context
¤ Problem: Existing methods for word similarity are specifically
tailored for word similarity only. Naïve conceptualization does not
consider sentential context.
¤ Approach
1. Given two words and their sentential contexts, conceptualize
the words
2. Estimate the similarity using cosine similarity of the concept
vectors
¤ Experiment: Compare the correlation between CDC-based
similarity and human judgment
42. Word similarity in context
¤ … Native Chinese cuisine makes frequent use of Asian leafy
vegetables like bok choy and kai-lan and puts a greater
emphasis on fresh meat …
¤ … American Chinese food is usually less pungent than
authentic cuisine …
¤ Human evaluation = 9.2 (0~10 scale)
43. Word similarity in context
¤ ... This system would be implemented into the national
response plan for bioweapons attacks in the Netherlands .
Researchers at Ben Gurion University in Israel are
developing a different device called the BioPen , essentially
a “Lab-in-a-Pen” …
¤ … originally written in 1969 and performed extensively at
the time by an Israeli military performing group , has
become one of the anthems of the Israeli peace camp .
During the Arab uprising known as the First Intifada ,
Israeli singer Si Heyman sang “Yorim VeBokhim” …
¤ Human evaluation = 8.1 (0~10 scale)
44. Word similarity in context: Results
Note: State-of-the-art word similarity method
yields correlation of 0.66 (Huang ACL 2012)
45. Experiments and Results
¤ Frame elements
¤ Word similarity in context
¤ Query-ad clickthrough
¤ Background: Matching ads with user queries is an important but
difficult task. Clickthrough rate for sponsored links is generally
very low.
¤ Problem: Ad bids and user queries are short sequences of
keywords that do not benefit from full NLP techniques. But simple
keyword expansion methods are inaccurate.
¤ Approach: Use CDC for both ad bids and queries and match them
using cosine similarity of the concept vectors.
¤ Experiment: Using search results of Bing, compare the correlation
of query-ad concept similarity and CTR.
46. Sponsored link bid keywords
Bid keywords for sponsored links=
{ Rockport, Shoes }
User Query =
{ Rockport men shoes }
Show sponsored links
when bid keywords and query match!
47. Query-ad clickthrough
Ad-bids Query CTR
rockport shoes rockport men boots 0.0201
rockport shoes florsheim shoes 0.0022
rockport shoes men dockers shoes 0.0000
replica watches breitling copy watches 0.0833
replica watches replica 0.0833
replica watches tiffany replica bracelet 0.0064
free email e mail 0.0454
free email windows mail 0.0294
free email set up free email account 0.0232
48. Equal weighting phrase conceptualization
company 0.366
brand 0.255
town 0.183
shoe 0.071
shoe company 0.058
neighboring town 0.054
popular name brand 0.010
top brand 3.49E-08
popular brand 3.01E-08
top name 2.38E-08
Bid keywords for sponsored links=
{ }
accessory 0.092
clothes 0.051
equipment 0.049
essential 0.045
garment 0.045
shoe 0.042
fashion accessory 0.034
touch 0.033
textile 0.029
surface 0.029
CDC
How to combine two CDC results?
Rockport,
CDC
Shoes
49. URL title and Query Conceptualization
User Query =
{ Bayesian Topic Model }
Title of this page
{ Latent Dirichlet allocation –
Wikipedia, the free encyclopedia }
Retrieve web pages
based on concept similarities
between URL-title and query
50. IDF Weighting Phrase Conceptualization
Title of Web page
{ Latent Dirichlet allocation – Wikipedia, the free encyclopedia }
User Query =
{ Bayesian Topic Model }
Are these important concepts for retrieval?
How to combine CDC results of query and title?
51. Correlation between CTR and avg. similarity
CDC achieves higher correlations between average similarity and CTR
Model Correlation
CDC-IDF-100
CDC-IDF-200
CDC-IDF-300
0.818
0.827
0.838
CDC-EQ-100
CDC-EQ-200
CDC-EQ-300
0.932
0.952
0.955
Keyword
IJCAI 11
0.259
0.243