SlideShare a Scribd company logo
1 of 17
SOCIAL METAPHOR
DETECTION VIA TOPICAL
ANALYSIS
Ting-Hao (Kenneth) Huang windx@cmu.edu
Language Technologies Institute, Carnegie Mellon University2013/10/13
2/17
Selectional Preference
 http://www.flickr.com/photos/uxud/320564052
4/
 http://www.flickr.com/photos/epc/313661914/
Verb has selectional preferences to its arguments.
Can you eat money?
3/17
Research QuestionsCould we capture metaphors
in social media
by selectional preference?
If yes, how?
Is it for verb only ?
If not, why not?
Could topic model help ?
4/17
Outline
 Selectional Preference
 3-Step Framework
1. Pre-processing
2. Modeling & Detection
3. Post-processing
 Topical Analysis
 Experiment & Result
 Conclusion
5/17
Selectional Preference
 Selectional Association (SA) (Resnik, 1997)
p: predicate
c: noun class
6/17
3-step Framework
Pre-processing -
Word Extraction &
Noun Clustering
Modeling & Detection -
SA Outlier Detection
Post-processing -
SA Strength Filter
7/17
Step 1: Pre-processing (1)
 Word Extraction
 Why?
 Parsing & POS tagging is hard on noisy data
 How?
 Using lemma form
 Set minimal term frequency
 Set minimal “POS rate”
 Proportion of occurrence of certain POS
 Predicates should be more strict than the nouns
 Noun: TF > 5, POS rate >= 0.7
 Verb & Adj: TF > 50, POS rate >= 0.8
8/17
Top 100
Similar Nouns
money:
1. funds
2. cash
3. profits
4. millions
5. monies
6. dollars
7. royalties
…
Weighted
Directed Graph
for Nouns
Spectral
Clustering
Step 1: Pre-processing (2)
 Semantic Noun Clustering
9/17
Step 2: Modeling & Detection
 Selectional Association
Another Candidate
Semantic Outlier Word Detection
 “Semantic Coherence” outlier
(Inkpen et al., 2005)
 Based on pair-wise word
semanic similarity
 Very High False Positive
 The influences of “general
words”
 Semantic similarity is not
reliable
Seafood
Fruit
Money
Human
Eat
…
!
SA1
SA2
SA(n-1)
SAn
10/17
Step 3: Post-processing
 SA Strength Filtering (Shutova, et al., 2010)
 SA Strength
 Strong (e.g., filmmake)
 Weak (e.g., “light verb”, put, take, …)
 Predicates with weak selectional preference barely
“violates” their own preference.
11/17
Topical Analysis
Entertainment
Food & Cook
Finance
Politics
Entertainment
Food & Cook
Politics
Finance
12/17
DataData  Online breast cancer support community
 All the public posts from Oct 2001 to Jan 2011.
 90,242 unique users who posted 1,562,459
messages belonging to 68,158 discussion
threads. (Wang, et al., 2012; Wen, et al., 2013)
13/17
Experiment Setting
Pre-processing -
- Stanford NLP/Parser
- 55k nouns, 3k adjs, and 1.8k verbs
Modeling & Detection -
- 3 deps: nsubj, dobj, amod
- Observe negative pairs
Post-processing -
- Follow (Shutova, et al., 2010)
Topical Model: JGibbLDA, 20 topics (k = 20)
14/17
Result
 Most outliers are NOT metaphors
 Parsing Error
“…yearly breast MRI…”: amod(breast, yearly)
 Non-metaphor
“…cancer cells float around in my blood…”: dobj(float, cancer)
 Metonymy
“If John win tomorrow night, …”: dobj(win, tomorrow)
 Only very few metaphors are identified
 “…keep my head occupied …”: nsubj(occupy, head)
 “… my belly has overtaken the boobs …”: nsubj(overtake, belly)
 Topic model does NOT help much
15/17
Discussion & ConclusionCould we capture metaphors
in social media by selectional preference?
If yes, how? Is it for verb only ?
If not, why not? Could topic model help ?
Maybe not by fully-automatic approaches.
Good parsing is challenging on social media.
Outliers of SA are not always metaphors.
Topic modeling does not help much.
Maybe seed-expansion method works better.
No, it could also work for amod dependency.
16/17
Thanks!
 Acknowledgement
 Zi Yang, Prof. Teruko Mitamura, Prof. Eric Nyberg for academic
supports, and Yi-Chia Wang , Dong Nguyen for data collection.
 Supported by the Intelligence Advanced Re-search Projects
Activity (IARPA) via Department of Defense US Army Research
Laboratory contract number W911NF-12-C-0020.
 Main References
 Resnik, P. (1997). Selectional preference and sense disambiguation.
In Proceedings of the ACL’97 SIGLEX Workshop.
 Shutova, E.; Sun, L.; and Korhonen, A. (2010). Metaphor
identification using verb and noun clustering. ACL’10.
 Wang, Y.-C., Kraut, R. E., & Levine, J. M. (2012). To Stay or Leave?
The Relationship of Emotional and Informational Support to
Commitment in Online Health Support Groups. CSCW'2012.
17/17
License
 Except where otherwise noted, content on this slide is licensed under a
Creative Commons. Attribution-ShareAlike 3.0 Unported License.
 Material used
 Backgournd image of slide 2: uxud@Flickr, “Plate of money”, CC BY 2.0
http://www.flickr.com/photos/uxud/3205640524/
 Backgournd image of slide 3, 15: jasonahowie@Flickr, “Social Media apps”,
CC BY 2.0
http://www.flickr.com/photos/jasonahowie/8583949219/
 Backgournd image of slide 12: susangkomenforthecure@Flickr, “Susan G.
Komen walkers gear up and take on Day 1 for breast cancer awareness.”, CC
BY-NC-ND 2.0
http://www.flickr.com/photos/susangkomenforthecure/9623480334/
 Backgournd image of slide 16: jam_project@Flickr, “IMG_0405 [2011-08-23]”,
CC BY-NC-ND 2.0
http://www.flickr.com/photos/jam_project/6075715482/

More Related Content

Similar to Social Metaphor Detection via Topical Analysis

Y12 res meth workbook hanan
Y12 res meth workbook hananY12 res meth workbook hanan
Y12 res meth workbook hananhma1
 
GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)Emil Chuck
 
What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...Bart Rienties
 
Assessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part HarmonyAssessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part Harmonyheasulli
 
1 8Annotated Bib
1                                         8Annotated Bib1                                         8Annotated Bib
1 8Annotated BibVannaJoy20
 
Mam in Second Life
Mam in Second LifeMam in Second Life
Mam in Second LifeMark Bell
 
Assignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docxAssignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docxrock73
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)jstaaks
 
1315 estella ma_motorlearning
1315 estella ma_motorlearning1315 estella ma_motorlearning
1315 estella ma_motorlearningTian Stella
 
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...pmilano
 
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docxReference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docxsimisterchristen
 
Using student data to transform teaching and learning
Using student data to transform teaching and learningUsing student data to transform teaching and learning
Using student data to transform teaching and learningBart Rienties
 
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docxA PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docxbartholomeocoombs
 
Response To Intervention - Tier One Strategies
Response To Intervention - Tier One StrategiesResponse To Intervention - Tier One Strategies
Response To Intervention - Tier One StrategiesMike Fisher
 
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom Jessica Henderson
 
Polling Systems Presentation Edu
Polling Systems Presentation EduPolling Systems Presentation Edu
Polling Systems Presentation EduBarry Gregory
 
2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedule2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedulepmilano
 
Embedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA ExperienceEmbedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA Experiencelmfederer
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingBram Zandbelt
 

Similar to Social Metaphor Detection via Topical Analysis (20)

Y12 res meth workbook hanan
Y12 res meth workbook hananY12 res meth workbook hanan
Y12 res meth workbook hanan
 
GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)
 
What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...
 
Assessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part HarmonyAssessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part Harmony
 
1 8Annotated Bib
1                                         8Annotated Bib1                                         8Annotated Bib
1 8Annotated Bib
 
Mam in Second Life
Mam in Second LifeMam in Second Life
Mam in Second Life
 
Assignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docxAssignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docx
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)
 
1315 estella ma_motorlearning
1315 estella ma_motorlearning1315 estella ma_motorlearning
1315 estella ma_motorlearning
 
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...
 
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docxReference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docx
 
Using student data to transform teaching and learning
Using student data to transform teaching and learningUsing student data to transform teaching and learning
Using student data to transform teaching and learning
 
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docxA PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
 
Response To Intervention - Tier One Strategies
Response To Intervention - Tier One StrategiesResponse To Intervention - Tier One Strategies
Response To Intervention - Tier One Strategies
 
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
 
AMait_CV_Feb2017
AMait_CV_Feb2017AMait_CV_Feb2017
AMait_CV_Feb2017
 
Polling Systems Presentation Edu
Polling Systems Presentation EduPolling Systems Presentation Edu
Polling Systems Presentation Edu
 
2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedule2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedule
 
Embedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA ExperienceEmbedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA Experience
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
 

More from Ting-Hao Huang

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeTing-Hao Huang
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Ting-Hao Huang
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...Ting-Hao Huang
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionTing-Hao Huang
 
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeTing-Hao Huang
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Ting-Hao Huang
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsTing-Hao Huang
 

More from Ting-Hao Huang (7)

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
 
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
 

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Social Metaphor Detection via Topical Analysis

  • 1. SOCIAL METAPHOR DETECTION VIA TOPICAL ANALYSIS Ting-Hao (Kenneth) Huang windx@cmu.edu Language Technologies Institute, Carnegie Mellon University2013/10/13
  • 2. 2/17 Selectional Preference  http://www.flickr.com/photos/uxud/320564052 4/  http://www.flickr.com/photos/epc/313661914/ Verb has selectional preferences to its arguments. Can you eat money?
  • 3. 3/17 Research QuestionsCould we capture metaphors in social media by selectional preference? If yes, how? Is it for verb only ? If not, why not? Could topic model help ?
  • 4. 4/17 Outline  Selectional Preference  3-Step Framework 1. Pre-processing 2. Modeling & Detection 3. Post-processing  Topical Analysis  Experiment & Result  Conclusion
  • 5. 5/17 Selectional Preference  Selectional Association (SA) (Resnik, 1997) p: predicate c: noun class
  • 6. 6/17 3-step Framework Pre-processing - Word Extraction & Noun Clustering Modeling & Detection - SA Outlier Detection Post-processing - SA Strength Filter
  • 7. 7/17 Step 1: Pre-processing (1)  Word Extraction  Why?  Parsing & POS tagging is hard on noisy data  How?  Using lemma form  Set minimal term frequency  Set minimal “POS rate”  Proportion of occurrence of certain POS  Predicates should be more strict than the nouns  Noun: TF > 5, POS rate >= 0.7  Verb & Adj: TF > 50, POS rate >= 0.8
  • 8. 8/17 Top 100 Similar Nouns money: 1. funds 2. cash 3. profits 4. millions 5. monies 6. dollars 7. royalties … Weighted Directed Graph for Nouns Spectral Clustering Step 1: Pre-processing (2)  Semantic Noun Clustering
  • 9. 9/17 Step 2: Modeling & Detection  Selectional Association Another Candidate Semantic Outlier Word Detection  “Semantic Coherence” outlier (Inkpen et al., 2005)  Based on pair-wise word semanic similarity  Very High False Positive  The influences of “general words”  Semantic similarity is not reliable Seafood Fruit Money Human Eat … ! SA1 SA2 SA(n-1) SAn
  • 10. 10/17 Step 3: Post-processing  SA Strength Filtering (Shutova, et al., 2010)  SA Strength  Strong (e.g., filmmake)  Weak (e.g., “light verb”, put, take, …)  Predicates with weak selectional preference barely “violates” their own preference.
  • 11. 11/17 Topical Analysis Entertainment Food & Cook Finance Politics Entertainment Food & Cook Politics Finance
  • 12. 12/17 DataData  Online breast cancer support community  All the public posts from Oct 2001 to Jan 2011.  90,242 unique users who posted 1,562,459 messages belonging to 68,158 discussion threads. (Wang, et al., 2012; Wen, et al., 2013)
  • 13. 13/17 Experiment Setting Pre-processing - - Stanford NLP/Parser - 55k nouns, 3k adjs, and 1.8k verbs Modeling & Detection - - 3 deps: nsubj, dobj, amod - Observe negative pairs Post-processing - - Follow (Shutova, et al., 2010) Topical Model: JGibbLDA, 20 topics (k = 20)
  • 14. 14/17 Result  Most outliers are NOT metaphors  Parsing Error “…yearly breast MRI…”: amod(breast, yearly)  Non-metaphor “…cancer cells float around in my blood…”: dobj(float, cancer)  Metonymy “If John win tomorrow night, …”: dobj(win, tomorrow)  Only very few metaphors are identified  “…keep my head occupied …”: nsubj(occupy, head)  “… my belly has overtaken the boobs …”: nsubj(overtake, belly)  Topic model does NOT help much
  • 15. 15/17 Discussion & ConclusionCould we capture metaphors in social media by selectional preference? If yes, how? Is it for verb only ? If not, why not? Could topic model help ? Maybe not by fully-automatic approaches. Good parsing is challenging on social media. Outliers of SA are not always metaphors. Topic modeling does not help much. Maybe seed-expansion method works better. No, it could also work for amod dependency.
  • 16. 16/17 Thanks!  Acknowledgement  Zi Yang, Prof. Teruko Mitamura, Prof. Eric Nyberg for academic supports, and Yi-Chia Wang , Dong Nguyen for data collection.  Supported by the Intelligence Advanced Re-search Projects Activity (IARPA) via Department of Defense US Army Research Laboratory contract number W911NF-12-C-0020.  Main References  Resnik, P. (1997). Selectional preference and sense disambiguation. In Proceedings of the ACL’97 SIGLEX Workshop.  Shutova, E.; Sun, L.; and Korhonen, A. (2010). Metaphor identification using verb and noun clustering. ACL’10.  Wang, Y.-C., Kraut, R. E., & Levine, J. M. (2012). To Stay or Leave? The Relationship of Emotional and Informational Support to Commitment in Online Health Support Groups. CSCW'2012.
  • 17. 17/17 License  Except where otherwise noted, content on this slide is licensed under a Creative Commons. Attribution-ShareAlike 3.0 Unported License.  Material used  Backgournd image of slide 2: uxud@Flickr, “Plate of money”, CC BY 2.0 http://www.flickr.com/photos/uxud/3205640524/  Backgournd image of slide 3, 15: jasonahowie@Flickr, “Social Media apps”, CC BY 2.0 http://www.flickr.com/photos/jasonahowie/8583949219/  Backgournd image of slide 12: susangkomenforthecure@Flickr, “Susan G. Komen walkers gear up and take on Day 1 for breast cancer awareness.”, CC BY-NC-ND 2.0 http://www.flickr.com/photos/susangkomenforthecure/9623480334/  Backgournd image of slide 16: jam_project@Flickr, “IMG_0405 [2011-08-23]”, CC BY-NC-ND 2.0 http://www.flickr.com/photos/jam_project/6075715482/