SlideShare uma empresa Scribd logo
1 de 24
Ganesan & Zhai 2012, Information Retrieval, Vol 15, Number 2




Kavita Ganesan (www.kavita-ganesan.com)
University of Illinois @ Urbana Champaign
Journal
Project Page
   Currently: No easy or direct way of finding
    entities (e.g. products, people, businesses)
    based on online opinions

   You need to read opinions about different
    entities to find entities that fulfill personal
    criteria
    e.g. finding mp3 players with ‘good sound quality’
   Currently: No easy or direct way of finding
    entities (e.g. products, people, businesses)
    based on online opinions

   You need to read opinions about different
    entities to find entities that fulfill personal
    criteria
     (e.g. finding mp3 players with ‘good sound quality’
       Time consuming process & impairs
                  user productivity!
   Use existing opinions to rank entities based on
    a set of unstructured user preferences

   Example of user preferences:
     Finding a hotel: “clean rooms, heated pools”
     Finding a restaurant: “authentic food, good ambience”
   Most obvious way: use results of existing
    opinion mining methods
     Find sentiment ratings on various aspects
      ▪ For example, for an mp3 player: find ratings for screen, sound,
        battery life aspects
      ▪ Then, rank entities based on these discovered aspect ratings
     Problem is that this is Not practical!
      ▪ Costly – It is costly to mine large amounts of textual content
      ▪ Prior knowledge – You need to know the set of queriable
        aspects in advance. So, you may have to define aspects for
        each domain either manually or through text mining
      ▪ Supervision – Most of the existing methods rely on some form
        of supervision like the presence of overall user ratings. Such
        information may not always be available.
   Leverage Existing Text Retrieval Models
   Why?
     Retrieval models can scale up to large amounts of
      textual content
     The models themselves can be tweaked or
      redefined
     This does not require costly information extraction
      or text mining
Leveraging robust text retrieval models
             Indexed
                        rank
 Entity 1    Entity 1
             Reviews

                        rank      retrieval         User Preferences
             Entity 2              models               (query)
 Entity 2
             Reviews             (BM25, LM, PL2)
                        rank

 Entity 3    Entity 3
             Reviews             Keyword match
                               between user prefs
                                & textual reviews
Leveraging robust text retrieval models
             Indexed
                        rank
 Entity 3    Entity 3
             Reviews

                        rank      retrieval         User Preferences
             Entity 2              models               (query)
 Entity 2
             Reviews             (BM25, LM, PL2)
                        rank

 Entity 1    Entity 1
             Reviews             Keyword match
                               between user prefs
                                & textual reviews
   Based on the basic setup, this ranking problem seems
    similar to regular document retrieval problem
   However, there are important differences:
1. The query is meant to express a user's preferences in keywords
    Query is expected to be longer than regular keyword queries
    Query may contain sub-queries expressing preferences for different
     aspects
    It may actually be beneficial to model these semantic aspects

2. Ranking is to capture how well an entity satisfies a user's
   preferences
    Not the relevance of a document to a query (as in regular retrieval)
    The matching of opinion/sentiment words would be important in
     this case
   Investigate use of text retrieval models for the
    task of Opinion-Based Entity Ranking

   Explore some extensions over IR models

   Propose evaluation method for the ranking task

   User Study
     To determine if results make sense to users
     Validate effectiveness of evaluation method
   In standard text retrieval we cannot distinguish
    the multiple preferences in a query.
    For example: “clean rooms, cheap, good service”
     Would be treated as a long keyword query even
      though there are 3 preferences in the query
     Problem with this is that an entity may score highly
      because of matching one aspect extremely well

   To improve this:
     We try to score each preference separately and then
      combine the results
Aspect Queries

“clean rooms, cheap,                                             “good
                       “clean rooms”        “cheap”
                                                                service”
    good service”

                                                                     scored
                                    retrieval model                separately
   retrieval model

                             result set 1   result set 2   result set 3
       Results

                                                              results
                                            Results
                                                             combined
   In standard retrieval models the matching of
    an opinion word & a standard topic word is
    not distinguished

   However, with Opinion-Based Entity Ranking:
     It is important to match opinion words in the
      query, but opinion words tend to have more
      variation than topic words
     Solution: Expand a query with similar opinion
      words to help emphasize the matching of opinions
Similar Meaning to
Fantastic battery life   “Fantastic battery life”
      Query
                            Good battery life


                            Great battery life


                           Excellent battery life

                           Review documents
Similar Meaning to
Fantastic battery life          “Fantastic battery life”
      Query
             Add synonyms of
                                   Good battery life
             word “fantastic”


  Fantastic, good,                 Great battery life
  great,excellent…
     battery life
                                  Excellent battery life
  Expanded Query
                                  Review documents
   Document Collection

   Gold Standard: Relevance Judgement

   User Queries

   Evaluation Measure
   Document Collection:
     Reviews of Hotels – Tripadvisor
     Reviews of Cars – Edmunds



                       Numerical
                     aspect ratings
                                          Gold
                                        standard
         Free text reviews
   Gold Standard:
     Needed to asses performance of ranking task


   For each entity & for each aspect (in dataset):
     Average numerical ratings across reviews. This will
      give the judgment score for each aspect
     Assumption:
      Since the numerical ratings were given by users,
      this would be a good approximation to actual
      human judgment
   Gold Standard:
    Ex. User looking for cars with “good performance”
     Ideally, the system should return cars with
      ▪ High numerical ratings on performance aspect
      ▪ Otherwise, we can say that the system is not doing well in
        ranking
         Should have high
         ratings on
         performance
   User Queries
     Semi synthethic queries
     Not able to obtain natural sample of queries

     Ask users to specify preferences on different aspects
      of car & hotel based on aspects available in dataset
      ▪ Seed queries
      ▪ Ex. Fuel: “good gas mileage”, “great mpg”

     Randomly combine seed queries from different
      aspects  forms synthetic queries
      ▪ Ex. Query 1: “great mpg, reliable car”
      ▪ Ex. Query 2: “comfortable, good performance”
   Evaluation Measure: nDCG
     This measure is ideal because it is based on
      multiple levels of ranking
     The numerical ratings used as judgment scores has
      a range of values and nDCG will actually support
      this.
   Users were asked to manually determine the relevance
    of system generated rankings to a set of queries

Two reasons for user study:
 Validate that results made sense to real users
     On average, users thought that the entities retrieved by the
      system were a reasonable match to the queries

   Validate effectiveness of gold standard rankings
     Gold standard ranking has relatively strong agreement
      with user rankings. This means the gold standard based on
      numerical ratings is a good approximation to human
      judgment
Most effective          Most effective
                               on BM25 (p23)           on BM25 (p23)
8.0%         Hotels               2.5%                Cars
6.0%                              2.0%
                                  1.5%
4.0%
                                  1.0%
2.0%                              0.5%
0.0%                              0.0%
       PL2      LM      BM25                    PL2      LM       BM25
       QAM   QAM + OpinExp                 QAM        QAM + OpinExp


Improvement in ranking using QAM
Improvement in ranking using QAM + OpinExp
   Lightweight approach to ranking entities based
    on opinions
     Use existing text retrieval models

   Explored some enhancements over retrieval
    models
     Namely opinion expansion & query aspect modeling
     Both showed some improvement in ranking

   Proposed evaluation method using user ratings
     User study shows that the evaluation method is sound
     This method can be used for future evaluation tasks

Mais conteúdo relacionado

Mais procurados (20)

Row major and column major in 2 d
Row major and column major in 2 dRow major and column major in 2 d
Row major and column major in 2 d
 
Lecture 3 data structures and algorithms
Lecture 3 data structures and algorithmsLecture 3 data structures and algorithms
Lecture 3 data structures and algorithms
 
Stack and queue
Stack and queueStack and queue
Stack and queue
 
Binary search in data structure
Binary search in data structureBinary search in data structure
Binary search in data structure
 
Recurrences
RecurrencesRecurrences
Recurrences
 
Data Structures Chapter-2
Data Structures Chapter-2Data Structures Chapter-2
Data Structures Chapter-2
 
Queue Data Structure
Queue Data StructureQueue Data Structure
Queue Data Structure
 
Data Structure (Queue)
Data Structure (Queue)Data Structure (Queue)
Data Structure (Queue)
 
linear search and binary search
linear search and binary searchlinear search and binary search
linear search and binary search
 
Searching in Arrays
Searching in ArraysSearching in Arrays
Searching in Arrays
 
Sorting
SortingSorting
Sorting
 
Searching and Sorting Techniques in Data Structure
Searching and Sorting Techniques in Data StructureSearching and Sorting Techniques in Data Structure
Searching and Sorting Techniques in Data Structure
 
queue antrian
queue antrianqueue antrian
queue antrian
 
Algoritma mid point
Algoritma mid pointAlgoritma mid point
Algoritma mid point
 
Trees in data structures
Trees in data structuresTrees in data structures
Trees in data structures
 
14237 19 & 20 jst
14237 19 & 20 jst14237 19 & 20 jst
14237 19 & 20 jst
 
Stacks in c++
Stacks in c++Stacks in c++
Stacks in c++
 
List Data Structure
List Data StructureList Data Structure
List Data Structure
 
DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS
 
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
 

Destaque

Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...
Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...
Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...Kavita Ganesan
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Kavita Ganesan
 
Query based summarization
Query based summarizationQuery based summarization
Query based summarizationdamom77
 
Micropinion Generation
Micropinion GenerationMicropinion Generation
Micropinion GenerationKavita Ganesan
 
Introduction to Java Strings, By Kavita Ganesan
Introduction to Java Strings, By Kavita GanesanIntroduction to Java Strings, By Kavita Ganesan
Introduction to Java Strings, By Kavita GanesanKavita Ganesan
 
Context based sentiment analysis
Context based sentiment analysisContext based sentiment analysis
Context based sentiment analysisAkshat Bakaya
 
ACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise SearchACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise SearchGan Keng Hoon
 
CVML2011: human action recognition (Ivan Laptev)
CVML2011: human action recognition (Ivan Laptev)CVML2011: human action recognition (Ivan Laptev)
CVML2011: human action recognition (Ivan Laptev)zukun
 
Aspect Mining Techniques
Aspect Mining TechniquesAspect Mining Techniques
Aspect Mining TechniquesEsteban Abait
 
Listening exercise ted 2
Listening exercise ted 2Listening exercise ted 2
Listening exercise ted 2Nini Paz
 
Blubag brochure
Blubag brochureBlubag brochure
Blubag brochurefooserv
 
Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012
Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012
Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012Stan Ottevanger
 
Listening exercise ted 2
Listening exercise ted 2Listening exercise ted 2
Listening exercise ted 2Nini Paz
 
Interactive tv fri123 7
Interactive tv fri123 7Interactive tv fri123 7
Interactive tv fri123 7설란 문
 
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit Kavita Ganesan
 
Supermarket
SupermarketSupermarket
SupermarketNini Paz
 

Destaque (20)

Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...
Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...
Opinosis Presentation @ Coling 2010: Opinosis - A Graph Based Approach to Abs...
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
 
Query based summarization
Query based summarizationQuery based summarization
Query based summarization
 
Micropinion Generation
Micropinion GenerationMicropinion Generation
Micropinion Generation
 
Introduction to Java Strings, By Kavita Ganesan
Introduction to Java Strings, By Kavita GanesanIntroduction to Java Strings, By Kavita Ganesan
Introduction to Java Strings, By Kavita Ganesan
 
Context based sentiment analysis
Context based sentiment analysisContext based sentiment analysis
Context based sentiment analysis
 
ACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise SearchACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise Search
 
CVML2011: human action recognition (Ivan Laptev)
CVML2011: human action recognition (Ivan Laptev)CVML2011: human action recognition (Ivan Laptev)
CVML2011: human action recognition (Ivan Laptev)
 
Aspect Mining Techniques
Aspect Mining TechniquesAspect Mining Techniques
Aspect Mining Techniques
 
Clothes
ClothesClothes
Clothes
 
Listening exercise ted 2
Listening exercise ted 2Listening exercise ted 2
Listening exercise ted 2
 
Blubag brochure
Blubag brochureBlubag brochure
Blubag brochure
 
Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012
Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012
Presentatie Bart Vos & Margreet Kloppenburg Humanagement Relatiedag 11-10-2012
 
Listening exercise ted 2
Listening exercise ted 2Listening exercise ted 2
Listening exercise ted 2
 
Giffords
GiffordsGiffords
Giffords
 
Banda Marcial
Banda MarcialBanda Marcial
Banda Marcial
 
Interactive tv fri123 7
Interactive tv fri123 7Interactive tv fri123 7
Interactive tv fri123 7
 
Clothes
ClothesClothes
Clothes
 
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
 
Supermarket
SupermarketSupermarket
Supermarket
 

Semelhante a Opinion-Based Entity Ranking

Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
In situ evaluation of entity retrieval and opinion summarization
In situ evaluation of entity retrieval and opinion summarizationIn situ evaluation of entity retrieval and opinion summarization
In situ evaluation of entity retrieval and opinion summarizationKavita Ganesan
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clusteringawesomesos
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface晓愚 孟
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationSease
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationAlessandro Benedetti
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...OpenSource Connections
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsMarina Santini
 
Opinion Driven Decision Support System
Opinion Driven Decision Support SystemOpinion Driven Decision Support System
Opinion Driven Decision Support SystemKavita Ganesan
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Paolo Missier
 
Mining Product Reputations On the Web
Mining Product Reputations On the WebMining Product Reputations On the Web
Mining Product Reputations On the Webfeiwin
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmVaibhav Varshney
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixXavier Amatriain
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013MLconf
 
Online Learning to Rank
Online Learning to RankOnline Learning to Rank
Online Learning to Rankewhuang3
 
2005 Web Content Mining 4
2005 Web Content Mining   42005 Web Content Mining   4
2005 Web Content Mining 4George Ang
 
Movie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial IntelligenceMovie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial IntelligenceHarivamshi D
 

Semelhante a Opinion-Based Entity Ranking (20)

Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
In situ evaluation of entity retrieval and opinion summarization
In situ evaluation of entity retrieval and opinion summarizationIn situ evaluation of entity retrieval and opinion summarization
In situ evaluation of entity retrieval and opinion summarization
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clustering
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
 
Developing Movie Recommendation System
Developing Movie Recommendation SystemDeveloping Movie Recommendation System
Developing Movie Recommendation System
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
Opinion Driven Decision Support System
Opinion Driven Decision Support SystemOpinion Driven Decision Support System
Opinion Driven Decision Support System
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07
 
Mining Product Reputations On the Web
Mining Product Reputations On the WebMining Product Reputations On the Web
Mining Product Reputations On the Web
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013
 
Online Learning to Rank
Online Learning to RankOnline Learning to Rank
Online Learning to Rank
 
2005 Web Content Mining 4
2005 Web Content Mining   42005 Web Content Mining   4
2005 Web Content Mining 4
 
ppt.pptx
ppt.pptxppt.pptx
ppt.pptx
 
Movie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial IntelligenceMovie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial Intelligence
 

Último

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Opinion-Based Entity Ranking

  • 1. Ganesan & Zhai 2012, Information Retrieval, Vol 15, Number 2 Kavita Ganesan (www.kavita-ganesan.com) University of Illinois @ Urbana Champaign Journal Project Page
  • 2. Currently: No easy or direct way of finding entities (e.g. products, people, businesses) based on online opinions  You need to read opinions about different entities to find entities that fulfill personal criteria e.g. finding mp3 players with ‘good sound quality’
  • 3. Currently: No easy or direct way of finding entities (e.g. products, people, businesses) based on online opinions  You need to read opinions about different entities to find entities that fulfill personal criteria  (e.g. finding mp3 players with ‘good sound quality’ Time consuming process & impairs user productivity!
  • 4. Use existing opinions to rank entities based on a set of unstructured user preferences  Example of user preferences:  Finding a hotel: “clean rooms, heated pools”  Finding a restaurant: “authentic food, good ambience”
  • 5. Most obvious way: use results of existing opinion mining methods  Find sentiment ratings on various aspects ▪ For example, for an mp3 player: find ratings for screen, sound, battery life aspects ▪ Then, rank entities based on these discovered aspect ratings  Problem is that this is Not practical! ▪ Costly – It is costly to mine large amounts of textual content ▪ Prior knowledge – You need to know the set of queriable aspects in advance. So, you may have to define aspects for each domain either manually or through text mining ▪ Supervision – Most of the existing methods rely on some form of supervision like the presence of overall user ratings. Such information may not always be available.
  • 6. Leverage Existing Text Retrieval Models  Why?  Retrieval models can scale up to large amounts of textual content  The models themselves can be tweaked or redefined  This does not require costly information extraction or text mining
  • 7. Leveraging robust text retrieval models Indexed rank Entity 1 Entity 1 Reviews rank retrieval User Preferences Entity 2 models (query) Entity 2 Reviews (BM25, LM, PL2) rank Entity 3 Entity 3 Reviews Keyword match between user prefs & textual reviews
  • 8. Leveraging robust text retrieval models Indexed rank Entity 3 Entity 3 Reviews rank retrieval User Preferences Entity 2 models (query) Entity 2 Reviews (BM25, LM, PL2) rank Entity 1 Entity 1 Reviews Keyword match between user prefs & textual reviews
  • 9. Based on the basic setup, this ranking problem seems similar to regular document retrieval problem  However, there are important differences: 1. The query is meant to express a user's preferences in keywords  Query is expected to be longer than regular keyword queries  Query may contain sub-queries expressing preferences for different aspects  It may actually be beneficial to model these semantic aspects 2. Ranking is to capture how well an entity satisfies a user's preferences  Not the relevance of a document to a query (as in regular retrieval)  The matching of opinion/sentiment words would be important in this case
  • 10. Investigate use of text retrieval models for the task of Opinion-Based Entity Ranking  Explore some extensions over IR models  Propose evaluation method for the ranking task  User Study  To determine if results make sense to users  Validate effectiveness of evaluation method
  • 11. In standard text retrieval we cannot distinguish the multiple preferences in a query. For example: “clean rooms, cheap, good service”  Would be treated as a long keyword query even though there are 3 preferences in the query  Problem with this is that an entity may score highly because of matching one aspect extremely well  To improve this:  We try to score each preference separately and then combine the results
  • 12. Aspect Queries “clean rooms, cheap, “good “clean rooms” “cheap” service” good service” scored retrieval model separately retrieval model result set 1 result set 2 result set 3 Results results Results combined
  • 13. In standard retrieval models the matching of an opinion word & a standard topic word is not distinguished  However, with Opinion-Based Entity Ranking:  It is important to match opinion words in the query, but opinion words tend to have more variation than topic words  Solution: Expand a query with similar opinion words to help emphasize the matching of opinions
  • 14. Similar Meaning to Fantastic battery life “Fantastic battery life” Query Good battery life Great battery life Excellent battery life Review documents
  • 15. Similar Meaning to Fantastic battery life “Fantastic battery life” Query Add synonyms of Good battery life word “fantastic” Fantastic, good, Great battery life great,excellent… battery life Excellent battery life Expanded Query Review documents
  • 16. Document Collection  Gold Standard: Relevance Judgement  User Queries  Evaluation Measure
  • 17. Document Collection:  Reviews of Hotels – Tripadvisor  Reviews of Cars – Edmunds Numerical aspect ratings Gold standard Free text reviews
  • 18. Gold Standard:  Needed to asses performance of ranking task  For each entity & for each aspect (in dataset):  Average numerical ratings across reviews. This will give the judgment score for each aspect  Assumption: Since the numerical ratings were given by users, this would be a good approximation to actual human judgment
  • 19. Gold Standard: Ex. User looking for cars with “good performance”  Ideally, the system should return cars with ▪ High numerical ratings on performance aspect ▪ Otherwise, we can say that the system is not doing well in ranking Should have high ratings on performance
  • 20. User Queries  Semi synthethic queries  Not able to obtain natural sample of queries  Ask users to specify preferences on different aspects of car & hotel based on aspects available in dataset ▪ Seed queries ▪ Ex. Fuel: “good gas mileage”, “great mpg”  Randomly combine seed queries from different aspects  forms synthetic queries ▪ Ex. Query 1: “great mpg, reliable car” ▪ Ex. Query 2: “comfortable, good performance”
  • 21. Evaluation Measure: nDCG  This measure is ideal because it is based on multiple levels of ranking  The numerical ratings used as judgment scores has a range of values and nDCG will actually support this.
  • 22. Users were asked to manually determine the relevance of system generated rankings to a set of queries Two reasons for user study:  Validate that results made sense to real users  On average, users thought that the entities retrieved by the system were a reasonable match to the queries  Validate effectiveness of gold standard rankings  Gold standard ranking has relatively strong agreement with user rankings. This means the gold standard based on numerical ratings is a good approximation to human judgment
  • 23. Most effective Most effective on BM25 (p23) on BM25 (p23) 8.0% Hotels 2.5% Cars 6.0% 2.0% 1.5% 4.0% 1.0% 2.0% 0.5% 0.0% 0.0% PL2 LM BM25 PL2 LM BM25 QAM QAM + OpinExp QAM QAM + OpinExp Improvement in ranking using QAM Improvement in ranking using QAM + OpinExp
  • 24. Lightweight approach to ranking entities based on opinions  Use existing text retrieval models  Explored some enhancements over retrieval models  Namely opinion expansion & query aspect modeling  Both showed some improvement in ranking  Proposed evaluation method using user ratings  User study shows that the evaluation method is sound  This method can be used for future evaluation tasks

Notas do Editor

  1. So this long keyword query will be split into 3 separate queries. Each called an aspect query.These aspect queries are scored separately and the results are then combined.
  2. -
  3. -for each entity, average the numerical ratings of each aspect-assumption: this would be a good approximation to human judgment
  4. Otherwise, this tells you that the system is not really doing well in ranking.
  5. -could not obtain natural queries, so we used semi synthetic queries.-what we did was-and then we randomly combined queries…to form a set of queries.
  6. Then finally we conducted a user study where users were asked to manually determine the relevance of the the sysGen results to query. This is to validate that the results made sense to real usersAnd also to validate the effectiveness of the gold standard rankings which is based on the…Based on this we found that…Which means that this evaluation method can be safely used for similar ranking tasks…