SlideShare a Scribd company logo
1 of 28
Download to read offline
Current Approaches in
Search Result Diversification
         Mario Sangiorgio
Presentation outline
Problem definition

What is diversity?

The relevance/diversity trade-off

Performance evaluation

Open issues and conclusions
Why is result diversification needed?

   A couple of real life examples
Ambiguous query


     Flash
Unambiguous query



 Nuclear power plant
Problem definition


Search result diversification is an optimization
 problem aiming to find k items which are the
subset of all relevant results that contains both
  most relevant and most diverse results.
What is needed



Relevance measure       Diversity measure




           Diversification objective
The result diversification process



Items are ranked by relevance                          Diversity is measured




                 The two measures are used to get the final ranking
What is diversity?
How can items be diverse?

  Word sense diversity,
from ambiguous queries




                  Information source
                     diversity, from
                  unambiguous queries
Measures of diversity

Diversity is tightly coupled with the concept of
                     similarity

To address the different aspects of the problem
         several measures emerged:
             Semantic distance
            Categorical distance
             Novel information
Semantic distance
     Diversifies on content dissimilarity
Uses the min-hashing
  scheme to get the        Sd ={MH h 1 d  ,... , MH h d }
                                                         n


sketch of a document
                                         ∣Su ∩Sv∣
  Distance is computed      sim u , v=
                                         ∣Su ∪Sv∣
 from Jaccard similarity      d u , v=1−sim u , v


Does not work well when the documents have too
      different lengths or small sketch size
Categorical distance
Emphasizes word sense diversification

  It is based on metadata (Taxonomy)

The measure is a weighted tree distance
                    l u                             l v
                                     1                                 1
   d u , v=      ∑            2
                                    e i−1
                                                    ∑            2
                                                                      e i−1
                i=lca u , v                     i=lca u , v


        Examples of taxonomies:
      /Top/Health vs /Top/Finance
/Top/Sport/Racing vs /Top/Sport/Football
Novel information
   Diversifies on a general sense regarding
   content dissimilarity. Good for subtopics
Results are represented with unigram language
models (Used for natural language processing)
  For each document is evaluated (with the
Kullback-Leibler divergence) how much novel
       information it brings into the set
How many extra bits will be needed to describe
  the new document using only the already
        selected document in the set
Diversity measures: open issues

  Some aspects not taken into account:

    intrinsic properties of the document

          genre of the document

       sentiment regarding the topic
The relevance/diversity
 optimization problem
Diversification objectives
It has been proved impossible to find a function
       that has all the required properties:

               scale invariance
                 consistency
                   richness
                    stability
     independence of irrelevant attributes
                 monotonicity
            strength of relevance
            strength of similarity
Diversification objectives
           Several functions proposed:
         Max sum                 Max min
        (No stability)        (No consistency
                                nor stability)
 Max sum of max score
                              Mono objective
(Maximizes relevancy and
                              (No consistency)
     then diversity)
                                 Categorical
        Max product
                            (Results have to cover
(It is based on the already
                              a set of categories)
       chosen results)
Diversification algorithms
 Finding the best solution is a NP-Hard problem

  Algorithm depends on the objective function
    Approximation              Greedy


                  Open issues:
   Is Off-line
                           Are there efficient
pre-computation
                           data structures?
   applicable?
How to evaluate diversity in
         search
Data set for the evaluation
                     Full text
                 TREC Interactive
   Top results from commercial search engine

               Structured data
      Taxonomies (Open Directory Project)

                 Ground truth
        Wikipedia disambiguation pages
   Judgements from Amazon Mechanical Turk

There is the need of task-specific standard datasets
Benchmarks
          Adaptation from existing metrics:
    Alpha-NDCG             Subtopic recall and
Normalized discounted          precision
   cumulative gain         Number of subtopics
                                covered
      User intent              Comparison
   Results distribution         against the
 should reflect what the         optimum
    user is asking for
Alpha-nDCG

  Based on information nuggets (Answer to a
                 question)

A document is relevant when it contains a nugget
              needed by the user

 Quality of results graded by human assessors

    The most nuggets are in the set the best
Subtopic recall and precision

                   Is the result set exhaustive?
                  number of subtopics covered by the first k documents
s−recall at k =
                              total number of subtopics




                      Is the result set efficient?
                                       minRank S opt , r
                    s− precision at r=
                                        minRank S , r
Conclusions


Diversification can really improve quality of search
                       results

There is still some work to do in order to achieve
   good results in all the possible scenarios
Open issues

  There is room for improvement defining new
           diversity types and metrics

Ranking functions should take in account diversity
  from the beginning in an integrated process

  Datasets to evaluate each notion of diversity
                should be built
References
   Minack, E., Demartini, G., Nejdl W.: Current Approaches to Search
           Result Diversification. In: Proceedings of ISWC '09

     Gollapudi, S., Sharma, A.: An Axiomatic Approach for Result
             Diversification.In: Proocedings of WWW '09

 Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond Independent Relevance:
Methods and Evaluation Metrics for Subtopic Retrieval. In: Proceedings
                              of SIGIR '03

 Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying Search
                  Results. In: Proceedings of WSDM '09

 Clough, P., Sanderson, M., Abouammoh, M., Navarro, S., Paramita, M.:
 Multiple Approaches to Analysing Query Diversity. In: Proceedings of
                               SIGIR '09

   Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A.,
   Büttcher, S., MacKinnon, I.: Novelty and Diversity in Information
           Retrieval Evaluation. In: Proceedings of SIGIR '08
Current Approaches in Search Result Diversification

More Related Content

What's hot

data_mining_Projectreport
data_mining_Projectreportdata_mining_Projectreport
data_mining_Projectreport
Sampath Velaga
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
butest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
Centralized Class Specific Dictionary Learning for wearable sensors based phy...
Centralized Class Specific Dictionary Learning for wearable sensors based phy...Centralized Class Specific Dictionary Learning for wearable sensors based phy...
Centralized Class Specific Dictionary Learning for wearable sensors based phy...
Sherin Mathews
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
Victor Codina
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 
Knewton adaptive-learning-white-paper
Knewton adaptive-learning-white-paperKnewton adaptive-learning-white-paper
Knewton adaptive-learning-white-paper
dearrd
 
uai2004_V1.doc.doc.doc
uai2004_V1.doc.doc.docuai2004_V1.doc.doc.doc
uai2004_V1.doc.doc.doc
butest
 

What's hot (18)

Learning from Multiple Annotators
Learning  from  Multiple AnnotatorsLearning  from  Multiple Annotators
Learning from Multiple Annotators
 
I0704047054
I0704047054I0704047054
I0704047054
 
data_mining_Projectreport
data_mining_Projectreportdata_mining_Projectreport
data_mining_Projectreport
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
 
Supervised WSD Using Master- Slave Voting Technique
Supervised WSD Using Master- Slave Voting TechniqueSupervised WSD Using Master- Slave Voting Technique
Supervised WSD Using Master- Slave Voting Technique
 
Centralized Class Specific Dictionary Learning for wearable sensors based phy...
Centralized Class Specific Dictionary Learning for wearable sensors based phy...Centralized Class Specific Dictionary Learning for wearable sensors based phy...
Centralized Class Specific Dictionary Learning for wearable sensors based phy...
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
 
Naive Bayes | Statistics
Naive Bayes | StatisticsNaive Bayes | Statistics
Naive Bayes | Statistics
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
 
Knewton adaptive-learning-white-paper
Knewton adaptive-learning-white-paperKnewton adaptive-learning-white-paper
Knewton adaptive-learning-white-paper
 
uai2004_V1.doc.doc.doc
uai2004_V1.doc.doc.docuai2004_V1.doc.doc.doc
uai2004_V1.doc.doc.doc
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 

Similar to Current Approaches in Search Result Diversification

Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Mining
butest
 
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
ssuser33da69
 

Similar to Current Approaches in Search Result Diversification (20)

Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
G04124041046
G04124041046G04124041046
G04124041046
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Mining
 
Binary search query classifier
Binary search query classifierBinary search query classifier
Binary search query classifier
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
 
Summary2 (1)
Summary2 (1)Summary2 (1)
Summary2 (1)
 
Classification
ClassificationClassification
Classification
 
Classification
ClassificationClassification
Classification
 
Multivariate Models in Questionnaire Development
Multivariate Models in Questionnaire DevelopmentMultivariate Models in Questionnaire Development
Multivariate Models in Questionnaire Development
 
A Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and DistributionsA Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and Distributions
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
Clustering
ClusteringClustering
Clustering
 
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
STUDENT PERFORMANCE ANALYSIS USING DECISION TREESTUDENT PERFORMANCE ANALYSIS USING DECISION TREE
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
 
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Current Approaches in Search Result Diversification

  • 1. Current Approaches in Search Result Diversification Mario Sangiorgio
  • 2. Presentation outline Problem definition What is diversity? The relevance/diversity trade-off Performance evaluation Open issues and conclusions
  • 3. Why is result diversification needed? A couple of real life examples
  • 6. Problem definition Search result diversification is an optimization problem aiming to find k items which are the subset of all relevant results that contains both most relevant and most diverse results.
  • 7. What is needed Relevance measure Diversity measure Diversification objective
  • 8. The result diversification process Items are ranked by relevance Diversity is measured The two measures are used to get the final ranking
  • 10. How can items be diverse? Word sense diversity, from ambiguous queries Information source diversity, from unambiguous queries
  • 11. Measures of diversity Diversity is tightly coupled with the concept of similarity To address the different aspects of the problem several measures emerged: Semantic distance Categorical distance Novel information
  • 12. Semantic distance Diversifies on content dissimilarity Uses the min-hashing scheme to get the Sd ={MH h 1 d  ,... , MH h d } n sketch of a document ∣Su ∩Sv∣ Distance is computed sim u , v= ∣Su ∪Sv∣ from Jaccard similarity d u , v=1−sim u , v Does not work well when the documents have too different lengths or small sketch size
  • 13. Categorical distance Emphasizes word sense diversification It is based on metadata (Taxonomy) The measure is a weighted tree distance l u  l v 1 1 d u , v= ∑ 2 e i−1  ∑ 2 e i−1 i=lca u , v i=lca u , v Examples of taxonomies: /Top/Health vs /Top/Finance /Top/Sport/Racing vs /Top/Sport/Football
  • 14. Novel information Diversifies on a general sense regarding content dissimilarity. Good for subtopics Results are represented with unigram language models (Used for natural language processing) For each document is evaluated (with the Kullback-Leibler divergence) how much novel information it brings into the set How many extra bits will be needed to describe the new document using only the already selected document in the set
  • 15. Diversity measures: open issues Some aspects not taken into account: intrinsic properties of the document genre of the document sentiment regarding the topic
  • 17. Diversification objectives It has been proved impossible to find a function that has all the required properties: scale invariance consistency richness stability independence of irrelevant attributes monotonicity strength of relevance strength of similarity
  • 18. Diversification objectives Several functions proposed: Max sum Max min (No stability) (No consistency nor stability) Max sum of max score Mono objective (Maximizes relevancy and (No consistency) then diversity) Categorical Max product (Results have to cover (It is based on the already a set of categories) chosen results)
  • 19. Diversification algorithms Finding the best solution is a NP-Hard problem Algorithm depends on the objective function Approximation Greedy Open issues: Is Off-line Are there efficient pre-computation data structures? applicable?
  • 20. How to evaluate diversity in search
  • 21. Data set for the evaluation Full text TREC Interactive Top results from commercial search engine Structured data Taxonomies (Open Directory Project) Ground truth Wikipedia disambiguation pages Judgements from Amazon Mechanical Turk There is the need of task-specific standard datasets
  • 22. Benchmarks Adaptation from existing metrics: Alpha-NDCG Subtopic recall and Normalized discounted precision cumulative gain Number of subtopics covered User intent Comparison Results distribution against the should reflect what the optimum user is asking for
  • 23. Alpha-nDCG Based on information nuggets (Answer to a question) A document is relevant when it contains a nugget needed by the user Quality of results graded by human assessors The most nuggets are in the set the best
  • 24. Subtopic recall and precision Is the result set exhaustive? number of subtopics covered by the first k documents s−recall at k = total number of subtopics Is the result set efficient? minRank S opt , r s− precision at r= minRank S , r
  • 25. Conclusions Diversification can really improve quality of search results There is still some work to do in order to achieve good results in all the possible scenarios
  • 26. Open issues There is room for improvement defining new diversity types and metrics Ranking functions should take in account diversity from the beginning in an integrated process Datasets to evaluate each notion of diversity should be built
  • 27. References Minack, E., Demartini, G., Nejdl W.: Current Approaches to Search Result Diversification. In: Proceedings of ISWC '09 Gollapudi, S., Sharma, A.: An Axiomatic Approach for Result Diversification.In: Proocedings of WWW '09 Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval. In: Proceedings of SIGIR '03 Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying Search Results. In: Proceedings of WSDM '09 Clough, P., Sanderson, M., Abouammoh, M., Navarro, S., Paramita, M.: Multiple Approaches to Analysing Query Diversity. In: Proceedings of SIGIR '09 Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and Diversity in Information Retrieval Evaluation. In: Proceedings of SIGIR '08