SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
ECDL 2010
6-10 september 2010




          Measuring Effectiveness of
           Geographic IR Systems
              in Digital Libraries:
          Evaluation and Case Study

         Damien Palacio, Guillaume Cabanac,
            Christian Sallaberry, Gilles Hubert

              Damien Palacio - damien.palacio@univ-pau.fr   1
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                     = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     2
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                     = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     3
1. Motivation – Why Geographic IR?


Geographic Information Retrieval
➔   Query = ''trip around Glasgow in summer 2010''

➔   Search Engines
      ➔   Topical          term          ∈ {trip, Glasgow, summer, 2010}

                           spatial  ∈ {citiesNearGlasgow ...}
      ➔   Geographic       temporal ∈ {21june .. 22sept 2010}
                           term     ∈ {trip, Glasgow, summer, 2010}


➔   ≈ 1/6 Queries = Geographic Queries
      ➔   Excite       (Sanderson et al., 2004)
      ➔   AOL          (Gan et al., 2008)
      ➔   Yahoo!       (Jones et al., 2008)


➔   Current Issue and Realistic
                                                                       4
1. Motivation – Why Geographic IR?


A Geographic IRS: How Does It Work?
➔   3 Dimensions to Process:
      ➔   Spatial, temporal and topical

➔   1 Index per Dimension
      ➔   Topical      bag of words, vector space model, ...
      ➔   Spatial      named entity recognition, ...
      ➔   Temporal     named entity recognition, ...




                                                               5
1. Motivation – Why Geographic IR?


A Geographic IRS: How Does It Work?
➔   Spatial Processing




                                      6
1. Motivation – Why Geographic IR?


A Geographic IRS: How Does It Work?
➔   3 Dimensions to Process:
      ➔   Spatial, temporal and topical

➔   1 Index per Dimension
      ➔   Topical      bag of words, vector space model, ...
      ➔   Spatial      named entity recognition, ...
      ➔   Temporal     named entity recognition, ...


➔   Retrieval
      ➔   Usually by filtering (STEWARD, SPIRIT, CITER, …)


➔   Issue: Performance of GIRS vs. topical IRS
➔   Hypothesis: Geographic IRS better than topical IRS
                                                               7
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                     = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     8
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency             +   effectiveness



                       Geo IR litterature             Topical IR
                                                      litterature


➔   Effectiveness Evaluation




                                                                    9
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency             +   effectiveness

                   Computation       Storage
                   time              needed
                       Geo IR litterature             Topical IR
                                                      litterature


➔   Effectiveness Evaluation




                                                                    10
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency             +   effectiveness

                   Computation       Storage
                                     needed             Quality
                   time
                       Geo IR litterature             Topical IR
                                                      litterature


➔   Effectiveness Evaluation




                                                                    11
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation


                     Temporal               Topical




                                  Spatial

                                                                       12
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...

                     Temporal               Topical




                                  Spatial

                                                                         13
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...
TempEval
                     Temporal               Topical




                                  Spatial

                                                                         14
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...
TempEval
                     Temporal               Topical


                                                      Bucher et al. (2005)
                                                      GeoClef
                                  Spatial

                                                                         15
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...
TempEval
                     Temporal               Topical


                                                      Bucher et al. (2005)
    Evaluation                                        GeoClef
    framework                     Spatial
    proposed
                                                                         16
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                      = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     17
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (1/2)
 ➔   Goal: measuring GIRS quality

 ➔   Means: building on TREC framework (1992-)

 ➔   ''Cranfield'' methodology
       ➔   Test collection
              ➔   Corpus
              ➔   ≥ 25 Topics
              ➔   Qrels


       ➔   Measures: P@X, MAP,
             NDCG, ...
                                             [Voorhees, 2007]

                                                                18
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (2/2)
 ➔   TREC Framework Extension
       ➔   Test collection
              ➔   ≥ 25 Topics
              ➔   Corpus          Covering the 3
                                  dimensions
              ➔   Gradual qrels
              ➔   + geographic ressources




                                                   19
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (2/2)
 ➔   TREC Framework Extension
       ➔   Test collection
              ➔   ≥ 25 Topics
              ➔   Corpus            Covering the 3
                                    dimensions
              ➔   Gradual qrels
                                                  3 dimensions:
              ➔   + geographic ressources          Topic: ''trip around Glasgow''
                                                   Doc: trip + Bob born in Dumbarton
                                   No dimension                   3 dimensions + global
       ➔   About qrels …                                                     =
                                                                     Satisfied topic 
              ➔   Relevance (doc, topic) ∈ {0;1;2;3;4}
              ➔   Principle: ''the more satisfied dimensions there are, the
                    better it is''




                                                                                          20
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (2/2)
 ➔   TREC Framework Extension
       ➔   Test collection
              ➔   ≥ 25 Topics
              ➔   Corpus             Covering the 3
                                     dimensions
              ➔   Gradual qrels
                                                   3 dimensions:
              ➔   + geographic ressources           Topic: ''trip around Glasgow''
                                                    Doc: trip + Bob born in Dumbarton
                                    No dimension                   3 dimensions + global
       ➔   About qrels …                                                      =
                                                                      Satisfied topic 
              ➔   Relevance (doc, topic) ∈ {0;1;2;3;4}
              ➔   Principle: ''the more satisfied dimensions there are, the
                    better it is''

       ➔   Gradual qrels aware measure:
             Normalized Discounted Cumulative Gain            [Järvelin & Kekäläinen, 2002]

              ➔   By topic: NDCG for each topic
              ➔   Global:   meanNDCG for the system                                        21
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                      = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     22
4. Experiments – Case Study with PIV GIRS


Case Study: PIV System
➔   Indexing: 1 index per dimension
      ➔   Topical = Terrier IRS   [Ounis et al, 2005]
      ➔   Spatial = map segmentation into tiles
      ➔   Temporal = timeline segmentation into tiles




                                                            CombMNZ


➔   Retrieval
      ➔   Result document list for each index
      ➔   Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997]
                                                                      23
4. Experiments – Case Study with PIV GIRS


CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]




                                                 24
4. Experiments – Case Study with PIV GIRS


CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]




                                                 25
4. Experiments – Case Study with PIV GIRS


CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]




                                                 26
4. Experiments – Case Study with PIV GIRS


Case Study: MIDR_2010 collection
➔   Building Qrels: 12 volunteers (thanks!!!)


31 topics                                         Qrels

  5645                                          Relevance
documents                                       judgments

     =                                          {0;1;2;3;4}

paragraphs




   Map for
   tracking
    spatial
 information


                                                              27
4. Experiments – Hypothesis Validated


Analysis of Collected Data
➔   IRS Evaluation
                                   trec_eval
      ➔   ResultsList × Qrels                  NDCG


➔   Results: geographic IRS most effective




Hypothesis 


                                                      28
4. Experiments – Hypothesis Validated


Analysis of Collected Data
➔   Results: geographic IRS most effective




                                             29
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                      = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     30
Evaluation framework for Geographic IR Systems


Conclusions and Future Works (1/2)
➔   Evaluation Framework for Geographic IR Systems
      ➔   Reusable
      ➔   Generalizable for more dimensions: confidence,
            freshness, ... [Costa Pereira et al., 2009]
      ➔   Not gradual relevance per dimension


➔   Case Study with PIV System
      ➔   Creation of a specific test collection (≥ 25 topics)
      ➔   French test collection
      ➔   Limited collection (number of documents)




                                                                 31
Evaluation Framework for Geographic IR Systems


Conclusions and Future Works (2/2)
➔   Hypothesis Validated
      ➔   The 3 dimensions improve IR (+66.5%)


➔   Future Works
      ➔   More precise analysis: by query
      ➔   Quantify PIV improvements: various indexes combinations
      ➔   Organize a GIRS evaluation campaign: anyone interested?




                                                                    32
ECDL 2010
6-10 september 2010




                           Thank you!




              Damien Palacio - damien.palacio@univ-pau.fr   33
Spatial Interface




                    34
Spatial Interface




                    35
Temporal Interface




                     36
Temporal Interface




                     37
Spatial Tiling




                 38

Mais conteúdo relacionado

Semelhante a ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshopFrank Gielen
 
Team management presentation3
Team management presentation3Team management presentation3
Team management presentation3John Martin
 
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...IJORCS
 
PMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
PMICOS Webinar: Building a Sound Schedule in an Enterprise EnvironmentPMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
PMICOS Webinar: Building a Sound Schedule in an Enterprise EnvironmentAcumen
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopDataWorks Summit
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Lu Wei
 
Intro to CCSS - East China 2-
Intro to CCSS - East China 2-Intro to CCSS - East China 2-
Intro to CCSS - East China 2-Laura Chambless
 
Real World Application Performance with MongoDB
Real World Application Performance with MongoDBReal World Application Performance with MongoDB
Real World Application Performance with MongoDBMongoDB
 
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin ChengPinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin ChengCeph Community
 
Real World Cognition Loop for IoT
Real World Cognition Loop for IoTReal World Cognition Loop for IoT
Real World Cognition Loop for IoTDarminder
 
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative FilteringYONG ZHENG
 
Show observe and tell giang nguyen
Show observe and tell   giang nguyenShow observe and tell   giang nguyen
Show observe and tell giang nguyenNguyen Giang
 
Cognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
Cognitive Ability Effects on Effort in Web Search & Navigation by GwizdkaCognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
Cognitive Ability Effects on Effort in Web Search & Navigation by Gwizdkajacekg
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrGrant Ingersoll
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrGrant Ingersoll
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementMongoDB
 

Semelhante a ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study (17)

Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshop
 
Team management presentation3
Team management presentation3Team management presentation3
Team management presentation3
 
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
 
PMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
PMICOS Webinar: Building a Sound Schedule in an Enterprise EnvironmentPMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
PMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with Hadoop
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
 
Intro to CCSS - East China 2-
Intro to CCSS - East China 2-Intro to CCSS - East China 2-
Intro to CCSS - East China 2-
 
Real World Application Performance with MongoDB
Real World Application Performance with MongoDBReal World Application Performance with MongoDB
Real World Application Performance with MongoDB
 
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin ChengPinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
 
Real World Cognition Loop for IoT
Real World Cognition Loop for IoTReal World Cognition Loop for IoT
Real World Cognition Loop for IoT
 
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
 
Show observe and tell giang nguyen
Show observe and tell   giang nguyenShow observe and tell   giang nguyen
Show observe and tell giang nguyen
 
Cognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
Cognitive Ability Effects on Effort in Web Search & Navigation by GwizdkaCognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
Cognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
 
Research and collection of data
Research and collection of dataResearch and collection of data
Research and collection of data
 

Mais de Guillaume Cabanac

Adoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousainesAdoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousainesGuillaume Cabanac
 
Dépollution de la littérature scientifique : traque d’expression torturées ...
Dépollution de la littérature scientifique : traque d’expression torturées ...Dépollution de la littérature scientifique : traque d’expression torturées ...
Dépollution de la littérature scientifique : traque d’expression torturées ...Guillaume Cabanac
 
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...Guillaume Cabanac
 
Comment analyser une mobilisation collective dans les réseaux socionumériques...
Comment analyser une mobilisation collective dans les réseaux socionumériques...Comment analyser une mobilisation collective dans les réseaux socionumériques...
Comment analyser une mobilisation collective dans les réseaux socionumériques...Guillaume Cabanac
 
Gender as a Variable to Study Academic Writing
Gender as a Variable to Study Academic WritingGender as a Variable to Study Academic Writing
Gender as a Variable to Study Academic WritingGuillaume Cabanac
 
Prospection de textes scientifiques : vision prospective
Prospection de textes scientifiques : vision prospectiveProspection de textes scientifiques : vision prospective
Prospection de textes scientifiques : vision prospectiveGuillaume Cabanac
 
Questionner le texte scientifique pour caractériser la science et l'innovation
Questionner le texte scientifique pour caractériser la science et l'innovationQuestionner le texte scientifique pour caractériser la science et l'innovation
Questionner le texte scientifique pour caractériser la science et l'innovationGuillaume Cabanac
 
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...Guillaume Cabanac
 
Interroger le texte scientifique
Interroger le texte scientifiqueInterroger le texte scientifique
Interroger le texte scientifiqueGuillaume Cabanac
 
The promises of web scrapping: Mining the web for relational data about artists
The promises of web scrapping: Mining the web for relational data about artistsThe promises of web scrapping: Mining the web for relational data about artists
The promises of web scrapping: Mining the web for relational data about artistsGuillaume Cabanac
 
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...Guillaume Cabanac
 
Confrontation à la perception humaine de mesures de similarité entre membres
Confrontation à la perception humaine de mesures de similarité entre membres Confrontation à la perception humaine de mesures de similarité entre membres
Confrontation à la perception humaine de mesures de similarité entre membres Guillaume Cabanac
 
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...Guillaume Cabanac
 
Émergence de l’open access « gris » : LibGen et Sci-Hub
Émergence de l’open access « gris » : LibGen et Sci-HubÉmergence de l’open access « gris » : LibGen et Sci-Hub
Émergence de l’open access « gris » : LibGen et Sci-HubGuillaume Cabanac
 
Sur les étagères des bibliothèques numériques clandestines:
Sur les étagères des bibliothèques numériques clandestines: Sur les étagères des bibliothèques numériques clandestines:
Sur les étagères des bibliothèques numériques clandestines: Guillaume Cabanac
 
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociauxLes altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociauxGuillaume Cabanac
 
A Journey in Scientometrics: quantitative studies of science at the crossroad...
A Journey in Scientometrics: quantitative studies of science at the crossroad...A Journey in Scientometrics: quantitative studies of science at the crossroad...
A Journey in Scientometrics: quantitative studies of science at the crossroad...Guillaume Cabanac
 
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifique
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifiqueBibliogifts ? Les bibliothèques clandestines de l'édition scientifique
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifiqueGuillaume Cabanac
 
Le renfort des liens forts - dynamique relationnelle du coauthorship
Le renfort des liens forts - dynamique relationnelle du coauthorshipLe renfort des liens forts - dynamique relationnelle du coauthorship
Le renfort des liens forts - dynamique relationnelle du coauthorshipGuillaume Cabanac
 

Mais de Guillaume Cabanac (20)

Adoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousainesAdoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousaines
 
Dépollution de la littérature scientifique : traque d’expression torturées ...
Dépollution de la littérature scientifique : traque d’expression torturées ...Dépollution de la littérature scientifique : traque d’expression torturées ...
Dépollution de la littérature scientifique : traque d’expression torturées ...
 
Interroger la science
Interroger la scienceInterroger la science
Interroger la science
 
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
 
Comment analyser une mobilisation collective dans les réseaux socionumériques...
Comment analyser une mobilisation collective dans les réseaux socionumériques...Comment analyser une mobilisation collective dans les réseaux socionumériques...
Comment analyser une mobilisation collective dans les réseaux socionumériques...
 
Gender as a Variable to Study Academic Writing
Gender as a Variable to Study Academic WritingGender as a Variable to Study Academic Writing
Gender as a Variable to Study Academic Writing
 
Prospection de textes scientifiques : vision prospective
Prospection de textes scientifiques : vision prospectiveProspection de textes scientifiques : vision prospective
Prospection de textes scientifiques : vision prospective
 
Questionner le texte scientifique pour caractériser la science et l'innovation
Questionner le texte scientifique pour caractériser la science et l'innovationQuestionner le texte scientifique pour caractériser la science et l'innovation
Questionner le texte scientifique pour caractériser la science et l'innovation
 
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
 
Interroger le texte scientifique
Interroger le texte scientifiqueInterroger le texte scientifique
Interroger le texte scientifique
 
The promises of web scrapping: Mining the web for relational data about artists
The promises of web scrapping: Mining the web for relational data about artistsThe promises of web scrapping: Mining the web for relational data about artists
The promises of web scrapping: Mining the web for relational data about artists
 
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
 
Confrontation à la perception humaine de mesures de similarité entre membres
Confrontation à la perception humaine de mesures de similarité entre membres Confrontation à la perception humaine de mesures de similarité entre membres
Confrontation à la perception humaine de mesures de similarité entre membres
 
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
 
Émergence de l’open access « gris » : LibGen et Sci-Hub
Émergence de l’open access « gris » : LibGen et Sci-HubÉmergence de l’open access « gris » : LibGen et Sci-Hub
Émergence de l’open access « gris » : LibGen et Sci-Hub
 
Sur les étagères des bibliothèques numériques clandestines:
Sur les étagères des bibliothèques numériques clandestines: Sur les étagères des bibliothèques numériques clandestines:
Sur les étagères des bibliothèques numériques clandestines:
 
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociauxLes altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
 
A Journey in Scientometrics: quantitative studies of science at the crossroad...
A Journey in Scientometrics: quantitative studies of science at the crossroad...A Journey in Scientometrics: quantitative studies of science at the crossroad...
A Journey in Scientometrics: quantitative studies of science at the crossroad...
 
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifique
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifiqueBibliogifts ? Les bibliothèques clandestines de l'édition scientifique
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifique
 
Le renfort des liens forts - dynamique relationnelle du coauthorship
Le renfort des liens forts - dynamique relationnelle du coauthorshipLe renfort des liens forts - dynamique relationnelle du coauthorship
Le renfort des liens forts - dynamique relationnelle du coauthorship
 

Último

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Último (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

  • 1. ECDL 2010 6-10 september 2010 Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study Damien Palacio, Guillaume Cabanac, Christian Sallaberry, Gilles Hubert Damien Palacio - damien.palacio@univ-pau.fr 1
  • 2. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 2
  • 3. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 3
  • 4. 1. Motivation – Why Geographic IR? Geographic Information Retrieval ➔ Query = ''trip around Glasgow in summer 2010'' ➔ Search Engines ➔ Topical term ∈ {trip, Glasgow, summer, 2010} spatial ∈ {citiesNearGlasgow ...} ➔ Geographic temporal ∈ {21june .. 22sept 2010} term ∈ {trip, Glasgow, summer, 2010} ➔ ≈ 1/6 Queries = Geographic Queries ➔ Excite (Sanderson et al., 2004) ➔ AOL (Gan et al., 2008) ➔ Yahoo! (Jones et al., 2008) ➔ Current Issue and Realistic 4
  • 5. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ 3 Dimensions to Process: ➔ Spatial, temporal and topical ➔ 1 Index per Dimension ➔ Topical bag of words, vector space model, ... ➔ Spatial named entity recognition, ... ➔ Temporal named entity recognition, ... 5
  • 6. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ Spatial Processing 6
  • 7. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ 3 Dimensions to Process: ➔ Spatial, temporal and topical ➔ 1 Index per Dimension ➔ Topical bag of words, vector space model, ... ➔ Spatial named entity recognition, ... ➔ Temporal named entity recognition, ... ➔ Retrieval ➔ Usually by filtering (STEWARD, SPIRIT, CITER, …) ➔ Issue: Performance of GIRS vs. topical IRS ➔ Hypothesis: Geographic IRS better than topical IRS 7
  • 8. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 8
  • 9. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 9
  • 10. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage time needed Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 10
  • 11. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 11
  • 12. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation Temporal Topical Spatial 12
  • 13. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... Temporal Topical Spatial 13
  • 14. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Spatial 14
  • 15. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Bucher et al. (2005) GeoClef Spatial 15
  • 16. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Bucher et al. (2005) Evaluation GeoClef framework Spatial proposed 16
  • 17. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 17
  • 18. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (1/2) ➔ Goal: measuring GIRS quality ➔ Means: building on TREC framework (1992-) ➔ ''Cranfield'' methodology ➔ Test collection ➔ Corpus ➔ ≥ 25 Topics ➔ Qrels ➔ Measures: P@X, MAP, NDCG, ... [Voorhees, 2007] 18
  • 19. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels ➔ + geographic ressources 19
  • 20. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels 3 dimensions: ➔ + geographic ressources Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton No dimension 3 dimensions + global ➔ About qrels … = Satisfied topic  ➔ Relevance (doc, topic) ∈ {0;1;2;3;4} ➔ Principle: ''the more satisfied dimensions there are, the better it is'' 20
  • 21. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels 3 dimensions: ➔ + geographic ressources Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton No dimension 3 dimensions + global ➔ About qrels … = Satisfied topic  ➔ Relevance (doc, topic) ∈ {0;1;2;3;4} ➔ Principle: ''the more satisfied dimensions there are, the better it is'' ➔ Gradual qrels aware measure: Normalized Discounted Cumulative Gain [Järvelin & Kekäläinen, 2002] ➔ By topic: NDCG for each topic ➔ Global: meanNDCG for the system 21
  • 22. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 22
  • 23. 4. Experiments – Case Study with PIV GIRS Case Study: PIV System ➔ Indexing: 1 index per dimension ➔ Topical = Terrier IRS [Ounis et al, 2005] ➔ Spatial = map segmentation into tiles ➔ Temporal = timeline segmentation into tiles CombMNZ ➔ Retrieval ➔ Result document list for each index ➔ Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997] 23
  • 24. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 24
  • 25. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 25
  • 26. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 26
  • 27. 4. Experiments – Case Study with PIV GIRS Case Study: MIDR_2010 collection ➔ Building Qrels: 12 volunteers (thanks!!!) 31 topics Qrels 5645 Relevance documents judgments = {0;1;2;3;4} paragraphs Map for tracking spatial information 27
  • 28. 4. Experiments – Hypothesis Validated Analysis of Collected Data ➔ IRS Evaluation trec_eval ➔ ResultsList × Qrels NDCG ➔ Results: geographic IRS most effective Hypothesis  28
  • 29. 4. Experiments – Hypothesis Validated Analysis of Collected Data ➔ Results: geographic IRS most effective 29
  • 30. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 30
  • 31. Evaluation framework for Geographic IR Systems Conclusions and Future Works (1/2) ➔ Evaluation Framework for Geographic IR Systems ➔ Reusable ➔ Generalizable for more dimensions: confidence, freshness, ... [Costa Pereira et al., 2009] ➔ Not gradual relevance per dimension ➔ Case Study with PIV System ➔ Creation of a specific test collection (≥ 25 topics) ➔ French test collection ➔ Limited collection (number of documents) 31
  • 32. Evaluation Framework for Geographic IR Systems Conclusions and Future Works (2/2) ➔ Hypothesis Validated ➔ The 3 dimensions improve IR (+66.5%) ➔ Future Works ➔ More precise analysis: by query ➔ Quantify PIV improvements: various indexes combinations ➔ Organize a GIRS evaluation campaign: anyone interested? 32
  • 33. ECDL 2010 6-10 september 2010 Thank you! Damien Palacio - damien.palacio@univ-pau.fr 33