ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

ECDL 2010
6-10 september 2010

Measuring Effectiveness of
Geographic IR Systems
in Digital Libraries:
Evaluation and Case Study

Damien Palacio, Guillaume Cabanac,
Christian Sallaberry, Gilles Hubert

Damien Palacio - damien.palacio@univ-pau.fr 1

Outline

1. Motivation Topical IR → Geographic IR
Hypothesis: GIRS > IRS

2. Context IRS evaluation
Issue Current evaluation frameworks
= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRS
Hypothesis validated

5. Conclusion and Future Works

2

Outline


= partial




3

1. Motivation – Why Geographic IR?

Geographic Information Retrieval
➔ Query = ''trip around Glasgow in summer 2010''

➔ Search Engines
➔ Topical term ∈ {trip, Glasgow, summer, 2010}

spatial ∈ {citiesNearGlasgow ...}
➔ Geographic temporal ∈ {21june .. 22sept 2010}
term ∈ {trip, Glasgow, summer, 2010}

➔ ≈ 1/6 Queries = Geographic Queries
➔ Excite (Sanderson et al., 2004)
➔ AOL (Gan et al., 2008)
➔ Yahoo! (Jones et al., 2008)

➔ Current Issue and Realistic
4


A Geographic IRS: How Does It Work?
➔ 3 Dimensions to Process:
➔ Spatial, temporal and topical

➔ 1 Index per Dimension
➔ Topical bag of words, vector space model, ...
➔ Spatial named entity recognition, ...
➔ Temporal named entity recognition, ...

5


➔ Spatial Processing

6


➔ 3 Dimensions to Process:
➔ Spatial, temporal and topical

➔ 1 Index per Dimension
➔ Topical bag of words, vector space model, ...
➔ Spatial named entity recognition, ...
➔ Temporal named entity recognition, ...

➔ Retrieval
➔ Usually by filtering (STEWARD, SPIRIT, CITER, …)

➔ Issue: Performance of GIRS vs. topical IRS
➔ Hypothesis: Geographic IRS better than topical IRS
7

Outline


= partial




8

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System
➔ System = efficiency + effectiveness

Geo IR litterature Topical IR
litterature

➔ Effectiveness Evaluation

9



Computation Storage
time needed
litterature


10



Computation Storage
needed Quality
time
litterature


11



Computation Storage
needed Quality
time
litterature


Temporal Topical

Spatial

12



Computation Storage
needed Quality
time
litterature

TREC, CLEF, ...

Temporal Topical

Spatial

13



Computation Storage
needed Quality
time
litterature

TREC, CLEF, ...
TempEval
Temporal Topical

Spatial

14



Computation Storage
needed Quality
time
litterature

TREC, CLEF, ...
TempEval
Temporal Topical

Bucher et al. (2005)
GeoClef
Spatial

15



Computation Storage
needed Quality
time
litterature

TREC, CLEF, ...
TempEval
Temporal Topical

Bucher et al. (2005)
Evaluation GeoClef
framework Spatial
proposed
16

Outline


= partial




17

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (1/2)
➔ Goal: measuring GIRS quality

➔ Means: building on TREC framework (1992-)

➔ ''Cranfield'' methodology
➔ Test collection
➔ Corpus
➔ ≥ 25 Topics
➔ Qrels

➔ Measures: P@X, MAP,
NDCG, ...
[Voorhees, 2007]

18


➔ TREC Framework Extension
➔ Test collection
➔ ≥ 25 Topics
➔ Corpus Covering the 3
dimensions
➔ Gradual qrels
➔ + geographic ressources

19


➔ Test collection
➔ ≥ 25 Topics
dimensions
➔ Gradual qrels
3 dimensions:
➔ + geographic ressources Topic: ''trip around Glasgow''
Doc: trip + Bob born in Dumbarton
No dimension 3 dimensions + global
➔ About qrels … =
Satisfied topic 
➔ Relevance (doc, topic) ∈ {0;1;2;3;4}
➔ Principle: ''the more satisfied dimensions there are, the
better it is''

20


➔ Test collection
➔ ≥ 25 Topics
dimensions
➔ Gradual qrels
3 dimensions:
➔ + geographic ressources Topic: ''trip around Glasgow''
Doc: trip + Bob born in Dumbarton
No dimension 3 dimensions + global
➔ About qrels … =
Satisfied topic 
➔ Relevance (doc, topic) ∈ {0;1;2;3;4}
➔ Principle: ''the more satisfied dimensions there are, the
better it is''

➔ Gradual qrels aware measure:
Normalized Discounted Cumulative Gain [Järvelin & Kekäläinen, 2002]

➔ By topic: NDCG for each topic
➔ Global: meanNDCG for the system 21

Outline


= partial




22

4. Experiments – Case Study with PIV GIRS

Case Study: PIV System
➔ Indexing: 1 index per dimension
➔ Topical = Terrier IRS [Ounis et al, 2005]
➔ Spatial = map segmentation into tiles
➔ Temporal = timeline segmentation into tiles

CombMNZ

➔ Retrieval
➔ Result document list for each index
➔ Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997]
23


CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]

24



25



26


Case Study: MIDR_2010 collection
➔ Building Qrels: 12 volunteers (thanks!!!)

31 topics Qrels

5645 Relevance
documents judgments

= {0;1;2;3;4}

paragraphs

Map for
tracking
spatial
information

27

4. Experiments – Hypothesis Validated

Analysis of Collected Data
➔ IRS Evaluation
trec_eval
➔ ResultsList × Qrels NDCG

➔ Results: geographic IRS most effective

Hypothesis 

28

4. Experiments – Hypothesis Validated

Analysis of Collected Data
➔ Results: geographic IRS most effective

29

Outline


= partial




30

Evaluation framework for Geographic IR Systems

Conclusions and Future Works (1/2)
➔ Evaluation Framework for Geographic IR Systems
➔ Reusable
➔ Generalizable for more dimensions: confidence,
freshness, ... [Costa Pereira et al., 2009]
➔ Not gradual relevance per dimension

➔ Case Study with PIV System
➔ Creation of a specific test collection (≥ 25 topics)
➔ French test collection
➔ Limited collection (number of documents)

31

Evaluation Framework for Geographic IR Systems

Conclusions and Future Works (2/2)
➔ Hypothesis Validated
➔ The 3 dimensions improve IR (+66.5%)

➔ Future Works
➔ More precise analysis: by query
➔ Quantify PIV improvements: various indexes combinations
➔ Organize a GIRS evaluation campaign: anyone interested?

32

ECDL 2010
6-10 september 2010

Thank you!

Damien Palacio - damien.palacio@univ-pau.fr 33

Spatial Interface

34

Spatial Interface

35

Temporal Interface

36

Temporal Interface

37

ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

Semelhante a ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study (17)

Mais de Guillaume Cabanac

Mais de Guillaume Cabanac (20)

Último

Último (20)

ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study