Eastwood presentation on_kellyetal2010

Effects of Position and Number of Relevant Documents on Users’ Evaluations of System Performance A presentation by Meg Eastwood on the 2010 paper by D. Kelly, X. Fu, and C. Shah INF 384H September 26th, 2011 1

Diane Kelly Associate Professor, School of Library and Information Science, UNC Chapel Hill ,[object Object]

Ph.D., Rutgers University (Information Science)

MLS, Rutgers University (Information Retrieval)

BA, University of Alabama (Psychology and English)

Graduate Certificate in Cognitive Science, Rutgers Center for Cognitive Science2

Primary Aim of Research “to investigate the relationship between actual system performance and users’ evaluations of system performance” (pg 9:2) 3

Secondary Aim of Research “to develop an experimental method that can be used to isolate and study specific aspects of the search process” (pg 9:2) 4

Previous Experimental Protocols Traditional lab-based Naturalistic TREC Interactive Track Study entire search episodes Thomas and Hawking (2006) Trade control for “ecological validity” 5 Both designs include so many variables that it can be “difficult to establish causal relationships” (pg 9:2)

Literature Review Main criticisms of previous studies: Evaluation measures were calculated based on TREC assessor’s relevance judgments, not user judgments Users not provided with explicit instructions Users may have been fatigued Low sample sizes 6

Studies 1 and 2 : effect of position of relevant documents on user’s evaluation of system performance Study 3: effect of number of relevant documents 8

9 Participants were asked to help researchers evaluate four search engines For each search engine, read topic and posed one query

10 After issuing query, all participants were re-directed to the same results page with 10 standardized results

11 Participants asked to evaluate full text of each search result in the order presented and judge the relevance

12 After evaluating all the documents on the results page, participants were asked to evaluate the search engine

Study 1 Operationalized average precision at n Subjects required to evaluate all 10 documents 13

Study 2 Also operationalized average precision at n Subjects instructed to find five relevant documents 14

Study 3 – Operationalized Precision at n 15

Topics and Documents 16 Selected topics associated with newspaper articles about current events Selected documents with “high probability of being judged relevant or not relevant” (pg 9:12)

Study Participants 17 “Convenient sample” (pg 9:27) of undergraduates from UNC 27 participants for each study (1 -3) Demographic information collected: Sex Age Major Search experience Search frequency

Results Relevance Assessments 18

Did users’ relevance judgments agree with baseline assessments? 19

Did users’ relevance judgments agree with baseline assessments? 20

Did the topic affect differences in relevance assessments? 21

How much did relevance assessments vary between documents? 22

Results Evaluations of System Performance 23

Did participants modify evaluation ratings? 24

Participant ratings compared between performance levels and studies 25

Participant ratings compared between performance levels and studies 26 Study 1 showed no significant differences in ratings according to performance level

Participant ratings compared between performance levels and studies 27 Studies 2 and 3 did show significant differences in ratings according to performance level

What are the differences between study 1 and study 2? Intended difference: Completion time? 28

What are the differences between study 1 and study 2? Unintended differences: Instructions for study 2 provided clearer performance objective Subjects felt more successful in study 2? 29

User Experienced Precision 30 “experimental manipulations [of precision] were only 90% effective” (pg 9:24)

Are user-experienced precision values correlated with user ratings of system performance? 31

Are user-experienced precision values correlated with user ratings of system performance? 32

Eastwood presentation on_kellyetal2010

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Destaque

Destaque (11)

Semelhante a Eastwood presentation on_kellyetal2010

Semelhante a Eastwood presentation on_kellyetal2010 (20)

Último

Último (20)

Eastwood presentation on_kellyetal2010

Notas do Editor