SlideShare a Scribd company logo
1 of 1
Download to read offline
In a Nutshell 
3 runs, Amazon Mechanical Turk, External HITs 
One HIT for each set of 5 documents = 435 HITs (2175 judgments) 
$0.20 per HIT = $0.04 per document 
Run 3 Stepwise execution of the GetAnotherLabel algorithm. Hypothesis: bad workers for one type of topics are not necessarily bad for others. For each worker wi compute expected quality qi on all topics and quality qij on each topic type tj. For topics in tj, use only workers with qij>qi. Topic categorization: TREC category (closed, advice, navigational, etc.), topic subject (politics, shopping, etc.) and rarity of the topic words. Runs 1 & 2 Train rule-based and SVM-based ML models. Features: 
•Worker confusion matrix from GetAnotherLabel: 
•For all workers, average posterior probability of relevant/nonrelevant 
•For all workers, average correct-to-incorrect ratio when saying relevant or not 
•For the document, relevant-to-nonrelevant ratio 
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track 
Julián Urbano, Mónica Marrero, Diego Martín, Jorge Morato, Karina Robles and Juan Lloréns 
Gaithersburg, USA November 16th, 2011 
run 1 
run 2 
run 3 
Hours to complete 
8.5 
38 
20.5 
HITs submitted (overhead) 
438 (+1%) 
535 (+23%) 
448 (+3%) 
Submitted workers (just previewers) 
29 (102) 
83 (383) 
30 (163) 
Average documents per worker 
76 
32 
75 
Total cost (including fees) 
$95.7 
$95.7 
$95.7 
much better control of the whole process 
fair for most workers (previous trials) 
2. Display Modes 
•With images 
•Black & white, same layout but no images 
Topic key terms (run 3) 
3. Task focus: keywords (runs 1 & 2) or relevance (run 3) 
4. Tabbed design 
5. Quality Control 
Worker Level 
50 HITs at most, at least 100 approved and 95% approval (98% in run 3) 
Implicit Task Level: Work Time 
At least 4.5 s/document (preview+work) 
Explicit Task Level: Comprehension What set of keywords better describe the document? 
•Correct: top 3 by TF + 2 from next 5 
•Incorrect: 5 random in last 25 
some folks work while previewing 
subjects always recognize top 1-2 by TF 
Rejecting & Blocking 
Action 
Failure 
run 1 
run 2 
run 3 
Reject 
Keyword 
1 
0 
1 
Time 
2 
1 
1 
Block 
Keyword 
1 
1 
1 
Time 
2 
1 
1 
HITs rejected 
3 (1%) 
100 (23%) 
13 (3%) 
Workers blocked 
0 (0%) 
40 (48%) 
4 (13%) 
7. Relevance Labels Binary 
•run 1: bad = 0, fair or good = 1 
•runs 2 & 3: normalize slider range in [0-1] If value > 0.4 then 1, else 0 Ranking 
•run 1: order by relevance, then by failures in keywords and then by time spent 
•runs 2 & 3: explicit in sliders 
Task I 
Task II 
Acc. 
Rec. 
Prec. 
Spec. 
AP 
NDCG 
Median 
.623 
.729 
.773 
.536 
.931 
.922 
run 1 
.748 
.802 
.841 
.632 
.922 
.958 
run 2 
.690 
.720 
.821 
.607 
.889 
.935 
run 3 
.731 
.737 
.857 
.728 
.894 
.932 
Acc. 
Rec. 
Prec. 
Spec. 
AP 
NDCG 
Median 
.640 
.754 
.625 
.560 
.111 
.359 
run 1 
.699 
.754 
.679 
.644 
.166 
.415 
run 2 
.714 
.750 
.700 
.678 
.082 
.331 
run 3 
.571 
.659 
.560 
.484 
.060 
.299 
according to Wordnet 
unbiased majority voting 
1. Document Preprocessing 
Cleanup for smooth loading and safe rendering: remove everything unrelated to style or layout 
6. Relevance: run 1 run2 run3 
* Unofficial, as per NIST gold labels

More Related Content

Similar to The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track

Performance evaluation of IR models
Performance evaluation of IR modelsPerformance evaluation of IR models
Performance evaluation of IR modelsNisha Arankandath
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?Michaela Greiler
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Sagar Deogirkar
 
2013 7 24 TAR Webinar 5 Tips & Myths Sigler
2013 7 24 TAR Webinar 5 Tips & Myths Sigler2013 7 24 TAR Webinar 5 Tips & Myths Sigler
2013 7 24 TAR Webinar 5 Tips & Myths SiglerSonya Sigler
 
Intro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft VenturesIntro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft Venturesmicrosoftventures
 
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...Spark Summit
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.pptArumugam90
 
Machine Learning with TensorFlow 2
Machine Learning with TensorFlow 2Machine Learning with TensorFlow 2
Machine Learning with TensorFlow 2Sarah Stemmler
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationMongoDB
 
Chapter 5 Query Evaluation.pdf
Chapter 5 Query Evaluation.pdfChapter 5 Query Evaluation.pdf
Chapter 5 Query Evaluation.pdfHabtamu100
 
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksParts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksRishabh Mehrotra
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1arthi v
 
Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)Ichigaku Takigawa
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealedinfoblog
 
04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptxShree Shree
 
Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning modelsKyriakos Chatzidimitriou
 
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)Amazon Web Services
 
Heidelberg presentation
Heidelberg presentationHeidelberg presentation
Heidelberg presentationnpz
 

Similar to The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track (20)

Performance evaluation of IR models
Performance evaluation of IR modelsPerformance evaluation of IR models
Performance evaluation of IR models
 
Rui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase GenerationRui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase Generation
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
 
2013 7 24 TAR Webinar 5 Tips & Myths Sigler
2013 7 24 TAR Webinar 5 Tips & Myths Sigler2013 7 24 TAR Webinar 5 Tips & Myths Sigler
2013 7 24 TAR Webinar 5 Tips & Myths Sigler
 
Intro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft VenturesIntro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft Ventures
 
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
Machine Learning with TensorFlow 2
Machine Learning with TensorFlow 2Machine Learning with TensorFlow 2
Machine Learning with TensorFlow 2
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + Optimization
 
Chapter 5 Query Evaluation.pdf
Chapter 5 Query Evaluation.pdfChapter 5 Query Evaluation.pdf
Chapter 5 Query Evaluation.pdf
 
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksParts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
 
Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealed
 
04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx
 
Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning models
 
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
AWS re:Invent 2016: Getting to Ground Truth with Amazon Mechanical Turk (MAC201)
 
OR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptxOR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptx
 
Heidelberg presentation
Heidelberg presentationHeidelberg presentation
Heidelberg presentation
 

More from Julián Urbano

Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Julián Urbano
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowJulián Urbano
 
The Treatment of Ties in AP Correlation
The Treatment of Ties in AP CorrelationThe Treatment of Ties in AP Correlation
The Treatment of Ties in AP CorrelationJulián Urbano
 
A Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationA Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationJulián Urbano
 
Crawling the Web for Structured Documents
Crawling the Web for Structured DocumentsCrawling the Web for Structured Documents
Crawling the Web for Structured DocumentsJulián Urbano
 
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...Julián Urbano
 
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...Julián Urbano
 
Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)Julián Urbano
 
Evaluation in Audio Music Similarity
Evaluation in Audio Music SimilarityEvaluation in Audio Music Similarity
Evaluation in Audio Music SimilarityJulián Urbano
 
Validity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information RetrievalValidity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information RetrievalJulián Urbano
 
On the Measurement of Test Collection Reliability
On the Measurement of Test Collection ReliabilityOn the Measurement of Test Collection Reliability
On the Measurement of Test Collection ReliabilityJulián Urbano
 
How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...Julián Urbano
 
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Julián Urbano
 
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...Julián Urbano
 
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...Julián Urbano
 
Audio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and StabilityAudio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and StabilityJulián Urbano
 
Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...
Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...
Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...Julián Urbano
 
Improving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered ListsImproving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered ListsJulián Urbano
 
Crowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks
Crowdsourcing Preference Judgments for Evaluation of Music Similarity TasksCrowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks
Crowdsourcing Preference Judgments for Evaluation of Music Similarity TasksJulián Urbano
 

More from Julián Urbano (20)

Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
 
Your PhD and You
Your PhD and YouYour PhD and You
Your PhD and You
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and How
 
The Treatment of Ties in AP Correlation
The Treatment of Ties in AP CorrelationThe Treatment of Ties in AP Correlation
The Treatment of Ties in AP Correlation
 
A Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationA Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR Evaluation
 
Crawling the Web for Structured Documents
Crawling the Web for Structured DocumentsCrawling the Web for Structured Documents
Crawling the Web for Structured Documents
 
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
 
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
 
Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)
 
Evaluation in Audio Music Similarity
Evaluation in Audio Music SimilarityEvaluation in Audio Music Similarity
Evaluation in Audio Music Similarity
 
Validity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information RetrievalValidity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information Retrieval
 
On the Measurement of Test Collection Reliability
On the Measurement of Test Collection ReliabilityOn the Measurement of Test Collection Reliability
On the Measurement of Test Collection Reliability
 
How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...
 
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
 
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Noteboo...
 
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
 
Audio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and StabilityAudio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and Stability
 
Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...
Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...
Bringing Undergraduate Students Closer to a Real-World Information Retrieval ...
 
Improving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered ListsImproving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered Lists
 
Crowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks
Crowdsourcing Preference Judgments for Evaluation of Music Similarity TasksCrowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks
Crowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks
 

Recently uploaded

pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxBhagirath Gogikar
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONrouseeyyy
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 

Recently uploaded (20)

pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 

The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track

  • 1. In a Nutshell 3 runs, Amazon Mechanical Turk, External HITs One HIT for each set of 5 documents = 435 HITs (2175 judgments) $0.20 per HIT = $0.04 per document Run 3 Stepwise execution of the GetAnotherLabel algorithm. Hypothesis: bad workers for one type of topics are not necessarily bad for others. For each worker wi compute expected quality qi on all topics and quality qij on each topic type tj. For topics in tj, use only workers with qij>qi. Topic categorization: TREC category (closed, advice, navigational, etc.), topic subject (politics, shopping, etc.) and rarity of the topic words. Runs 1 & 2 Train rule-based and SVM-based ML models. Features: •Worker confusion matrix from GetAnotherLabel: •For all workers, average posterior probability of relevant/nonrelevant •For all workers, average correct-to-incorrect ratio when saying relevant or not •For the document, relevant-to-nonrelevant ratio The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track Julián Urbano, Mónica Marrero, Diego Martín, Jorge Morato, Karina Robles and Juan Lloréns Gaithersburg, USA November 16th, 2011 run 1 run 2 run 3 Hours to complete 8.5 38 20.5 HITs submitted (overhead) 438 (+1%) 535 (+23%) 448 (+3%) Submitted workers (just previewers) 29 (102) 83 (383) 30 (163) Average documents per worker 76 32 75 Total cost (including fees) $95.7 $95.7 $95.7 much better control of the whole process fair for most workers (previous trials) 2. Display Modes •With images •Black & white, same layout but no images Topic key terms (run 3) 3. Task focus: keywords (runs 1 & 2) or relevance (run 3) 4. Tabbed design 5. Quality Control Worker Level 50 HITs at most, at least 100 approved and 95% approval (98% in run 3) Implicit Task Level: Work Time At least 4.5 s/document (preview+work) Explicit Task Level: Comprehension What set of keywords better describe the document? •Correct: top 3 by TF + 2 from next 5 •Incorrect: 5 random in last 25 some folks work while previewing subjects always recognize top 1-2 by TF Rejecting & Blocking Action Failure run 1 run 2 run 3 Reject Keyword 1 0 1 Time 2 1 1 Block Keyword 1 1 1 Time 2 1 1 HITs rejected 3 (1%) 100 (23%) 13 (3%) Workers blocked 0 (0%) 40 (48%) 4 (13%) 7. Relevance Labels Binary •run 1: bad = 0, fair or good = 1 •runs 2 & 3: normalize slider range in [0-1] If value > 0.4 then 1, else 0 Ranking •run 1: order by relevance, then by failures in keywords and then by time spent •runs 2 & 3: explicit in sliders Task I Task II Acc. Rec. Prec. Spec. AP NDCG Median .623 .729 .773 .536 .931 .922 run 1 .748 .802 .841 .632 .922 .958 run 2 .690 .720 .821 .607 .889 .935 run 3 .731 .737 .857 .728 .894 .932 Acc. Rec. Prec. Spec. AP NDCG Median .640 .754 .625 .560 .111 .359 run 1 .699 .754 .679 .644 .166 .415 run 2 .714 .750 .700 .678 .082 .331 run 3 .571 .659 .560 .484 .060 .299 according to Wordnet unbiased majority voting 1. Document Preprocessing Cleanup for smooth loading and safe rendering: remove everything unrelated to style or layout 6. Relevance: run 1 run2 run3 * Unofficial, as per NIST gold labels