SlideShare a Scribd company logo
1 of 42
Download to read offline
Did you mean crowdsourcingfor recommender systems? 
OMAR ALONSO 
6-OCT-2014 
CROWDREC 2014
Disclaimer 
The views and opinions expressed in this talkare mine and do not necessarily reflect the official policy or position of Microsoft. 
CROWDREC 2014
Outline 
A bit on human computation 
Crowdsourcingin informationretrieval 
Opportunities for recommender systems 
CROWDREC 2014
Human Computation 
CROWDREC 2014
Human Computation 
You are a computer 
CROWDREC 2014
Human-based computation 
Use humans as processors in a distributed system 
Address problems that computers aren’t good 
Games with a purpose 
Examples 
◦ESP game 
◦Captcha 
◦ReCaptcha 
CROWDREC 2014
Some definitions 
Human computation is a computation that is performed by a human 
Human computation system is a system that organizes human efforts to carry out computation 
Crowdsourcing is a tool that a human computation system can use to distribute tasks. 
CROWDREC 2014 
Edith Law and Luis von Ahn. Human Computation.Morgan & Claypool Publishers, 2011.
HC at the core of RecSys 
“In a typical recommender system people provide recommendations as inputs, which the system then aggregates and directs to appropriate recipients” –Resnick and Varian (CACM 1997) 
S. Perugini, M. Gonçalves, E. Fox: Recommender Systems Research: A Connection-Centric Survey. J. Intell. Inf. Syst. 23(2): 107-143 (2004) CROWDREC 2014
{where to go on vacation} 
MTurk: 50 answers, $1.80 
Quora: 2 answers 
Y! Answers: 2 answers 
FB: 1 answer 
Tons of results 
Read title + snippet + URL 
Explore a few pages in detail 
CROWDREC 2014
{where to go on vacation} 
Countries 
Cities 
CROWDREC 2014
Information Retrieval and Crowdsourcing 
CROWDREC 2014
The rise of crowdsourcing in IR 
Crowdsourcing is hot 
Lots of interest in the research community 
◦Articles showing good results 
◦Journals special issues (IR, IEEE Internet Computing, etc.) 
◦Workshops and tutorials (SIGIR, NACL, WSDM, WWW, VLDB, RecSys, CHI, etc.) 
◦HCOMP 
◦CrowdConf 
Large companies using crowdsourcing 
Big data 
Start-ups 
Venture capital investment 
CROWDREC 2014
Why is this interesting? 
Easy to prototype and test new experiments 
Cheap and fast 
No need to setup infrastructure 
Introduce experimentation early in the cycle 
In the context of IR, implement and experiment as you go 
For new ideas, this is very helpful 
CROWDREC 2014
Caveats and clarifications 
Trust and reliability 
Wisdom of the crowd re-visit 
Adjust expectations 
Crowdsourcing is another data point for your analysis 
Complementary to other experiments 
CROWDREC 2014
Why now? 
The Web 
Use humans as processors in a distributed system 
Address problems that computers aren’t good 
Scale 
Reach 
CROWDREC 2014
Motivating example: relevance judging 
Relevance of search results is difficult to judge 
◦Highly subjective 
◦Expensive to measure 
Professional editors commonly used 
Potential benefits of crowdsourcing 
◦Scalability (time and cost) 
◦Diversity of judgments 
CROWDREC 2014 
Matt Lease and Omar Alonso. “Crowdsourcing for search evaluation and social-algorithmic search”, ACM SIGIR 2012 Tutorial.
Crowdsourcing and relevance evaluation 
For relevance, it combines two main approaches 
◦Explicit judgments 
◦Automated metrics 
Other features 
◦Large scale 
◦Inexpensive 
◦Diversity 
CROWDREC 2014
Development framework 
Incremental approach 
Measure, evaluate, and adjust as you go 
Suitable for repeatable tasks 
CROWDREC 2014 
O. Alonso. “Implementing crowdsourcing-based relevance experimentation: an industrial perspective". Information Retrieval, (16)2, 2013
Asking questions 
Ask the right questions 
Part art, part science 
Instructions are key 
Workers may not be IR experts so don’t assume the same understanding in terms of terminology 
Show examples 
Hire a technical writer 
◦Engineer writes the specification 
◦Writer communicates 
CROWDREC 2014 
N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.
UX design 
Time to apply all those usability concepts 
Experiment should be self-contained. 
Keep it short and simple. Brief and concise. 
Be very clear with the relevance task. 
Engage with the worker. Avoid boring stuff. 
Document presentation & design 
Need to grab attention 
Always ask for feedback (open-ended question) in an input box. 
Localization 
CROWDREC 2014
Other design principles 
Text alignment 
Legibility 
Reading level: complexity of words and sentences 
Attractiveness (worker’s attention & enjoyment) 
Multi-cultural / multi-lingual 
Who is the audience (e.g. target worker community) 
Special needs communities (e.g. simple color blindness) 
Cognitive load: mental rigor needed to perform task 
Exposure effect 
CROWDREC 2014
When to assess work quality? 
Beforehand (prior to main task activity) 
◦How: “qualification tests” or similar mechanism 
◦Purpose: screening, selection, recruiting, training 
During 
◦How: assess labels as worker produces them 
◦Like random checks on a manufacturing line 
◦Purpose: calibrate, reward/penalize, weight 
After 
◦How: compute accuracy metrics post-hoc 
◦Purpose: filter, calibrate, weight, retain (HR) 
CROWDREC 2014
How do we measure work quality? 
Compare worker’s label vs. 
◦Known (correct, trusted) label 
◦Other workers’ labels 
◦Model predictions of workers and labels 
Verify worker’s label 
◦Yourself 
◦Tiered approach (e.g. Find-Fix-Verify) 
CROWDREC 2014
Comparing to known answers 
AKA: gold, honey pot, verifiable answer, trap 
Assumes you have known answers 
Cost vs. Benefit 
◦Producing known answers (experts?) 
◦% of work spent re-producing them 
Finer points 
◦What if workers recognize the honey pots? 
CROWDREC 2014
Comparing to other workers 
AKA: consensus, plurality, redundant labeling 
Well-known metrics for measuring agreement 
Cost vs. Benefit: % of work that is redundant 
Finer points 
◦Is consensus “truth” or systematic bias of group? 
◦What if no one really knows what they’re doing? 
◦Low-agreement across workers indicates problem is with the task (or a specific example), not the workers 
CROWDREC 2014
Methods for measuring agreement 
What to look for 
◦Agreement, reliability, validity 
Inter-agreement level 
◦Agreement between judges 
◦Agreement between judges and the gold set 
Some statistics 
◦Percentage agreement 
◦Cohen’s kappa (2 raters) 
◦Fleiss’ kappa (any number of raters) 
◦Krippendorff’s alpha 
With majority vote, what if 2 say relevant, 3 say not? 
◦Use expert to break ties 
◦Collect more judgments as needed to reduce uncertainty 
CROWDREC 2014
Pause 
Crowdsourcing works 
◦Fast turnaround, easy to experiment, few dollarsto test 
◦But: you have to design experiments carefully, quality, platform limitations 
Crowdsourcing in production 
◦Large scale data sets (millions of labels) 
◦Continuous execution 
◦Difficult to debug 
Multiple contingent factors 
How do you knowthe experiment is working 
Goal: framework for ensuring reliability on crowdsourcing tasks 
CROWDREC 2014 
O. Alonso, C. Marshall and M. Najork. “Crowdsourcing a subjective labeling task: A human centered framework to ensure reliable results” http://research.microsoft.com/apps/pubs/default.aspx?id=219755.
Labeling tweets –an example of a task 
Is this tweet interesting? 
Subjective activity 
Not focused on specific events 
Findings 
◦Difficult problem, low inter-rater agreement (Fleiss’ k, Krippendorff’salpha) 
◦Tested many designs, number of workers, platforms (MTurkand others) 
Multiple contingent factors 
◦Worker performance 
◦Work 
◦Task design 
CROWDREC 2014 
O. Alonso, C. Marshall & M. Najork. “Are some tweets more interesting than others? #hardquestion. HCIR 2013.
Designs that include in-task CAPTCHA 
Borrowed idea from reCAPTCHA -> use of control term 
Adapt your labeling task 
2 more questions as control 
◦1 algorithmic 
◦1 semantic 
CROWDREC 2014
Production example #1 
CROWDREC 2014 
Q1 (k = 0.91, alpha = 0.91) 
Q2 (k = 0.771, alpha = 0.771) 
Q3 (k = 0.033, alpha = 0.035) 
In-task captcha 
Tweet de-branded 
The main question
Production example #2 
CROWDREC 2014 
•Q3 Worthless (alpha = 0.033) 
•Q3 Trivial (alpha = 0.043) 
•Q3 Funny (alpha = -0.016) 
•Q3 Makes me curious (alpha = 0.026) 
•Q3 Contains useful info (alpha = 0.048) 
•Q3 Important news (alpha = 0.207) 
Q2 (k = 0.728, alpha = 0.728) 
Q1 (k = 0.907, alpha = 0.907) 
In-task captcha 
Breakdown by categories to get better signal 
Tweet de-branded
Findings from designs 
No quality controlissues 
Eliminating workers who did a poor job on question #1 didn’t affect inter-rateragreement for question #2 and #3. 
Interestingness is a fully subjective notion 
Wecan still build a classifier that identifies tweets that are interesting to a majority of users 
CROWDREC 2014
Careful with That AxeData, Eugene 
In the area of big data and machine learning: 
◦labels -> features -> predictive model -> optimization 
Labeling/experimentation perceived as boring 
Don’t rush labeling 
◦Human and machine 
Label quality is veryimportant 
◦Don’t outsource it 
◦Own it end to end 
◦Large scale 
CROWDREC 2014
More on label quality 
Data gathering is not a free lunch 
You can’t outsource label acquisition and quality 
Labels for the machine != labels for humans 
Emphasis on algorithms, models/optimizations and mining from labels 
Not so much on algorithms for ensuring high quality labels 
Training sets 
CROWDREC 2014
People are more than HPUs 
Why is Facebook popular? People are social. 
Information needs are contextually grounded in our social experiences and social networks 
Our social networks also embody additional knowledge about us, our needs, and the world 
We relate to recommendations 
The social dimension complements computation 
CROWDREC 2014
Opportunities in RecSys 
CROWDREC 2014
Humans in the loop 
Computation loopsthat mix humans and machines 
Kind of active learning 
Double goal: 
◦Humanchecking on the machine 
◦Machinechecking on humans 
Example:classifiers for social data 
CROWDREC 2014
Collaborative Filtering v2 
Collaboration with recipients 
Interactive 
Learning new data 
CROWDREC 2014
What’s in a label? 
Clicks, reviews, ratings, etc. 
Better or novel systems if we focus more on label quality? 
New ways of collecting data 
Training sets 
Evaluation & measurement 
CROWDREC 2014
Routing 
Expertise detection and routing 
Social load balancing 
When to switchbetween machines and humans 
CROWDREC 2014
Conclusions 
Crowdsourcing at scale works but requires a solid framework 
Three aspects that need attention: workers, work and task design 
Labeling social data is hard 
Traditional IR approaches don’t seem to work for Twitter data 
Label quality 
Outlined areas where RecSyscan benefit from crowdsourcing 
CROWDREC 2014
Thank you CROWDREC 2014 
@elunca

More Related Content

What's hot

Crowdsourcing for research libraries
Crowdsourcing for research librariesCrowdsourcing for research libraries
Crowdsourcing for research librariesElena Simperl
 
Computational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 StepsComputational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 StepsPaul Herring
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Matthew Lease
 
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...Aalto University
 
Computational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogyComputational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogyPaul Herring
 
Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Simon Buckingham Shum
 
Interaction design & quantified self
Interaction design & quantified selfInteraction design & quantified self
Interaction design & quantified selfPaul Sas
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Matthew Lease
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
 
Computational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All StudentsComputational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All StudentsNAFCareerAcads
 
Usability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered DesignUsability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered Design@cristobalcobo
 
Proposal defense
Proposal defenseProposal defense
Proposal defenseleducmills
 
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)City University London
 
Qsite Presentation computational thinking 2013
Qsite Presentation   computational thinking 2013Qsite Presentation   computational thinking 2013
Qsite Presentation computational thinking 2013Paul Herring
 
What is computational thinking? Who needs it? Why? How can it be learnt? ...
What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...
What is computational thinking? Who needs it? Why? How can it be learnt? ...Aaron Sloman
 
Towards Contested Collective Intelligence
Towards Contested Collective IntelligenceTowards Contested Collective Intelligence
Towards Contested Collective IntelligenceSimon Buckingham Shum
 
Junkbots and computational thinking
Junkbots and computational thinkingJunkbots and computational thinking
Junkbots and computational thinkingScott Turner
 

What's hot (20)

Crowdsourcing for research libraries
Crowdsourcing for research librariesCrowdsourcing for research libraries
Crowdsourcing for research libraries
 
Computational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 StepsComputational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 Steps
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)
 
Provocations for CLA Dashboard
Provocations for CLA DashboardProvocations for CLA Dashboard
Provocations for CLA Dashboard
 
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
 
BCII 2016 - Visualizing Complexity
BCII 2016 - Visualizing ComplexityBCII 2016 - Visualizing Complexity
BCII 2016 - Visualizing Complexity
 
Computational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogyComputational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogy
 
Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)
 
Interaction design & quantified self
Interaction design & quantified selfInteraction design & quantified self
Interaction design & quantified self
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
 
Computational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All StudentsComputational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All Students
 
Usability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered DesignUsability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered Design
 
Proposal defense
Proposal defenseProposal defense
Proposal defense
 
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
 
Qsite Presentation computational thinking 2013
Qsite Presentation   computational thinking 2013Qsite Presentation   computational thinking 2013
Qsite Presentation computational thinking 2013
 
What is computational thinking? Who needs it? Why? How can it be learnt? ...
What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...
What is computational thinking? Who needs it? Why? How can it be learnt? ...
 
Ppdd copy
Ppdd copyPpdd copy
Ppdd copy
 
Towards Contested Collective Intelligence
Towards Contested Collective IntelligenceTowards Contested Collective Intelligence
Towards Contested Collective Intelligence
 
Junkbots and computational thinking
Junkbots and computational thinkingJunkbots and computational thinking
Junkbots and computational thinking
 

Similar to Did you mean crowdsourcing for recommender systems?

Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...Amazon Web Services
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventKay Aubrey
 
Becoming A User Advocate
Becoming A User AdvocateBecoming A User Advocate
Becoming A User AdvocateKarl Kaufmann
 
Business Analysts are on the GO: Design with users, not for them!
Business Analysts are on the GO: Design with users, not for them!Business Analysts are on the GO: Design with users, not for them!
Business Analysts are on the GO: Design with users, not for them!SQALab
 
Uxpin guide to_usability_testing
Uxpin guide to_usability_testingUxpin guide to_usability_testing
Uxpin guide to_usability_testingimdurgesh
 
Build Services that Fit: Product Discovery Lessons Learned
Build Services that Fit: Product Discovery Lessons LearnedBuild Services that Fit: Product Discovery Lessons Learned
Build Services that Fit: Product Discovery Lessons LearnedTremis Skeete
 
You Don't Know C.R.A.P. about UX/UI
You Don't Know C.R.A.P. about UX/UIYou Don't Know C.R.A.P. about UX/UI
You Don't Know C.R.A.P. about UX/UILindsay Tabas
 
Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...
Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...
Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...BrittanyRubinstein
 
Step by step guide.docx
Step by step guide.docxStep by step guide.docx
Step by step guide.docxeram_abbasi
 
Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...
Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...
Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...Julia Grosman
 
Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2
Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2
Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2Costanoa Ventures
 
Practical User Research: A Crash Course
Practical User Research: A Crash CoursePractical User Research: A Crash Course
Practical User Research: A Crash Coursematthewjdoty
 
UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...
UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...
UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...Joe Lamantia
 
UX STRAT USA Presentation: Joe Lamantia, Bottomline Technologies
UX STRAT USA Presentation: Joe Lamantia, Bottomline TechnologiesUX STRAT USA Presentation: Joe Lamantia, Bottomline Technologies
UX STRAT USA Presentation: Joe Lamantia, Bottomline TechnologiesUX STRAT
 
Eapp Evaluation
Eapp EvaluationEapp Evaluation
Eapp EvaluationPGAEAPP
 
Requirements Engineering for the Humanities
Requirements Engineering for the HumanitiesRequirements Engineering for the Humanities
Requirements Engineering for the HumanitiesShawn Day
 
Design research for a quality product
Design research for a quality productDesign research for a quality product
Design research for a quality productCarmen Brion
 
Usability Testing and QA 7-18-14
Usability Testing and QA 7-18-14Usability Testing and QA 7-18-14
Usability Testing and QA 7-18-14Shilpa Thanawala
 
JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...
JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...
JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...PROIDEA
 

Similar to Did you mean crowdsourcing for recommender systems? (20)

Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
 
Becoming A User Advocate
Becoming A User AdvocateBecoming A User Advocate
Becoming A User Advocate
 
Business Analysts are on the GO: Design with users, not for them!
Business Analysts are on the GO: Design with users, not for them!Business Analysts are on the GO: Design with users, not for them!
Business Analysts are on the GO: Design with users, not for them!
 
Uxpin guide to_usability_testing
Uxpin guide to_usability_testingUxpin guide to_usability_testing
Uxpin guide to_usability_testing
 
Build Services that Fit: Product Discovery Lessons Learned
Build Services that Fit: Product Discovery Lessons LearnedBuild Services that Fit: Product Discovery Lessons Learned
Build Services that Fit: Product Discovery Lessons Learned
 
You Don't Know C.R.A.P. about UX/UI
You Don't Know C.R.A.P. about UX/UIYou Don't Know C.R.A.P. about UX/UI
You Don't Know C.R.A.P. about UX/UI
 
Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...
Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...
Denver Startup Week 2019: Choosing a Direction Learning How to Test Ideas and...
 
Step by step guide.docx
Step by step guide.docxStep by step guide.docx
Step by step guide.docx
 
Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...
Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...
Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...
 
Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2
Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2
Costanoa Expert Series: What Business Leaders Should Know About Design- Order 2
 
Practical User Research: A Crash Course
Practical User Research: A Crash CoursePractical User Research: A Crash Course
Practical User Research: A Crash Course
 
UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...
UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...
UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centere...
 
UX STRAT USA Presentation: Joe Lamantia, Bottomline Technologies
UX STRAT USA Presentation: Joe Lamantia, Bottomline TechnologiesUX STRAT USA Presentation: Joe Lamantia, Bottomline Technologies
UX STRAT USA Presentation: Joe Lamantia, Bottomline Technologies
 
Eapp Evaluation
Eapp EvaluationEapp Evaluation
Eapp Evaluation
 
Requirements Engineering for the Humanities
Requirements Engineering for the HumanitiesRequirements Engineering for the Humanities
Requirements Engineering for the Humanities
 
Design research for a quality product
Design research for a quality productDesign research for a quality product
Design research for a quality product
 
Usability Testing and QA 7-18-14
Usability Testing and QA 7-18-14Usability Testing and QA 7-18-14
Usability Testing and QA 7-18-14
 
JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...
JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...
JDD2014: QA to AQ: shifting from quality assurance to agile quality - Joseph ...
 

Recently uploaded

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 

Recently uploaded (20)

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 

Did you mean crowdsourcing for recommender systems?

  • 1. Did you mean crowdsourcingfor recommender systems? OMAR ALONSO 6-OCT-2014 CROWDREC 2014
  • 2. Disclaimer The views and opinions expressed in this talkare mine and do not necessarily reflect the official policy or position of Microsoft. CROWDREC 2014
  • 3. Outline A bit on human computation Crowdsourcingin informationretrieval Opportunities for recommender systems CROWDREC 2014
  • 5. Human Computation You are a computer CROWDREC 2014
  • 6. Human-based computation Use humans as processors in a distributed system Address problems that computers aren’t good Games with a purpose Examples ◦ESP game ◦Captcha ◦ReCaptcha CROWDREC 2014
  • 7. Some definitions Human computation is a computation that is performed by a human Human computation system is a system that organizes human efforts to carry out computation Crowdsourcing is a tool that a human computation system can use to distribute tasks. CROWDREC 2014 Edith Law and Luis von Ahn. Human Computation.Morgan & Claypool Publishers, 2011.
  • 8. HC at the core of RecSys “In a typical recommender system people provide recommendations as inputs, which the system then aggregates and directs to appropriate recipients” –Resnick and Varian (CACM 1997) S. Perugini, M. Gonçalves, E. Fox: Recommender Systems Research: A Connection-Centric Survey. J. Intell. Inf. Syst. 23(2): 107-143 (2004) CROWDREC 2014
  • 9. {where to go on vacation} MTurk: 50 answers, $1.80 Quora: 2 answers Y! Answers: 2 answers FB: 1 answer Tons of results Read title + snippet + URL Explore a few pages in detail CROWDREC 2014
  • 10. {where to go on vacation} Countries Cities CROWDREC 2014
  • 11. Information Retrieval and Crowdsourcing CROWDREC 2014
  • 12. The rise of crowdsourcing in IR Crowdsourcing is hot Lots of interest in the research community ◦Articles showing good results ◦Journals special issues (IR, IEEE Internet Computing, etc.) ◦Workshops and tutorials (SIGIR, NACL, WSDM, WWW, VLDB, RecSys, CHI, etc.) ◦HCOMP ◦CrowdConf Large companies using crowdsourcing Big data Start-ups Venture capital investment CROWDREC 2014
  • 13. Why is this interesting? Easy to prototype and test new experiments Cheap and fast No need to setup infrastructure Introduce experimentation early in the cycle In the context of IR, implement and experiment as you go For new ideas, this is very helpful CROWDREC 2014
  • 14. Caveats and clarifications Trust and reliability Wisdom of the crowd re-visit Adjust expectations Crowdsourcing is another data point for your analysis Complementary to other experiments CROWDREC 2014
  • 15. Why now? The Web Use humans as processors in a distributed system Address problems that computers aren’t good Scale Reach CROWDREC 2014
  • 16. Motivating example: relevance judging Relevance of search results is difficult to judge ◦Highly subjective ◦Expensive to measure Professional editors commonly used Potential benefits of crowdsourcing ◦Scalability (time and cost) ◦Diversity of judgments CROWDREC 2014 Matt Lease and Omar Alonso. “Crowdsourcing for search evaluation and social-algorithmic search”, ACM SIGIR 2012 Tutorial.
  • 17. Crowdsourcing and relevance evaluation For relevance, it combines two main approaches ◦Explicit judgments ◦Automated metrics Other features ◦Large scale ◦Inexpensive ◦Diversity CROWDREC 2014
  • 18. Development framework Incremental approach Measure, evaluate, and adjust as you go Suitable for repeatable tasks CROWDREC 2014 O. Alonso. “Implementing crowdsourcing-based relevance experimentation: an industrial perspective". Information Retrieval, (16)2, 2013
  • 19. Asking questions Ask the right questions Part art, part science Instructions are key Workers may not be IR experts so don’t assume the same understanding in terms of terminology Show examples Hire a technical writer ◦Engineer writes the specification ◦Writer communicates CROWDREC 2014 N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.
  • 20. UX design Time to apply all those usability concepts Experiment should be self-contained. Keep it short and simple. Brief and concise. Be very clear with the relevance task. Engage with the worker. Avoid boring stuff. Document presentation & design Need to grab attention Always ask for feedback (open-ended question) in an input box. Localization CROWDREC 2014
  • 21. Other design principles Text alignment Legibility Reading level: complexity of words and sentences Attractiveness (worker’s attention & enjoyment) Multi-cultural / multi-lingual Who is the audience (e.g. target worker community) Special needs communities (e.g. simple color blindness) Cognitive load: mental rigor needed to perform task Exposure effect CROWDREC 2014
  • 22. When to assess work quality? Beforehand (prior to main task activity) ◦How: “qualification tests” or similar mechanism ◦Purpose: screening, selection, recruiting, training During ◦How: assess labels as worker produces them ◦Like random checks on a manufacturing line ◦Purpose: calibrate, reward/penalize, weight After ◦How: compute accuracy metrics post-hoc ◦Purpose: filter, calibrate, weight, retain (HR) CROWDREC 2014
  • 23. How do we measure work quality? Compare worker’s label vs. ◦Known (correct, trusted) label ◦Other workers’ labels ◦Model predictions of workers and labels Verify worker’s label ◦Yourself ◦Tiered approach (e.g. Find-Fix-Verify) CROWDREC 2014
  • 24. Comparing to known answers AKA: gold, honey pot, verifiable answer, trap Assumes you have known answers Cost vs. Benefit ◦Producing known answers (experts?) ◦% of work spent re-producing them Finer points ◦What if workers recognize the honey pots? CROWDREC 2014
  • 25. Comparing to other workers AKA: consensus, plurality, redundant labeling Well-known metrics for measuring agreement Cost vs. Benefit: % of work that is redundant Finer points ◦Is consensus “truth” or systematic bias of group? ◦What if no one really knows what they’re doing? ◦Low-agreement across workers indicates problem is with the task (or a specific example), not the workers CROWDREC 2014
  • 26. Methods for measuring agreement What to look for ◦Agreement, reliability, validity Inter-agreement level ◦Agreement between judges ◦Agreement between judges and the gold set Some statistics ◦Percentage agreement ◦Cohen’s kappa (2 raters) ◦Fleiss’ kappa (any number of raters) ◦Krippendorff’s alpha With majority vote, what if 2 say relevant, 3 say not? ◦Use expert to break ties ◦Collect more judgments as needed to reduce uncertainty CROWDREC 2014
  • 27. Pause Crowdsourcing works ◦Fast turnaround, easy to experiment, few dollarsto test ◦But: you have to design experiments carefully, quality, platform limitations Crowdsourcing in production ◦Large scale data sets (millions of labels) ◦Continuous execution ◦Difficult to debug Multiple contingent factors How do you knowthe experiment is working Goal: framework for ensuring reliability on crowdsourcing tasks CROWDREC 2014 O. Alonso, C. Marshall and M. Najork. “Crowdsourcing a subjective labeling task: A human centered framework to ensure reliable results” http://research.microsoft.com/apps/pubs/default.aspx?id=219755.
  • 28. Labeling tweets –an example of a task Is this tweet interesting? Subjective activity Not focused on specific events Findings ◦Difficult problem, low inter-rater agreement (Fleiss’ k, Krippendorff’salpha) ◦Tested many designs, number of workers, platforms (MTurkand others) Multiple contingent factors ◦Worker performance ◦Work ◦Task design CROWDREC 2014 O. Alonso, C. Marshall & M. Najork. “Are some tweets more interesting than others? #hardquestion. HCIR 2013.
  • 29. Designs that include in-task CAPTCHA Borrowed idea from reCAPTCHA -> use of control term Adapt your labeling task 2 more questions as control ◦1 algorithmic ◦1 semantic CROWDREC 2014
  • 30. Production example #1 CROWDREC 2014 Q1 (k = 0.91, alpha = 0.91) Q2 (k = 0.771, alpha = 0.771) Q3 (k = 0.033, alpha = 0.035) In-task captcha Tweet de-branded The main question
  • 31. Production example #2 CROWDREC 2014 •Q3 Worthless (alpha = 0.033) •Q3 Trivial (alpha = 0.043) •Q3 Funny (alpha = -0.016) •Q3 Makes me curious (alpha = 0.026) •Q3 Contains useful info (alpha = 0.048) •Q3 Important news (alpha = 0.207) Q2 (k = 0.728, alpha = 0.728) Q1 (k = 0.907, alpha = 0.907) In-task captcha Breakdown by categories to get better signal Tweet de-branded
  • 32. Findings from designs No quality controlissues Eliminating workers who did a poor job on question #1 didn’t affect inter-rateragreement for question #2 and #3. Interestingness is a fully subjective notion Wecan still build a classifier that identifies tweets that are interesting to a majority of users CROWDREC 2014
  • 33. Careful with That AxeData, Eugene In the area of big data and machine learning: ◦labels -> features -> predictive model -> optimization Labeling/experimentation perceived as boring Don’t rush labeling ◦Human and machine Label quality is veryimportant ◦Don’t outsource it ◦Own it end to end ◦Large scale CROWDREC 2014
  • 34. More on label quality Data gathering is not a free lunch You can’t outsource label acquisition and quality Labels for the machine != labels for humans Emphasis on algorithms, models/optimizations and mining from labels Not so much on algorithms for ensuring high quality labels Training sets CROWDREC 2014
  • 35. People are more than HPUs Why is Facebook popular? People are social. Information needs are contextually grounded in our social experiences and social networks Our social networks also embody additional knowledge about us, our needs, and the world We relate to recommendations The social dimension complements computation CROWDREC 2014
  • 36. Opportunities in RecSys CROWDREC 2014
  • 37. Humans in the loop Computation loopsthat mix humans and machines Kind of active learning Double goal: ◦Humanchecking on the machine ◦Machinechecking on humans Example:classifiers for social data CROWDREC 2014
  • 38. Collaborative Filtering v2 Collaboration with recipients Interactive Learning new data CROWDREC 2014
  • 39. What’s in a label? Clicks, reviews, ratings, etc. Better or novel systems if we focus more on label quality? New ways of collecting data Training sets Evaluation & measurement CROWDREC 2014
  • 40. Routing Expertise detection and routing Social load balancing When to switchbetween machines and humans CROWDREC 2014
  • 41. Conclusions Crowdsourcing at scale works but requires a solid framework Three aspects that need attention: workers, work and task design Labeling social data is hard Traditional IR approaches don’t seem to work for Twitter data Label quality Outlined areas where RecSyscan benefit from crowdsourcing CROWDREC 2014
  • 42. Thank you CROWDREC 2014 @elunca