SlideShare uma empresa Scribd logo
1 de 53
Statistical Crowdsourcing: From Aggregating
Judgments to Search Engine Evaluation
Matt Lease
ir.ischool.utexas.edu
School of Information @mattlease
University of Texas at Austin ml@utexas.edu
Undergraduate Mentors at UW
Roadmap
• What are Crowdsourcing & Human Computation? 4-16
– A great research area for iSchools: something for everyone!
• Benchmarking Statistical Consensus Methods 18-26
• Psychometrics & Crowds for Relevance Judging 28-35
3
Matt Lease <ml@utexas.edu>
Crowdsourcing
• Jeff Howe. WIRED, June 2006
• Rise of digital work & internet
empowers a global workforce
via open call solicitations
• New application of principles
from open source movement
4
5
• Online marketplace for paid labor since 2005
• On-demand, elastic, 24/7 global workforce
• API integrates human labor with computation
Amazon Mechanical Turk (MTurk)
6
A New Scale of Labeled Data for AI
Snow et al., EMNLP 2008
• MTurk labels for 5 NLP Tasks
• 22K labels for only $26
• While individual annotations
noisy, aggregated consensus
labels show high agreement
with expert labels (“gold”)
7
AI + Human Computation =
A new breed of hybrid intelligent systems
PlateMate (Noronha et al., UIST’10)
8
Social & Behavioral Sciences
• A Guide to Behavioral Experiments
on Mechanical Turk
– W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
– L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research:
Insights from Mechanical Turk
– Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of
Inexpensive, Yet High-Quality, Data?
– M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
– see also: Amazon Mechanical Turk Guide for Social Scientists
9
August 12, 2012 10
Ethics of Crowdsourcing?
11
Paul Hyman. Communications of the ACM, Vol. 56 No. 8, Pages 19-21, August 2013.
Matt Lease <ml@utexas.edu>
Who are
the workers?
• A. Baio, November 2008. The Faces of Mechanical Turk.
• P. Ipeirotis. March 2010. The New Demographics of
Mechanical Turk
• J. Ross, et al. Who are the Crowdworkers? CHI 2010.
12
Matt Lease <ml@utexas.edu>
Matt Lease <ml@utexas.edu>
13
Safeguarding Participant Data
•
“What are the characteristics of MTurk workers?... the MTurk
system is set up to strictly protect workers’ anonymity….”
14
`
Amazon profile page
URLs use the same
IDs as used on MTurk!
Lease et al., SSRN’13
15
Crowdsourcing & the Law: Independent
Contractors vs. Employees
• Wolfson & Lease, ASIS&T’11
• Some platforms classify online contributors as
independent contractors (vs. employees)
• While Employment is legally-defined (e.g., FLSA
and past court decisions), the definition leaks
• It seems unlikely Congress will provide clarity
• Class action litigation pending in the courts
16
Matt Lease <ml@utexas.edu>
Roadmap
• What are Crowdsourcing & Human Computation? 4-16
• Benchmarking Statistical Consensus Methods 18-26
• Psychometrics & Crowds for Relevance Judging 28-35
• a
17
Matt Lease <ml@utexas.edu>
Science of Measurement & Benchmarks
• “If you cannot measure it, you cannot improve it.”
• Drive field innovation by clear challenge tasks
– e.g., David Tse’s FIST 2012 Keynote (Comp. Biology)
• Many things we can learn
– What is the current state-of-the-art?
– How do current methods compare?
– What works, what doesn’t, and why?
– How has field progressed over time? 18
Matt Lease <ml@utexas.edu>
Finding Consensus in Human Computation
• For an objective labeling task, how do we
resolve disagreement between responses?
• Simple baseline: majority voting
• Research pre-dates crowdsourcing
– Dawid and Skene’79, Smyth et al., ’95
• One of the most studied problems in HCOMP
– Laymen likely to err more than experts
– Methods in many areas: ML, Vision, NLP, IR, DB, … 19
Matt Lease <ml@utexas.edu>
20
Matt Lease <ml@utexas.edu>
SQUARE:
A Benchmark
for Research on
Computing
Crowd
Consensus
@HCOMP’13
ir.ischool.utexas.edu/square
(open source)
21
Datasets
Methods
Include popular and/or open-source methods
• Majority Voting
• Expectation-Maximization (Dawid-Skene, 1979)
• Naïve Bayes (Snow et al., 2008)
• GLAD (Whitehill et al., 2009)
• ZenCrowd (Demartini et al., 2012)
• Raykar et al. (2012)
• CUBAM (Welinder et al., 2010)
Matt Lease <ml@utexas.edu>
22
Results: Unsupervised Accuracy
Relative gain/loss vs. majority voting
23
-15%
-10%
-5%
0%
5%
10%
15%
BM HCB SpamCF WVSCM WB RTE TEMP WSD AC2 HC ALL
DS ZC RY GLAD CUBCAM
Results: Varying Supervision
24
Matt Lease <ml@utexas.edu>
Findings
• Majority voting never best, but rarely much worse
• No method performs far better than others
• Each method often best for some condition
– e.g., original dataset method was designed for
• DS & RY tend to perform best (RY adds priors)
25
Matt Lease <ml@utexas.edu>
Why Don’t We See Bigger Gains?
• Of course contributions aren’t just empirical…
• Maybe gold is too noisy to detect improvement?
– Cormack & Kolcz’09, Klebanov & Beigman’10
• Might we see bigger differences from
– Different tasks/scenarios?
– Better benchmark tests?
– Different methods or tuning?
• We invite community contributions!
26
Matt Lease <ml@utexas.edu>
Roadmap
• What are Crowdsourcing & Human Computation? 4-16
• Benchmarking Statistical Consensus Methods 18-26
• Psychometrics & Crowds for Relevance Judging 28-35
• a
27
Matt Lease <ml@utexas.edu>
Multidimensional Relevance Modeling
via Psychometrics and Crowdsourcing
Joint work with
Yinglong Zhang Jin Zhang Jacek Gwizdka
Paper @ SIGIR 2014
Matt Lease <ml@utexas.edu>
28
Background: Evaluating IR Systems
• Classic Cranfield method (Cleverdon et al., 1966)
– Given a document collection & set of queries
– Judge documents for topical relevance to each query
– Evaluate on these queries & documents
• Problem: Scaling manual data labeling is difficult
• Idea: try Crowdsourcing
– Alonso et al. (SIGIR Forum 2008)
– Grady & Lease, 2010
– TREC 2011-2013 Crowdsourcing Track 29
Matt Lease <ml@utexas.edu>
But Problems are Deeper
• User relevance > simple topical relevance
– The Great Divide in IR: systems-centered vs. user-centered
– What other factors to model, & what is their relative
importance? Long history of studies, little consensus.
– Dearth of labeled data for training/evaluating systems
• Even trusted assessors disagree often on “simple”
topical relevance judgments
– Often attributed to subjectivity, but can we do better?
• How do we ensure quality of subjective data?
– Largely unstudied in HCOMP community to date
Matt Lease <ml@utexas.edu>
30
Pscychology to the Rescue!
• A Guide to Behavioral Experiments
on Mechanical Turk
– W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
– L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research:
Insights from Mechanical Turk
– Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of
Inexpensive, Yet High-Quality, Data?
– M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
– see also: Amazon Mechanical Turk Guide for Social Scientists
31
August 12, 2012 32
Key Ideas from Pscyhometrics
• Use standard survey techniques for collecting
multi-dimensional relevance judgments
– Ask repeated, similar questions, & change polarity
• Analyze data via Structural Equation Modeling
– cousin to graphical models in statistics/AI
– Posit questions associated with latent factors
– Use Exploratory Factor Analysis to determine factors
& question associations, then prune questions
– Use Confirmatory Factor Analysis to assess
correlations, test significance, and compare models
Matt Lease <ml@utexas.edu>
33
Matt Lease <ml@utexas.edu>
34
Future Directions
• Strong foundation for ongoing positivist research of
alternative relevance factors
– For different user groups, search scenarios, etc.
– Need more data to support normative claims
• Train/test operational systems for varying factors
• Improve judging agreement by making task more
natural and/or assessing impact of latent factors
• Intra-subject vs. inter-subject aggregation?
• SEM vs. graphical modeling?
• Other methods for ensuring subjective data quality?
Matt Lease <ml@utexas.edu>
35
The Future of Crowd Work, CSCW’13
Kittur, Nickerson, Bernstein, Gerber,
Shaw, Zimmerman, Lease, and Horton
36
Matt Lease <ml@utexas.edu>
Thank You!
ir.ischool.utexas.edu
Slides: www.slideshare.net/mattlease
Matt Lease <ml@utexas.edu>
38
A Few Moral Dilemmas
• A “fair” price for online work in a global economy?
– Is it better to pay nothing (i.e., volunteers, gamification)
rather than pay something small for valuable work?
• Are we obligated to inform people how their
participation / work products will be used?
– If my IRB doesn’t require me to obtain informed consent,
is there some other moral obligation to do so?
• A worker finds his ID posted in a researcher’s online
source code and asks that it be removed. This can’t
be done without recreating the repo, which many
people use. What should be done?
Matt Lease <ml@utexas.edu>
39
Ethical Crowdsourcing
• Assume researchers have good intentions, and
so issues of gross negligence are rare
– Withholding promised pay after work performed
– Not obtaining or complying with IRB oversight
• Instead, great challenge is how to recognize our
impacts appropriate actions in a complex world
– Educating ourselves takes time & effort
– Failing to educate ourselves could harm to others
• How can we strike a reasonable balance between
complete apathy vs. being overly alarmist?
Matt Lease <ml@utexas.edu>
40
• Contribute to society and human well-being
• Avoid harm to others
• Be honest and trustworthy
• Be fair and take action not to discriminate
• Respect the privacy of others
COMPLIANCE WITH THE CODE. As an ACM member I will
– Uphold and promote the principles of this Code
– Treat violations of this code as inconsistent with
membership in the ACM
41
Matt Lease <ml@utexas.edu>
CS2008 Curriculum Update (ACM, IEEE)
There is reasonably wide agreement that this topic of legal, social,
professional and ethical should feature in all computing degrees.
…financial and economic imperatives …Which approaches are less
expensive and is this sensible? With the advent of outsourcing and
off-shoring these matters become more complex and take on new
dimensions …there are often related ethical issues concerning
exploitation… Such matters ought to feature in courses on legal,
ethical and professional practice.
if ethical considerations are covered only in the standalone course and
not “in context,” it will reinforce the false notion that technical processes
are void of ethical issues. Thus it is important that several traditional
courses include modules that analyze ethical considerations in the
context of the technical subject matter … It would be explicitly against
the spirit of the recommendations to have only a standalone course.
42
Matt Lease <ml@utexas.edu>
“Contribute to society and human
well-being; avoid harm to others”
• Do we have a moral obligation to try to ascertain
conditions under which work is performed? Or the
impact we have upon those performing the work?
• Do we feel differently when work is performed by
– Political refugees? Children? Prisoners? Disabled?
• How do we know who is doing the work, or if a
decision to work (for a given price) is freely made?
– Does it matter why someone accepts offered work?
Matt Lease <ml@utexas.edu>
43
Some Notable Prior Research
• Silberman, Irani, and Ross (2010)
– “How should we… conceptualize the role of these people
who we ask to power our computing?”
– “abstraction hides detail'‘ - some details may be worth
keeping conspicuously present (Jessica Hullman)
• Irani and Silberman (2013)
– “…AMT helps employers see themselves as builders of
innovative technologies, rather than employers unconcerned
with working conditions.”
– “…human computation currently relies on worker invisibility.”
• Fort, Adda, and Cohen (2011)
– “…opportunities for our community to deliberately value
ethics above cost savings.” 44
Power Asymmetry on MTurk
45
Matt Lease <ml@utexas.edu>
• Mistakes happen, such as wrongly rejecting work – e.g., error by
new student, software bug, poor instructions, noisy gold, etc.
• How do we balance the harm caused by our mistakes to workers
(our liability) vs. our cost/effort of preventing such mistakes?
Task Decomposition
By minimizing context, greater task efficiency &
accuracy can often be achieved in practice
– e.g. “Can you name who is in this photo?”
• Much research on ways to streamline work
and decompose complex tasks
46
Matt Lease <ml@utexas.edu>
Context & Informed Consent
• Assume we wish to obtain informed consent
• Without context, consent cannot be informed
– Zittrain, Ubiquitous human computing (2008) 47
Consequences of Human Computation
as a Panacea where AI Falls Short
• The Googler who Looked at the Worst of the Internet
• Policing the Web’s Lurid Precincts
• Facebook content moderation
• The dirty job of keeping Facebook clean
• Even linguistic annotators report stress &
nightmares from reading news articles!
48
Matt Lease <ml@utexas.edu>
What about Freedom?
• Crowdsourcing vision: empowering freedom
– work whenever you want for whomever you want
• Risk: people compelled to perform work
– Chinese prisoners farming gold online
– Digital sweat shops? Digital slaves?
– We know relatively little today about work conditions
– How might we monitor and mitigate risk/growth of
crowd work inflicting harm to at-risk populations?
– Traction? Human Trafficking at MSR Summit’12
49
Matt Lease <ml@utexas.edu>
Robert Sim, MSR Summit’12
50
Matt Lease <ml@utexas.edu>
Join the conversation!
Crowdwork-ethics, by Six Silberman
http://crowdwork-ethics.wtf.tw
an informal, occasional blog for researchers
interested in ethical issues in crowd work
51
Matt Lease <ml@utexas.edu>
Additional References
• Irani, Lilly C. The Ideological Work of Microwork. In preparation,
draft available online.
• Adda, Gilles, et al. Crowdsourcing for language resource
development: Critical analysis of amazon mechanical turk
overpowering use. Proceedings of the 5th Language and Technology
Conference (LTC). 2011.
• Adda, Gilles, and Joseph J. Mariani. Economic, Legal and Ethical
analysis of Crowdsourcing for Speech Processing. (2013).
• Harris, Christopher G., and Padmini Srinivasan. Crowdsourcing and
Ethics. Security and Privacy in Social Networks. 67-83. 2013.
• Harris, Christopher G. Dirty Deeds Done Dirt Cheap: A Darker Side
to Crowdsourcing. IEEE 3rd conference on social computing
(socialcom). 2011.
• Horton, John J. The condition of the Turking class: Are online
employers fair and honest?. Economics Letters 111.1 (2011): 10-12.
52
Matt Lease <ml@utexas.edu>
• Bederson, B. B., & Quinn, A. J. Web workers unite! addressing challenges
of online laborers. In CHI 2011 Human Computation Workshop, 97-106.
• Bederson, B. B., & Quinn, A. J. Participation in Human Computation. In
CHI 2011 Human Computation Workshop.
• Felstiner, Alek. Working the Crowd: Employment and Labor Law in the
Crowdsourcing Industry. Berkeley J. Employment & Labor Law 32.1 2011
• Felstiner, Alek. Sweatshop or Paper Route?: Child Labor Laws and In-
Game Work. CrowdConf (2010).
• Larson, Martha. Toward Responsible and Sustainable Crowsourcing.
Blog post + Slides from Dagstuhl, September 2013.
• Vili Lehdonvirta and Paul Mezier. Identity and Self-Organization in
Unstructured Work. Unpublished working paper. 16 October 2013.
• Zittrain, Jonathan. Minds for Sale. You Tube. 53
Matt Lease <ml@utexas.edu>
Additional References (2)

Mais conteúdo relacionado

Mais procurados

Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Matthew Lease
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
 
Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...Paco Nathan
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
 
Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013Roger Hoerl
 
DSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco NathanDSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco NathanPaco Nathan
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairnessAnthonyMelson
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Big_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_ReddiboinaBig_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_ReddiboinaMadhu Reddiboina
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextMurad Daryousse
 
Australian Legal Education in 2017: Taking Stock for an Uncertain Future
Australian Legal Education in 2017: Taking Stock for an Uncertain FutureAustralian Legal Education in 2017: Taking Stock for an Uncertain Future
Australian Legal Education in 2017: Taking Stock for an Uncertain FutureSally Kift
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG DataPrasant Misra
 
Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...
Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...
Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...Brendan Aldrich
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...Krishnaram Kenthapadi
 

Mais procurados (18)

Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
 
Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
 
Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013
 
DSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco NathanDSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco Nathan
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Big_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_ReddiboinaBig_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_Reddiboina
 
What is Data Science
What is Data ScienceWhat is Data Science
What is Data Science
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
 
Big data trends in 2020
Big data trends in 2020Big data trends in 2020
Big data trends in 2020
 
Australian Legal Education in 2017: Taking Stock for an Uncertain Future
Australian Legal Education in 2017: Taking Stock for an Uncertain FutureAustralian Legal Education in 2017: Taking Stock for an Uncertain Future
Australian Legal Education in 2017: Taking Stock for an Uncertain Future
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
2EPS
2EPS2EPS
2EPS
 
Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...
Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...
Founding a Data Democracy: How Ivy Tech is Leading a Revolution in Higher Edu...
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
 

Semelhante a Crowdsourcing: From Aggregation to Search Engine Evaluation

Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsMatthew Lease
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsMatthew Lease
 
Metrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-ComputingMetrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-ComputingMatthew Lease
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing ScienceMatthew Lease
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopMatthew Lease
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)Matthew Lease
 
Stated preference methods and analysis
Stated preference methods and analysisStated preference methods and analysis
Stated preference methods and analysisHabet Madoyan
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Matthew Lease
 
Technology in Employee Recruitment and Selection
Technology in Employee Recruitment and SelectionTechnology in Employee Recruitment and Selection
Technology in Employee Recruitment and SelectionIoannis Nikolaou
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...teodroscampaus
 
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Matthew Lease
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Matthew Lease
 
Web search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionWeb search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionAli Dasdan
 
Cyber-Social Learning Systems
Cyber-Social Learning SystemsCyber-Social Learning Systems
Cyber-Social Learning Systemsdiannepatricia
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)Matthew Lease
 
Csls 20160821 v1
Csls 20160821 v1Csls 20160821 v1
Csls 20160821 v1ISSIP
 
Ten reasons 20130621 v3
Ten reasons 20130621 v3Ten reasons 20130621 v3
Ten reasons 20130621 v3ISSIP
 

Semelhante a Crowdsourcing: From Aggregation to Search Engine Evaluation (20)

Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to Ethics
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
 
Metrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-ComputingMetrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-Computing
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing Science
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loop
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
Stated preference methods and analysis
Stated preference methods and analysisStated preference methods and analysis
Stated preference methods and analysis
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)
 
Technology in Employee Recruitment and Selection
Technology in Employee Recruitment and SelectionTechnology in Employee Recruitment and Selection
Technology in Employee Recruitment and Selection
 
Data Scientists
 Data Scientists Data Scientists
Data Scientists
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
 
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
 
Web search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionWeb search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introduction
 
Digital Economics
Digital EconomicsDigital Economics
Digital Economics
 
Cyber-Social Learning Systems
Cyber-Social Learning SystemsCyber-Social Learning Systems
Cyber-Social Learning Systems
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)
 
Csls 20160821 v1
Csls 20160821 v1Csls 20160821 v1
Csls 20160821 v1
 
Ten reasons 20130621 v3
Ten reasons 20130621 v3Ten reasons 20130621 v3
Ten reasons 20130621 v3
 

Mais de Matthew Lease

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesMatthew Lease
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkMatthew Lease
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Matthew Lease
 
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsMatthew Lease
 
Mechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMatthew Lease
 
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...Matthew Lease
 

Mais de Matthew Lease (11)

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey Responses
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s Clothing
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical Turk
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences.
 
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
 
Mechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMechanical Turk is Not Anonymous
Mechanical Turk is Not Anonymous
 
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...
 

Último

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Crowdsourcing: From Aggregation to Search Engine Evaluation

  • 1. Statistical Crowdsourcing: From Aggregating Judgments to Search Engine Evaluation Matt Lease ir.ischool.utexas.edu School of Information @mattlease University of Texas at Austin ml@utexas.edu
  • 3. Roadmap • What are Crowdsourcing & Human Computation? 4-16 – A great research area for iSchools: something for everyone! • Benchmarking Statistical Consensus Methods 18-26 • Psychometrics & Crowds for Relevance Judging 28-35 3 Matt Lease <ml@utexas.edu>
  • 4. Crowdsourcing • Jeff Howe. WIRED, June 2006 • Rise of digital work & internet empowers a global workforce via open call solicitations • New application of principles from open source movement 4
  • 5. 5
  • 6. • Online marketplace for paid labor since 2005 • On-demand, elastic, 24/7 global workforce • API integrates human labor with computation Amazon Mechanical Turk (MTurk) 6
  • 7. A New Scale of Labeled Data for AI Snow et al., EMNLP 2008 • MTurk labels for 5 NLP Tasks • 22K labels for only $26 • While individual annotations noisy, aggregated consensus labels show high agreement with expert labels (“gold”) 7
  • 8. AI + Human Computation = A new breed of hybrid intelligent systems PlateMate (Noronha et al., UIST’10) 8
  • 9. Social & Behavioral Sciences • A Guide to Behavioral Experiments on Mechanical Turk – W. Mason and S. Suri (2010). SSRN online. • Crowdsourcing for Human Subjects Research – L. Schmidt (CrowdConf 2010) • Crowdsourcing Content Analysis for Behavioral Research: Insights from Mechanical Turk – Conley & Tosti-Kharas (2010). Academy of Management • Amazon's Mechanical Turk : A New Source of Inexpensive, Yet High-Quality, Data? – M. Buhrmester et al. (2011). Perspectives… 6(1):3-5. – see also: Amazon Mechanical Turk Guide for Social Scientists 9
  • 11. Ethics of Crowdsourcing? 11 Paul Hyman. Communications of the ACM, Vol. 56 No. 8, Pages 19-21, August 2013. Matt Lease <ml@utexas.edu>
  • 12. Who are the workers? • A. Baio, November 2008. The Faces of Mechanical Turk. • P. Ipeirotis. March 2010. The New Demographics of Mechanical Turk • J. Ross, et al. Who are the Crowdworkers? CHI 2010. 12 Matt Lease <ml@utexas.edu>
  • 14. Safeguarding Participant Data • “What are the characteristics of MTurk workers?... the MTurk system is set up to strictly protect workers’ anonymity….” 14
  • 15. ` Amazon profile page URLs use the same IDs as used on MTurk! Lease et al., SSRN’13 15
  • 16. Crowdsourcing & the Law: Independent Contractors vs. Employees • Wolfson & Lease, ASIS&T’11 • Some platforms classify online contributors as independent contractors (vs. employees) • While Employment is legally-defined (e.g., FLSA and past court decisions), the definition leaks • It seems unlikely Congress will provide clarity • Class action litigation pending in the courts 16 Matt Lease <ml@utexas.edu>
  • 17. Roadmap • What are Crowdsourcing & Human Computation? 4-16 • Benchmarking Statistical Consensus Methods 18-26 • Psychometrics & Crowds for Relevance Judging 28-35 • a 17 Matt Lease <ml@utexas.edu>
  • 18. Science of Measurement & Benchmarks • “If you cannot measure it, you cannot improve it.” • Drive field innovation by clear challenge tasks – e.g., David Tse’s FIST 2012 Keynote (Comp. Biology) • Many things we can learn – What is the current state-of-the-art? – How do current methods compare? – What works, what doesn’t, and why? – How has field progressed over time? 18 Matt Lease <ml@utexas.edu>
  • 19. Finding Consensus in Human Computation • For an objective labeling task, how do we resolve disagreement between responses? • Simple baseline: majority voting • Research pre-dates crowdsourcing – Dawid and Skene’79, Smyth et al., ’95 • One of the most studied problems in HCOMP – Laymen likely to err more than experts – Methods in many areas: ML, Vision, NLP, IR, DB, … 19 Matt Lease <ml@utexas.edu>
  • 20. 20 Matt Lease <ml@utexas.edu> SQUARE: A Benchmark for Research on Computing Crowd Consensus @HCOMP’13 ir.ischool.utexas.edu/square (open source)
  • 22. Methods Include popular and/or open-source methods • Majority Voting • Expectation-Maximization (Dawid-Skene, 1979) • Naïve Bayes (Snow et al., 2008) • GLAD (Whitehill et al., 2009) • ZenCrowd (Demartini et al., 2012) • Raykar et al. (2012) • CUBAM (Welinder et al., 2010) Matt Lease <ml@utexas.edu> 22
  • 23. Results: Unsupervised Accuracy Relative gain/loss vs. majority voting 23 -15% -10% -5% 0% 5% 10% 15% BM HCB SpamCF WVSCM WB RTE TEMP WSD AC2 HC ALL DS ZC RY GLAD CUBCAM
  • 24. Results: Varying Supervision 24 Matt Lease <ml@utexas.edu>
  • 25. Findings • Majority voting never best, but rarely much worse • No method performs far better than others • Each method often best for some condition – e.g., original dataset method was designed for • DS & RY tend to perform best (RY adds priors) 25 Matt Lease <ml@utexas.edu>
  • 26. Why Don’t We See Bigger Gains? • Of course contributions aren’t just empirical… • Maybe gold is too noisy to detect improvement? – Cormack & Kolcz’09, Klebanov & Beigman’10 • Might we see bigger differences from – Different tasks/scenarios? – Better benchmark tests? – Different methods or tuning? • We invite community contributions! 26 Matt Lease <ml@utexas.edu>
  • 27. Roadmap • What are Crowdsourcing & Human Computation? 4-16 • Benchmarking Statistical Consensus Methods 18-26 • Psychometrics & Crowds for Relevance Judging 28-35 • a 27 Matt Lease <ml@utexas.edu>
  • 28. Multidimensional Relevance Modeling via Psychometrics and Crowdsourcing Joint work with Yinglong Zhang Jin Zhang Jacek Gwizdka Paper @ SIGIR 2014 Matt Lease <ml@utexas.edu> 28
  • 29. Background: Evaluating IR Systems • Classic Cranfield method (Cleverdon et al., 1966) – Given a document collection & set of queries – Judge documents for topical relevance to each query – Evaluate on these queries & documents • Problem: Scaling manual data labeling is difficult • Idea: try Crowdsourcing – Alonso et al. (SIGIR Forum 2008) – Grady & Lease, 2010 – TREC 2011-2013 Crowdsourcing Track 29 Matt Lease <ml@utexas.edu>
  • 30. But Problems are Deeper • User relevance > simple topical relevance – The Great Divide in IR: systems-centered vs. user-centered – What other factors to model, & what is their relative importance? Long history of studies, little consensus. – Dearth of labeled data for training/evaluating systems • Even trusted assessors disagree often on “simple” topical relevance judgments – Often attributed to subjectivity, but can we do better? • How do we ensure quality of subjective data? – Largely unstudied in HCOMP community to date Matt Lease <ml@utexas.edu> 30
  • 31. Pscychology to the Rescue! • A Guide to Behavioral Experiments on Mechanical Turk – W. Mason and S. Suri (2010). SSRN online. • Crowdsourcing for Human Subjects Research – L. Schmidt (CrowdConf 2010) • Crowdsourcing Content Analysis for Behavioral Research: Insights from Mechanical Turk – Conley & Tosti-Kharas (2010). Academy of Management • Amazon's Mechanical Turk : A New Source of Inexpensive, Yet High-Quality, Data? – M. Buhrmester et al. (2011). Perspectives… 6(1):3-5. – see also: Amazon Mechanical Turk Guide for Social Scientists 31
  • 33. Key Ideas from Pscyhometrics • Use standard survey techniques for collecting multi-dimensional relevance judgments – Ask repeated, similar questions, & change polarity • Analyze data via Structural Equation Modeling – cousin to graphical models in statistics/AI – Posit questions associated with latent factors – Use Exploratory Factor Analysis to determine factors & question associations, then prune questions – Use Confirmatory Factor Analysis to assess correlations, test significance, and compare models Matt Lease <ml@utexas.edu> 33
  • 35. Future Directions • Strong foundation for ongoing positivist research of alternative relevance factors – For different user groups, search scenarios, etc. – Need more data to support normative claims • Train/test operational systems for varying factors • Improve judging agreement by making task more natural and/or assessing impact of latent factors • Intra-subject vs. inter-subject aggregation? • SEM vs. graphical modeling? • Other methods for ensuring subjective data quality? Matt Lease <ml@utexas.edu> 35
  • 36. The Future of Crowd Work, CSCW’13 Kittur, Nickerson, Bernstein, Gerber, Shaw, Zimmerman, Lease, and Horton 36 Matt Lease <ml@utexas.edu>
  • 39. A Few Moral Dilemmas • A “fair” price for online work in a global economy? – Is it better to pay nothing (i.e., volunteers, gamification) rather than pay something small for valuable work? • Are we obligated to inform people how their participation / work products will be used? – If my IRB doesn’t require me to obtain informed consent, is there some other moral obligation to do so? • A worker finds his ID posted in a researcher’s online source code and asks that it be removed. This can’t be done without recreating the repo, which many people use. What should be done? Matt Lease <ml@utexas.edu> 39
  • 40. Ethical Crowdsourcing • Assume researchers have good intentions, and so issues of gross negligence are rare – Withholding promised pay after work performed – Not obtaining or complying with IRB oversight • Instead, great challenge is how to recognize our impacts appropriate actions in a complex world – Educating ourselves takes time & effort – Failing to educate ourselves could harm to others • How can we strike a reasonable balance between complete apathy vs. being overly alarmist? Matt Lease <ml@utexas.edu> 40
  • 41. • Contribute to society and human well-being • Avoid harm to others • Be honest and trustworthy • Be fair and take action not to discriminate • Respect the privacy of others COMPLIANCE WITH THE CODE. As an ACM member I will – Uphold and promote the principles of this Code – Treat violations of this code as inconsistent with membership in the ACM 41 Matt Lease <ml@utexas.edu>
  • 42. CS2008 Curriculum Update (ACM, IEEE) There is reasonably wide agreement that this topic of legal, social, professional and ethical should feature in all computing degrees. …financial and economic imperatives …Which approaches are less expensive and is this sensible? With the advent of outsourcing and off-shoring these matters become more complex and take on new dimensions …there are often related ethical issues concerning exploitation… Such matters ought to feature in courses on legal, ethical and professional practice. if ethical considerations are covered only in the standalone course and not “in context,” it will reinforce the false notion that technical processes are void of ethical issues. Thus it is important that several traditional courses include modules that analyze ethical considerations in the context of the technical subject matter … It would be explicitly against the spirit of the recommendations to have only a standalone course. 42 Matt Lease <ml@utexas.edu>
  • 43. “Contribute to society and human well-being; avoid harm to others” • Do we have a moral obligation to try to ascertain conditions under which work is performed? Or the impact we have upon those performing the work? • Do we feel differently when work is performed by – Political refugees? Children? Prisoners? Disabled? • How do we know who is doing the work, or if a decision to work (for a given price) is freely made? – Does it matter why someone accepts offered work? Matt Lease <ml@utexas.edu> 43
  • 44. Some Notable Prior Research • Silberman, Irani, and Ross (2010) – “How should we… conceptualize the role of these people who we ask to power our computing?” – “abstraction hides detail'‘ - some details may be worth keeping conspicuously present (Jessica Hullman) • Irani and Silberman (2013) – “…AMT helps employers see themselves as builders of innovative technologies, rather than employers unconcerned with working conditions.” – “…human computation currently relies on worker invisibility.” • Fort, Adda, and Cohen (2011) – “…opportunities for our community to deliberately value ethics above cost savings.” 44
  • 45. Power Asymmetry on MTurk 45 Matt Lease <ml@utexas.edu> • Mistakes happen, such as wrongly rejecting work – e.g., error by new student, software bug, poor instructions, noisy gold, etc. • How do we balance the harm caused by our mistakes to workers (our liability) vs. our cost/effort of preventing such mistakes?
  • 46. Task Decomposition By minimizing context, greater task efficiency & accuracy can often be achieved in practice – e.g. “Can you name who is in this photo?” • Much research on ways to streamline work and decompose complex tasks 46 Matt Lease <ml@utexas.edu>
  • 47. Context & Informed Consent • Assume we wish to obtain informed consent • Without context, consent cannot be informed – Zittrain, Ubiquitous human computing (2008) 47
  • 48. Consequences of Human Computation as a Panacea where AI Falls Short • The Googler who Looked at the Worst of the Internet • Policing the Web’s Lurid Precincts • Facebook content moderation • The dirty job of keeping Facebook clean • Even linguistic annotators report stress & nightmares from reading news articles! 48 Matt Lease <ml@utexas.edu>
  • 49. What about Freedom? • Crowdsourcing vision: empowering freedom – work whenever you want for whomever you want • Risk: people compelled to perform work – Chinese prisoners farming gold online – Digital sweat shops? Digital slaves? – We know relatively little today about work conditions – How might we monitor and mitigate risk/growth of crowd work inflicting harm to at-risk populations? – Traction? Human Trafficking at MSR Summit’12 49 Matt Lease <ml@utexas.edu>
  • 50. Robert Sim, MSR Summit’12 50 Matt Lease <ml@utexas.edu>
  • 51. Join the conversation! Crowdwork-ethics, by Six Silberman http://crowdwork-ethics.wtf.tw an informal, occasional blog for researchers interested in ethical issues in crowd work 51 Matt Lease <ml@utexas.edu>
  • 52. Additional References • Irani, Lilly C. The Ideological Work of Microwork. In preparation, draft available online. • Adda, Gilles, et al. Crowdsourcing for language resource development: Critical analysis of amazon mechanical turk overpowering use. Proceedings of the 5th Language and Technology Conference (LTC). 2011. • Adda, Gilles, and Joseph J. Mariani. Economic, Legal and Ethical analysis of Crowdsourcing for Speech Processing. (2013). • Harris, Christopher G., and Padmini Srinivasan. Crowdsourcing and Ethics. Security and Privacy in Social Networks. 67-83. 2013. • Harris, Christopher G. Dirty Deeds Done Dirt Cheap: A Darker Side to Crowdsourcing. IEEE 3rd conference on social computing (socialcom). 2011. • Horton, John J. The condition of the Turking class: Are online employers fair and honest?. Economics Letters 111.1 (2011): 10-12. 52 Matt Lease <ml@utexas.edu>
  • 53. • Bederson, B. B., & Quinn, A. J. Web workers unite! addressing challenges of online laborers. In CHI 2011 Human Computation Workshop, 97-106. • Bederson, B. B., & Quinn, A. J. Participation in Human Computation. In CHI 2011 Human Computation Workshop. • Felstiner, Alek. Working the Crowd: Employment and Labor Law in the Crowdsourcing Industry. Berkeley J. Employment & Labor Law 32.1 2011 • Felstiner, Alek. Sweatshop or Paper Route?: Child Labor Laws and In- Game Work. CrowdConf (2010). • Larson, Martha. Toward Responsible and Sustainable Crowsourcing. Blog post + Slides from Dagstuhl, September 2013. • Vili Lehdonvirta and Paul Mezier. Identity and Self-Organization in Unstructured Work. Unpublished working paper. 16 October 2013. • Zittrain, Jonathan. Minds for Sale. You Tube. 53 Matt Lease <ml@utexas.edu> Additional References (2)