SlideShare uma empresa Scribd logo
1 de 49
Mechanical Turk for Social Science Sean Munson	EytanBakshy School of Information, UMich	28 October 2009
11:00 am - Problem:Need to classify thousands of blogs according to category.
Lunch* *not actual lunch
1:00 pm 50 blogs classified5x each
Mechanical Turk for Social Science Awesome Sean Munson                                            EytanBakshy An API made of people!
Overview Who are the Turkers? Tasks suitable for Mechanical Turk and workarounds for tasks that are semi-suitable Tasks from Turkers’ and requesters’ points of view Examples Classifying links Reacting to collections of links Practicalities Tools Paying Turkers at UMich Human Subjects Slides will be available online.
Who are the Turkers?
Andy Baio, Faces of Mechanical Turk
Andy Baio, Faces of Mechanical Turk
Andy Baio, Faces of Mechanical Turk
300 Turker Survey from PanosIpeirotis Limited by self-selection issues (people who do tasks w/ only one available, and at that pay). By country: 	76% US;   8% India;    3% UK;     2% Canada
Ideal types of tasks Short duration Repetitive – Turker learns once, repeats many  No particular expertise required From requester perspective: Human input is verifiable with less effort than it would take to do it yourself or to pay an expert, e.g. tasks that require people to write something  assess quality using multiple raters but you can use it in other ways.
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Get Paid Complete task
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Get Paid Complete task
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Get Paid Complete task
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Get Paid Complete task
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Get Paid Complete task Create task type Load Task instances (prepay) Flickr:Michelle Gibson
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks
Turkers as Classifiers
Large-scale study of diffusion and influence on Twitter How does the spread of a URL over the twitter network depend on the content? What proportion of “influential” users are mass media vs. individuals Requires thousands of labels of URLs and users.  Needs to be fast and cheap.
Turkers as Subjects
Turkers as Subjects – Challenges Hard to check answer quality when you want opinions! Screening & treatment randomization mTurk not optimized for 1x tasks
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks
How to screen? Liberal Republican Democrat Conservative
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Take Qualification Get Paid Complete task Create task type Load Task instances (prepay) Require 95% task approval rating Require US location Ask demographics, political preferences Approve or reject tasks
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Take Qualification Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks Evaluate Qualification: Grant or reject Create or use existing qualification
Checking for validity Couldn’t ask verifiable information (Kittur and Chi) about collection without affecting how the subjects look at the list Did have demographic info from qualification. Randomly selected a question to repeat  removed people for gender changes, aging backwards, or major changes in political preferences
Total cost:  $382 for 485 collection ratings Had to pay more (~$12/hr) because only one task available at a time, plus required (unpaid) qualification.
Practicalities
Tools Web interface: WYSIWYG editor, CSV upload of tasks. Many task templates to use as starting points. Very simple and fast to use, but limited in capability.  Command line tools: Required to create custom qualifications or use multiple quals. Much more flexibility. Input format is XML. Documentation is adequate, overall experience is clunky. Other libraries(e.g. http://developer.amazonwebservices.com/connect/entry.jspa?externalID=827&categoryID=85) 3rd party tools: Almost as easy to use as Amazon’s web interface & support nearly all features of command line tools. But they take a cut.  CrowdFlower – from Dolores Labs: crowdflower.com Smartsheet: smartsheet.com/product/smartsourcing
Human subjects? Human subjects status varies with design Categorizing content: Not human subjects Asking for reactions to content: Human subjects. Informed Consent My preference has been to argue for waiver of informed consent. (Mechanical Turk terms of service prohibit collection of identifiable information.) You can use qualifications if you have a task where you feel informed consent is appropriate, have extended consent information and have repetitive tasks.
Subject payment mTurk handles all payment, but Associate your account with the University of Michigan employer ID number, in case any one person earns more than the IRS reporting limit from all Michigan mTurk studies.Stacy Callahan or I have more information.
Automatically accept another task of this type, or go find a new task Task listing – Preview & select task  Take Qualification Get Paid Complete task Turker Create task type Load Task instances (prepay) Approve or reject tasks Evaluate Qualification: Grant or reject Scoring ,[object Object]
Download & score: Good for participant screening, fast turnaround (run every minute), random assignmentCan set limits on retaking Too many rejects? Revoke qualification. Create or use existing qualification ,[object Object]
Built in quals for location, reputationRequester Can assign people to dummy qualifications to allow them to take follow-up studies, and you can email them through mTurk. Also can exclude this way to maintain virgin sample.
Some references & resources General Dolores Labs blog: http://blog.doloreslabs.com/ Turker Nation forums: http://turkers.proboards.com 5 Study how-tos from Markus Jakobsson (PARC)http://blogs.parc.com/blog/2009/07/experimenting-on-mechanical-turk-5-how-tos/ Turker Demographics Survey by PanosIpeirotishttp://behind-the-enemy-lines.blogspot.com/2008/03/mechanical-turk-demographics.html Turker demographics vs. Internet Demographicshttp://behind-the-enemy-lines.blogspot.com/2009/03/turker-demographics-vs-internet.html Why do people participatehttp://behind-the-enemy-lines.blogspot.com/2008/03/why-people-participate-on-mechanical.html Why do people participate (more)http://www.floozyspeak.com/blog/archives/2008/08/valley_of_the_t.html

Mais conteúdo relacionado

Destaque

Presenting Diverse Political Opinions: How and How Much (CHI 2010)
Presenting Diverse Political Opinions: How and How Much (CHI 2010)Presenting Diverse Political Opinions: How and How Much (CHI 2010)
Presenting Diverse Political Opinions: How and How Much (CHI 2010)Sean Munson
 
רשימת שמירה ינואר 13
רשימת שמירה ינואר 13רשימת שמירה ינואר 13
רשימת שמירה ינואר 13sadrinat
 
resume minesh s soni
resume minesh s soniresume minesh s soni
resume minesh s soniminesh soni
 
רשימת שמירה פברואר 11
רשימת שמירה פברואר 11רשימת שמירה פברואר 11
רשימת שמירה פברואר 11sadrinat
 
Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...
Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...
Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...Suva Präsentationen und Broschüren
 
Imagen de ciudad relacion
Imagen de ciudad relacionImagen de ciudad relacion
Imagen de ciudad relacionJuan Padilla
 

Destaque (13)

Una jornada escolar
Una jornada escolarUna jornada escolar
Una jornada escolar
 
Presenting Diverse Political Opinions: How and How Much (CHI 2010)
Presenting Diverse Political Opinions: How and How Much (CHI 2010)Presenting Diverse Political Opinions: How and How Much (CHI 2010)
Presenting Diverse Political Opinions: How and How Much (CHI 2010)
 
Admpublica
AdmpublicaAdmpublica
Admpublica
 
רשימת שמירה ינואר 13
רשימת שמירה ינואר 13רשימת שמירה ינואר 13
רשימת שמירה ינואר 13
 
resume minesh s soni
resume minesh s soniresume minesh s soni
resume minesh s soni
 
רשימת שמירה פברואר 11
רשימת שמירה פברואר 11רשימת שמירה פברואר 11
רשימת שמירה פברואר 11
 
Comunicación global parte 2
Comunicación global parte 2Comunicación global parte 2
Comunicación global parte 2
 
Dinos nao existem
Dinos nao existemDinos nao existem
Dinos nao existem
 
Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...
Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...
Informationsanlass der Suva Aarau: Partnerschaftliche Zusammenarbeit von Arzt...
 
Dmitry Plushevskiy, Add in App
Dmitry Plushevskiy, Add in AppDmitry Plushevskiy, Add in App
Dmitry Plushevskiy, Add in App
 
Imagen de ciudad relacion
Imagen de ciudad relacionImagen de ciudad relacion
Imagen de ciudad relacion
 
Posso clamar eyshla
Posso clamar   eyshlaPosso clamar   eyshla
Posso clamar eyshla
 
Las rocas
Las rocasLas rocas
Las rocas
 

Semelhante a Mechanical Turk for Social Science Introduction

Remote, unmoderated usability and user testing.
Remote, unmoderated usability and user testing.Remote, unmoderated usability and user testing.
Remote, unmoderated usability and user testing.Marc-Oliver Gern
 
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...Amazon Web Services
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social MediaJeffrey Nichols
 
Identifying and improving top tasks
Identifying and improving top tasksIdentifying and improving top tasks
Identifying and improving top tasksMichele Ide-Smith
 
C_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 Exam
C_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 ExamC_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 Exam
C_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 ExamPass Certifications
 
Vipul Kocher - Software Testing, A Framework Based Approach
Vipul Kocher - Software Testing, A Framework Based ApproachVipul Kocher - Software Testing, A Framework Based Approach
Vipul Kocher - Software Testing, A Framework Based ApproachTEST Huddle
 
Project ppt, Learn Project Java
Project ppt, Learn Project JavaProject ppt, Learn Project Java
Project ppt, Learn Project JavaRohit Singh
 
Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016Anthony D. Paul
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
PEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docx
PEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docxPEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docx
PEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docxdanhaley45372
 
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4Anthony D. Paul
 
1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 Exam
1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 Exam1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 Exam
1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 ExamPass Certifications
 
Agile User Studies (Agile & Beyond 2012)
Agile User Studies (Agile & Beyond 2012)Agile User Studies (Agile & Beyond 2012)
Agile User Studies (Agile & Beyond 2012)Derek Poppink CXA CUA
 
Using the ACBSP Online Reporting Portal to Complete a Self Study
Using the ACBSP Online Reporting Portal to Complete a Self StudyUsing the ACBSP Online Reporting Portal to Complete a Self Study
Using the ACBSP Online Reporting Portal to Complete a Self StudyACBSP Global Accreditation
 
Usability Primer - for Alberta Municipal Webmasters Working Group
Usability Primer - for Alberta Municipal Webmasters Working GroupUsability Primer - for Alberta Municipal Webmasters Working Group
Usability Primer - for Alberta Municipal Webmasters Working GroupNormanMendoza
 
September_08 SQuAd Presentation
September_08 SQuAd PresentationSeptember_08 SQuAd Presentation
September_08 SQuAd Presentationiradari
 
A Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on TwitterA Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on TwitterJonathas Magalhães
 

Semelhante a Mechanical Turk for Social Science Introduction (20)

Remote, unmoderated usability and user testing.
Remote, unmoderated usability and user testing.Remote, unmoderated usability and user testing.
Remote, unmoderated usability and user testing.
 
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
 
SEO + MTurk
SEO + MTurk SEO + MTurk
SEO + MTurk
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social Media
 
Identifying and improving top tasks
Identifying and improving top tasksIdentifying and improving top tasks
Identifying and improving top tasks
 
Oco usability
Oco usabilityOco usability
Oco usability
 
C_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 Exam
C_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 ExamC_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 Exam
C_GRCAC_10 Exam - Six Things You Must know About SAP C_GRCAC_10 Exam
 
Vipul Kocher - Software Testing, A Framework Based Approach
Vipul Kocher - Software Testing, A Framework Based ApproachVipul Kocher - Software Testing, A Framework Based Approach
Vipul Kocher - Software Testing, A Framework Based Approach
 
OCSS_PPT.ppt
OCSS_PPT.pptOCSS_PPT.ppt
OCSS_PPT.ppt
 
Project ppt, Learn Project Java
Project ppt, Learn Project JavaProject ppt, Learn Project Java
Project ppt, Learn Project Java
 
Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
PEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docx
PEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docxPEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docx
PEER RESPONSESWeek 3 - Discussion 1Kirkpatricks Taxo.docx
 
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
 
1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 Exam
1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 Exam1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 Exam
1Z0-450 Exam - Six Things You Must know About Oracle 1Z0-450 Exam
 
Agile User Studies (Agile & Beyond 2012)
Agile User Studies (Agile & Beyond 2012)Agile User Studies (Agile & Beyond 2012)
Agile User Studies (Agile & Beyond 2012)
 
Using the ACBSP Online Reporting Portal to Complete a Self Study
Using the ACBSP Online Reporting Portal to Complete a Self StudyUsing the ACBSP Online Reporting Portal to Complete a Self Study
Using the ACBSP Online Reporting Portal to Complete a Self Study
 
Usability Primer - for Alberta Municipal Webmasters Working Group
Usability Primer - for Alberta Municipal Webmasters Working GroupUsability Primer - for Alberta Municipal Webmasters Working Group
Usability Primer - for Alberta Municipal Webmasters Working Group
 
September_08 SQuAd Presentation
September_08 SQuAd PresentationSeptember_08 SQuAd Presentation
September_08 SQuAd Presentation
 
A Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on TwitterA Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on Twitter
 

Mais de Sean Munson

Encouraging Reading of Diverse Political Viewpoints with a Browser Widget
Encouraging Reading of Diverse Political Viewpoints with a Browser WidgetEncouraging Reading of Diverse Political Viewpoints with a Browser Widget
Encouraging Reading of Diverse Political Viewpoints with a Browser WidgetSean Munson
 
Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...
Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...
Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...Sean Munson
 
Happier Together: Integrating a Wellness Application Into a Social Network Site
Happier Together: Integrating a Wellness Application Into a Social Network SiteHappier Together: Integrating a Wellness Application Into a Social Network Site
Happier Together: Integrating a Wellness Application Into a Social Network SiteSean Munson
 
Challenges and Opportunities in Using Online Social Networks for Health (CSCW...
Challenges and Opportunities in Using Online Social Networks for Health (CSCW...Challenges and Opportunities in Using Online Social Networks for Health (CSCW...
Challenges and Opportunities in Using Online Social Networks for Health (CSCW...Sean Munson
 
Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)
Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)
Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)Sean Munson
 
The Prevalence of Political Discourse in Non-Political Blogs
The Prevalence of Political Discourse in Non-Political BlogsThe Prevalence of Political Discourse in Non-Political Blogs
The Prevalence of Political Discourse in Non-Political BlogsSean Munson
 
Attitudes toward Online Availability of US Public Records
Attitudes toward Online Availability of US Public RecordsAttitudes toward Online Availability of US Public Records
Attitudes toward Online Availability of US Public RecordsSean Munson
 
Building Wellness Interventions Into Facebook
Building Wellness Interventions Into Facebook Building Wellness Interventions Into Facebook
Building Wellness Interventions Into Facebook Sean Munson
 
Motivating and Enabling Organizational Memory with a Workgroup Wiki
Motivating and Enabling Organizational Memory with a Workgroup WikiMotivating and Enabling Organizational Memory with a Workgroup Wiki
Motivating and Enabling Organizational Memory with a Workgroup WikiSean Munson
 
Sidelines: An Algorithm for Increasing Diversity in News and Opinion Aggregators
Sidelines: An Algorithm for Increasing Diversity in News and Opinion AggregatorsSidelines: An Algorithm for Increasing Diversity in News and Opinion Aggregators
Sidelines: An Algorithm for Increasing Diversity in News and Opinion AggregatorsSean Munson
 

Mais de Sean Munson (10)

Encouraging Reading of Diverse Political Viewpoints with a Browser Widget
Encouraging Reading of Diverse Political Viewpoints with a Browser WidgetEncouraging Reading of Diverse Political Viewpoints with a Browser Widget
Encouraging Reading of Diverse Political Viewpoints with a Browser Widget
 
Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...
Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...
Exploring Goal-setting, Rewards, Self-monitoring, and Sharing to Motivate Phy...
 
Happier Together: Integrating a Wellness Application Into a Social Network Site
Happier Together: Integrating a Wellness Application Into a Social Network SiteHappier Together: Integrating a Wellness Application Into a Social Network Site
Happier Together: Integrating a Wellness Application Into a Social Network Site
 
Challenges and Opportunities in Using Online Social Networks for Health (CSCW...
Challenges and Opportunities in Using Online Social Networks for Health (CSCW...Challenges and Opportunities in Using Online Social Networks for Health (CSCW...
Challenges and Opportunities in Using Online Social Networks for Health (CSCW...
 
Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)
Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)
Thanks and Tweets: Comparing Two Public Displays (CSCW 2011)
 
The Prevalence of Political Discourse in Non-Political Blogs
The Prevalence of Political Discourse in Non-Political BlogsThe Prevalence of Political Discourse in Non-Political Blogs
The Prevalence of Political Discourse in Non-Political Blogs
 
Attitudes toward Online Availability of US Public Records
Attitudes toward Online Availability of US Public RecordsAttitudes toward Online Availability of US Public Records
Attitudes toward Online Availability of US Public Records
 
Building Wellness Interventions Into Facebook
Building Wellness Interventions Into Facebook Building Wellness Interventions Into Facebook
Building Wellness Interventions Into Facebook
 
Motivating and Enabling Organizational Memory with a Workgroup Wiki
Motivating and Enabling Organizational Memory with a Workgroup WikiMotivating and Enabling Organizational Memory with a Workgroup Wiki
Motivating and Enabling Organizational Memory with a Workgroup Wiki
 
Sidelines: An Algorithm for Increasing Diversity in News and Opinion Aggregators
Sidelines: An Algorithm for Increasing Diversity in News and Opinion AggregatorsSidelines: An Algorithm for Increasing Diversity in News and Opinion Aggregators
Sidelines: An Algorithm for Increasing Diversity in News and Opinion Aggregators
 

Último

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Último (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Mechanical Turk for Social Science Introduction

  • 1. Mechanical Turk for Social Science Sean Munson EytanBakshy School of Information, UMich 28 October 2009
  • 2. 11:00 am - Problem:Need to classify thousands of blogs according to category.
  • 4.
  • 5. 1:00 pm 50 blogs classified5x each
  • 6. Mechanical Turk for Social Science Awesome Sean Munson EytanBakshy An API made of people!
  • 7. Overview Who are the Turkers? Tasks suitable for Mechanical Turk and workarounds for tasks that are semi-suitable Tasks from Turkers’ and requesters’ points of view Examples Classifying links Reacting to collections of links Practicalities Tools Paying Turkers at UMich Human Subjects Slides will be available online.
  • 8. Who are the Turkers?
  • 9. Andy Baio, Faces of Mechanical Turk
  • 10. Andy Baio, Faces of Mechanical Turk
  • 11. Andy Baio, Faces of Mechanical Turk
  • 12. 300 Turker Survey from PanosIpeirotis Limited by self-selection issues (people who do tasks w/ only one available, and at that pay). By country: 76% US; 8% India; 3% UK; 2% Canada
  • 13.
  • 14.
  • 15.
  • 16. Ideal types of tasks Short duration Repetitive – Turker learns once, repeats many No particular expertise required From requester perspective: Human input is verifiable with less effort than it would take to do it yourself or to pay an expert, e.g. tasks that require people to write something assess quality using multiple raters but you can use it in other ways.
  • 17. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task
  • 18. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task
  • 19. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task
  • 20. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task
  • 21. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task Create task type Load Task instances (prepay) Flickr:Michelle Gibson
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks
  • 29. Large-scale study of diffusion and influence on Twitter How does the spread of a URL over the twitter network depend on the content? What proportion of “influential” users are mass media vs. individuals Requires thousands of labels of URLs and users. Needs to be fast and cheap.
  • 30.
  • 31.
  • 32.
  • 34. Turkers as Subjects – Challenges Hard to check answer quality when you want opinions! Screening & treatment randomization mTurk not optimized for 1x tasks
  • 35.
  • 36. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks
  • 37. How to screen? Liberal Republican Democrat Conservative
  • 38. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Take Qualification Get Paid Complete task Create task type Load Task instances (prepay) Require 95% task approval rating Require US location Ask demographics, political preferences Approve or reject tasks
  • 39. Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Take Qualification Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks Evaluate Qualification: Grant or reject Create or use existing qualification
  • 40. Checking for validity Couldn’t ask verifiable information (Kittur and Chi) about collection without affecting how the subjects look at the list Did have demographic info from qualification. Randomly selected a question to repeat  removed people for gender changes, aging backwards, or major changes in political preferences
  • 41. Total cost: $382 for 485 collection ratings Had to pay more (~$12/hr) because only one task available at a time, plus required (unpaid) qualification.
  • 43. Tools Web interface: WYSIWYG editor, CSV upload of tasks. Many task templates to use as starting points. Very simple and fast to use, but limited in capability. Command line tools: Required to create custom qualifications or use multiple quals. Much more flexibility. Input format is XML. Documentation is adequate, overall experience is clunky. Other libraries(e.g. http://developer.amazonwebservices.com/connect/entry.jspa?externalID=827&categoryID=85) 3rd party tools: Almost as easy to use as Amazon’s web interface & support nearly all features of command line tools. But they take a cut. CrowdFlower – from Dolores Labs: crowdflower.com Smartsheet: smartsheet.com/product/smartsourcing
  • 44. Human subjects? Human subjects status varies with design Categorizing content: Not human subjects Asking for reactions to content: Human subjects. Informed Consent My preference has been to argue for waiver of informed consent. (Mechanical Turk terms of service prohibit collection of identifiable information.) You can use qualifications if you have a task where you feel informed consent is appropriate, have extended consent information and have repetitive tasks.
  • 45. Subject payment mTurk handles all payment, but Associate your account with the University of Michigan employer ID number, in case any one person earns more than the IRS reporting limit from all Michigan mTurk studies.Stacy Callahan or I have more information.
  • 46.
  • 47.
  • 48. Built in quals for location, reputationRequester Can assign people to dummy qualifications to allow them to take follow-up studies, and you can email them through mTurk. Also can exclude this way to maintain virgin sample.
  • 49. Some references & resources General Dolores Labs blog: http://blog.doloreslabs.com/ Turker Nation forums: http://turkers.proboards.com 5 Study how-tos from Markus Jakobsson (PARC)http://blogs.parc.com/blog/2009/07/experimenting-on-mechanical-turk-5-how-tos/ Turker Demographics Survey by PanosIpeirotishttp://behind-the-enemy-lines.blogspot.com/2008/03/mechanical-turk-demographics.html Turker demographics vs. Internet Demographicshttp://behind-the-enemy-lines.blogspot.com/2009/03/turker-demographics-vs-internet.html Why do people participatehttp://behind-the-enemy-lines.blogspot.com/2008/03/why-people-participate-on-mechanical.html Why do people participate (more)http://www.floozyspeak.com/blog/archives/2008/08/valley_of_the_t.html
  • 50. Some references & resources Improving Answer quality AniketKittur, Ed H. Chi, and BongwonSuh (2008). “Crowdsourcing user studies with Mechanical Turk,” CHI 2008. Answer quality and dealing with bad answers Carpenter, Bob. 2008. Hierarchical Bayesian Models of Categorical Data Raykar et al. (2009) Supervised Learning from Multiple Experts: Whom to Trust when Everyone Lies a Bit, ICML. Worker quality & HIT difficultyhttp://behind-the-enemy-lines.blogspot.com/2008/08/mechanical-turk-worker-quality-and-hit.html Also see literature on scoring a test without an answer key
  • 51. Some references & resources Turker effort, skills, participation rate, and pay W Mason, D Watts. (2009). Financial Incentives and the Performance of Crowds. KDD Workshop on Human Computation. Self report on skillshttp://behind-the-enemy-lines.blogspot.com/2009/01/how-good-are-you-turker.html Human Subjects Consent in qualification testshttp://behind-the-enemy-lines.blogspot.com/2009/08/get-consent-form-for-irb-on-mturk-using.html Discussionhttp://behind-the-enemy-lines.blogspot.com/2009/01/mechanical-turk-human-subjects-and-irbs.html

Notas do Editor

  1. Tasks can be sorted by price or number of HITs available, among other things. To increase participation, you generally want to appear higher on at least one of these lists.
  2. For this study, we wanted Conservative Republicans and Liberal Democrats, not people with neutral views, Liberal Republicans, or Conservative Democrats.
  3. Restricting who can participate.
  4. If not automatically scored, the qualification introduces an even bigger delay in the process, and you’ll lose workers. But scoring it yourself allows a lot more control, and lets you retain turker answer data.