Mechanical Turk for Social Science Introduction

Mechanical Turk for Social Science Sean Munson EytanBakshy School of Information, UMich 28 October 2009

11:00 am - Problem:Need to classify thousands of blogs according to category.

1:00 pm 50 blogs classified5x each

Mechanical Turk for Social Science Awesome Sean Munson EytanBakshy An API made of people!

Overview Who are the Turkers? Tasks suitable for Mechanical Turk and workarounds for tasks that are semi-suitable Tasks from Turkers’ and requesters’ points of view Examples Classifying links Reacting to collections of links Practicalities Tools Paying Turkers at UMich Human Subjects Slides will be available online.

Andy Baio, Faces of Mechanical Turk

300 Turker Survey from PanosIpeirotis Limited by self-selection issues (people who do tasks w/ only one available, and at that pay). By country: 76% US; 8% India; 3% UK; 2% Canada

Ideal types of tasks Short duration Repetitive – Turker learns once, repeats many No particular expertise required From requester perspective: Human input is verifiable with less effort than it would take to do it yourself or to pay an expert, e.g. tasks that require people to write something assess quality using multiple raters but you can use it in other ways.

Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task

Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task Create task type Load Task instances (prepay) Flickr:Michelle Gibson

Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks

Large-scale study of diffusion and influence on Twitter How does the spread of a URL over the twitter network depend on the content? What proportion of “influential” users are mass media vs. individuals Requires thousands of labels of URLs and users. Needs to be fast and cheap.

Turkers as Subjects – Challenges Hard to check answer quality when you want opinions! Screening & treatment randomization mTurk not optimized for 1x tasks

How to screen? Liberal Republican Democrat Conservative

Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Take Qualification Get Paid Complete task Create task type Load Task instances (prepay) Require 95% task approval rating Require US location Ask demographics, political preferences Approve or reject tasks

Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Take Qualification Get Paid Complete task Create task type Load Task instances (prepay) Approve or reject tasks Evaluate Qualification: Grant or reject Create or use existing qualification

Checking for validity Couldn’t ask verifiable information (Kittur and Chi) about collection without affecting how the subjects look at the list Did have demographic info from qualification. Randomly selected a question to repeat  removed people for gender changes, aging backwards, or major changes in political preferences

Total cost: $382 for 485 collection ratings Had to pay more (~$12/hr) because only one task available at a time, plus required (unpaid) qualification.

Tools Web interface: WYSIWYG editor, CSV upload of tasks. Many task templates to use as starting points. Very simple and fast to use, but limited in capability. Command line tools: Required to create custom qualifications or use multiple quals. Much more flexibility. Input format is XML. Documentation is adequate, overall experience is clunky. Other libraries(e.g. http://developer.amazonwebservices.com/connect/entry.jspa?externalID=827&categoryID=85) 3rd party tools: Almost as easy to use as Amazon’s web interface & support nearly all features of command line tools. But they take a cut. CrowdFlower – from Dolores Labs: crowdflower.com Smartsheet: smartsheet.com/product/smartsourcing

Human subjects? Human subjects status varies with design Categorizing content: Not human subjects Asking for reactions to content: Human subjects. Informed Consent My preference has been to argue for waiver of informed consent. (Mechanical Turk terms of service prohibit collection of identifiable information.) You can use qualifications if you have a task where you feel informed consent is appropriate, have extended consent information and have repetitive tasks.

Subject payment mTurk handles all payment, but Associate your account with the University of Michigan employer ID number, in case any one person earns more than the IRS reporting limit from all Michigan mTurk studies.Stacy Callahan or I have more information.

Automatically accept another task of this type, or go find a new task Task listing – Preview & select task Take Qualification Get Paid Complete task Turker Create task type Load Task instances (prepay) Approve or reject tasks Evaluate Qualification: Grant or reject Scoring ,[object Object]

Download & score: Good for participant screening, fast turnaround (run every minute), random assignmentCan set limits on retaking Too many rejects? Revoke qualification. Create or use existing qualification ,[object Object]

Built in quals for location, reputationRequester Can assign people to dummy qualifications to allow them to take follow-up studies, and you can email them through mTurk. Also can exclude this way to maintain virgin sample.

Some references & resources General Dolores Labs blog: http://blog.doloreslabs.com/ Turker Nation forums: http://turkers.proboards.com 5 Study how-tos from Markus Jakobsson (PARC)http://blogs.parc.com/blog/2009/07/experimenting-on-mechanical-turk-5-how-tos/ Turker Demographics Survey by PanosIpeirotishttp://behind-the-enemy-lines.blogspot.com/2008/03/mechanical-turk-demographics.html Turker demographics vs. Internet Demographicshttp://behind-the-enemy-lines.blogspot.com/2009/03/turker-demographics-vs-internet.html Why do people participatehttp://behind-the-enemy-lines.blogspot.com/2008/03/why-people-participate-on-mechanical.html Why do people participate (more)http://www.floozyspeak.com/blog/archives/2008/08/valley_of_the_t.html

Mechanical Turk for Social Science Introduction

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (13)

Semelhante a Mechanical Turk for Social Science Introduction

Semelhante a Mechanical Turk for Social Science Introduction (20)

Mais de Sean Munson

Mais de Sean Munson (10)

Último

Último (20)

Mechanical Turk for Social Science Introduction

Notas do Editor