SlideShare uma empresa Scribd logo
1 de 41
Baixar para ler offline
Getting Started with Mechanical Turk
Emily Tucker Prud’hommeaux
June 15, 2010
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools:
6. Getting fancy with the command line: external pages.
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Mechanical Turk, a.k.a Mturk
What is Mechanical Turk?
• Then: A chess-playing “robot” -- actually a guy in a box.
• Now: A service run by Amazon.com that allows people
worldwide to do work or answer questions for you.
Mechanical Turk Terminology
• Requester: You, the person asking the
questions.
• Workers (or Turkers): The people answering
your questions.
• Human Intelligence Task (HIT): The question or
set of questions you want them to answer.
• Reward: How much you pay a Worker for a HIT.
MTurk vs. Traditional Methods
Mechanical Turk Traditional Methods
Many workers answer a few
questions in a short period.
Few subjects answer lots of
questions over a long period.
Not a lot of interaction -- may be
hard to explain task.
Tons of interaction -- lots of
opportunity to explain things.
Who are these people?!? You know your subjects.
Very cheap, and you don’t have to
pay if they do a bad job.
Not so cheap, and you have to
pay the people anyway.
Quality control is tricky. Quality control is not so hard.
Less opportunity for bias on the
part of the experimenter.
More opportunity for bias.
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Creating Your Accounts
1. Create an Amazon Mechanical Turk Requester
account. You need this to use Mechanical Turk.
https://requester.mturk.com/mturk/beginsignin
2. (Optional) Create an Amazon Web Services (AWS)
account. You need this to be able use the command line
tools and possibly for some other things:
https://aws-portal.amazon.com/gp/aws/developer/registration/index.html
Funding Your Account
Funding Your Account
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Setting up your first experiment.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Creating a HIT
1. Click the Design tab
Select a Template
Letʼs try Data Collection
2. Select a HIT template.
Enter Properties
Don’t give people too much time
Other criteria can be helpful
(e.g., must live in US). Amazon
displays your HIT only to the
people who meet the criteria.
Reward: usually just a few
cents, unless it’s really long.
Be brief but descriptive.
Design Layout
Click here to edit the HTML.
Ah, much better!
Design Layout
Input data variables. You’ll
upload a CSV file containing their
values. Format them this way and
MTurk will interpret them for you.
This is how worker responses
get stored, just like a regular old
HTML form, which you already
know all about.
Hint: If you want some specific type of HTML form input (e.g., radio
buttons, drop down menu, checkbox), look at the Blank Template template.
Preview and Finish
Recall: we will upload a CSV file
to fill in these blanks for each HIT.
Publishing Your HIT
Create and Upload CSV File
You create the CSV file on your computer and upload it here. It will look
something like this for this example.
name,address,phone
Bread and Ink,3600 SE Hawthorne,503-555-1212
Three Doors Down,1415 SE 38th,503-555-1213
Cha cha cha!,3375 SE Hawthorne,503-555-1214
Preview your HIT
The ${name}, ${phone}, and
${address} variables got
filled with the values from
your CSV file.
Confirm and Publish
Manage HITs and Results
Review and Download Results
Approve or reject that worker’s work.
Download results to your computer.
You get to process your
results file however you like --
open it in Excel or write a
program to make it look nice.
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Including Audio without Flash
• For audio, you can convert your wavs to mp3, put them on
the web, have the links to the mp3s be your variables in the
CSV file, then force the links to open in a new window.
• If you want something more reliable, embed the audio in a
Flash player, which I am about to describe.
• If you need more control (e.g., you want to prevent the
worker from listening to the wave more than once), you
might need to use something fancier like Javascript.
audiofile1,audiofile2
http://etucker.com/a1.mp3,http://etucker.com/a2.mp3
CSV file
<a target="_blank" href="${audiofi1e1}>Audio1</a>
Template HTML
Including Audio with Flash
• If you donʼt want the audio to open in a new window,
embed the audio in a Flash player.
• I use the Google audio Flash player, which works well and
has nice controls.
• The html will look something like this:
<embed src="http://www.google.com/reader/ui/3523697345-audio-player.swf"
flashvars="audioUrl=${audiofile}" width="400" height="27" quality="best"
type="application/x-shockwave-flash"></embed>
• The input file will look something like this:
audiofile
http://www.csee.ogi.edu/mechturk/audio1.mp3
http://www.csee.ogi.edu/mechturk/audio2.mp3
http://www.csee.ogi.edu/mechturk/audio3.mp3
http://www.csee.ogi.edu/mechturk/audio4.mp3
Including Video
• For videos, I have been using Flash.
• Flash works reliably in all browsers (when it doesnʼt crash
them or take up the whole CPU) and everyone has it.
• If a lot of Workers start using iPads, this might not be a
good solution.
• Itʼs all super easy, so why am I presenting this?
• Because it took me so long to find the best tools and figure
out the best way to do the HTML so that it would work in
MTurk and in all browsers.
Video with Flash: Preparation
1. Convert your videos to .flv format. I have used FLVCrunch:
http://download.cnet.com/FLV-Crunch/3000-2194_4-10909295.html
2. Get a Flash player. I have used the free JW Player:
http://www.longtailvideo.com/players/jw-flv-player
3. Put both the player components (as described in the JW
Player instructions) and your .flv videos on the internet
somewhere. Sean created this directory for me on the
csee.ogi.edu servers:
/vol0/projects/www/CSE/public_html_noredirect/mech
which you can access on the web with this URL:
http://www.csee.ogi.edu/mech
Video with Flash: MTurk Part
4. Include your videos as variables in your CSV file like this:
video1,video2
http://www.csee.ogi.edu/mech/player.swf?file=http://
www.csee.ogi.edu/mech/video/myawesomevideo1.flv,http://
www.csee.ogi.edu/mech/player.swf?file=http://www.csee.ogi.edu/
mech/video/myawesomevideo2.flv
5. In the template for your hit, include a line like this for each
video you want to include in that hit:
<embed height="300" width="300" src="${video1}" name="player1"
id="player1"></embed>
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Setting up your first experiment.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Command Line Tools: Why?
Instead of using the GUI to set up your MTurk experiment,
you can use command line tools.
Advantages:
1. Approval/rejection process is easier when you have lots
of data from lots of workers.
2. More power to manage workers: block workers, set
qualifications for workers.
3. Possible to change properties for HIT already in progress.
4. Can use the sandbox to try out your experiments.
5. With external pages, much more flexibility in what
kind of web stuff you can do, like Javascript.
Command Line Tools: Basics
1. Download and install command line tools from here:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=694
2. Sign up for an AWS account, if you didnʼt before:
https://aws-portal.amazon.com/gp/aws/developer/registration/index.html
3. Associate your installation with your AWS identifiers
a) Find your identifiers:
http://s3.amazonaws.com/mturk/tools/pages/aws-access-identifiers/aws-
identifier.html
b) Enter those identifiers in bin/mturk.properties file:
access_key=[Your AWS Access Key]
secret_key=[Your Secret Key]
Command Line Tools: Documentation
There is some good documentation for the Mechanical Turk
command line tools:
1. The UserGuide.html that comes with the tools: definitely
use it to get started with everything.
2. The samples directory:
• Anything youʼd like to do with the command line tools is
pretty easy to figure out just by copying the samples...
• ...except setting up an external page, which is poorly
documented, which is why that is our next topic.
External Pages
• Get started using the samples/external_page directory
in your command line tools installation.
-rw-r--r-- 1 emtucker emtucker 119 Apr 24 2008 external_hit.input
-rw-r--r-- 1 emtucker emtucker 619 Apr 24 2008 external_hit.properties
-rw-r--r-- 1 emtucker emtucker 621 Feb 8 22:59 external_hit.question
-rw-r--r-- 1 emtucker emtucker 2218 Apr 24 2008 externalpage.htm
-rwxr-xr-x 1 emtucker emtucker 667 Apr 24 2008 approveAndDeleteResults.sh
-rwxr-xr-x 1 emtucker emtucker 705 Apr 24 2008 getResults.sh
-rwxr-xr-x 1 emtucker emtucker 671 Apr 24 2008 reviewResults.sh
-rwxr-xr-x 1 emtucker emtucker 799 Apr 24 2008 run.sh
external_hit.input
This is like the input file you used
with the GUI, but tab separated
instead of comma separated.
external_hit.properties
Title, description, reward,
qualifications, time allotted, what
your input variables are called.
external_hit.question
Link to external page plus how to
get your input variables into your
page. More on this shortly.
externalpage.html
The external web page itself. More
on this shortly
*.sh
All of the pre-written scripts for
submitting your HITs, downloading
the results, and approving/
rejecting the work.
Data Files
external_hit.input
external_hit.properties
external_hit.question
external_hit.question
http://www.csee.ogi.edu/page.html?id1=${helper.urlencode($id1)}&amp;sent1=${helper.urlencode($sent1)}
The URL to your
external page, wherever
you decide to put it.
The helper.urlencode bit is
how MTurk puts the
values of your input
variables (which it gets
from the .input file) into
the URL for the page for
each HIT.
Then, in your external web
page, you’ll use Javascript (or
something else of your choice)
to read these items out of the
URL in order to use them in
your page where you need
them.
MTurk also automatically inserts the AssignmentID
variable into the URL. That is, if a worker accepts the
HIT, a unique Assignment ID will be created and included
in the URL. You will have to use that information when
you post the results to MTurk in your external page.
The External Page
Needs to have a few important things:
• Javascript (or other) code for extracting the values of your
input variables out of the URL.
• Javascript (or other) code for accessing the Assignment ID
and for posting the workerʼs responses to MTurk.
This is all included in the externalpage.htm file in the
samples/external_page directory of the command line tools
installation.
The example external page is very helpful, but poorly
commented.
External Web Page:
Javascript code for extracting
URL parameters.
External Web Page:
Javascript code for using
extracted URL parameters.
This part is very important! The worker must accept the hit before being able to
complete it. Be sure to include this (or something like it) in your external page.
Command Line Tools: Sandbox
• Good idea to try out your experiments in the sandbox.
• Sandbox lets you see exactly how your HIT will look to
potential workers.
1. In your bin/mturk.properties file, comment out this line:
#service_url=http://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester
and uncomment this line:
service_url=http://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester
2. In your external html page, replace references to
http://www.mturk.com/mturk/externalSubmit
with
http://workersandbox.mturk.com/mturk/externalSubmit
Lots of Other Topics
• Using command line tools to interact more closely with
workers, design ways of determining who is a good worker
and recruiting those workers, banning specific workers.
• Using the Amazon Mechanical Turk SDK.
• Practical concerns: What kinds of projects can you do with
Mechanical Turk? Are some projects better carried out with
traditional methods?
• How much money do we save using Mechanical Turk?
Sometimes it might be cheaper and easier to use a few
carefully chosen local workers, or even people currently
employed at OGI.

Mais conteúdo relacionado

Último

办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Intellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxIntellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxBipin Adhikari
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 

Último (20)

办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Intellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxIntellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptx
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 

Destaque

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Destaque (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Getting Started with Mechanical Turk

  • 1. Getting Started with Mechanical Turk Emily Tucker Prud’hommeaux June 15, 2010
  • 2. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools: 6. Getting fancy with the command line: external pages.
  • 3. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  • 4. Mechanical Turk, a.k.a Mturk What is Mechanical Turk? • Then: A chess-playing “robot” -- actually a guy in a box. • Now: A service run by Amazon.com that allows people worldwide to do work or answer questions for you.
  • 5. Mechanical Turk Terminology • Requester: You, the person asking the questions. • Workers (or Turkers): The people answering your questions. • Human Intelligence Task (HIT): The question or set of questions you want them to answer. • Reward: How much you pay a Worker for a HIT.
  • 6. MTurk vs. Traditional Methods Mechanical Turk Traditional Methods Many workers answer a few questions in a short period. Few subjects answer lots of questions over a long period. Not a lot of interaction -- may be hard to explain task. Tons of interaction -- lots of opportunity to explain things. Who are these people?!? You know your subjects. Very cheap, and you don’t have to pay if they do a bad job. Not so cheap, and you have to pay the people anyway. Quality control is tricky. Quality control is not so hard. Less opportunity for bias on the part of the experimenter. More opportunity for bias.
  • 7. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  • 8. Creating Your Accounts 1. Create an Amazon Mechanical Turk Requester account. You need this to use Mechanical Turk. https://requester.mturk.com/mturk/beginsignin 2. (Optional) Create an Amazon Web Services (AWS) account. You need this to be able use the command line tools and possibly for some other things: https://aws-portal.amazon.com/gp/aws/developer/registration/index.html
  • 11. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Setting up your first experiment. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  • 12. Creating a HIT 1. Click the Design tab
  • 13. Select a Template Letʼs try Data Collection 2. Select a HIT template.
  • 14. Enter Properties Don’t give people too much time Other criteria can be helpful (e.g., must live in US). Amazon displays your HIT only to the people who meet the criteria. Reward: usually just a few cents, unless it’s really long. Be brief but descriptive.
  • 15. Design Layout Click here to edit the HTML. Ah, much better!
  • 16. Design Layout Input data variables. You’ll upload a CSV file containing their values. Format them this way and MTurk will interpret them for you. This is how worker responses get stored, just like a regular old HTML form, which you already know all about. Hint: If you want some specific type of HTML form input (e.g., radio buttons, drop down menu, checkbox), look at the Blank Template template.
  • 17. Preview and Finish Recall: we will upload a CSV file to fill in these blanks for each HIT.
  • 19. Create and Upload CSV File You create the CSV file on your computer and upload it here. It will look something like this for this example. name,address,phone Bread and Ink,3600 SE Hawthorne,503-555-1212 Three Doors Down,1415 SE 38th,503-555-1213 Cha cha cha!,3375 SE Hawthorne,503-555-1214
  • 20. Preview your HIT The ${name}, ${phone}, and ${address} variables got filled with the values from your CSV file.
  • 22. Manage HITs and Results
  • 23. Review and Download Results Approve or reject that worker’s work. Download results to your computer. You get to process your results file however you like -- open it in Excel or write a program to make it look nice.
  • 24. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  • 25. Including Audio without Flash • For audio, you can convert your wavs to mp3, put them on the web, have the links to the mp3s be your variables in the CSV file, then force the links to open in a new window. • If you want something more reliable, embed the audio in a Flash player, which I am about to describe. • If you need more control (e.g., you want to prevent the worker from listening to the wave more than once), you might need to use something fancier like Javascript. audiofile1,audiofile2 http://etucker.com/a1.mp3,http://etucker.com/a2.mp3 CSV file <a target="_blank" href="${audiofi1e1}>Audio1</a> Template HTML
  • 26. Including Audio with Flash • If you donʼt want the audio to open in a new window, embed the audio in a Flash player. • I use the Google audio Flash player, which works well and has nice controls. • The html will look something like this: <embed src="http://www.google.com/reader/ui/3523697345-audio-player.swf" flashvars="audioUrl=${audiofile}" width="400" height="27" quality="best" type="application/x-shockwave-flash"></embed> • The input file will look something like this: audiofile http://www.csee.ogi.edu/mechturk/audio1.mp3 http://www.csee.ogi.edu/mechturk/audio2.mp3 http://www.csee.ogi.edu/mechturk/audio3.mp3 http://www.csee.ogi.edu/mechturk/audio4.mp3
  • 27. Including Video • For videos, I have been using Flash. • Flash works reliably in all browsers (when it doesnʼt crash them or take up the whole CPU) and everyone has it. • If a lot of Workers start using iPads, this might not be a good solution. • Itʼs all super easy, so why am I presenting this? • Because it took me so long to find the best tools and figure out the best way to do the HTML so that it would work in MTurk and in all browsers.
  • 28. Video with Flash: Preparation 1. Convert your videos to .flv format. I have used FLVCrunch: http://download.cnet.com/FLV-Crunch/3000-2194_4-10909295.html 2. Get a Flash player. I have used the free JW Player: http://www.longtailvideo.com/players/jw-flv-player 3. Put both the player components (as described in the JW Player instructions) and your .flv videos on the internet somewhere. Sean created this directory for me on the csee.ogi.edu servers: /vol0/projects/www/CSE/public_html_noredirect/mech which you can access on the web with this URL: http://www.csee.ogi.edu/mech
  • 29. Video with Flash: MTurk Part 4. Include your videos as variables in your CSV file like this: video1,video2 http://www.csee.ogi.edu/mech/player.swf?file=http:// www.csee.ogi.edu/mech/video/myawesomevideo1.flv,http:// www.csee.ogi.edu/mech/player.swf?file=http://www.csee.ogi.edu/ mech/video/myawesomevideo2.flv 5. In the template for your hit, include a line like this for each video you want to include in that hit: <embed height="300" width="300" src="${video1}" name="player1" id="player1"></embed>
  • 30. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Setting up your first experiment. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  • 31. Command Line Tools: Why? Instead of using the GUI to set up your MTurk experiment, you can use command line tools. Advantages: 1. Approval/rejection process is easier when you have lots of data from lots of workers. 2. More power to manage workers: block workers, set qualifications for workers. 3. Possible to change properties for HIT already in progress. 4. Can use the sandbox to try out your experiments. 5. With external pages, much more flexibility in what kind of web stuff you can do, like Javascript.
  • 32. Command Line Tools: Basics 1. Download and install command line tools from here: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=694 2. Sign up for an AWS account, if you didnʼt before: https://aws-portal.amazon.com/gp/aws/developer/registration/index.html 3. Associate your installation with your AWS identifiers a) Find your identifiers: http://s3.amazonaws.com/mturk/tools/pages/aws-access-identifiers/aws- identifier.html b) Enter those identifiers in bin/mturk.properties file: access_key=[Your AWS Access Key] secret_key=[Your Secret Key]
  • 33. Command Line Tools: Documentation There is some good documentation for the Mechanical Turk command line tools: 1. The UserGuide.html that comes with the tools: definitely use it to get started with everything. 2. The samples directory: • Anything youʼd like to do with the command line tools is pretty easy to figure out just by copying the samples... • ...except setting up an external page, which is poorly documented, which is why that is our next topic.
  • 34. External Pages • Get started using the samples/external_page directory in your command line tools installation. -rw-r--r-- 1 emtucker emtucker 119 Apr 24 2008 external_hit.input -rw-r--r-- 1 emtucker emtucker 619 Apr 24 2008 external_hit.properties -rw-r--r-- 1 emtucker emtucker 621 Feb 8 22:59 external_hit.question -rw-r--r-- 1 emtucker emtucker 2218 Apr 24 2008 externalpage.htm -rwxr-xr-x 1 emtucker emtucker 667 Apr 24 2008 approveAndDeleteResults.sh -rwxr-xr-x 1 emtucker emtucker 705 Apr 24 2008 getResults.sh -rwxr-xr-x 1 emtucker emtucker 671 Apr 24 2008 reviewResults.sh -rwxr-xr-x 1 emtucker emtucker 799 Apr 24 2008 run.sh external_hit.input This is like the input file you used with the GUI, but tab separated instead of comma separated. external_hit.properties Title, description, reward, qualifications, time allotted, what your input variables are called. external_hit.question Link to external page plus how to get your input variables into your page. More on this shortly. externalpage.html The external web page itself. More on this shortly *.sh All of the pre-written scripts for submitting your HITs, downloading the results, and approving/ rejecting the work.
  • 36. external_hit.question http://www.csee.ogi.edu/page.html?id1=${helper.urlencode($id1)}&amp;sent1=${helper.urlencode($sent1)} The URL to your external page, wherever you decide to put it. The helper.urlencode bit is how MTurk puts the values of your input variables (which it gets from the .input file) into the URL for the page for each HIT. Then, in your external web page, you’ll use Javascript (or something else of your choice) to read these items out of the URL in order to use them in your page where you need them. MTurk also automatically inserts the AssignmentID variable into the URL. That is, if a worker accepts the HIT, a unique Assignment ID will be created and included in the URL. You will have to use that information when you post the results to MTurk in your external page.
  • 37. The External Page Needs to have a few important things: • Javascript (or other) code for extracting the values of your input variables out of the URL. • Javascript (or other) code for accessing the Assignment ID and for posting the workerʼs responses to MTurk. This is all included in the externalpage.htm file in the samples/external_page directory of the command line tools installation. The example external page is very helpful, but poorly commented.
  • 38. External Web Page: Javascript code for extracting URL parameters.
  • 39. External Web Page: Javascript code for using extracted URL parameters. This part is very important! The worker must accept the hit before being able to complete it. Be sure to include this (or something like it) in your external page.
  • 40. Command Line Tools: Sandbox • Good idea to try out your experiments in the sandbox. • Sandbox lets you see exactly how your HIT will look to potential workers. 1. In your bin/mturk.properties file, comment out this line: #service_url=http://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester and uncomment this line: service_url=http://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester 2. In your external html page, replace references to http://www.mturk.com/mturk/externalSubmit with http://workersandbox.mturk.com/mturk/externalSubmit
  • 41. Lots of Other Topics • Using command line tools to interact more closely with workers, design ways of determining who is a good worker and recruiting those workers, banning specific workers. • Using the Amazon Mechanical Turk SDK. • Practical concerns: What kinds of projects can you do with Mechanical Turk? Are some projects better carried out with traditional methods? • How much money do we save using Mechanical Turk? Sometimes it might be cheaper and easier to use a few carefully chosen local workers, or even people currently employed at OGI.