getting started with mechanical turk

41
Getting Started with Mechanical Turk Emily Tucker Prud’hommeaux June 15, 2010

Upload: alex-storer

Post on 21-May-2015

300 views

Category:

Internet


1 download

DESCRIPTION

Emily Tucker Prud’hommeaux's useful presentation!

TRANSCRIPT

Page 1: Getting Started with Mechanical Turk

Getting Started with Mechanical Turk

Emily Tucker Prud’hommeauxJune 15, 2010

Page 2: Getting Started with Mechanical Turk

Outline1. Overview of Mechanical Turk concept.2. Creating and funding your account.3. Using the GUI.• Designing your tasks.• Submitting your tasks.• Reviewing and approving your results.

4. Getting fancy with the GUI: audio and video.5. Using the command line tools:6. Getting fancy with the command line: external pages.

Page 3: Getting Started with Mechanical Turk

Outline1. Overview of Mechanical Turk concept.2. Creating and funding your account.3. Using the GUI.• Designing your tasks.• Submitting your tasks.• Reviewing and approving your results.

4. Getting fancy with the GUI: audio and video.5. Using the command line tools.6. Getting fancy with the command line: external pages.

Page 4: Getting Started with Mechanical Turk

Mechanical Turk, a.k.a MturkWhat is Mechanical Turk?• Then: A chess-playing “robot” -- actually a guy in a box.

• Now: A service run by Amazon.com that allows people worldwide to do work or answer questions for you.

Page 5: Getting Started with Mechanical Turk

Mechanical Turk Terminology

• Requester: You, the person asking the questions.

• Workers (or Turkers): The people answering your questions.

• Human Intelligence Task (HIT): The question or set of questions you want them to answer.

• Reward: How much you pay a Worker for a HIT.

Page 6: Getting Started with Mechanical Turk

MTurk vs. Traditional Methods

Mechanical Turk Traditional Methods

Many workers answer a few questions in a short period.

Few subjects answer lots of questions over a long period.

Not a lot of interaction -- may be hard to explain task.

Tons of interaction -- lots of opportunity to explain things.

Who are these people?!? You know your subjects.

Very cheap, and you don’t have to pay if they do a bad job.

Not so cheap, and you have to pay the people anyway.

Quality control is tricky. Quality control is not so hard.

Less opportunity for bias on the part of the experimenter.

More opportunity for bias.

Page 7: Getting Started with Mechanical Turk

Outline1. Overview of Mechanical Turk concept.2. Creating and funding your account.3. Using the GUI.• Designing your tasks.• Submitting your tasks.• Reviewing and approving your results.

4. Getting fancy with the GUI: audio and video.5. Using the command line tools.6. Getting fancy with the command line: external pages.

Page 8: Getting Started with Mechanical Turk

Creating Your Accounts1. Create an Amazon Mechanical Turk Requester account. You need this to use Mechanical Turk.

https://requester.mturk.com/mturk/beginsignin

2. (Optional) Create an Amazon Web Services (AWS) account. You need this to be able use the command line tools and possibly for some other things:

https://aws-portal.amazon.com/gp/aws/developer/registration/index.html

Page 9: Getting Started with Mechanical Turk

Funding Your Account

Page 10: Getting Started with Mechanical Turk

Funding Your Account

Page 11: Getting Started with Mechanical Turk

Outline1. Overview of Mechanical Turk concept.2. Creating and funding your account.3. Using the GUI.• Setting up your first experiment.• Submitting your tasks.• Reviewing and approving your results.

4. Getting fancy with the GUI: audio and video.5. Using the command line tools.6. Getting fancy with the command line: external pages.

Page 12: Getting Started with Mechanical Turk

Creating a HIT1. Click the Design tab

Page 13: Getting Started with Mechanical Turk

Select a Template

Letʼs try Data Collection

2. Select a HIT template.

Page 14: Getting Started with Mechanical Turk

Enter Properties

Don’t give people too much time

Other criteria can be helpful (e.g., must live in US). Amazon displays your HIT only to the people who meet the criteria.

Reward: usually just a few cents, unless it’s really long.

Be brief but descriptive.

Page 15: Getting Started with Mechanical Turk

Design Layout

Click here to edit the HTML.

Ah, much better!

Page 16: Getting Started with Mechanical Turk

Design Layout

Input data variables. You’ll upload a CSV file containing their values. Format them this way and MTurk will interpret them for you.

This is how worker responses get stored, just like a regular old HTML form, which you already know all about.

Hint: If you want some specific type of HTML form input (e.g., radio buttons, drop down menu, checkbox), look at the Blank Template template.

Page 17: Getting Started with Mechanical Turk

Preview and Finish

Recall: we will upload a CSV file to fill in these blanks for each HIT.

Page 18: Getting Started with Mechanical Turk

Publishing Your HIT

Page 19: Getting Started with Mechanical Turk

Create and Upload CSV File

You create the CSV file on your computer and upload it here. It will look something like this for this example.

name,address,phoneBread and Ink,3600 SE Hawthorne,503-555-1212Three Doors Down,1415 SE 38th,503-555-1213Cha cha cha!,3375 SE Hawthorne,503-555-1214

Page 20: Getting Started with Mechanical Turk

Preview your HIT

The ${name}, ${phone}, and ${address} variables got filled with the values from your CSV file.

Page 21: Getting Started with Mechanical Turk

Confirm and Publish

Page 22: Getting Started with Mechanical Turk

Manage HITs and Results

Page 23: Getting Started with Mechanical Turk

Review and Download Results

Approve or reject that worker’s work.

Download results to your computer.

You get to process your results file however you like -- open it in Excel or write a program to make it look nice.

Page 24: Getting Started with Mechanical Turk

Outline1. Overview of Mechanical Turk concept.2. Creating and funding your account.3. Using the GUI.• Designing your tasks.• Submitting your tasks.• Reviewing and approving your results.

4. Getting fancy with the GUI: audio and video.5. Using the command line tools.6. Getting fancy with the command line: external pages.

Page 25: Getting Started with Mechanical Turk

Including Audio without Flash• For audio, you can convert your wavs to mp3, put them on

the web, have the links to the mp3s be your variables in the CSV file, then force the links to open in a new window.

• If you want something more reliable, embed the audio in a Flash player, which I am about to describe.

• If you need more control (e.g., you want to prevent the worker from listening to the wave more than once), you might need to use something fancier like Javascript.

audiofile1,audiofile2http://etucker.com/a1.mp3,http://etucker.com/a2.mp3

CSV file

<a target="_blank" href="${audiofi1e1}>Audio1</a>Template HTML

Page 26: Getting Started with Mechanical Turk

Including Audio with Flash• If you donʼt want the audio to open in a new window,

embed the audio in a Flash player.• I use the Google audio Flash player, which works well and

has nice controls.• The html will look something like this:

<embed src="http://www.google.com/reader/ui/3523697345-audio-player.swf" flashvars="audioUrl=${audiofile}" width="400" height="27" quality="best" type="application/x-shockwave-flash"></embed>

• The input file will look something like this:audiofilehttp://www.csee.ogi.edu/mechturk/audio1.mp3http://www.csee.ogi.edu/mechturk/audio2.mp3http://www.csee.ogi.edu/mechturk/audio3.mp3http://www.csee.ogi.edu/mechturk/audio4.mp3

Page 27: Getting Started with Mechanical Turk

Including Video• For videos, I have been using Flash.• Flash works reliably in all browsers (when it doesnʼt crash

them or take up the whole CPU) and everyone has it.• If a lot of Workers start using iPads, this might not be a

good solution.• Itʼs all super easy, so why am I presenting this? • Because it took me so long to find the best tools and figure

out the best way to do the HTML so that it would work in MTurk and in all browsers.

Page 28: Getting Started with Mechanical Turk

Video with Flash: Preparation1. Convert your videos to .flv format. I have used FLVCrunch:http://download.cnet.com/FLV-Crunch/3000-2194_4-10909295.html

2. Get a Flash player. I have used the free JW Player:http://www.longtailvideo.com/players/jw-flv-player

3. Put both the player components (as described in the JW Player instructions) and your .flv videos on the internet somewhere. Sean created this directory for me on the csee.ogi.edu servers:/vol0/projects/www/CSE/public_html_noredirect/mech

which you can access on the web with this URL:http://www.csee.ogi.edu/mech

Page 29: Getting Started with Mechanical Turk

Video with Flash: MTurk Part4. Include your videos as variables in your CSV file like this:video1,video2http://www.csee.ogi.edu/mech/player.swf?file=http://www.csee.ogi.edu/mech/video/myawesomevideo1.flv,http://www.csee.ogi.edu/mech/player.swf?file=http://www.csee.ogi.edu/mech/video/myawesomevideo2.flv

5. In the template for your hit, include a line like this for each video you want to include in that hit:<embed height="300" width="300" src="${video1}" name="player1" id="player1"></embed>

Page 30: Getting Started with Mechanical Turk

Outline1. Overview of Mechanical Turk concept.2. Creating and funding your account.3. Using the GUI.• Setting up your first experiment.• Submitting your tasks.• Reviewing and approving your results.

4. Getting fancy with the GUI: audio and video.5. Using the command line tools.6. Getting fancy with the command line: external pages.

Page 31: Getting Started with Mechanical Turk

Command Line Tools: Why?Instead of using the GUI to set up your MTurk experiment, you can use command line tools.Advantages:1. Approval/rejection process is easier when you have lots of data from lots of workers.2. More power to manage workers: block workers, set qualifications for workers.3. Possible to change properties for HIT already in progress.4. Can use the sandbox to try out your experiments.5. With external pages, much more flexibility in what kind of web stuff you can do, like Javascript.

Page 32: Getting Started with Mechanical Turk

Command Line Tools: Basics1. Download and install command line tools from here:

http://developer.amazonwebservices.com/connect/entry.jspa?externalID=694

2. Sign up for an AWS account, if you didnʼt before: https://aws-portal.amazon.com/gp/aws/developer/registration/index.html

3. Associate your installation with your AWS identifiers

a) Find your identifiers: http://s3.amazonaws.com/mturk/tools/pages/aws-access-identifiers/aws-identifier.html

b) Enter those identifiers in bin/mturk.properties file:access_key=[Your AWS Access Key]secret_key=[Your Secret Key]

Page 33: Getting Started with Mechanical Turk

Command Line Tools: DocumentationThere is some good documentation for the Mechanical Turk command line tools:1. The UserGuide.html that comes with the tools: definitely

use it to get started with everything.2. The samples directory: • Anything youʼd like to do with the command line tools is pretty easy to figure out just by copying the samples...• ...except setting up an external page, which is poorly documented, which is why that is our next topic.

Page 34: Getting Started with Mechanical Turk

External Pages• Get started using the samples/external_page directory

in your command line tools installation.-rw-r--r-- 1 emtucker emtucker 119 Apr 24 2008 external_hit.input-rw-r--r-- 1 emtucker emtucker 619 Apr 24 2008 external_hit.properties-rw-r--r-- 1 emtucker emtucker 621 Feb 8 22:59 external_hit.question

-rw-r--r-- 1 emtucker emtucker 2218 Apr 24 2008 externalpage.htm

-rwxr-xr-x 1 emtucker emtucker 667 Apr 24 2008 approveAndDeleteResults.sh-rwxr-xr-x 1 emtucker emtucker 705 Apr 24 2008 getResults.sh-rwxr-xr-x 1 emtucker emtucker 671 Apr 24 2008 reviewResults.sh-rwxr-xr-x 1 emtucker emtucker 799 Apr 24 2008 run.sh

external_hit.input

This is like the input file you used with the GUI, but tab separated instead of comma separated.

external_hit.properties

Title, description, reward, qualifications, time allotted, what your input variables are called.

external_hit.question

Link to external page plus how to get your input variables into your page. More on this shortly.

externalpage.html

The external web page itself. More on this shortly

*.sh

All of the pre-written scripts for submitting your HITs, downloading the results, and approving/rejecting the work.

Page 35: Getting Started with Mechanical Turk

Data Files

external_hit.input

external_hit.properties

external_hit.question

Page 36: Getting Started with Mechanical Turk

external_hit.question

http://www.csee.ogi.edu/page.html?id1=${helper.urlencode($id1)}&amp;sent1=${helper.urlencode($sent1)}

The URL to your external page, wherever you decide to put it.

The helper.urlencode bit is how MTurk puts the values of your input variables (which it gets from the .input file) into the URL for the page for each HIT.

Then, in your external web page, you’ll use Javascript (or something else of your choice) to read these items out of the URL in order to use them in your page where you need them.

MTurk also automatically inserts the AssignmentID variable into the URL. That is, if a worker accepts the HIT, a unique Assignment ID will be created and included in the URL. You will have to use that information when you post the results to MTurk in your external page.

Page 37: Getting Started with Mechanical Turk

The External PageNeeds to have a few important things:• Javascript (or other) code for extracting the values of your

input variables out of the URL. • Javascript (or other) code for accessing the Assignment ID

and for posting the workerʼs responses to MTurk.This is all included in the externalpage.htm file in the samples/external_page directory of the command line tools installation.The example external page is very helpful, but poorly commented.

Page 38: Getting Started with Mechanical Turk

External Web Page: Javascript code for extracting URL parameters.

Page 39: Getting Started with Mechanical Turk

External Web Page: Javascript code for using extracted URL parameters.

This part is very important! The worker must accept the hit before being able to complete it. Be sure to include this (or something like it) in your external page.

Page 40: Getting Started with Mechanical Turk

Command Line Tools: Sandbox• Good idea to try out your experiments in the sandbox. • Sandbox lets you see exactly how your HIT will look to

potential workers.

1. In your bin/mturk.properties file, comment out this line:#service_url=http://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester

and uncomment this line:service_url=http://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester

2. In your external html page, replace references to http://www.mturk.com/mturk/externalSubmit

withhttp://workersandbox.mturk.com/mturk/externalSubmit

Page 41: Getting Started with Mechanical Turk

Lots of Other Topics• Using command line tools to interact more closely with

workers, design ways of determining who is a good worker and recruiting those workers, banning specific workers.

• Using the Amazon Mechanical Turk SDK.

• Practical concerns: What kinds of projects can you do with Mechanical Turk? Are some projects better carried out with traditional methods?

• How much money do we save using Mechanical Turk? Sometimes it might be cheaper and easier to use a few carefully chosen local workers, or even people currently employed at OGI.