crowdsourcing complexity: lessons for software engineering

Post on 26-Feb-2016

41 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Crowdsourcing Complexity: Lessons for Software Engineering. Lydia Chilton 2 June 2014 ICSE Crowdsourcing Workshop. Clarification: Human Computation. Mechanical Turk Microtasks :. 2007: JavaScript Calculator. 2007: JavaScript Calculator. Evolution of Complexity in Human Computation. - PowerPoint PPT Presentation

TRANSCRIPT

Crowdsourcing Complexity:Lessons for Software Engineering

Lydia Chilton2 June 2014 ICSE Crowdsourcing Workshop

Clarification: Human Computation

Mechanical Turk

Microtasks:

2007: JavaScript Calculator

2007: JavaScript Calculator

Evolution of Complexityin Human Computation

Task Decomposition: Cascade & Frenzy

Evolution of Complexity

1. Collective Intelligence

1906: 787 aggregated votes averaged 1197 lbs. Actual answer: 1198 lbs.

1. Collective Intelligence

Principles:– Small tasks– Intuitive tasks– Independent answers– Simple aggregation

Application: - ESP Game

2. Iterative Workflows

Collective Intelligence

work improve vote improve vote

2. Iterative Workflows

Principles:– Use fresh eyes– Vote to ensure

improvement

Application:- Bug finding

“given enough eyeballs, all bugs are shallow”

3. Psychological Boundaries

3. Psychological Boundaries

Applications:• Manager / programmer• Writer / editor• Write code / test code• Addition / subtraction

Principle:– Task switching is hard– Natural boundaries for tasks

4. Task Decomposition

Legion:Scribe Real Time Audio Captioning on MTurk

4. Task Decomposition

Principles:– Must be able to break apart tasks AND put them

back together.– Complex aggregation – Hint: Solve backwards. Find what people can do,

and build up from there.

5. Worker Choice

Mobi: Trip Planning on Mturk with an open UI.

5. Worker Choice

Applications:• Trip planning• Conference time table• Conference session-making

Principles:- Giving workers freedom relieves requesters’ burden of

task decomposition. - Workers feel more involved and empowered.- BUT complex interface that is difficult to scale.

6. Learning and Doing

6. Learning and Doing

Applications:• Peer assessment • Do grading assignments before you do your own

assignment• Task Feedback

Principles:- Teaching workers makes them better.- How long will they stay?

Lessons for Software Engineering

1. Propose and vote 2. Find natural psychological

boundaries between tasks3. Find the tasks people can do,

then assemble them using complex aggregation techniques.

4. Teach.

221+ 473

-221+ 473

Evolution of Complexityin Human Computation

Task Decomposition: Cascade & Frenzy

Task decomposition is the key to crowdsourcing software engineering

23

Lydia Chilton (UW), Greg Little (oDesk), Darren Edge (MSR Asia), Dan Weld (UW), James Landay (UW)

CascadeCrowdsourcing Taxonomy Creation

24

Problem

26

• 1000 eGovernment suggestions• 50 top product reviews• 100 employee feedback comments• 1000 answers to “Why did you decide to

major in Computer Science?”

Machines can’t analyze it

People don’t have time to analyze it1. time consuming2. overwhelming3. no right answer

27

Solution

28

Solution: Crowdsourced Taxonomies

29

Toy Application: Colors

30

Initial Prototypes

31

Problems1. The hierarchy grows and

becomes overwhelming2. Workers have to decide

what to do

Lesson Break up the task more

Iterative Improvement

32

Initial Approach 2:Category Comparison

ProblemWithout context, it’s hard to judge relationships• flying vs. flights• TSA liquids vs. removing liquids• Packing vs. what to bring

LessonDon’t compare abstractions to abstractionsInstead compare data to abstractions

Use Lesson #3

Find the tasks people can do.Assemble them using complex

aggregation techniques.

Generate Labels Select Best LabelsCategorize

34

Cascade Algorithm

Generate Labels Select Best Labels

Categorize

For a subset of items {good labels}

For all items, for all good labels,

BlueLight Blue

GreenGreenish

RedGold

Then recurse

35

Aggregate Data into TaxonomyBlue

Light BlueGreen

GreenishRed

Gold

Blue

Light Blue

GreenBlue:

Light Blue:Green:Other:

redundant

nested

singletons

36

Cascade Results: 100 Colors

37

How can we get a global picture from workers who see only subsets of the data?

Propose, Vote, Test1. Workers have good heuristics.2. Let them propose categories.3. Vote on categories to weed out bad ones.4. Test the heuristics by verifying it on data.

Propose Vote Test

39

Lesson

Propose, Vote, Test.

40

Deploy Cascade to Real Needs

1. CHI 2013 Program CommitteeOrganize 430 accepted papers to help session making

2. 40 CrowdCamp Hack-a-thon Participants Organize 100 hack-a-thon ideas to help organize teams

41

Patina: Dynamic Heatmaps for Visualizing Application Usage',Effects of Visualization and Note-Taking on Sensemaking and Analysis',Contextifier: Automatic Generation of Messaged Visualizations',Interactive Horizon Graphs: Improving the Compact Visualization of Multiple Time Series',Quantity Estimation in Visualizations of Tagged Text',Motif Simplification: Improving Network Visualization Readability with Fan, Connector, and Clique Glyphs',Evaluation of Alternative Glyph Designs for Time Series Data in a Small Multiple Setting',Individual User Characteristics and Information Visualization: Connecting the Dots through Eye Tracking',"Without the Clutter of Unimportant Words": Descriptive Keyphrases for Text Visualization']],Direct Space-Time Trajectory Control for Visual Media EditingYour eyes will go out of the face: Adaptation for virtual eyes in video see-through HMDsSwifter: Improved Online Video ScrubbingDirect Manipulation Video Navigation in 3DNoteVideo: Facilitating Navigation of Blackboard-style Lecture VideosOwnership and Control of Point of View in Remote AssistanceEyeContext: Recognition of High-level Contextual Cues from Human Visual BehaviourYour eyes will go out of the face: Adaptation for virtual eyes in video see-through HMDsStill Looking: Investigating Seamless Gaze-supported Selection, Positioning, and Manipulation of Distant TargetsIndividual User Characteristics and Information Visualization: Connecting the Dots through Eye TrackingQuantity Estimation in Visualizations of Tagged Text

430 CHI Papers: Good Results, but…

• Visualization (19)• evaluating infovis (9)

• text (2)• video (6)• visualizing time data (5)• gaze (4)

• gaze tracking (3)• user requirements (3)• color schemes (2)

42

“Don’t treat me like a Turker.”

“I just want to see all the data”

43

Lesson

Authority and Responsibility should be aligned.

44

Frenzy: Collaborative Data Organization for

Creating Conference Sessions

Lydia Chilton (UW), Juho Kim (MIT), Paul Andre (CMU), Felicia Cordeiro (UW), James Landay (Cornell?), Dan Weld (UW),

Steven Dow (CMU), Rob Miller (MIT), Haoqi Zhang (NW)

45

46

Groupware

Creating conference sessions is a social process. Grudin: Social process are often guided by personalities, tradition, convention. Challenge: support to the process without seeking to replace these behaviors. Challenge: remain flexible and do not improve rigid structures.

47

DEMO

48

Label

VoteCategorize

Light-weight contributions

49

2-Stage Workflow

Stage 1 Stage 2

Set-up Collect Meta Data • 60 PC members• Low authority• Low responsibility

Session Making• 11 PC members• High authority• High responsibility

50

Goals

Collect data: labels, votes Session-Making

Results

Sessions created in record-setting 88 minutes.

Lessons for Software Engineering

1. Propose and vote. 2. Find natural psychological

boundaries between tasks.3. Find the tasks people can do,

then assemble them using complex aggregation techniques.

4. Teach.5. Propose, vote, test.6. Align authority and

responsibility.

221+ 473

-221+ 473

Low Hi

top related