pick a crowd
TRANSCRIPT
1
Pick-A-Crowd: Tell Me What You Like,and I’ll Tell You What to Do
A Crowdsourcing Platform for PersonalizedHuman Intelligence Task Assignment Based on Social
Networks
Djellel E. Difallah, Gianluca Demartini, Philippe Cudré-MaurouxeXascale Infolab
University of Fribourg, Switzerland15th May 2013, WWW 2013 - Rio De Janeiro, Brazil
2
Crowdsourcing
• Exploit human intelligence to solve tasks that are simple for Humans and complex for machines
• Examples: – Wikipedia, reCaptcha, Duolingo
• Incentives– Financial, fun, visibility
3
Motivation
• The Pull Methodology is suboptimal
Effective workers
Actual workers
Max Overlap
4
Motivation
• The Push Methodology is a Task-to-Worker Recommender System.
5
Contribution and Claim
• Pick-A-Crowd: A system architecture that uses Task-to-Worker matching:– The worker’s social profile – The task context
• Workers can provide higher quality answers on tasks they relate to
7
Worker Social Profiling
“YouAreWhatYouLike”
8
Image Tagging
Problem Definition (1)-The Human Intelligence Task (HIT)
Data CollectionSurveyCategorization
Batch of Tasks:TitleBatch InstructionSpecific task instruction*Task data:
- Text.- Options.- Additional data (image, Url)
List of categories*
9
Problem Definition (2)-The Worker
Completed HITs: 256Approval Rate: 96%Qualification TypesGeneric Qualifications
Page:- Title- Category- Description- Feed, etc.
Page:- Title- Category- Description- Feed, etc.
Page:- Title- Category- Description- Feed, etc.
10
Problem Definition (3) –Task-to-Worker Matching
Batch of Tasks:TitleBatch InstructionSpecific task instruction*Task data:
- Text.- Options.- Additional data (image,
Url)List of categories*
Page:- Title- Category- Description- Feed, etc.
Page:- Title- Category- Description- Feed, etc.
Page:- Title- Category- Description- Feed, etc.
1- Task-to-Page Matching Function- Category- Expert finding- Semantic
2- Worker Ranking
11
Matching Models (1/3)–Category Based
• The requester provides a list of categories related to the batch• We create a subset of pages whose category is in the category
list of the batch• Rank the workers by the number of liked pages in the subset
12
Matching Models (2/3) –Expert Finding
• Build an inverted index on the pages’ titles and description• Use the title/description of the tasks as a key word query on
the inverted index and get a subset of pages• Rank the workers by the number of liked pages in the subset
Matching Models (3/3) –Semantic Based
• Link the context to an external knowledge base (e.g., DBPedia)• Exploit the underlying graph structure to determine the Hits and Pages similarity
– Assumption that a worker who likes a page is able to answer questions about related entities– Worker who likes a page is able to answer questions about entities of the same type
• Rank the workers by the number of liked pages in the subset
13
HIT FB Pages
Similarity
Relatedness
Type-Similarity
15
Pick-A-Crowd Architecture
16
Experimental Evaluation
• The Facebook app OpenTurk implements part of the Pick-A-Crowd architecture:– More than 170 registered workers participated– Over 12k pages crawled
• Covered both multiple answer questions as well as open-ended questions– 50 images with multiple choice question and 5 candidate answers
(Soccer, Actors, Music, Authors,Movies, Animes)– Answer 20 open-ended questions related to the topic (Cricket)
18
OpenTurk app
19
Evaluation -Correlation between the crowd accuracy and the number of relevant likes (Category Based)
WO
RKER
PR
ECIS
ION
NUMBER OF RELEVANT LIKES
20
Evaluation (Baseline) –Amazon Mechanical Turk (AMT)
AMT 3 = Majority vote of 3 workersAMT 5 = Majority vote of 5 workers
21
Evaluation – HIT Assignment Models
CATEGORY APPROACH
22
Evaluation – HIT Assignment Models
EXPERT FINDING BASED
TITLE/INSTRUCTION CONTENT
23
Evaluation – HIT Assignment Models
SEMANTIC BASED
TYPE RELATEDNESS
24
Evaluation -Comparison With Mechanical Turk
AMT
PICK
-A-C
ROW
D
25
Conclusions and Future Work
• Pull vs. Push methodologies in Crowdsourcing • Pick-A-Crowd system architecture with Task-
to-Worker recommendation• Experimental comparison with AMT shows a
consistent quality improvement“Workers Know what they Like”
• Exploit more of the social activity, and handle content-less tasks
26
Next Step
• We are building a Crowdsourcing platform for the research community
• Pre-register on:
www.openturk.com
Thank You!