human computation and crowdsourcing: survey and taxonomy uichin lee sept. 7, 2011

21
Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Upload: courtney-horsford

Post on 15-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Human Computation and Crowdsourcing: Survey and Taxonomy

Uichin LeeSept. 7, 2011

Page 2: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Human Computation: A Survey and Taxonomy of a Growing Field

Alexander J. Quinn, Benjamin B. Bederson CHI 2011

Page 3: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Human Computation

• Computer scientists (in the artificial intelligence field) have been trying to emulate human like abilities, e.g., language, visual processing, reasoning

• Alan Turing wrote in 1950: “The idea behind digital computers may be explained by

saying that these machines are intended to carry out any operations which could be done by a human computer.”

• L. Von Ahn 2005 a doctorial thesis about human computation

• The field is now thriving: business, art, R&D, HCI, databases, artificial intelligence, etc.

Page 4: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Definition of Human Computation

• Dates back 1938 in philosophy and psychology literature ; 1960 in Computer Science literature (by Turing)

• Modern usage inspired by von Ahn’s 2005 dissertation titled by “Human Computation” and the work leading to it– “…a paradigm for utilizing human processing

power to solve problems that computers cannot yet solve.”

Page 5: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Definition of Human Computation• “…the idea of using human effort to perform tasks that computers cannot yet

perform, usually in an enjoyable manner.” (Law, von Ahn 2009)• “…a new research area that studies the process of channeling the vast internet

population to perform tasks or provide data towards solving difficult problems that no known efficient computer algorithms can yet solve” (Chandrasekar, et al., 2010)

• “…a technique that makes use of human abilities for computation to solve problems.” (Yuen, Chen, King, 2009)

• “…a technique to let humans solve tasks, which cannot be solved by computers.” (Schall, Truong, Dustdar, 2008)

• “A computational process that involves humans in certain steps…” (Yang, et al., 2008)

• “…systems of computers and large numbers of humans that work together in order to solve problems that could not be solved by either computers or humans alone” (Quinn, Bederson, 2009)

• “…a new area of research that studies how to build systems, such as simple casual games, to collect annotations from human users.” (Law, et al., 2009)

Page 6: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Related Ideas

• Crowdsourcing• Social computing• Data mining• Collective intelligence

Page 7: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Crowdsourcing• “Crowdsourcing is the act of taking a job traditionally performed by a

designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call.” (Jeff Howe)

• Human computation replaces computers with humans, whereas crowdsourcing replaces traditional human workers with members of the public– HC: replacement of computers with humans– CS: replacement of insourced workers with crowdsourced workers

• Some crowdsourcing tasks can be considered as human computation tasks– Hiring crowdsourced workers for translation jobs : – Machine translation (fast, but low quality) vs. human translation (slow, high

quality)

Page 8: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Social Computing• Wikipedia:

– “.. supporting any sort of social behavior in or through computational systems” (e.g., blogs, email, IM, SNS, wikis, social bookmarking)

– “.. Supporting computations that are carried out by groups of people” (e.g., collaborative filtering, online auctions, prediction markets, reputation systems)

• Some other definitions:– “… applications and services that facilitate collective action and social

interaction online with rich exchange of multimedia information and evolution of aggregate knowledge…” (Parameswaran, Whinston, 2007)

– “… the interplay between persons' social behaviors and their interactions with computing technologies” (Dryer, Eisbach, Ark, 1999)

Page 9: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Data Mining

• Data mining is defined broadly as the application of specific algorithms for extracting patterns from data.” (Fayyad, Piatetsky-Shapiro, Smyth, 1996)

• While data mining deals with human created data, it does not involve human computation– Google PageRank “only” uses human created data

(links)

Page 10: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Collective Intelligence

• Overarching notion: large groups of loosely organized people can accomplish great things working together– Traditional study focused on “decision making

capabilities by a large group of people”• Taxonomical “genome” of collective intelligence– “… groups of individuals doing things collectively that

seem intelligent” (Malone, 2009)• Collective intelligence generally encompasses

human computation and social computing

Page 11: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Relationship Diagram

CollectiveIntelligence

Data Mining

CrowdsourcingSocial

Computing

HumanComputation

Page 12: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Classifying Human Computation• Motivation

– What does motivate people to perform HC? • Human skill

– What kinds of human skills do HC tasks require?• Aggregation

– How to combine results of HC tasks? • Quality control

– How to control quality of the results of HC tasks?• Processing order of different roles

– Roles (requester, worker, computer)• Task-request cardinality

– Requester vs. Worker cardinality

Page 13: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

MotivationMotivation Examples

Pay (financial rewards) Mechanical Turk (online labor marketplace), ChaCha (mobile Q&A), LiveOps (a distributed call center)

Altruism (just helping other people for good) helpfindjim.com (Jim Gray), Naver KiN, Yahoo! Answer

Enjoyment (fun) Game With A Purpose (GWAP): http://www.gwap.com - ESP Game, Tag a Tune,

Reputation (recognition) Volunteer translators at childrenslibrary.org , Naver KiN, Yahoo! Answer

Implicit work reCAPTCHA

Page 14: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Quality ControlQuality Control Examples

Output agreement ESP Game (a game for labeling images) – answer is accepted if the pair agree on the same answer

Input agreementTag-a-tune: two humans are listening to different inputs (music). They are asked to describe the music and try to decide whether they are looking at the same music or different music

Economic modelsWhen money is a motivating factor; some economic models can be used to elicit quality answers (e.g., game-theoretic model of the worker’s rating to reduce the incentive to cheat)

Defensive task design Design tasks so that it’s difficult to cheat (e.g., comprehension questions)

Redundancy Each task is given to multiple people to separate the wheat from the chaff

Statistical filtering Filter or aggregate the data in some way that removes the effects of irrelevant work

Multilevel review One set of workers does the work; the second set reviews the results and rates the quality (e.g., Soylent : find-fix-verity)

Automatic check fold.it (protein folding game); easy to check using computer, but hard to find answers

Reputation system Motivated to provide quality answers by a reputation scoring systems; Mechanical Turk, Naver KiN, etc.

Expert check Trusted expert skims or cross-checks results for relevance and apparent accuracy

Page 15: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Aggregation

Aggregation Examples

Collection (to build a knowledge base)

Artificial intelligence research; to build large DB of common sense facts (e.g., people can’t brush their hairs with a table)Examples: ESP game, reCAPTCHA, FACTory, Verbosity, etc.

Wisdom of crowds (statistical processing of data)

Average guess of normal people can be very close to the actual outcome; e.g., Ask500people, News Futures, Iowa Electronic Markets

SearchLarge number of volunteers to sift through photos or videos, searching for some desired scientific phenomenon, person, or object, e.g., helpfindjim.com, Stardust@home project

Iterative improvement Giving answers of previous worker to elicit better answers, e.g., MonoTrans

Active learning Classifier training; selects the samples that could potentially give best training benefits and select them for manual annotations for training

Genetic algorithm (search/optimization) Free Knowledge Exchange, PicBreeder

None (if independent task is performed) VizWiz (a mobile app that les a blind user take a photo and ask question)

Page 16: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Human Skills, Processing Order, Task-Request Cardinality

Human Skills ExamplesVisual recognition ESP GameLanguage understanding SoylentBasic human communication ChaCha

Processing Order Examples

Computer Worker (>> Requester) reCAPTCHA

Worker (player) Requester Computer (aggregation) ESP Game (image labeling)

Computer Worker Requester ComputerCyc inferred large # of common senses FACTory, a GWAP where worker (players solve problem) , Cyc performs aggregation

Requester Worker Mechanical Turk

Task-Request Cardinality ExamplesOne-to-one (one worker to one task) ChaChaMany-to-many (many workers to many tasks) ESP GameMany-to-one (many workers to one task) helpfindjim.com (Jim Gary)Few-to-one (few workers to one task) VizWiz

Page 17: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Crowdsourcing Systems on the World-Wide Web

Anhai Doan, Raghu Ramakrishnan, Alon Y. Halevy

Communications of the ACM Vol. 54 No. 4, Pages 86-96 2011

Page 18: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Crowdsourcing Systems (CS)

• Defining crowdsourcing systems: tricky– Explicit vs. implicit collaboration to solve something?

• CS system “enlists a crowd to help solve a problem defined by the system owners”

• Addressing the following issues:– How to recruit and retain users?– What contributions can users make?– How to combine user contributions to solve the target

problem?– How to evaluate users and their contributions?

Page 19: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Explicit Crowdsourcing Systems

Page 20: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Implicit Crowdsourcing Systems

Page 21: Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011

Summary

• Definition of human computation and crowdsourcing

• Relationship with other related issues• Classifying human computation and

crowdsourcing systems– Motivation, human skill, aggregation, quality

control, processing order, task-request cardinality– Nature of collaboration, architecture, recruitment,

human skill