the crowdsearch framework

19
+ A Framework for Crowdsourced Multimedia Processing and Querying Alessandro Bozzon, Ilio Catallo, Eleonora Ciceri, Piero Fraternali, Davide Martinenghi, Marco Tagliasacchi 1

Upload: cubrik-project

Post on 19-Jun-2015

316 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: The CrowdSearch framework

+

A Framework for Crowdsourced Multimedia Processing and Querying

Alessandro Bozzon, Ilio Catallo, Eleonora Ciceri, Piero Fraternali, Davide Martinenghi, Marco Tagliasacchi

1

Page 2: The CrowdSearch framework

+CUbRIK Project

CUbRIK is a research project financed by the European Union

Goals: Advance the architecture of

multimedia search Exploit the human

contribution in multimedia search

Use open-source components provided by the community

Start up a search business ecosystem

http://www.cubrikproject.eu/

2

Page 3: The CrowdSearch framework

+Humans in Multimedia Information Retrieval Problem: the uncertainty of analysis algorithms leads to low

confidence results and conflicting opinions on automatically extracted features

Solution: humans have superior capacity for understanding the content of audiovisual material State of the art: humans replace automatic feature extraction

processes (human annotations)

Our contribution: integration of human judgment and algorithms Goal: improve the performance of multimedia content processing

3

Page 4: The CrowdSearch framework

+ Example of CUbRIK Human-enhanced computation: Trademark Logo Detection Problem statement: identifying occurrences of trademark

logos in a video collection through keyword-based queries Special case of the classic problem of object recognition

Use case: a professional user wants to retrieve all the occurrences of logos in a large collection of video clips

Applications: rating effectiveness of advertising, subliminal advertising detection, automatic annotation, trademark violation detection

4

Page 5: The CrowdSearch framework

+

Problems in automatic logo detection: Object recognition is affected by the quality of the input set

of images

Uncertain matches, i.e., the ones with low matching score, could not contain the searched logo

5

Trademark Logo Detection: problems in automatic logo detection

Page 6: The CrowdSearch framework

+

Contribution in human computation Filter the input logos, eliminating the irrelevant ones Segment the input logos

Validate the matching results

6

Trademark Logo Detection: contribution of human computation

Page 7: The CrowdSearch framework

+ 7

Trademark Logo Detection: pipeline

Page 8: The CrowdSearch framework

+The CrowdSearch framework for HC task management

8

Page 9: The CrowdSearch framework

+CrowdSearch framework in the Logo detection application

9

Types of tasks• Automatic tasks• Crowd tasks: tasks that are executed

by an open-ended community of performers

Page 10: The CrowdSearch framework

+Community of Performers

10

The application is deployed as a Facebook application

Seed community Information Technology department of Politecnico di Milano

Task propagationEach user in the seed community can propagate tasks through the social networks

Page 11: The CrowdSearch framework

+Design of “Validate Logo Images”

11

The “LIKE” task variant requires to choose relevant logos among a set of not filtered images

The “ADD”task variant requires to add new relevant image URLs

Please add new relevant logos

URL…

Send

Page 12: The CrowdSearch framework

+People to task matching & Task Assignment

12

Execution criteriaConstraints of task execution

Time budget for the experiment

Content Affinity criteriaQuery on a representation of the users’ capacities• Current state: manual selection of users• Future work: Geocultural affinityQuestions are dispatched to the crowd according to the user experience in answering questions• Expert user: an user that has already

answered to three questions

New users answer to “LIKE” questions

Expert users answer to “LIKE”+“ADD” questions

Page 13: The CrowdSearch framework

+Task execution

13

“LIKE” task variant “ADD” task variant

Page 14: The CrowdSearch framework

+Output aggregation

14

“LIKE” task variantsTop-5 rated logos are selected as relevant logos

“ADD” task variantsNew images are fed back to the LIKE tasks

Page 15: The CrowdSearch framework

+Experimental evaluation

Three experimental settings: No human intervention Logo validation performed by two domain experts Inclusion of the actual crowd knowledge

Crowd involvement 40 people involved 50 task instances generated 70 collected answers

15

Page 16: The CrowdSearch framework

+Experimental evaluation

16

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Page 17: The CrowdSearch framework

+Experimental evaluation

17

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Precision decreases

Reasons for the wrong inclusion• Geographical location of the

users• Expertise of the involved users

Page 18: The CrowdSearch framework

+Experimental evaluation

18

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Precision decreases• Similarity between two

logos in the data set

Page 19: The CrowdSearch framework

+Future directions

Task design: Implement new task types (tag / comment / like / add / modify…) Partition large task instances into several smaller instances dispatched

to multiple users

Task assignment: study how to associate the most suitable request with the most appropriate user Implement a ranking function on worker pool, based on the expertise,

geocultural information and past work history of the performers

Task execution: multiple heterogeneous platforms (Facebook, LinkedIn, Twitter, stand-alone application)

More use cases: Breaking news Fashion trend

19