The CrowdSearch framework

+

A Framework for Crowdsourced
Multimedia Processing and Querying
Alessandro Bozzon, Ilio Catallo, Eleonora Ciceri, Piero
Fraternali, Davide Martinenghi, Marco Tagliasacchi

0

+ 1

CUbRIK Project

 CUbRIK is a research project
financed by the European Union

 Goals:
 Advance the architecture of
multimedia search
 Exploit the human
contribution in multimedia
search
 Use open-source
components provided by the
community
 Start up a search business
ecosystem

 http://www.cubrikproject.eu/

+ 2

Humans in Multimedia Information
Retrieval
 Problem: the uncertainty of analysis algorithms leads to low
confidence results and conflicting opinions on automatically
extracted features

 Solution: humans have superior capacity for understanding the
content of audiovisual material
 State of the art: humans replace automatic feature extraction
processes (human annotations)

 Our contribution: integration of human judgment and algorithms
 Goal: improve the performance of multimedia content processing

+ Example of CUbRIK Human-enhanced 3

computation: Trademark Logo Detection

 Problem statement: identifying occurrences of trademark logos in
a video collection through keyword-based queries
 Special case of the classic problem of object recognition

 Use case: a professional user wants to retrieve all the
occurrences of logos in a large collection of video clips

 Applications: rating effectiveness of advertising, subliminal
advertising detection, automatic annotation, trademark violation
detection

+ 4

Trademark Logo Detection: problems in
automatic logo detection
 Problems in automatic logo detection:
 Object recognition is affected by the quality of the input set of
images

 Uncertain matches, i.e., the ones with low matching score, could not
contain the searched logo

+ 5

Trademark Logo Detection:
contribution of human computation
 Contribution in human computation
 Filter the input logos, eliminating the irrelevant ones
 Segment the input logos

 Validate the matching results

+ 6

Trademark Logo Detection: pipeline

+ 7

The CrowdSearch framework for
HC task management

+ 8

CrowdSearch framework in the
Logo detection application
Problem solving
process
Process

Task Crowd
Task
Types of tasks
• Automatic tasks
• Crowd tasks: tasks that are executed by an
open-ended community of performers
Crowd Task

+ 9

Community of Performers
Content edges,
e.g., IS-A, part.of Content elements
The application is deployed as a
Facebook application

Seed community
Information Technology
Performer to content department of Politecnico di
edges, e.g., topical
group membership
Milano
Performers
edges, e.g.,
friendship, Task propagation
weak ties
Performers Each user in the seed
community can propagate
tasks through the social
networks

+ 10

Design of “Validate Logo Images”
The “LIKE” task variant requires to choose
relevant logos among a set of not filtered images

Human Task
Design

The “ADD”task variant requires to add new
relevant image URLs
Please add new relevant logos
URL…

Send

+ 11

People to task matching & Task
Assignment
Task Deployment Criteria Execution criteria
Constraints of task execution
Content Affinity Criteria
Time budget for the experiment
Execution Criteria
Content Affinity criteria
Query on a representation of the users’ capacities
• Current state: manual selection of users
People to • Future work: Geocultural affinity
task matching
Questions are dispatched to the crowd according to the
user experience in answering questions
• Expert user: an user that has already answered to
three questions

Task New users answer to “LIKE” questions
assignment
Expert users answer to “LIKE”+“ADD” questions

+ 12

Task execution
Task
execution

“LIKE” task variant “ADD” task variant

+ 13

Output aggregation

“LIKE” task variants
Top-5 rated logos are
selected as relevant logos
Task “ADD” task variants
execution New images are fed back to
the LIKE tasks
Task outputs

Task output

Output
aggregation

+ 14

Experimental evaluation

 Three experimental settings:
 No human intervention
 Logo validation performed by two domain experts
 Inclusion of the actual crowd knowledge

 Crowd involvement
 40 people involved
 50 task instances generated
 70 collected answers

+ 15


1

0.9

0.8
Crowd
0.7
Experts
0.6
Experts
Recall

Experts
0.5 Aleve
0.4 Crowd Chunky
0.3
No Crowd Shout
0.2 Crowd No Crowd
0.1

0 No Crowd
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Precision

+ 16


1

0.9

0.8 Precision decreases
Crowd
0.7
Experts
0.6 Reasons for the wrong inclusion
Experts
Recall

Experts • Geographical location of the users
0.5 Aleve
• Expertise of the involved users
0.4 Crowd Chunky
0.3
No Crowd Shout
0.2 Crowd No Crowd
0.1

0 No Crowd
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Precision

+ 17


1
Precision decreases
• Similarity between two
0.9
logos in the data set
0.8
Crowd
0.7
Experts
0.6
Experts
Recall

Experts
0.5 Aleve
0.4 Crowd Chunky
0.3
No Crowd Shout
0.2 Crowd No Crowd
0.1

0 No Crowd
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Precision

+ 18

Future directions

 Task design:
 Implement new task types (tag / comment / like / add / modify…)
 Partition large task instances into several smaller instances dispatched to multiple
users

 Task assignment: study how to associate the most suitable request with
the most appropriate user
 Implement a ranking function on worker pool, based on the
expertise, geocultural information and past work history of the performers

 Task execution: multiple heterogeneous platforms
(Facebook, LinkedIn, Twitter, stand-alone application)

 More use cases:
 Breaking news
 Fashion trend

The CrowdSearch framework

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (8)

Semelhante a The CrowdSearch framework

Semelhante a The CrowdSearch framework (20)

Mais de CUbRIK Project

Mais de CUbRIK Project (20)

Último

Último (20)

The CrowdSearch framework