SlideShare uma empresa Scribd logo
1 de 87
Baixar para ler offline
CROWDSEARCHER
Marco Brambilla, Stefano Ceri,
Andrea Mauri, Riccardo Volonterio
Politecnico di Milano
Dipartimento di Elettronica, Informazione e BioIngegneria
Crowdsearcher 1
Crowd-based Applications
• Emerging crowd-based applications:
• opinion mining
• localized information gathering
• marketing campaigns
• expert response gathering
• General structure:
• the requestor poses some questions
• a wide set of responders are in charge of providing answers
(typically unknown to the requestor)
• the system organizes a response collection campaign
• Include crowdsourcing and crowdsearching
Crowdsearcher 2
The “system” is a wide concept
• Crowd-based applications may use social networks and Q&A
websites in addition to crowdsourcing platforms
• Our approach: a coordination engine which keeps an overall
control on the application deployment and execution
Crowdsearcher 3
CrowdSearcher
APIAccess
CrowdSearcher
• Combines a conceptual framework, a specification
paradigm and a reactive execution control environment
• Supports designing, deploying, and monitoring
applications on top of crowd-based systems
• Design is top-down, platform-independent
• Deployment turns declarative specifications into platform-specific
implementations which include social networks and crowdsourcing
platforms
• Monitoring provides reactive control, which guarantees
applications’ adaptation and interoperability
• Developed in the context of Search Computing
(SeCo, ERC Advanced Grant, 2008-2013)
Crowdsearcher 4
An example of crowd-based application:
crowd-search
• People do not trust web search completely
• Want to get direct feedback from people
• Expect recommendations, insights, opinions, reassurance
Crowdsearcher 7
Crowd-searching after conventional
search
• From search results to friends and experts feedback
Social Platform
initial query
Human
Search
System
Search
System
Social PlatformSocial Platform
Crowdsearcher 8
Example: Find your next job (exploration)
Crowdsearcher 9
Example: Find your job (social invitation)
Crowdsearcher 10
Example: Find your job (social invitation)
Selected data items
can be transferred
to the crowd question
Crowdsearcher 11
Find your job (response submission)
Crowdsearcher 12
Crowdsearcher results (in the loop)
Crowdsearcher 13
Deployment alternatives
• Multi-platform deployment
Embedded
application
Social/ Crowd platform
Native
behaviours
External
application
Standalone
application
API
Embedding
Community / Crowd
Generated query template
Native
Crowdsearcher 14
Deployment: search on a social network
• Multi-platform deployment
Crowdsearcher 15
Deployment: search on the social network
• Multi-platform deployment
Crowdsearcher 16
Deployment: search on the social network
• Multi-platform deployment
Crowdsearcher 17
Deployment: search on the social network
• Multi-platform deployment
Crowdsearcher 18
From social workers to communities
• Issues and problems
• Motivation of the responders
• Intensity of social activity of the asker
• Topic appropriateness
• Timing of the post (hour of the day, day of the
week)
• Context and language barrier
Crowdsearcher 19
THE MODELAND
THE PROCESS
Crowdsearcher 20
• A simple task design and deployment process, based on specific data
structures
• created using model-driven transformations
• driven by the task specification
The Design Process
Task
Specification
Task Planning
Control
Specification
Crowdsearcher 21
• Task Specification: task operations, objects, and performers
• Task Planning: work distribution
• Control Specification: task control policies
Task Specification
• Which are the input objects of the crowd interaction?
• Do they have a schema (record of named and typed fields)?
• Which operations should the crowd perform?
• Like, label, comment, add new instances, verify/modify data, order, etc.
• Who are the performers of the task? How should they be
selected? And invited?
• e.g. push vs pull model
• Which quality criteria should be used for deciding the task
outcome?
• e.g., majority weighting, with/without spam detection
• Which platforms should be used? Which execution
interface should be used?
Crowdsearcher 22
Operations
• In a Task, performers are required to execute logical operations on input objects
• e.g. Locate the faces of the people appearing in the following 5 images
• CrowdSearcher offers pre-defined operation types:
• Like: Ask a performer to express a preference (true/false)
• e.g. Do you like this picture?
• Comment: Ask a performer to write a description / summary / evaluation
• e.g. Can you summarize the following text using your own words?
• Tag: Ask a performer to annotate an object with a set of tags
• e.g. How would you label the following image?
• Classify: Ask a performer to classify an object within a closed-set of alternatives
• e.g. Would you classify this tweet as pro-right, pro-left, or neutral?
• Add: Ask a performer to add a new object conforming to the specified schema
• e.g. Can you list the name and address of good restaurants nearby Politecnico di Milano?
• Modify: Ask a performer to verify/modify the content of one or more input object
• e.g. Is this wine from Cinque Terre? If not, where does it come from?
• Order: Ask a performer to order the input objects
• e.g. Order the following books according to your taste
Crowdsearcher 23
Task planning
Typical problems:
• Task structuring: the task is too complex or too critical to
be executed as a single operation.
• Task splitting: the input data collection is too large to be
presented to a user.
• Task routing: a query can be distributed according to the
values of some attribute of the collection.
Crowdsearcher 24
Micro Tasks
• The actual unit of interaction with a performer.
• Mapping of objects to Micro Tasks:
• How many objects in each MicroTask?
• Which objects should appear in each MicroTask?
• How often an object should appear in MicroTasks?
• Which objects cannot appear together?
• Should objects be presented always in some order?
Crowdsearcher 25
Assignment Strategy
• Given a set of MicroTasks, which performers are
assigned to them?
• Pull vs Push:
• Pull: The performer choses
• Push: The performer is chosen
• Online vs offline
• Micro Tasks dynamically assigned to performers
• First come / First served
• Based on performer’s performance
• MicroTasks statically assigned to performers
• Based on performers’ priority
• Based on matching
Crowdsearcher 26
Invitation Strategy
• The process of inviting performers to perform Micro Tasks
• Can use very different mechanisms
• Essential in order to generate the appropriate performer reaction / reward.
• Examples:
• Send an email to a mailing list
• Publish a HIT on Mechanical Turk
• Create a new challenge in your game
• Publish a post/tweet on your social network profile
• Publish a post/tweet on your friends' profile
Crowdsearcher 27
Steps in Crodw-based Application Design
1) Task Design
2) Object and Performer Design
3) Micro Task Design
Step 1. Task Design
Crowdsearcher 29
Step 2: Object and Performer Design
Step 3: MicroTask Design
Crowdsearcher 31
Complete Meta-Model
Crowdsearcher 32
Design Tool: Screenshot
Crowdsearcher 33
Application instatiation (for Italian Politics)
• Given the picture and name of a politician, specify his/her political
affiliation
• No time limit
• Performers are encouraged to look up online
• 2 set of rules
• Majority Evaluation
• Spammer Detection
Crowdsearcher 34
REACTIVITY AND
MULTIPLATFORM
Crowdsearcher 35
Crowd Control is tough…
• There are several aspects that makes crowd
engineering complicated
• Task design, planning, assignment
• Workers discovery, assessment, engagement
Crowdsearcher 36
Crowd Control is tough…
• There are several aspects that makes crowd
engineering complicated
• Task design, planning, assignment
• Workers discovery, assessment, engagement
• Controlling crowdsourcing tasks is a
fundamental issue
• Cost
• Time
• Quality
• Need for higher level abstrasction and tools
Crowdsearcher 37
Reactive Crowdsourcing
• A conceptual framework for controlling the execution of
crowd-based computations. Based on:
• Control Marts
• Active Rules
• Classical forms of controls:
• Majority control (to close object computations)
• Quality control (to check that quality constraints are met)
• Spam detection (to detect / eliminate some performers)
• Multi-platform adaptation (to change the deployment platform)
• Social adaptation (to change the community of performers)
Crowdsearcher 38
Why Active Rules?
• Ease of Use: control is easily expressible
• Simple formalism, simple computation
• Power: arbitrarily complex controls is supported
• Extensibility mechanisms
• Automation: active rules can be system-generated
• Well-defined semantics
• Flexibility: localized impact of changes on the rules set
• Control isolation
• Known formal properties descending from known theory
• Termination, confluence
Crowdsearcher 39
Control Mart
• Data structure for controlling application execution, inspired by data
marts (for data warehousing); content is automatically built from task
specification & planning
• Central entity: MicroTask Object Execution
• Dimensions: Task / Operations, Performer, Object
Crowdsearcher 40
Task Specification Task Planning Control Specification
Auxiliary Structures
• Object : tracking object responses
• Performer: tracking performer behavior (e.g. spammers)
• Task: tracking task status
Crowdsearcher 41
Task Specification Task Planning Control Specification
Active Rules Language
• Active rules are expressed on the previous data
structures
• Event-Condition-Action paradigm
Crowdsearcher 42
Active Rules Language
• Active rules are expressed on the previous data
structures
• Event-Condition-Action paradigm
• Events: data updates / timer
• ROW-level granularity
• OLD  before state of a row
• NEW  after state of a row
Crowdsearcher 43
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
Active Rules Language
• Active rules are expressed on the previous data
structures
• Event-Condition-Action paradigm
• Events: data updates / timer
• ROW-level granularity
• OLD  before state of a row
• NEW  after state of a row
• Condition: a predicate that must be satisfied (e.g. conditions on
control mart attributes)
Crowdsearcher 44
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
c: NEW.ClassifiedParty == ’Republican’
Active Rules Language
• Active rules are expressed on the previous data
structures
• Event-Condition-Action paradigm
• Events: data updates / timer
• ROW-level granularity
• OLD  before state of a row
• NEW  after state of a row
• Condition: a predicate that must be satisfied (e.g. conditions on
control mart attributes)
• Actions: updates on data structures (e.g. change attribute value,
create new instances), special functions (e.g. replan)
Crowdsearcher 45
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
Crowdsearcher 46
Rule Example 1
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
Crowdsearcher 47
Rule Example 1
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
Crowdsearcher 48
Rule Example 1
Crowdsearcher 49
e: UPDATE FOR ObjectControl
c: (NEW.Rep== 2) or (NEW.Dem == 2)
a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer,
SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
Crowdsearcher 50
e: UPDATE FOR ObjectControl
c: (NEW.Rep== 2) or (NEW.Dem == 2)
a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer,
SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
Crowdsearcher 51
e: UPDATE FOR ObjectControl
c: (NEW.Rep== 2) or (NEW.Dem == 2)
a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer,
SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
Crowdsearcher 52
e: UPDATE FOR ObjectControl
c: (NEW.Rep== 2) or (NEW.Dem == 2)
a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer,
SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
Rule Programming Best Practice
• We define three classes of rules
Crowdsearcher 53
Rule Programming Best Practice
Crowdsearcher 54
• We define three classes of rules
• Control rules: modifying the control tables;
Rule Programming Best Practice
Crowdsearcher 55
• We define three classes of rules
• Control rules: modifying the control tables;
• Result rules: modifying the dimension tables (object, performer, task);
Rule Programming Best Practice
Crowdsearcher 56
• Top-to-bottom, left-to-right, evaluation
• Guaranteed termination
• We define three classes of rules
• Control rules: modifying the control tables;
• Result rules: modifying the dimension tables (object, performer, task);
Rule Programming Best Practice
• We define three classes of rules
• Control rules: modifying the control tables;
• Result rules: modifying the dimension tables (object, performer, task);
• Execution rules: modifying the execution table, either directly or through re-planning
Crowdsearcher 57
• Termination must be proven (rule precedence graph has cycles)
EXPERIMENTS
Crowdsearcher 58
Crowdsearcher Experiment 1
• Goal: Test engagement on social networks
• Some 150 users
• Two classes of experiments:
• Random questions on fixed topics: interests (e.g. restaurants in the
vicinity of Politecnico), to famous 2011 songs, or to top-quality EU
soccer teams
• Questions manually submitted by the users
• Different invitation strategies:
• Random invitation
• Explicit selection of responders by the asker
• Outcome
• 175 like and insert queries
• 1536 invitations to friends
• 230 answers
• 95 questions (~55%) got at least one answer
Crowdsearcher 59
Manual and Random Questions
Crowdsearcher 60
Interest / Rewarding Factor
• Manually written and assigned questions
are consistently more responded in time
Crowdsearcher 61
Query Type
• Engagement depends on the difficulty of the task
• Like vs. Add tasks:
Crowdsearcher 62
Comparison of Execution Platforms
• Facebook vs. Doodle
Crowdsearcher 64
Posting Time
• Facebook vs. Doodle
Crowdsearcher 65
Crowdsearcher Experiment 2
• GOAL: demonstrate the flexibility and expressive power
of reactive crowdsourcing
• 3 experiments, focused on Italian politicians
• Parties: Human Computation  affiliation classification
• Law: Game With a Purpose  guess the convicted politician
• Order: Pure Game  hot or not
• 1 week (November 2012)
• 284 distinct performers
• Recruited through public mailing lists and social networks
announcements
• 3500 Micro Tasks
Crowdsearcher 66
Politician Affiliation
• Given the picture and name of a politician, specify his/her political
affiliation
• No time limit
• Performers are encouraged to look up online
• 2 set of rules
• Majority Evaluation
• Spammer Detection
Crowdsearcher 67
Results – Majority Evaluation_1/3
Crowdsearcher 68
30 object; object redundancy = 9;
Final object classification as simple majority after 7 evaluations
Results - Majority Evaluation_2/3
Crowdsearcher 69
Final object classification as total majority after 3 evaluations
Otherwise, re-plan of 4 additional evaluations. Then simple majority at 7
Results - Majority Evaluation_3/3
Crowdsearcher 70
Final object classification as total majority after 3 evaluations
Otherwise, simple majority at 5 or at 7 (with replan)
Results – Spammer Detection_1/2
Crowdsearcher 71
New rule for spammer detection without ground truth
Performer correctness on final majority. Spammer if > 50% wrong classifications
Results – Spammer Detection_1/2
Crowdsearcher 72
New rule for spammer detection without ground truth
Performer correctness on current majority. Spammer if > 50% wrong classifications
EXPERT FINDING IN
CROWDSEARCHER
Crowdsearcher 73
Problem
• Ranking the members of a social group according
to the level of knowledge that they have about a
given topic
• Application: crowd selection (for Crowd Searching
or Sourcing)
• Available data
• User profile
• behavioral trace that users leave behind them through
their social activities
Crowdsearcher 74
Considered Features
• User Profiles
• Plus Linked Web Pages
• Social Relationships
• Facebook Friendship
• Twitter mutual following relationship
• LinkedIn Connections
• Resource Containers
• Groups, Facebook Pages
• Linked Pages
• Users who are followed by a given user are resource containers
• Resources
• Material published in resource containers
Crowdsearcher 75
Feature Organization Meta-Model
Crowdsearcher 76
Example (Facebook)
Crowdsearcher 77
Example (Twitter)
Crowdsearcher 78
Resource Distance
• Objects in social graph organized according to their
distance with respect to the user profile
• Why? Privacy, Computational Cost, Platform Access Constraints
Distance Resource
0 Expert Candidate Profile
1
Expert Candidate owns/create/annotates Resource
Expert Candidate relatedTo Resource Container
Expert Candidate follows UserProfile
2
Expert Candidate follows UserProfile relatedTo Resource Container
Expert Candidate relatedTo Resource Container contains Resource
Expert Candidate follows UserProfile owns/create/annotates Resource
Expert Candidate follows UserProfile follows UserProfile
Crowdsearcher 79
Distance interpretation
Distance Resource
0 Expert Candidate Profile
1
Expert Candidate owns/create/annotates Resource
Expert Candidate relatedTo Resource Container
Expert Candidate follows UserProfile
2
Expert Candidate follows UserProfile relatedTo Resource Container
Expert Candidate relatedTo Resource Container contains Resource
Expert Candidate follows UserProfile owns/create/annotates Resource
Expert Candidate follows UserProfile follows UserProfile
Crowdsearcher 80
Resource Processing
• Extraction from Social
Network APIs
• Extraction of Text from linked
Web Pages
• Alchemy Text Extraction APIs
• Language Identification
• Text Processing
• Sanitization, tokenization,
stopword, lemmatization
• Entity Extraction and
Disambiguation
• TagMe
Crowdsearcher 81
Dataset
• 7 kinds of expertises
• Computer Engineering, Location, Movies & TV, Music, Science,
Sport, Technology & Videogames
• 40 volunteer users (on Facebook & Twitter & LinkedIN)
• 330.000 resources (70% with URL to external resources)
• Groundtruth created trough self-assessment
• For expertise need, vote on 7 Likert Scale
• EXPERTS  expertise above average
Crowdsearcher 84
Metrics
• We obtain lists of candidate experts and assess them
against the ground truth, using:
• For precision:
• Mean Average Precision (MAP)
• 11-Point Interpolated Average Precision (11-P)
• For ranking:
• Mean Reciprocal Rank (MRR) – for the first value
• Normalized Discounted Cumulative Gain (DCG) – for more values, can
be set @N for the first N values
Crowdsearcher 86
Metrics improves with resources
• But it comes with a cost
Crowdsearcher 87
Friendship Relationship not useful
• Inspecting friend’s resources does not improve metrics!
Crowdsearcher 88
Social Network Analysis
• a
Comparison of the results obtained with All the social networks, or separately by
FaceBook, TWitter, and LinkedIn.
Crowdsearcher 89
Main Results
• Profiles are less effective than level-1 resources
• Resources produced by others help in describing each individual’s
expertise
• Twitter is the most effective social network for expertise
matching – sometimes it outperforms the other social
networks
• Twitter most effective in Computer Engineering, Science, Technology &
Games, Sport
• Facebook effective in Locations, Sport, Movies & TV, Music
• Linked-in never very helpful in locating expertise
Crowdsearcher 90
CONCLUSIONS
Crowdsearcher 95
Summary
• Results
• An integrated framework for crowdsourcing task design and control
• Well-structured control rules with guarantees of termination
• Support for cross-platform crowd interoperability
• A working prototype  crowdsearcher.search-computing.org
• Forthcoming
• Publication of Web Interface + API
• Support of declarative options for automatic rule generation
• Integration with more social networks and human computation
platforms
• Providing vertical solutions for specific markets
• More applications and experiments (e.g. in Expo 2015)
Crowdsearcher 96
QUESTIONS?
Crowdsearcher 97

Mais conteúdo relacionado

Destaque

It Only Takes a Minute
It Only Takes a MinuteIt Only Takes a Minute
It Only Takes a Minuteelliottofhook
 
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Marco Brambilla
 
Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...
Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...
Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...Social Media for Nonprofits
 
Fundchange Koodonation Social Media for Charities and Non-Profits
Fundchange Koodonation Social Media for Charities and Non-ProfitsFundchange Koodonation Social Media for Charities and Non-Profits
Fundchange Koodonation Social Media for Charities and Non-ProfitsIdeavibes | Paul Dombowsky
 
CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...
CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...
CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...erinleebrady
 
Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)
Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)
Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)Frezzy Vinson
 
Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014
Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014
Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014Nima Dokoohaki
 
We Are Social's Guide To Building A Connected Strategy
We Are Social's Guide To Building A Connected StrategyWe Are Social's Guide To Building A Connected Strategy
We Are Social's Guide To Building A Connected StrategyWe Are Social Singapore
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systemsguest77b0cd12
 
Crowd Agents: Interactive Crowd-Powered Systems in the Real World
Crowd Agents:  Interactive Crowd-Powered Systems in the Real WorldCrowd Agents:  Interactive Crowd-Powered Systems in the Real World
Crowd Agents: Interactive Crowd-Powered Systems in the Real WorldJeffrey Bigham
 

Destaque (11)

It Only Takes a Minute
It Only Takes a MinuteIt Only Takes a Minute
It Only Takes a Minute
 
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
 
Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...
Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...
Robert Rosenthal - Social Media & the 3Rs: Content Strategy Basics for Engagi...
 
Fundchange Koodonation Social Media for Charities and Non-Profits
Fundchange Koodonation Social Media for Charities and Non-ProfitsFundchange Koodonation Social Media for Charities and Non-Profits
Fundchange Koodonation Social Media for Charities and Non-Profits
 
CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...
CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...
CSCW 2013 - Investigating the Appropriateness of Social Network Question Aski...
 
Masters thesis defense talk
Masters thesis defense talkMasters thesis defense talk
Masters thesis defense talk
 
Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)
Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)
Introduction to the Social Dimension of Education (gamilla, vinson, sabelo)
 
Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014
Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014
Building Recommendation Systems on Social Data @KTH - FutureFriday - March 2014
 
We Are Social's Guide To Building A Connected Strategy
We Are Social's Guide To Building A Connected StrategyWe Are Social's Guide To Building A Connected Strategy
We Are Social's Guide To Building A Connected Strategy
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systems
 
Crowd Agents: Interactive Crowd-Powered Systems in the Real World
Crowd Agents:  Interactive Crowd-Powered Systems in the Real WorldCrowd Agents:  Interactive Crowd-Powered Systems in the Real World
Crowd Agents: Interactive Crowd-Powered Systems in the Real World
 

Semelhante a CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013

How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...Lucidworks
 
REQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGREQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGSaqib Raza
 
Model driven development and code generation of software systems
Model driven development and code generation of software systemsModel driven development and code generation of software systems
Model driven development and code generation of software systemsMarco Brambilla
 
Introduction to SAD.pptx
Introduction to SAD.pptxIntroduction to SAD.pptx
Introduction to SAD.pptxazida3
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Sonya Liberman
 
Software Project Management Presentation Final
Software Project Management Presentation FinalSoftware Project Management Presentation Final
Software Project Management Presentation FinalMinhas Kamal
 
Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingDesignHammer
 
Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingDavid Minton
 
2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).ppt2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).pptabhishekgoyal29250
 
2.Basic Introduction of SDLC Phases and explanation of SDLC Models.ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models.ppt2.Basic Introduction of SDLC Phases and explanation of SDLC Models.ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models.pptTanuYadav844527
 
software requirement
software requirement software requirement
software requirement nimmik4u
 
CIS375 Interaction Designs Chapter15
CIS375 Interaction Designs Chapter15CIS375 Interaction Designs Chapter15
CIS375 Interaction Designs Chapter15Dr. Ahmed Al Zaidy
 
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...TimelessFuture
 
How to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product ManagerHow to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product ManagerProduct School
 
Introduction to SAD.pptx
Introduction to SAD.pptxIntroduction to SAD.pptx
Introduction to SAD.pptxazida3
 
Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!Lionel Briand
 

Semelhante a CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013 (20)

How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
 
REQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGREQUIREMENT ENGINEERING
REQUIREMENT ENGINEERING
 
Model driven development and code generation of software systems
Model driven development and code generation of software systemsModel driven development and code generation of software systems
Model driven development and code generation of software systems
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
Introduction to SAD.pptx
Introduction to SAD.pptxIntroduction to SAD.pptx
Introduction to SAD.pptx
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019
 
Software Project Management Presentation Final
Software Project Management Presentation FinalSoftware Project Management Presentation Final
Software Project Management Presentation Final
 
SRE.pptx
SRE.pptxSRE.pptx
SRE.pptx
 
Requirementengg
RequirementenggRequirementengg
Requirementengg
 
Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testing
 
Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testing
 
2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).ppt2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models (1).ppt
 
2.Basic Introduction of SDLC Phases and explanation of SDLC Models.ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models.ppt2.Basic Introduction of SDLC Phases and explanation of SDLC Models.ppt
2.Basic Introduction of SDLC Phases and explanation of SDLC Models.ppt
 
software requirement
software requirement software requirement
software requirement
 
CIS375 Interaction Designs Chapter15
CIS375 Interaction Designs Chapter15CIS375 Interaction Designs Chapter15
CIS375 Interaction Designs Chapter15
 
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
 
How to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product ManagerHow to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product Manager
 
Introduction to SAD.pptx
Introduction to SAD.pptxIntroduction to SAD.pptx
Introduction to SAD.pptx
 
UXLX2012 User Research Hacks
UXLX2012 User Research HacksUXLX2012 User Research Hacks
UXLX2012 User Research Hacks
 
Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!
 

Último

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013

  • 1. CROWDSEARCHER Marco Brambilla, Stefano Ceri, Andrea Mauri, Riccardo Volonterio Politecnico di Milano Dipartimento di Elettronica, Informazione e BioIngegneria Crowdsearcher 1
  • 2. Crowd-based Applications • Emerging crowd-based applications: • opinion mining • localized information gathering • marketing campaigns • expert response gathering • General structure: • the requestor poses some questions • a wide set of responders are in charge of providing answers (typically unknown to the requestor) • the system organizes a response collection campaign • Include crowdsourcing and crowdsearching Crowdsearcher 2
  • 3. The “system” is a wide concept • Crowd-based applications may use social networks and Q&A websites in addition to crowdsourcing platforms • Our approach: a coordination engine which keeps an overall control on the application deployment and execution Crowdsearcher 3 CrowdSearcher APIAccess
  • 4. CrowdSearcher • Combines a conceptual framework, a specification paradigm and a reactive execution control environment • Supports designing, deploying, and monitoring applications on top of crowd-based systems • Design is top-down, platform-independent • Deployment turns declarative specifications into platform-specific implementations which include social networks and crowdsourcing platforms • Monitoring provides reactive control, which guarantees applications’ adaptation and interoperability • Developed in the context of Search Computing (SeCo, ERC Advanced Grant, 2008-2013) Crowdsearcher 4
  • 5. An example of crowd-based application: crowd-search • People do not trust web search completely • Want to get direct feedback from people • Expect recommendations, insights, opinions, reassurance Crowdsearcher 7
  • 6. Crowd-searching after conventional search • From search results to friends and experts feedback Social Platform initial query Human Search System Search System Social PlatformSocial Platform Crowdsearcher 8
  • 7. Example: Find your next job (exploration) Crowdsearcher 9
  • 8. Example: Find your job (social invitation) Crowdsearcher 10
  • 9. Example: Find your job (social invitation) Selected data items can be transferred to the crowd question Crowdsearcher 11
  • 10. Find your job (response submission) Crowdsearcher 12
  • 11. Crowdsearcher results (in the loop) Crowdsearcher 13
  • 12. Deployment alternatives • Multi-platform deployment Embedded application Social/ Crowd platform Native behaviours External application Standalone application API Embedding Community / Crowd Generated query template Native Crowdsearcher 14
  • 13. Deployment: search on a social network • Multi-platform deployment Crowdsearcher 15
  • 14. Deployment: search on the social network • Multi-platform deployment Crowdsearcher 16
  • 15. Deployment: search on the social network • Multi-platform deployment Crowdsearcher 17
  • 16. Deployment: search on the social network • Multi-platform deployment Crowdsearcher 18
  • 17. From social workers to communities • Issues and problems • Motivation of the responders • Intensity of social activity of the asker • Topic appropriateness • Timing of the post (hour of the day, day of the week) • Context and language barrier Crowdsearcher 19
  • 19. • A simple task design and deployment process, based on specific data structures • created using model-driven transformations • driven by the task specification The Design Process Task Specification Task Planning Control Specification Crowdsearcher 21 • Task Specification: task operations, objects, and performers • Task Planning: work distribution • Control Specification: task control policies
  • 20. Task Specification • Which are the input objects of the crowd interaction? • Do they have a schema (record of named and typed fields)? • Which operations should the crowd perform? • Like, label, comment, add new instances, verify/modify data, order, etc. • Who are the performers of the task? How should they be selected? And invited? • e.g. push vs pull model • Which quality criteria should be used for deciding the task outcome? • e.g., majority weighting, with/without spam detection • Which platforms should be used? Which execution interface should be used? Crowdsearcher 22
  • 21. Operations • In a Task, performers are required to execute logical operations on input objects • e.g. Locate the faces of the people appearing in the following 5 images • CrowdSearcher offers pre-defined operation types: • Like: Ask a performer to express a preference (true/false) • e.g. Do you like this picture? • Comment: Ask a performer to write a description / summary / evaluation • e.g. Can you summarize the following text using your own words? • Tag: Ask a performer to annotate an object with a set of tags • e.g. How would you label the following image? • Classify: Ask a performer to classify an object within a closed-set of alternatives • e.g. Would you classify this tweet as pro-right, pro-left, or neutral? • Add: Ask a performer to add a new object conforming to the specified schema • e.g. Can you list the name and address of good restaurants nearby Politecnico di Milano? • Modify: Ask a performer to verify/modify the content of one or more input object • e.g. Is this wine from Cinque Terre? If not, where does it come from? • Order: Ask a performer to order the input objects • e.g. Order the following books according to your taste Crowdsearcher 23
  • 22. Task planning Typical problems: • Task structuring: the task is too complex or too critical to be executed as a single operation. • Task splitting: the input data collection is too large to be presented to a user. • Task routing: a query can be distributed according to the values of some attribute of the collection. Crowdsearcher 24
  • 23. Micro Tasks • The actual unit of interaction with a performer. • Mapping of objects to Micro Tasks: • How many objects in each MicroTask? • Which objects should appear in each MicroTask? • How often an object should appear in MicroTasks? • Which objects cannot appear together? • Should objects be presented always in some order? Crowdsearcher 25
  • 24. Assignment Strategy • Given a set of MicroTasks, which performers are assigned to them? • Pull vs Push: • Pull: The performer choses • Push: The performer is chosen • Online vs offline • Micro Tasks dynamically assigned to performers • First come / First served • Based on performer’s performance • MicroTasks statically assigned to performers • Based on performers’ priority • Based on matching Crowdsearcher 26
  • 25. Invitation Strategy • The process of inviting performers to perform Micro Tasks • Can use very different mechanisms • Essential in order to generate the appropriate performer reaction / reward. • Examples: • Send an email to a mailing list • Publish a HIT on Mechanical Turk • Create a new challenge in your game • Publish a post/tweet on your social network profile • Publish a post/tweet on your friends' profile Crowdsearcher 27
  • 26. Steps in Crodw-based Application Design 1) Task Design 2) Object and Performer Design 3) Micro Task Design
  • 27. Step 1. Task Design Crowdsearcher 29
  • 28. Step 2: Object and Performer Design
  • 29. Step 3: MicroTask Design Crowdsearcher 31
  • 32. Application instatiation (for Italian Politics) • Given the picture and name of a politician, specify his/her political affiliation • No time limit • Performers are encouraged to look up online • 2 set of rules • Majority Evaluation • Spammer Detection Crowdsearcher 34
  • 34. Crowd Control is tough… • There are several aspects that makes crowd engineering complicated • Task design, planning, assignment • Workers discovery, assessment, engagement Crowdsearcher 36
  • 35. Crowd Control is tough… • There are several aspects that makes crowd engineering complicated • Task design, planning, assignment • Workers discovery, assessment, engagement • Controlling crowdsourcing tasks is a fundamental issue • Cost • Time • Quality • Need for higher level abstrasction and tools Crowdsearcher 37
  • 36. Reactive Crowdsourcing • A conceptual framework for controlling the execution of crowd-based computations. Based on: • Control Marts • Active Rules • Classical forms of controls: • Majority control (to close object computations) • Quality control (to check that quality constraints are met) • Spam detection (to detect / eliminate some performers) • Multi-platform adaptation (to change the deployment platform) • Social adaptation (to change the community of performers) Crowdsearcher 38
  • 37. Why Active Rules? • Ease of Use: control is easily expressible • Simple formalism, simple computation • Power: arbitrarily complex controls is supported • Extensibility mechanisms • Automation: active rules can be system-generated • Well-defined semantics • Flexibility: localized impact of changes on the rules set • Control isolation • Known formal properties descending from known theory • Termination, confluence Crowdsearcher 39
  • 38. Control Mart • Data structure for controlling application execution, inspired by data marts (for data warehousing); content is automatically built from task specification & planning • Central entity: MicroTask Object Execution • Dimensions: Task / Operations, Performer, Object Crowdsearcher 40 Task Specification Task Planning Control Specification
  • 39. Auxiliary Structures • Object : tracking object responses • Performer: tracking performer behavior (e.g. spammers) • Task: tracking task status Crowdsearcher 41 Task Specification Task Planning Control Specification
  • 40. Active Rules Language • Active rules are expressed on the previous data structures • Event-Condition-Action paradigm Crowdsearcher 42
  • 41. Active Rules Language • Active rules are expressed on the previous data structures • Event-Condition-Action paradigm • Events: data updates / timer • ROW-level granularity • OLD  before state of a row • NEW  after state of a row Crowdsearcher 43 e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
  • 42. Active Rules Language • Active rules are expressed on the previous data structures • Event-Condition-Action paradigm • Events: data updates / timer • ROW-level granularity • OLD  before state of a row • NEW  after state of a row • Condition: a predicate that must be satisfied (e.g. conditions on control mart attributes) Crowdsearcher 44 e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’
  • 43. Active Rules Language • Active rules are expressed on the previous data structures • Event-Condition-Action paradigm • Events: data updates / timer • ROW-level granularity • OLD  before state of a row • NEW  after state of a row • Condition: a predicate that must be satisfied (e.g. conditions on control mart attributes) • Actions: updates on data structures (e.g. change attribute value, create new instances), special functions (e.g. replan) Crowdsearcher 45 e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’ a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
  • 44. e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’ a: SET ObjectControl[oID == NEW.oID].#Eval+= 1 Crowdsearcher 46 Rule Example 1
  • 45. e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’ a: SET ObjectControl[oID == NEW.oID].#Eval+= 1 Crowdsearcher 47 Rule Example 1
  • 46. e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’ a: SET ObjectControl[oID == NEW.oID].#Eval+= 1 Crowdsearcher 48 Rule Example 1
  • 47. Crowdsearcher 49 e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1 Rule Example 2
  • 48. Crowdsearcher 50 e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1 Rule Example 2
  • 49. Crowdsearcher 51 e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1 Rule Example 2
  • 50. Crowdsearcher 52 e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1 Rule Example 2
  • 51. Rule Programming Best Practice • We define three classes of rules Crowdsearcher 53
  • 52. Rule Programming Best Practice Crowdsearcher 54 • We define three classes of rules • Control rules: modifying the control tables;
  • 53. Rule Programming Best Practice Crowdsearcher 55 • We define three classes of rules • Control rules: modifying the control tables; • Result rules: modifying the dimension tables (object, performer, task);
  • 54. Rule Programming Best Practice Crowdsearcher 56 • Top-to-bottom, left-to-right, evaluation • Guaranteed termination • We define three classes of rules • Control rules: modifying the control tables; • Result rules: modifying the dimension tables (object, performer, task);
  • 55. Rule Programming Best Practice • We define three classes of rules • Control rules: modifying the control tables; • Result rules: modifying the dimension tables (object, performer, task); • Execution rules: modifying the execution table, either directly or through re-planning Crowdsearcher 57 • Termination must be proven (rule precedence graph has cycles)
  • 57. Crowdsearcher Experiment 1 • Goal: Test engagement on social networks • Some 150 users • Two classes of experiments: • Random questions on fixed topics: interests (e.g. restaurants in the vicinity of Politecnico), to famous 2011 songs, or to top-quality EU soccer teams • Questions manually submitted by the users • Different invitation strategies: • Random invitation • Explicit selection of responders by the asker • Outcome • 175 like and insert queries • 1536 invitations to friends • 230 answers • 95 questions (~55%) got at least one answer Crowdsearcher 59
  • 58. Manual and Random Questions Crowdsearcher 60
  • 59. Interest / Rewarding Factor • Manually written and assigned questions are consistently more responded in time Crowdsearcher 61
  • 60. Query Type • Engagement depends on the difficulty of the task • Like vs. Add tasks: Crowdsearcher 62
  • 61. Comparison of Execution Platforms • Facebook vs. Doodle Crowdsearcher 64
  • 62. Posting Time • Facebook vs. Doodle Crowdsearcher 65
  • 63. Crowdsearcher Experiment 2 • GOAL: demonstrate the flexibility and expressive power of reactive crowdsourcing • 3 experiments, focused on Italian politicians • Parties: Human Computation  affiliation classification • Law: Game With a Purpose  guess the convicted politician • Order: Pure Game  hot or not • 1 week (November 2012) • 284 distinct performers • Recruited through public mailing lists and social networks announcements • 3500 Micro Tasks Crowdsearcher 66
  • 64. Politician Affiliation • Given the picture and name of a politician, specify his/her political affiliation • No time limit • Performers are encouraged to look up online • 2 set of rules • Majority Evaluation • Spammer Detection Crowdsearcher 67
  • 65. Results – Majority Evaluation_1/3 Crowdsearcher 68 30 object; object redundancy = 9; Final object classification as simple majority after 7 evaluations
  • 66. Results - Majority Evaluation_2/3 Crowdsearcher 69 Final object classification as total majority after 3 evaluations Otherwise, re-plan of 4 additional evaluations. Then simple majority at 7
  • 67. Results - Majority Evaluation_3/3 Crowdsearcher 70 Final object classification as total majority after 3 evaluations Otherwise, simple majority at 5 or at 7 (with replan)
  • 68. Results – Spammer Detection_1/2 Crowdsearcher 71 New rule for spammer detection without ground truth Performer correctness on final majority. Spammer if > 50% wrong classifications
  • 69. Results – Spammer Detection_1/2 Crowdsearcher 72 New rule for spammer detection without ground truth Performer correctness on current majority. Spammer if > 50% wrong classifications
  • 71. Problem • Ranking the members of a social group according to the level of knowledge that they have about a given topic • Application: crowd selection (for Crowd Searching or Sourcing) • Available data • User profile • behavioral trace that users leave behind them through their social activities Crowdsearcher 74
  • 72. Considered Features • User Profiles • Plus Linked Web Pages • Social Relationships • Facebook Friendship • Twitter mutual following relationship • LinkedIn Connections • Resource Containers • Groups, Facebook Pages • Linked Pages • Users who are followed by a given user are resource containers • Resources • Material published in resource containers Crowdsearcher 75
  • 76. Resource Distance • Objects in social graph organized according to their distance with respect to the user profile • Why? Privacy, Computational Cost, Platform Access Constraints Distance Resource 0 Expert Candidate Profile 1 Expert Candidate owns/create/annotates Resource Expert Candidate relatedTo Resource Container Expert Candidate follows UserProfile 2 Expert Candidate follows UserProfile relatedTo Resource Container Expert Candidate relatedTo Resource Container contains Resource Expert Candidate follows UserProfile owns/create/annotates Resource Expert Candidate follows UserProfile follows UserProfile Crowdsearcher 79
  • 77. Distance interpretation Distance Resource 0 Expert Candidate Profile 1 Expert Candidate owns/create/annotates Resource Expert Candidate relatedTo Resource Container Expert Candidate follows UserProfile 2 Expert Candidate follows UserProfile relatedTo Resource Container Expert Candidate relatedTo Resource Container contains Resource Expert Candidate follows UserProfile owns/create/annotates Resource Expert Candidate follows UserProfile follows UserProfile Crowdsearcher 80
  • 78. Resource Processing • Extraction from Social Network APIs • Extraction of Text from linked Web Pages • Alchemy Text Extraction APIs • Language Identification • Text Processing • Sanitization, tokenization, stopword, lemmatization • Entity Extraction and Disambiguation • TagMe Crowdsearcher 81
  • 79. Dataset • 7 kinds of expertises • Computer Engineering, Location, Movies & TV, Music, Science, Sport, Technology & Videogames • 40 volunteer users (on Facebook & Twitter & LinkedIN) • 330.000 resources (70% with URL to external resources) • Groundtruth created trough self-assessment • For expertise need, vote on 7 Likert Scale • EXPERTS  expertise above average Crowdsearcher 84
  • 80. Metrics • We obtain lists of candidate experts and assess them against the ground truth, using: • For precision: • Mean Average Precision (MAP) • 11-Point Interpolated Average Precision (11-P) • For ranking: • Mean Reciprocal Rank (MRR) – for the first value • Normalized Discounted Cumulative Gain (DCG) – for more values, can be set @N for the first N values Crowdsearcher 86
  • 81. Metrics improves with resources • But it comes with a cost Crowdsearcher 87
  • 82. Friendship Relationship not useful • Inspecting friend’s resources does not improve metrics! Crowdsearcher 88
  • 83. Social Network Analysis • a Comparison of the results obtained with All the social networks, or separately by FaceBook, TWitter, and LinkedIn. Crowdsearcher 89
  • 84. Main Results • Profiles are less effective than level-1 resources • Resources produced by others help in describing each individual’s expertise • Twitter is the most effective social network for expertise matching – sometimes it outperforms the other social networks • Twitter most effective in Computer Engineering, Science, Technology & Games, Sport • Facebook effective in Locations, Sport, Movies & TV, Music • Linked-in never very helpful in locating expertise Crowdsearcher 90
  • 86. Summary • Results • An integrated framework for crowdsourcing task design and control • Well-structured control rules with guarantees of termination • Support for cross-platform crowd interoperability • A working prototype  crowdsearcher.search-computing.org • Forthcoming • Publication of Web Interface + API • Support of declarative options for automatic rule generation • Integration with more social networks and human computation platforms • Providing vertical solutions for specific markets • More applications and experiments (e.g. in Expo 2015) Crowdsearcher 96