1. Mechanical cheat
Spamming Schemes and Adversarial
Techniques on Crowdsourcing Platforms
Djellel Eddine Difallah, GianlucaDemartini, and Philippe Cudré-Mauroux
University of Fribourg, Switzerland
2. Popularity and Monetary Incentives
Micro task Crowdsourcing is growing in popularity.
~500k registered workers in AMT
~200k hits available (April 2012)
~20k $ of rewards (April 2012)
4. Some Experiments Results:
Entity Link Selection (ZenCrowd – WWW2012)
Evidence of participations of dishonest workers, spending
less time doing more tasks and achieving lesser quality.
5. Dishonest Answers onCrowdsourcing
Platforms
We define a dishonest answer in a crowd sourcing context as
answer that has been either:
Randomly posted.
Artificially generated.
Duplicated from another source.
6. How can requesters perform quality
control?
Go over all the submissions?
Blindly accept all submissions?
Use selection and filtering algorithms.
7. Anti adversarial techniques
Pre-selection and dissuasion
Use built in control (ex: acceptance rate)
Task design
Qualification test
Post processing
Task repetition and aggregation
Test questions
Machine learning (ex: probabilistic netw0rk in ZenCrowd)
9. Counteringadversarial techniques
Individual attacks
Random Answers
Target tasks designed with monetary incentive
Countered with test questions
Automated Answers
Target tasks with simple submission mechanism
Counter with test questions (especially captchas)
Semi-Automated Answers
Target easy hits achievable with some AI.
Can pass easy-to-answer test questions
Can detect captchas and forward them to a human.
10. Counteringadversarial techniques
Group attacks
Agree on Answers
Target naïve aggregation schemes like majority vote.
May discard valid answers!
Counter by shuffling the options
Answer Sharing
Target repeated tasks
Counter with creating multiple batches
Artificial Clones
Target repeated tasks
11. Conclusions and future work
We claim the inefficiency of some quality control tools to
counter resourceful spammers.
Combine multiple techniques for post-filtering.
Crowdsourcing platforms to provide more tools.
Evaluation of futurefiltering algorithms must be repeatable
and generic.
Crowdsourcing benchmark.
12. Conclusions and future work
Benchmarkproposal
A collection of tasks with multiple choice options
Each task is repeated multiple times
Unpublished expert judgment for all the tasks
Publish answers completed in a controlled environment with the
following categories of workers:
Honest workers
Random clicks
Semi automated program
Organized group
Post-filtering methods are evaluated based on their ability to achieve
high precision score.
Other parameter could be the money spent etc
If you are a task requester, you’d prefer to “hire” honest workers, and not automated programs nor dishonest workers. MTurk, for instances do not offer any guarantee for that, Furthermore they encourage the requester to (pay well, fairly and quickly). Beside, if one has a large amount of tasks, one will likely never go through all the submissions. How to the task requesters unsure quality then? - Go over all the submissions? - Blindly accept all? - Filter algorithm
Many researchers looked at this particular issue and proposed solution. We can mainly distinguish two approaches1- Cheater disuasion, and pre-selection2- postprocessing
Note that there is no evidence of existence of such groups
Conclusion and future work: So we tried to review some quality controls tool, and look at them with spammers eyes. By claiming insufficiency in available quality control tools we are mainly stressing that spammers are resourceful.So what does it take to build a bullet proof CS platform or filtering scheme? One solution do not fit all ..