SlideShare uma empresa Scribd logo
1 de 13
Mechanical cheat
Spamming Schemes and Adversarial
Techniques on Crowdsourcing Platforms
Djellel Eddine Difallah, GianlucaDemartini, and Philippe Cudré-Mauroux
University of Fribourg, Switzerland
Popularity and Monetary Incentives
 Micro task Crowdsourcing is growing in popularity.
 ~500k registered workers in AMT
 ~200k hits available (April 2012)
 ~20k $ of rewards (April 2012)
Spam could be a threat for
Crowdsourcing
Some Experiments Results:
Entity Link Selection (ZenCrowd – WWW2012)

 Evidence of participations of dishonest workers, spending

less time doing more tasks and achieving lesser quality.
Dishonest Answers onCrowdsourcing
Platforms
 We define a dishonest answer in a crowd sourcing context as

answer that has been either:
 Randomly posted.
 Artificially generated.
 Duplicated from another source.
How can requesters perform quality
control?
 Go over all the submissions?
 Blindly accept all submissions?
 Use selection and filtering algorithms.
Anti adversarial techniques
 Pre-selection and dissuasion
 Use built in control (ex: acceptance rate)
 Task design
 Qualification test

 Post processing
 Task repetition and aggregation
 Test questions
 Machine learning (ex: probabilistic netw0rk in ZenCrowd)
Countering adversarial techniques
Organization
Counteringadversarial techniques
Individual attacks
 Random Answers
 Target tasks designed with monetary incentive
 Countered with test questions
 Automated Answers
 Target tasks with simple submission mechanism
 Counter with test questions (especially captchas)
 Semi-Automated Answers
 Target easy hits achievable with some AI.
 Can pass easy-to-answer test questions
 Can detect captchas and forward them to a human.
Counteringadversarial techniques
Group attacks
 Agree on Answers
 Target naïve aggregation schemes like majority vote.
 May discard valid answers!
 Counter by shuffling the options
 Answer Sharing
 Target repeated tasks
 Counter with creating multiple batches
 Artificial Clones
 Target repeated tasks
Conclusions and future work
 We claim the inefficiency of some quality control tools to

counter resourceful spammers.
 Combine multiple techniques for post-filtering.
 Crowdsourcing platforms to provide more tools.
 Evaluation of futurefiltering algorithms must be repeatable

and generic.
 Crowdsourcing benchmark.
Conclusions and future work
Benchmarkproposal
 A collection of tasks with multiple choice options
 Each task is repeated multiple times
 Unpublished expert judgment for all the tasks
 Publish answers completed in a controlled environment with the

following categories of workers:





Honest workers
Random clicks
Semi automated program
Organized group

 Post-filtering methods are evaluated based on their ability to achieve

high precision score.

 Other parameter could be the money spent etc
Discussion
Q&A

Mais conteúdo relacionado

Semelhante a Mechanical Cheat

Introduction to Panel Management Solutions
Introduction to Panel Management SolutionsIntroduction to Panel Management Solutions
Introduction to Panel Management SolutionsQuestionPro
 
Crowd Testing Framework : Mobile Application Testing
Crowd Testing Framework : Mobile Application TestingCrowd Testing Framework : Mobile Application Testing
Crowd Testing Framework : Mobile Application Testingmomoahmedabad
 
Artificial Intelligence & QA
Artificial Intelligence & QAArtificial Intelligence & QA
Artificial Intelligence & QAMalihaAshraf
 
Test automation and Agile software development
Test automation and Agile software developmentTest automation and Agile software development
Test automation and Agile software developmentBas Dijkstra
 
Automate Legacy-System Testing: Easy, Reliable, and Extendible
Automate Legacy-System Testing: Easy, Reliable, and ExtendibleAutomate Legacy-System Testing: Easy, Reliable, and Extendible
Automate Legacy-System Testing: Easy, Reliable, and ExtendibleTechWell
 
Machine learning for sensor Data Analytics
Machine learning for sensor Data AnalyticsMachine learning for sensor Data Analytics
Machine learning for sensor Data AnalyticsMATLABISRAEL
 
Home Mess System III
Home Mess System IIIHome Mess System III
Home Mess System IIIgueste15df5
 
Home Mess System III
Home Mess System IIIHome Mess System III
Home Mess System IIIwow!systems
 
Usability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandUsability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandCarol Smith
 
Webinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability TestingWebinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability TestingUserZoom
 
Mi0033 software engineering...
Mi0033  software engineering...Mi0033  software engineering...
Mi0033 software engineering...smumbahelp
 
Big Data Science - hype?
Big Data Science - hype?Big Data Science - hype?
Big Data Science - hype?BalaBit
 
Is Crowd Testing (relevant) for Software Engineers?
Is Crowd Testing (relevant) for Software Engineers?Is Crowd Testing (relevant) for Software Engineers?
Is Crowd Testing (relevant) for Software Engineers?Henry Muccini
 
Tnt Testing The Future Of Testing V1.0
Tnt Testing The Future Of Testing V1.0Tnt Testing The Future Of Testing V1.0
Tnt Testing The Future Of Testing V1.0guestbd19b51
 
Testing the Future Of Testing
Testing the Future Of TestingTesting the Future Of Testing
Testing the Future Of TestingEwald Roodenrijs
 
Shiva ppt.pptx
Shiva ppt.pptxShiva ppt.pptx
Shiva ppt.pptxbcvishal50
 
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...Vivrfvg
 
Say "Hi!" to Your New Boss
Say "Hi!" to Your New BossSay "Hi!" to Your New Boss
Say "Hi!" to Your New BossAndreas Dewes
 

Semelhante a Mechanical Cheat (20)

Introduction to Panel Management Solutions
Introduction to Panel Management SolutionsIntroduction to Panel Management Solutions
Introduction to Panel Management Solutions
 
Crowd Testing Framework : Mobile Application Testing
Crowd Testing Framework : Mobile Application TestingCrowd Testing Framework : Mobile Application Testing
Crowd Testing Framework : Mobile Application Testing
 
Artificial Intelligence & QA
Artificial Intelligence & QAArtificial Intelligence & QA
Artificial Intelligence & QA
 
Test automation and Agile software development
Test automation and Agile software developmentTest automation and Agile software development
Test automation and Agile software development
 
Automate Legacy-System Testing: Easy, Reliable, and Extendible
Automate Legacy-System Testing: Easy, Reliable, and ExtendibleAutomate Legacy-System Testing: Easy, Reliable, and Extendible
Automate Legacy-System Testing: Easy, Reliable, and Extendible
 
Resume
ResumeResume
Resume
 
Machine learning for sensor Data Analytics
Machine learning for sensor Data AnalyticsMachine learning for sensor Data Analytics
Machine learning for sensor Data Analytics
 
Home Mess System III
Home Mess System IIIHome Mess System III
Home Mess System III
 
Home Mess System III
Home Mess System IIIHome Mess System III
Home Mess System III
 
Usability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandUsability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG Cleveland
 
Webinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability TestingWebinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability Testing
 
Machine learning
Machine learningMachine learning
Machine learning
 
Mi0033 software engineering...
Mi0033  software engineering...Mi0033  software engineering...
Mi0033 software engineering...
 
Big Data Science - hype?
Big Data Science - hype?Big Data Science - hype?
Big Data Science - hype?
 
Is Crowd Testing (relevant) for Software Engineers?
Is Crowd Testing (relevant) for Software Engineers?Is Crowd Testing (relevant) for Software Engineers?
Is Crowd Testing (relevant) for Software Engineers?
 
Tnt Testing The Future Of Testing V1.0
Tnt Testing The Future Of Testing V1.0Tnt Testing The Future Of Testing V1.0
Tnt Testing The Future Of Testing V1.0
 
Testing the Future Of Testing
Testing the Future Of TestingTesting the Future Of Testing
Testing the Future Of Testing
 
Shiva ppt.pptx
Shiva ppt.pptxShiva ppt.pptx
Shiva ppt.pptx
 
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
 
Say "Hi!" to Your New Boss
Say "Hi!" to Your New BossSay "Hi!" to Your New Boss
Say "Hi!" to Your New Boss
 

Mais de eXascale Infolab

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictioneXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex GraphseXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapeXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceanseXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutioneXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataeXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingeXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big DataeXascale Infolab
 

Mais de eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 

Último

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Último (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Mechanical Cheat

  • 1. Mechanical cheat Spamming Schemes and Adversarial Techniques on Crowdsourcing Platforms Djellel Eddine Difallah, GianlucaDemartini, and Philippe Cudré-Mauroux University of Fribourg, Switzerland
  • 2. Popularity and Monetary Incentives  Micro task Crowdsourcing is growing in popularity.  ~500k registered workers in AMT  ~200k hits available (April 2012)  ~20k $ of rewards (April 2012)
  • 3. Spam could be a threat for Crowdsourcing
  • 4. Some Experiments Results: Entity Link Selection (ZenCrowd – WWW2012)  Evidence of participations of dishonest workers, spending less time doing more tasks and achieving lesser quality.
  • 5. Dishonest Answers onCrowdsourcing Platforms  We define a dishonest answer in a crowd sourcing context as answer that has been either:  Randomly posted.  Artificially generated.  Duplicated from another source.
  • 6. How can requesters perform quality control?  Go over all the submissions?  Blindly accept all submissions?  Use selection and filtering algorithms.
  • 7. Anti adversarial techniques  Pre-selection and dissuasion  Use built in control (ex: acceptance rate)  Task design  Qualification test  Post processing  Task repetition and aggregation  Test questions  Machine learning (ex: probabilistic netw0rk in ZenCrowd)
  • 9. Counteringadversarial techniques Individual attacks  Random Answers  Target tasks designed with monetary incentive  Countered with test questions  Automated Answers  Target tasks with simple submission mechanism  Counter with test questions (especially captchas)  Semi-Automated Answers  Target easy hits achievable with some AI.  Can pass easy-to-answer test questions  Can detect captchas and forward them to a human.
  • 10. Counteringadversarial techniques Group attacks  Agree on Answers  Target naïve aggregation schemes like majority vote.  May discard valid answers!  Counter by shuffling the options  Answer Sharing  Target repeated tasks  Counter with creating multiple batches  Artificial Clones  Target repeated tasks
  • 11. Conclusions and future work  We claim the inefficiency of some quality control tools to counter resourceful spammers.  Combine multiple techniques for post-filtering.  Crowdsourcing platforms to provide more tools.  Evaluation of futurefiltering algorithms must be repeatable and generic.  Crowdsourcing benchmark.
  • 12. Conclusions and future work Benchmarkproposal  A collection of tasks with multiple choice options  Each task is repeated multiple times  Unpublished expert judgment for all the tasks  Publish answers completed in a controlled environment with the following categories of workers:     Honest workers Random clicks Semi automated program Organized group  Post-filtering methods are evaluated based on their ability to achieve high precision score.  Other parameter could be the money spent etc

Notas do Editor

  1.  If you are a task requester, you’d prefer to “hire” honest workers, and not automated programs nor dishonest workers. MTurk, for instances do not offer any guarantee for that, Furthermore they encourage the requester to (pay well, fairly and quickly). Beside, if one has a large amount of tasks, one will likely never go through all the submissions. How to the task requesters unsure quality then? - Go over all the submissions? - Blindly accept all? - Filter algorithm
  2. Many researchers looked at this particular issue and proposed solution. We can mainly distinguish two approaches1- Cheater disuasion, and pre-selection2- postprocessing
  3. Note that there is no evidence of existence of such groups
  4.  Conclusion and future work: So we tried to review some quality controls tool, and look at them with spammers eyes. By claiming insufficiency in available quality control tools we are mainly stressing that spammers are resourceful.So what does it take to build a bullet proof CS platform or filtering scheme? One solution do not fit all ..