O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
CATS4ML:
Crowdsourcing Adverse Test Sets for ML
Welcome!
ˈl ɪ k ə r t
cats4ml.humancomputation.com
● Click on the three dots on the lower corner of
your Google Meets screen.
● Select “Change layout.”
● Choose your preferr...
Fill out our interest form to hear
about events and opportunities
at goo.gle/neurips-booth-form
Google at NeurIPS 2020
Bui...
ˈl ɪ k ə r t
TAKE HOME MESSAGE
data is the compass for AI - AI advances where
there is data
data quality must be addressed...
ˈl ɪ k ə r t
TAKE HOME MESSAGE
data is the compass for AI - AI advances where
there is data
data quality must be addressed...
ˈl ɪ k ə r t
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
reactive
da...
ˈl ɪ k ə r t
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
to reach here
we need proactive
data imp...
ˈl ɪ k ə r t
Your AI model is as good as
your evaluation data
… but is your evaluation data
missing relevant examples?
cat...
ˈl ɪ k ə r t
Your AI model is as good as
your evaluation data
… but is your evaluation data
missing relevant examples?
How...
ˈl ɪ k ə r t
offers a crowdsourced solution for finding
blindspots of your AI models
inspired by prior research in human c...
ˈl ɪ k ə r t
offers a crowdsourced red team
for finding blindspots of your AI models
CATS4ML Challenge
Crowdsourcing Adver...
ˈl ɪ k ə r t
In this first version of the CATS4ML challenge
participants will discover AI blindspots in the Open Images Da...
ˈl ɪ k ə r t
image1 label1
image2 label2
…
challenge
participants
labels comparison
Open
Images
dataset
submission
image 1...
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
cats4ml.humancomputation.com
The challenge will run until 30 April, 2021
ˈl ɪ k ə r t
cats4ml.humancomputation.com
Join us now!
● join the data challenge
individually or form a team
● discover in...
ˈl ɪ k ə r t
The Team
Praveen Paritosh Ka Wong
Lora Aroyo Devi Krishna
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
Share your voice about ML data!
We are inviting ML professionals to
participate in a short survey to
learn ab...
Próximos SlideShares
Carregando em…5
×

CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning

Presentation at Google Booth @ NeurIPS2020

  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning

  1. 1. CATS4ML: Crowdsourcing Adverse Test Sets for ML Welcome! ˈl ɪ k ə r t cats4ml.humancomputation.com
  2. 2. ● Click on the three dots on the lower corner of your Google Meets screen. ● Select “Change layout.” ● Choose your preferred layout: ○ Sidebar View ○ Spotlight View ○ Tiled View ● Select “Turn on captions” for Closed Captioning Your Google Meet View
  3. 3. Fill out our interest form to hear about events and opportunities at goo.gle/neurips-booth-form Google at NeurIPS 2020 Build for everyone
  4. 4. ˈl ɪ k ə r t TAKE HOME MESSAGE data is the compass for AI - AI advances where there is data data quality must be addressed in AI practices especially in the way we evaluate AI improving evaluation of AI must consider ways to measure variance and capture bias to bring us one step closer to data excellence to address variance in AI evaluation we propose a number of novel metrics for reliability, significance (metrology for AI) and disagreement (CrowdTruth) to address bias in AI evaluation we propose a novel method for crowdsourcing adverse test sets for ML models (CATS4ML)
  5. 5. ˈl ɪ k ə r t TAKE HOME MESSAGE data is the compass for AI - AI advances where there is data data quality must be addressed in AI practices especially in the way we evaluate AI improving evaluation of AI must consider ways to measure variance and capture bias to bring us one step closer to data excellence to address variance in AI evaluation we propose a number of novel metrics for reliability , significance (metrology for AI) and disagreement (CrowdTruth) to address bias in AI evaluation we propose a novel method for crowdsourcing adverse test sets for ML models (CATS4ML)
  6. 6. ˈl ɪ k ə r t The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ... reactive data improvement
  7. 7. ˈl ɪ k ə r t The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” to reach here we need proactive data improvement
  8. 8. ˈl ɪ k ə r t Your AI model is as good as your evaluation data … but is your evaluation data missing relevant examples? cats4ml.humancomputation.com
  9. 9. ˈl ɪ k ə r t Your AI model is as good as your evaluation data … but is your evaluation data missing relevant examples? How can we find such examples, especially if they are AI blindspots (i.e. unknown unknowns)? cats4ml.humancomputation.com
  10. 10. ˈl ɪ k ə r t offers a crowdsourced solution for finding blindspots of your AI models inspired by prior research in human computation Beat the Machine: Challenging Humans to Find a Predictive Model's “Unknown Unknowns” (Attenberg, Ipeirotis, Provost, 2014) Vulnerability Reward Program (Google) CATS4ML Challenge Crowdsourcing Adverse Test Sets for ML
  11. 11. ˈl ɪ k ə r t offers a crowdsourced red team for finding blindspots of your AI models CATS4ML Challenge Crowdsourcing Adverse Test Sets for ML cats4ml.humancomputation.com
  12. 12. ˈl ɪ k ə r t In this first version of the CATS4ML challenge participants will discover AI blindspots in the Open Images Dataset Lipstick? Airplane? Car? Construction worker? Thanksgiving? These AI blindspots are real images with visual patterns that confuse AI models in ways humans might find meaningful cats4ml.humancomputation.com
  13. 13. ˈl ɪ k ə r t image1 label1 image2 label2 … challenge participants labels comparison Open Images dataset submission image 1: label 1: lipstick image 2: label 2: thanksgiving ... human-in-the-loop verification image1-label1 image2-label2 ... image1-label1 image2-label2 ... submission scoring vs. The Challenge Process cats4ml.humancomputation.com
  14. 14. ˈl ɪ k ə r t cats4ml.humancomputation.com
  15. 15. ˈl ɪ k ə r t cats4ml.humancomputation.com
  16. 16. ˈl ɪ k ə r t cats4ml.humancomputation.com
  17. 17. ˈl ɪ k ə r t cats4ml.humancomputation.com The challenge will run until 30 April, 2021
  18. 18. ˈl ɪ k ə r t cats4ml.humancomputation.com Join us now! ● join the data challenge individually or form a team ● discover interesting AI blindspots in Open Images Dataset & contribute these to the challenge ● be part of this research effort ● winning contributions will be promoted at next CrowdCamp 2021
  19. 19. ˈl ɪ k ə r t The Team Praveen Paritosh Ka Wong Lora Aroyo Devi Krishna ˈl ɪ k ə r t cats4ml.humancomputation.com
  20. 20. ˈl ɪ k ə r t Share your voice about ML data! We are inviting ML professionals to participate in a short survey to learn about your challenges and needs with data. Scan to participate now!

×