SlideShare uma empresa Scribd logo
1 de 21
ISSUES IN AI, POLICING AND
CRIMINAL JUSTICE:
BIAS
Dr Janet Bastiman
@yssybyl
janet.bastiman@story-stream.com
STORY-STREAM.COM.
What is AI?
"Any system that makes a decision that appears to be intelligent
from specific inputs” John McCarthy (1955)
AI -> Machine Learning -> Deep Learning -> (G)AI
Visualizing and Understanding Convolutional Networks,
Zeiler and Fergus 2013
https://arxiv.org/pdf/1311.2901.pdf
Deep if there's more than one stage
of non-linear feature transformation
What is Bias?
"an unwarranted correlation between input variables and output classification"
"erroneous assumptions in the learning algorithm resulting
in missing relevant relationships" - bias (underfitting)
"noise is modelled rather than the valid outputs" - variance (overfitting)
Under vs Over
Poor data is also a problem
Algorithm too simple
Prediction: Cow Prediction: Horse
Industry problems
Nothing is (externally) peer reviewed
IP is in the training data, network architecture and test set - hidden
Results are cherry picked / exaggerated
This might look fine but would be embarrassing and potentially
contract ending for a brand – how disastrous could this be for a
decision on a person’s life?
Machine Washing
• Overcomplicating AI deliberately to promote its abilities
• Removes desire to question as it sounds too difficult
• Pretending something is AI when it’s not
We cannot have transparency when the general public are deliberately misled
with statistics in the interest of sensationalism.
Bias in data gathering
• Ignorance of the problem
– building a lego kit without picture or instructions
• Asking biased questions to lead the model
– presupposing the answer
• Limiting the data to a set that supports the hypothesis
Data choices
Must be:
• representative of real world data
• varied
• sufficient
• able to define the predictions well
There will always be exceptions - a good model will handle these sensibly.
Ignorance of the problem space will lead to poor models
Data choices ...
Biased data will result in biased models
E.g. Oxbridge entry is inherently biased:
• state school candidates are:
• less likely to apply
• more likely to apply to oversubscribed courses
Hence there is an observed bias towards privately educated students
getting places.
Racial discrepancy is exacerbated without understanding the inherent
data bias
COMPAS: Is there bias? If so, where?
• Only looked at individuals arrested for crimes
• Exact algorithms and training data unknown
• 137 point questionnaire to feed risk score
• Questions appear unsuitable
• Testing showed a significant correlation between risk scores and
reoffending
• Black individuals were given higher risk scores than white
individuals for the same crimes
Was it biased?
COMPAS: Self-Validation
• 5575 individuals
• 50.2% Black, 42.0% White, 7.8% Other
• 86.3% male
• Had been assessed by COMPAS for risk of reoffence and were
then monitored over 12 months.
• Over half the individuals scored low (1-3)
• Changes between percentage, percentage change and
percentage per category to show results in the best light
http://criminology.fsu.edu/wp-content/uploads/Validation-of-the-COMPAS-Risk-Assessment-Classification-Instrument.pdf
COMPAS: Self-Validation
COMPAS: Pro Republica Study
• Same individuals
• Concluded that there was fundamental bias as a false positive for
a black defendant categorised as high risk was twice that of a
white defendant
Two conflicting statistical studies…
https://www.prorepublica.org/article/how-we-analysed-the-compass-recidivism-algorithm
Bias-free but skewed population?
• Chouldechova (2016) reviewed both
• For all risk scores in the set, the test is fair if the probability is
irrespective of group membership
• P(Y=1 | S= s, R = b) = P(Y=1 | S=s, R = w)
• COMPAS adheres well to this fairness condition
• However, FNR and FPR are related by:
• 𝐹𝑃𝑅 =
𝑝
1−𝜌
1−𝑃𝑃𝑉
𝑃𝑃𝑉
1 − 𝐹𝑁𝑅
• If recidivism is not equal between the two groups then a fair test
score cannot have equal FPR and FNR across the two groups
• Either the prediction is unbiased or the error is unbiased
https://arxiv.org/pdf/1610.07524.pdf
Inherent trade offs
• Kleinberg et al (2016)
• Is statistical parity possible? Or should we strive for balance of
classes so that the chance of making a mistake does not depend
on their group.
• Also determined independently that you could not balance
unbiased predictions with unbiased errors.
• They define more complex feature vectors 𝜎 and avoid the case
where 𝜎 is not defined or incomplete for some individuals
• Rigorous proof that you cannot balance all side
https://arxiv.org/pdf/1609.05807.pdf
Making data fair?
• Hardt et al (2016)
• Ignore all protected attributes? Ineffective due to redundant
encodings and other connected attributes
• Demographic parity? If the probability within a class varies you
cannot have positive parity and error parity at the same time
• Propose new parity – aiming to equalise both true positives and
false positives
• “fairer” is still subjective
https://ttic.uchicago.edu/~nati/Publications/HardtPriceSrebro2016.pdf
Summary
• Cannot maximise predictive success and equalise error across
unbalanced classes
• Several approaches – depends on what is the goal of the algorithm
• Input data needs to be unbiased
“All models are wrong, but some are useful – how wrong do they have to be
to not be useful?” – George E. P. Box
Human “gut feel” decisions are acceptable and when challenged as long as
some of the contributing factors are explained, that’s okay.
Are we holding AI predictive models to an unachievably high standard that
we do not apply to humans?

Mais conteúdo relacionado

Semelhante a AI Bias Oxford 2017

Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...
chguxu
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Krishnaram Kenthapadi
 
475 media effects methods 2012 up
475 media effects methods 2012 up475 media effects methods 2012 up
475 media effects methods 2012 up
mpeffl
 

Semelhante a AI Bias Oxford 2017 (20)

Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
 
Learn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity ChallengesLearn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity Challenges
 
Building an ethical data science practice
Building an ethical data science practiceBuilding an ethical data science practice
Building an ethical data science practice
 
How (Not) To Lie With Statistics.pptx
How (Not) To Lie With Statistics.pptxHow (Not) To Lie With Statistics.pptx
How (Not) To Lie With Statistics.pptx
 
Fairness in Machine Learning @Codemotion
Fairness in Machine Learning @CodemotionFairness in Machine Learning @Codemotion
Fairness in Machine Learning @Codemotion
 
Survey Methodology for Security and Privacy Researchers
Survey Methodology for Security and Privacy ResearchersSurvey Methodology for Security and Privacy Researchers
Survey Methodology for Security and Privacy Researchers
 
Statistics in Journalism
Statistics in JournalismStatistics in Journalism
Statistics in Journalism
 
Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...Lab presentation (a framework for understanding unintended consequences of ma...
Lab presentation (a framework for understanding unintended consequences of ma...
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayH
 
Fore FAIR ISMB 2019
Fore FAIR ISMB 2019Fore FAIR ISMB 2019
Fore FAIR ISMB 2019
 
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
 
Gender balance at work a study of an Irish civil service department
Gender balance at work a study of an Irish civil service departmentGender balance at work a study of an Irish civil service department
Gender balance at work a study of an Irish civil service department
 
475 media effects methods 2012 up
475 media effects methods 2012 up475 media effects methods 2012 up
475 media effects methods 2012 up
 
Ethical Dilemmas in AI/ML-based systems
Ethical Dilemmas in AI/ML-based systemsEthical Dilemmas in AI/ML-based systems
Ethical Dilemmas in AI/ML-based systems
 
Essay On Juvenile Incarceration
Essay On Juvenile IncarcerationEssay On Juvenile Incarceration
Essay On Juvenile Incarceration
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
 

Mais de Dr Janet Bastiman

Mais de Dr Janet Bastiman (8)

Making a deepfake
Making a deepfakeMaking a deepfake
Making a deepfake
 
Ethics of Deepfakes
Ethics of DeepfakesEthics of Deepfakes
Ethics of Deepfakes
 
What are deepfakes?
What are deepfakes?What are deepfakes?
What are deepfakes?
 
AI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsAI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systems
 
Making AI efficient
Making AI efficientMaking AI efficient
Making AI efficient
 
Can abstraction lead to intelligence?
Can abstraction lead to intelligence?Can abstraction lead to intelligence?
Can abstraction lead to intelligence?
 
Creating AI using biological network techniques
Creating AI using biological network techniquesCreating AI using biological network techniques
Creating AI using biological network techniques
 
Collaboration, Publications, Community: Building your personal tech brand
Collaboration, Publications, Community: Building your personal tech brandCollaboration, Publications, Community: Building your personal tech brand
Collaboration, Publications, Community: Building your personal tech brand
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

AI Bias Oxford 2017

  • 1. ISSUES IN AI, POLICING AND CRIMINAL JUSTICE: BIAS Dr Janet Bastiman @yssybyl janet.bastiman@story-stream.com STORY-STREAM.COM.
  • 2. What is AI? "Any system that makes a decision that appears to be intelligent from specific inputs” John McCarthy (1955) AI -> Machine Learning -> Deep Learning -> (G)AI
  • 3. Visualizing and Understanding Convolutional Networks, Zeiler and Fergus 2013 https://arxiv.org/pdf/1311.2901.pdf Deep if there's more than one stage of non-linear feature transformation
  • 4. What is Bias? "an unwarranted correlation between input variables and output classification" "erroneous assumptions in the learning algorithm resulting in missing relevant relationships" - bias (underfitting) "noise is modelled rather than the valid outputs" - variance (overfitting)
  • 6. Poor data is also a problem
  • 7. Algorithm too simple Prediction: Cow Prediction: Horse
  • 8. Industry problems Nothing is (externally) peer reviewed IP is in the training data, network architecture and test set - hidden Results are cherry picked / exaggerated
  • 9. This might look fine but would be embarrassing and potentially contract ending for a brand – how disastrous could this be for a decision on a person’s life?
  • 10. Machine Washing • Overcomplicating AI deliberately to promote its abilities • Removes desire to question as it sounds too difficult • Pretending something is AI when it’s not We cannot have transparency when the general public are deliberately misled with statistics in the interest of sensationalism.
  • 11. Bias in data gathering • Ignorance of the problem – building a lego kit without picture or instructions • Asking biased questions to lead the model – presupposing the answer • Limiting the data to a set that supports the hypothesis
  • 12. Data choices Must be: • representative of real world data • varied • sufficient • able to define the predictions well There will always be exceptions - a good model will handle these sensibly. Ignorance of the problem space will lead to poor models
  • 13. Data choices ... Biased data will result in biased models E.g. Oxbridge entry is inherently biased: • state school candidates are: • less likely to apply • more likely to apply to oversubscribed courses Hence there is an observed bias towards privately educated students getting places. Racial discrepancy is exacerbated without understanding the inherent data bias
  • 14. COMPAS: Is there bias? If so, where? • Only looked at individuals arrested for crimes • Exact algorithms and training data unknown • 137 point questionnaire to feed risk score • Questions appear unsuitable • Testing showed a significant correlation between risk scores and reoffending • Black individuals were given higher risk scores than white individuals for the same crimes Was it biased?
  • 15. COMPAS: Self-Validation • 5575 individuals • 50.2% Black, 42.0% White, 7.8% Other • 86.3% male • Had been assessed by COMPAS for risk of reoffence and were then monitored over 12 months. • Over half the individuals scored low (1-3) • Changes between percentage, percentage change and percentage per category to show results in the best light http://criminology.fsu.edu/wp-content/uploads/Validation-of-the-COMPAS-Risk-Assessment-Classification-Instrument.pdf
  • 17. COMPAS: Pro Republica Study • Same individuals • Concluded that there was fundamental bias as a false positive for a black defendant categorised as high risk was twice that of a white defendant Two conflicting statistical studies… https://www.prorepublica.org/article/how-we-analysed-the-compass-recidivism-algorithm
  • 18. Bias-free but skewed population? • Chouldechova (2016) reviewed both • For all risk scores in the set, the test is fair if the probability is irrespective of group membership • P(Y=1 | S= s, R = b) = P(Y=1 | S=s, R = w) • COMPAS adheres well to this fairness condition • However, FNR and FPR are related by: • 𝐹𝑃𝑅 = 𝑝 1−𝜌 1−𝑃𝑃𝑉 𝑃𝑃𝑉 1 − 𝐹𝑁𝑅 • If recidivism is not equal between the two groups then a fair test score cannot have equal FPR and FNR across the two groups • Either the prediction is unbiased or the error is unbiased https://arxiv.org/pdf/1610.07524.pdf
  • 19. Inherent trade offs • Kleinberg et al (2016) • Is statistical parity possible? Or should we strive for balance of classes so that the chance of making a mistake does not depend on their group. • Also determined independently that you could not balance unbiased predictions with unbiased errors. • They define more complex feature vectors 𝜎 and avoid the case where 𝜎 is not defined or incomplete for some individuals • Rigorous proof that you cannot balance all side https://arxiv.org/pdf/1609.05807.pdf
  • 20. Making data fair? • Hardt et al (2016) • Ignore all protected attributes? Ineffective due to redundant encodings and other connected attributes • Demographic parity? If the probability within a class varies you cannot have positive parity and error parity at the same time • Propose new parity – aiming to equalise both true positives and false positives • “fairer” is still subjective https://ttic.uchicago.edu/~nati/Publications/HardtPriceSrebro2016.pdf
  • 21. Summary • Cannot maximise predictive success and equalise error across unbalanced classes • Several approaches – depends on what is the goal of the algorithm • Input data needs to be unbiased “All models are wrong, but some are useful – how wrong do they have to be to not be useful?” – George E. P. Box Human “gut feel” decisions are acceptable and when challenged as long as some of the contributing factors are explained, that’s okay. Are we holding AI predictive models to an unachievably high standard that we do not apply to humans?