O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Transparency in ML and AI (humble views from a concerned academic)

349 visualizações

Publicada em

A mini-survey on the issue of transparency in AI and ML in light of the GDPR -- with a modest proposal

Publicada em: Tecnologia
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Transparency in ML and AI (humble views from a concerned academic)

  1. 1. Dr. Paolo Missier School of Computing Newcastle University Innovation Opportunity of the GDPR for AI and ML Digital Catapult London, March 2nd, 2018 Transparency in ML and AI (humble views from a concerned academic)
  2. 2. 2 My current favourite book <eventname> How much of Big Data is My Data? Is Data the problem? Or the algorithms? Or how much we trust them? Is there a problem at all?
  3. 3. 3 What matters? <eventname> Decisions made by processes based on algorithmically-generated knowledge: Knowledge-Generating Systems (KGS) • automatically filtering job applicants • approving loans or other credit • approving access to benefits schemes • predicting insurance risk levels • user profiling for policing purposes and to predict risk of criminal recidivism • identifying health risk factors • …
  4. 4. 4 GDPR and algorithmic decision making <eventname> Profiling is “any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person” Thus profiling should be construed as a subset of processing, under two conditions: the processing is automated, and the processing is for the purposes of evaluation. Article 22: Automated individual decision-making, including profiling, paragraph 1 (see figure 1) prohibits any“decision based solely on automated processing, including profiling” which “significantly affects” a data subject. it stands to reason that an algorithm can only be explained if the trained model can be articulated and understood by a human. It is reasonable to suppose that any adequate explanation would provide an account of how input features relate to predictions: - Is the model more or less likely to recommend a loan if the applicant is a minority? - Which features play the largest role in prediction? B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a ‘right to explanation,’” Proc. 2016 ICML Work. Hum. Interpret. Mach. Learn. (WHI 2016), Jun. 2016.
  5. 5. 5 Heads up on the key questions: • [to what extent, at what level] should lay people be educated about algorithmic decision making? • What mechanisms would you propose to engender trust in algorithmic decision making? • With regards to trust and transparency, what should Computer Science researchers focus on? • What kind of inter-disciplinary research do you see? <eventname>
  6. 6. 6 Recidivism Prediction Instruments (RPI) <eventname> • Increasingly popular within the criminal justice system • Used or considered for use in pre-trial decision-making (USA) Social debate and scholarly arguments… Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. 2016. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm black defendants who did not recidivate over a two-year period were nearly twice as likely to be misclassified as higher risk compared to their white counterparts (45 percent vs. 23 percent). white defendants who re-offended within the next two years were mistakenly labeled low risk almost twice as often as black re-offenders (48 percent vs. 28 percent) A. Chouldechova, “Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments,” Big Data, vol. 5, no. 2, pp. 153–163, Jun. 2017. In this paper we show that the differences in false positive and false negative rates cited as evidence of racial bias in the ProPublica article are a direct consequence of applying an instrument that is free from predictive bias to a population in which recidivism prevalence differs across groups.
  7. 7. 7 Opacity <eventname> J. Burrell, “How the machine ‘thinks’: Understanding opacity in machine learning algorithms,” Big Data Soc., vol. 3, no. 1, p. 2053951715622512, 2016. Three forms of opacity: 1- intentional corporate or state secrecy, institutional self-protection 2- opacity as technical illiteracy, writing (and reading) code is a specialist skill • One proposed response is to make code available for scrutiny, through regulatory means if necessary 3- mismatch between mathematical optimization in high-dimensionality characteristic of machine learning and the demands of human-scale reasoning and styles of semantic interpretation. “Ultimately partnerships between legal scholars, social scientists, domain experts, along with computer scientists may chip away at these challenging questions of fairness in classification in light of the barrier of opacity”
  8. 8. 8 <eventname> But, is research focusing on the right problems? Research and innovation: React to threats, Spot opportunities…
  9. 9. 9 To recall…Predictive Data Analytics (Learning) <eventname>
  10. 10. 10 Interpretability (of machine learning models) <eventname> Z. C. Lipton, “The Mythos of Model Interpretability,” Proc. 2016 ICML Work. Hum. Interpret. Mach. Learn. (WHI 2016), Jun. 2016. - Transparency - Are features understandable? - Which features are more important? - Post hoc interpretability - Natural language explanations - Visualisations of models - Explanations by example - “this tumor is classified as malignant because to the model it looks a lot like these other tumors”
  11. 11. 11 “Why Should I Trust You?” <eventname> M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ : Explaining the Predictions of Any Classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 2016, pp. 1135–1144. Interpretability of model predictions has become a hot research topic in Machine Learning “if the users do not trust a model or a prediction, they will not use it” By “explaining a prediction”, we mean presenting textual or visual artifacts that provide qualitative understanding of the relationship between the instance’s components and the model’s prediction.
  12. 12. 12 Explaining image classification <eventname> M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ : Explaining the Predictions of Any Classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 2016, pp. 1135–1144.
  13. 13. 13 Features: few, high level <eventname> SVM classifier, 94% accuracy …but questionable!
  14. 14. 14 Features Volume: how many features contribute to the prediction? Meaning : how suitable are the features for human interpretation? • Raw: (low-level, non-semantic) signals such as images pixels • Deep learning • Visualisation ---- occlusion test • Cases: Object recognition, and medical diagnosis • Many features: (thousands is too many) • Few, high-level features. -- is this the only chance?
  15. 15. 15 Occlusion test for CNNs Kemany, et al., Identifying Medical Diagnoses and treatable diseases by image based deep learning Cell 2018 Zeiler, et al., Visualizing and Understanding Convolutional Networks, ECCV 2014
  16. 16. 16 Attribute Learning Layer for Semantic Attributes Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, Shree K. Nayar,, "Describable Visual Attributes for Face Verification and Image Search,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 33, no. 10, pp. 1962--1977, October 2011.
  17. 17. 17 Can we control inferences made about us? <eventname> Facebook’s (and many other marketing companies) problem: Personal characteristics are often hard to observe because of lack of data or privacy restrictions Solution: firms and governments increasingly depend on statistical inferences drawn from available information. Goal of the research: - How to to give online users transparency into why certain inferences are made about them by statistical models - How to inhibit those inferences by hiding (“cloaking”) certain personal information from inference D. Chen, S. P. Fraiberger, R. Moakler, and F. Provost, “Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals,” in 2016 ICML Workshop on Human Interpretability in Machine network Learning (WHI 2016), 2016, pp. 21–25. privacy invasions via statistical inferences are at least as troublesome as privacy invasions based on revealing personal data
  18. 18. 18 “Cloaking” <eventname> Which “evidence” in the input feature vectors is critical to make an accurate prediction? evidence counterfactual: “what would the model have done if this evidence hadn’t been present”? Not an easy problem! User 1 greatly affected User 2 unaffected
  19. 19. 19 Cloakability <eventname> How many Facebook “Likes” should be “cloaked” to inhibit a prediction? Predicted trait Cloaking effort = Number of likes to be removed
  20. 20. 20 AI Guardians <eventname> A. Etzioni and O. Etzioni, “Designing AI Systems That Obey Our Laws and Values,” Commun. ACM, vol. 59, no. 9, pp. 29–31, Aug. 2016. Operational AI systems (for example, self-driving cars) need to obey both the law of the land and our values. Why do we need oversight systems? - AI systems learn continuously  they change over time - AI systems are becoming opaque - “black boxes” to human beings - AI-guided systems have increasing autonomy - they make choices “on their own.” a major mission for AI is to develop in the near future such AI oversight systems Auditors Monitors EnforcersEthics bots!
  21. 21. 21 AI accountability – your next Pal? <eventname> Asked where AI systems are weak today, Veloso (*) says they should be more transparent. "They need to explain themselves: why did they do this, why did they do that, why did they detect this, why did they recommend that? Accountability is absolutely necessary." (*) Manuela Veloso, head of the Machine Learning Department at Carnegie-Mellon University Gary Anthes. 2017. Artificial intelligence poised to ride a new wave. Commun. ACM 60, 7 (June 2017), 19-21. DOI: https://doi.org/10.1145/3088342 IBM's Witbrock echoes the call for humanism in AI: …"It's an embodiment of a human dream of having a patient, helpful, collaborative kind of companion."
  22. 22. 22 A personal view <eventname> Hypothesis: it is technically practical to provide a limited and IP-preserving degree of transparency by surrounding and augmenting a black-box KGS with metadata that describes the nature of its input, training and test data, and can therefore be used to automatically generate explanations that can be understood by lay persons. Knowledge-Generating Systems (KGS) …It’s the meta-data, stupid (*) (*) https://en.wikipedia.org/wiki/It%27s_the_economy,_stupid
  23. 23. 23 Something new to try, perhaps? <eventname> Contextualised Classifications Explanation Service KGS 2 limited profile KGS 1 limited profile Secure ledger (Blockchain) infomediary users User data contributions Shared Vocabulary And metadata model Informed co-decision process KGS 2. (e.g. health) Background (Big) data KGS 1 (e.g. pensions) Background (Big) data Contextualised Classifications Secure ledger (Blockchain) - descriptive summary of background data - high-level characterisation of algorithm KGS profiles Users instances and classifications Disclosure policy Disclosure policy Fig. 1
  24. 24. 24 References (to take home) <eventname> • Gary Anthes. 2017. Artificial intelligence poised to ride a new wave. Commun. ACM 60, 7 (June 2017), 19-21. DOI: https://doi.org/10.1145/3088342 • J. Burrell, “How the machine ‘thinks’: Understanding opacity in machine learning algorithms,” Big Data Soc., vol. 3, no. 1, p. 2053951715622512, 2016 • Caruana, Rich, Lou, Yin, Gehrke, Johannes, Koch, Paul, Sturm, Marc, and Elhadad, Noemie. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In KDD, 2015 • D. Chen, S. P. Fraiberger, R. Moakler, and F. Provost, “Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals,” in 2016 ICML Workshop on Human Interpretability in Machine network Learning (WHI 2016), 2016, pp. 21–25 • A. Chouldechova, “Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments,” Big Data, vol. 5, no. 2, pp. 153–163, Jun. 2017. • A. Etzioni and O. Etzioni, “Designing AI Systems That Obey Our Laws and Values,” Commun. ACM, vol. 59, no. 9, pp. 29–31, Aug. 2016. • B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a ‘right to explanation,’” Proc. 2016 ICML Work. Hum. Interpret. Mach. Learn. (WHI 2016), Jun. 2016. • Kumar, et al. Describable visual attributes for face verification and image search PAMI, 2011. (Pattern Analysis and Machine Intelligence) • Z. C. Lipton, “The Mythos of Model Interpretability,” Proc. 2016 ICML Work. Hum. Interpret. Mach. Learn. (WHI 2016), Jun. 2016. • M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ : Explaining the Predictions of Any Classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 2016, pp. 1135–1144. • Zeiler, et al., Visualizing and Understanding Convolutional Networks, ECCV 2014
  25. 25. 25 Questions to you: • [to what extent, at what level] should lay people be educated about algorithmic decision making? • What mechanisms would you propose to engender trust in algorithmic decision making? • With regards to trust and transparency, what should Computer Science researchers focus on? • What kind of inter-disciplinary research do you see? <eventname>
  26. 26. 26 Scenarios <eventname> What kind of explanations would you request / expect / accept? • My application for benefits has been denied but I am not sure why • My insurance premium is higher than my partner’s, and it’s not clear why • My work performance has been deemed unsatisfactory, but I don’t see why • [can you suggest other scenarios close to your experience?]

×