A Categorisation of Post-hoc Explanations for Predictive Models

Wednesday, 27th March
A Post-Hoc Categorisation
of Predictive Models
John Mitros
University College Dublin
ioannis.mitros@insight-centre.org

Outline
• Introduc)on
• Overview of interpretability/explainability
• Post-hoc approaches for interpretability
• Common themes, connec)ng ideas, general picture
• Not an exhaus)ve survey of all the literature body
• Open challenges and possible future direc)ons
• Examples and use cases
• Recent approaches
2

Explainable AI
3
Image from https://xaitutorial2019.github.io/

Explainable vs. Interpretable
• Explainable ML:
• Post-hoc analysis of black box models
• Interpretable ML:
• Intrinsically interpretable a.k.a transparent
4Rudin, C. & Ertekin, Ş. Math. Prog. Comp. (2018) 10: 659. hNps://doi.org/10.1007/s12532-018-0143-8

Interpretability
• It is inherently a mul/faceted no/on whose meaning changes according to
the diﬀerent applicability scenarios
• Interpretability needs to answer what the model has learned and why it
came to that conclusion
• Deﬁni/on of interpretability:
• “interpretability is the degree to which a human can understand the cause of a
decision” (Miller 2017)
5

Interpretability
• Deﬁni&on of interpretability:
• “interpretability is the degree to which a machine can explain the cause of a decision
into coherent logical arguments”
• inherently it involves a bijec&ve process from input to output and vice versa, where
the intermediate steps are transparent to the end user
!" " # → % &ℎ() " % → #
• logical fallacies should be avoided
6

Scope of Interpretability
7
Lipton, Z. C. 2016. The Mythos of Model Interpretability. ICML Workshop on Human Interpretability in Machine Learning
(WHI 2016), New York, NY

Hierarchical Structure of Interpretability
9

Examples of Post-hoc Explana2ons
10
Chen, C.; Li, O.; Barne1, A.; Su, J.; and Rudin, C. 2018. This looks like that: Deep learning for interpretable image
recogniHon. ICML
Group A
What has the model
learned?
(holisHc or modular level)
Model
Speciﬁc

11
Rudin, C., and Ertekin, S ̧. 2018. Learning customized and op>mized lists of rules with mathema>cal programming.
Mathema'cal Programming Computa'on 10(4):659–702
Group A
What has the model
learned?
(holis>c or modular level)
Model
Speciﬁc

12
Montavon, G.; Lapuschkin, S.; Binder, A.; Samek, W.; and Mu ̈ller, K.-R. 2017. Explaining nonlinear classiﬁcaIon de- cisions
with deep Taylor decomposiIon. Pa#ern Recogni- .on 65:211–222.
Group A
What has the model
learned?
(holisIc or modular level)
Model
Specific

13
Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2016. ”Why Should I Trust You?”: Explaining the PredicJons of Any Classiﬁer.
ACM KDD
Group A
What has the model
learned?
(holisJc or modular level)
Model
AgnosJc

14Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2018. Anchors: High-Precision Model-Agnostic Explanations. AAAI Press 32:1527–
Group A
What has the model
learned?
(holisOc or modular level)
Model
AgnosOc

15
Henelius, A.; Puolama ̈ki, K.; and Ukkonen, A. 2017. Interpre?ng Classiﬁers through ADribute Interac?ons in Datasets.
ICML
Group A
What has the model
learned?
(holis?c or modular level)
Model
Agnos?c

General Concepts & Methods
• Rule Sets
• Sensi+vity Analysis
• Induc+ve Logic/Programming
• Recently:
• Counterfactuals
• Adversarial approaches
• Game theory
16

Open Challenges
• No formal agreed upon defini1on
• The no1on of interpretability seems to be an ill-defined term?
• Having agreed upon defini1on avoids reinven1ng the wheel
• Easier to built upon and contribute to prior work
• Rigorous, agreed upon evalua1on metrics
• Clear dis1nc1on of human vs. machine based evalua1on metrics
• Provide a clear picture of what is working and what needs improvement
19

Open Challenges
• Stochas(c nature of the models, diﬀerent random seeds lead to diﬀerent
outcomes for the same models
• P Henderson, R Islam, P Bachman, J Pineau, Deep Reinforcement Learning That
MaAers, AAAI 20018
• Models are built on assump(ons à = f( )
• When do they break and how?
20

Open Challenges
• Humans are great storytellers/story makers
• Memory championship à Method of loci
• Often humans create stories from small indications which rely upon in order to
build explanations
• These explanations might not have any relation with the underlying actual model
• How to avoid specific cognitive biases?
• Framing effect
• Focusing effect
• Illusory correlation 21

Open Challenges
• Saliency maps can be misleading (Olah et al., 2018)
• Models are uncalibrated
• Need for more transparent approaches
• Bringing another to interpret the exisCng 24

References
1. Bodenhofer, Ulrich and Bauer, Peter. Towards an Axiomatic Treatment of
Interpretability, Fuzzy Systems, 2000.
2. Olah, Chris and Satyanarayan, Arvind. The Building Blocks of Interpretability, Distill.pub,
2018.
3. Zadrozny, Bianca and Elkan, Charles. Obtaining calibrated probability estimates from
decision trees and naive bayesian classifiers. In ICML, pp. 609–616, 2001.
4. Zadrozny, Bianca and Elkan, Charles. Transforming classifier scores into accurate
multiclass probability estimates. In KDD, pp. 694–699, 2002.
5. Naeini, Mahdi Pakdaman, Cooper, Gregory F, and Hauskrecht, Milos. Obtaining well
calibrated probabilities using bayesian binning. In AAAI, pp. 2901, 2015.
6. Platt, John et al. Probabilistic outputs for support vector machines and comparisons to
regularized likelihood methods. Advances in large margin classifiers, 10(3): 61–74, 1999.
7. Guo, Chuan and Pleiss, Geoff and Sun, Yu and Weinberger, Kilian Q. On Calibration of
Modern Neural Networks. In ICML 2017.
25

THANKS!
Ques-ons?
26
Preprint: https://arxiv.org/abs/1904.02495
ioannis.mitros@insight-centre.org

A Categorisation of Post-hoc Explanations for Predictive Models

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a A Categorisation of Post-hoc Explanations for Predictive Models

Semelhante a A Categorisation of Post-hoc Explanations for Predictive Models (20)

Último

Último (20)

A Categorisation of Post-hoc Explanations for Predictive Models