The document discusses the importance of interpretability and explainability in machine learning models. It provides examples of how "black box" algorithms can have harmful and unsafe outcomes when used without understanding how they work. It advocates for techniques that allow humans to explore model predictions, understand how variables contribute to outcomes, and debug models when needed. These types of interpretable machine learning approaches will change how predictive models are developed and used.
8. • “You don’t see a lot of skepticism,” she says. “The algorithms are like shiny new
toys that we can’t resist using. We trust them so much that we project meaning on to
them.”
• Ultimately algorithms, according to O’Neil, reinforce discrimination and widen
inequality, “using people’s fear and trust of mathematics to prevent them from
asking questions”.
https://www.theguardian.com/books/2016/oct/27/cathy-oneil-weapons-of-math-
destruction-algorithms-big-data !8
Cathy O'Neil:
The era of blind faith
in big data must end
black boxes
Why do we need explanations for complex models?
24. Input:
4 years old passenger from 1st class. Paid 72 for the ticket
What is the contribution of each variable to the final odds?
(model: Random Forest)
iBreakDown: Uncertainty of Model Explanations for Non-additive Predictive Models
Alicja Gosiewska, Przemyslaw Biecek (2019) https://arxiv.org/abs/1903.11420v1
Random Forest prediction: 0.422
25. Additive attribution of model prediction via sequence of conditionings
Added values of variable l in the sequence
Final attributions
26. Conditional distributions, read from top to the bottom
iBreakDown: Uncertainty of Model Explanations for Non-additive Predictive Models
Alicja Gosiewska, Przemyslaw Biecek (2019) https://arxiv.org/abs/1903.11420v1
34. What If?
Input:
42 years old passenger from 1st class. Paid 72 for the ticket
Logistic regression model predicts 0.32 probability of survival
What would happen if….
41. Defaults (package
defaults (Def.P) and
optimal defaults
(Def.O)),
tunability of the
hyperparameters with
the package defaults
(Tun.P) and our
optimal defaults
(Tun.O) as reference
and tuning space
quantiles (q0.05 and
q0.95) for different
parameters of the
algorithms
42. Let’s focus on a single dataset: 334
For a selected class of models (here Random Forest) we can learn
how the model performance depends on hyper-parameters
43. Let’s focus on a single dataset: 334
For a selected class of models (here Random Forest) we can learn
how the model performance depends on hyper-parameters
51. Use a good black box model (i.e. trained with AutoML) and extract an
interpretable model from it.
AutoIML
52. AutoIML
Use a good black box model (i.e. trained with AutoML) and extract an
interpretable model from it.
Preliminary results for the FICO data,
xgboost is used as a surrogate to construct a logistic regression
model.
54. MDP : : Model Development Process
Data validation
Feature selection
Parameters tuning
Problem formulation Crisp modelling Fine tuning Maintaining
Data acquisition
Model deployment
Data cleaning
Data exploration
Sample selection
Feature engineering
Model selection
Model validation
Documentation
Communication
Data preparation
Data understanding
Model delivery
Model assembly
Model audit
Model benchmarking
Iterations P1 C1 C2 F1 F2 M1 M2 M3
time
Techniques for explanation and exploration
will change the way how we do predictive models
55. IML in R: DALEX, iml, mlr3vis(?), …
IML in python: ELI5, skater, xai, SHAP, lime, …
Other tools: H2O, …
56. An Introduction to Machine Learning Interpretability
Navdeep Gill, Patrick Hall
https://www.h2o.ai/oreilly-mli-booklet-2019/
Interpretable Machine Learning
Christoph Molnar
https://christophm.github.io/interpretable-ml-book/
Predictive Models: Explore, Explain, and Debug
Przemyslaw Biecek and Tomasz Burzykowski
https://pbiecek.github.io/PM_VEE/