TDC2018FLN | Trilha Machine Learning - Explainable Machine Learning

1. Explainable Machine Learning Gabriel Cypriano TDC 2018 Floripa

2. Creditas

3. Nós precisamos de explicações das predições?

4. E se usarmos modelos lineares?

5. Mas... E se nós conseguirmos treinar uma Random Forest com desempenho muito melhor?

6. Random Forest Feature Importances

7. Decision Paths Predições para: RM LSTAT NOX DIS 3.1 4.5 0.54 2.6 http://blog.datadive.net/interpreting-ra ndom-forests

18. treeinterpreter — Interpretando predições com Decision Paths

23. treeinterpreter — Explicando a diferença entre 2 datasets com Decision Paths

27. treeinterpreter — E interação entre as features?

28. treeinterpreter — Classificação no dataset Iris

32. treeinterpreter / Pivotal — Explicações com dataviz

33. treeinterpreter / Pivotal — Contribuição x Valor da Feature (1 Decision Tree)

34. treeinterpreter / Pivotal — Contribuição x Valor da Feature (Random Forest)

35. treeinterpreter / Pivotal — Contribuição x Valor da Feature (single Decision Tree)

36. treeinterpreter / Pivotal — Contribuição x Valor da Feature (Random Forest)

37. treeinterpreter / Pivotal — Classificação Violin Plot de Contribuição de Features (para a classe “Infant”)

38. treeinterpreter / Pivotal — Classificação Contribuições para cada classe x Valor da Feature

39. Como utilizar em Boosted Trees? Ao invés de tirar a média das contribuições das árvores, só precisamos somá-las. Disponível no seguinte pacote: ● ELI5 e.g., XGBoost, LightGBM

40. ELI5 — XGBoost — Feature Importances (dataset do Titanic)

41. ELI5 — Predições do XGBoost— dataset do Titanic

42. Explicações agnósticas ao modelo e.g., para modelos não baseados em árvores

43. Lime ● Aproximações locais ● Agnóstico ao modelo ● Consegue selecionar um conjunto de instâncias representativas para exibir explicações

44. Lime — Explicação

45. Lime — utiliza superpixels para explicações no reconhecimento objetos em imagem

46. Lime — reconhecimento de objetos em imagem

47. Lime para Processamento de Linguagem Natural

48. Mais casos de uso Amazon, Netflix

49. Mais casos de uso ● Entender se o modelo aprende com as features corretas / sofre de overfitting com features em específico ● Indentificar data leakage ● Dataset shift (dados de treino diferentes de dados de teste) ● Caso de pneumunia/asma ● Caso Stripe Amazon, Netflix

54. ● Não só útil quando as coisas não estão funcionando bem ● Custos diferentes para tipos de erro

55. Referências Interpreting Random Forests Random forest interpretation with scikit-learn Random forest interpretation – conditional feature contributions Interpreting Decision Trees and Random Forests XGBoost Decision Paths Explaining XGBoost predictions on the Titanic dataset “Why Should I Trust You?” Explaining the Predictions of Any Classifier

56. Referências (podcasts) TWiML: Exploring Black Box Predictions with Sam Ritchie TWiML: Carlos Guestrin – Explaining the Predictions of Machine Learning Models Data Skeptic: Marco Ribeiro - Trusting Machine Learning Models With Lime

57. Ferramentas treeinterpreter Lime ELI5

58. Gracias! gabrielcs.me vagas.creditas.com.br

TDC2018FLN | Trilha Machine Learning - Explainable Machine Learning

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a TDC2018FLN | Trilha Machine Learning - Explainable Machine Learning

Semelhante a TDC2018FLN | Trilha Machine Learning - Explainable Machine Learning (20)

Mais de tdc-globalcode

Mais de tdc-globalcode (20)

Último

Último (20)

TDC2018FLN | Trilha Machine Learning - Explainable Machine Learning