SlideShare uma empresa Scribd logo
1 de 34
© British Telecommunications plc 20191
Dr Detlef Nauck
Chief Research Scientist for Data Science
Applied Research
BT
1
Test Driven Machine Learning
© British Telecommunications plc 2019
InfoQ.com: News & Community Site
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
tdd-ml
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon London
www.qconlondon.com
© British Telecommunications plc 2019
Analytics Advanced Analytics Machine Learning
https://www.instagram.com/ground.control.toons
© British Telecommunications plc 2019
AI – ML – DS Taxonomy
© British Telecommunications plc 2019
So what are AI, Machine Learning and Data Science?
AI
“Artificial intelligence is that activity devoted to making machines intelligent, and intelligence is that quality that
enables an entity to function appropriately and with foresight in its environment.”
Nils J. Nilsson, The Quest for Artificial Intelligence: A History of Ideas and Achievements
(Cambridge, UK: Cambridge University Press, 2010).
Machine Learning
Field in Computer Science and AI that looks for algorithms that can automatically improve their
performance at a task without explicit programming but by observing relevant data.
Data Science
Data Science uses statistics and machine learning for (data) analytics – the scientific process of turning data
into insight for making better decisions.
(INFORMS - The Institute for Operations Research and the Management Sciences)
© British Telecommunications plc 2019
The Ethical Dimension of AI, ML and Data Science
What should AI and Data Science be used / not
used for and what is the impact on society such
as privacy and automation of jobs?
GDPR
For example no black box models if decisions
significantly affect a person. Organisations must
be able to explain their algorithmic decision
making in certain conditions.
Bias – (unfair) under / over-representation of
subgroups in your data
Bias in data and models not only makes models
perform worse, it can also damage a brand.
5
http://www.wsj.com/articles/wisconsin-supreme-court-to-rule-on-predictive-algorithms-used-in-sentencing-1465119008
https://www.wired.com/story/how-coders-are-fighting-bias-in-facial-recognition-software/
© British Telecommunications plc 20196
https://www.theguardian.com/technology/2018/aug/29/coding-algorithms-frankenalgos-program-danger
© British Telecommunications plc 2019
A Report from the Coalface: Naïve Bayes – From R to Spark ML
Teams builds a naïve Bayes classifier in R (e1071 package)
using numerical and categorical features
Team had to rebuild the model in Spark ML which can
only use categorical features encoded as numbers.
Team interpreted this as any numbers are fine, i.e.
numerical features can be used too.
The team converted the categorical features into
numbers and kept the numerical features as they were
Fortunately, there were negative values in the numerical
features which the Spark ML method refused to accept.
Things can get tricky when you switch data analysis
methods and environments without proper testing.
© British Telecommunications plc 2019
A Report from the Coalface: Naïve Bayes in R
© British Telecommunications plc 2019
A Report from the Coalface: Naïve Bayes in Spark ML
© British Telecommunications plc 2019
Some Learning Algorithms Complain about Wrong Data
© British Telecommunications plc 2019
Machine Learning can work really well – but it doesn’t always
Image recognition with deep networks is statistically impressive, but individually unreliable
a young boy is holding
a baseball bat
(Produced by a Deep Network trained for image recognition and automatic image labelling)
Source: Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan (Google): Show and Tell: A Neural Image Caption Generator (https://arxiv.org/pdf/1411.4555.pdf)
© British Telecommunications plc 2019
https://xkcd.com/1838/
© British Telecommunications plc 2019
Challenges in Analytics (especially Machine Learning)
Data Provenance
Where does the data come from, how was it collected and can I
trust it?
Data Quality
Is the data error free?
Data Bias
Does the data accurately reflect the population/situation I want
to model? What is missing?
Model Bias
Driven by unconscious bias or data bias – does my model treat
everybody fairly?
Model Comprehension
Why are the outputs of my (black box) models the way they are?
Ethics
Can I use this data? Do decisions made by models affect people?
Is there discriminatory bias? …
© British Telecommunications plc 2019
© British Telecommunications plc 2019
The Process of Building Models from Data
"CRISP-DM Process Diagram" by Kenneth Jensen - Own work. Licensed under CC BY-SA 3.0
via Commons - https://commons.wikimedia.org/wiki/File:CRISP-
DM_Process_Diagram.png#/media/File:CRISP-DM_Process_Diagram.png
Cross Industry Standard
Process for Data Mining
Data Wrangling:
includes data access,
exploratory data analysis
and data transformation.
70%-90% of project effort
95% of failed
projects fail at
deployment
Analytics, Machine Learning:
Building and testing models
Reality check – does it
solve the business problem?
© British Telecommunications plc 2019
Perception Problem about Machine Learning
The Data MineThe AI & Machine Learning Office
This is where you would like to work This is where you will spent most of your time
© British Telecommunications plc 2019
Test Driven Analytics (TDA) – Learn from Software Development
Test-driven development (TDD) is an advanced technique
of using automated unit tests to drive the design of
software and force decoupling of dependencies. The
result of using this practice is a comprehensive suite of
unit tests that can be run at any time to provide feedback
that the software is still working (Microsoft: Guidelines for
test-driven development )
How do you make sure that your analysis
is correct?
• Is your data any good?
• Have you validated your model?
• Do decisions taken have the expected
business impact?
© British Telecommunications plc 2019
Follow Best Practice for AI, ML and Data Science – Test Everything!
17
Versioning Versioning Versioning Versioning
Data
Repository
Testing
Testing Testing Testing
Data wrangling
Feature
generation Model building
Logging Logging From versioning
Testing Testing Testing
Decision
making
Model
application
(prediction)
Model
deploymentEnables
Build
In Production
1
2 3 4
567
Test Scenarios
1. Data quality and completeness
2. Data errors and consistency
3. Feature quality
4. Code errors, model quality
5. Model scalability
6. Model quality, feature consistency
7. Control groups, business impact
© British Telecommunications plc 2019
Testing Your Model: Cross-Validation
In a nutshell:
• Partition your data into k random subsets of equal size
• Use k-1 subsets to train your model and the remaining subset to test (validate) it
• Repeat k times, i.e. use each subset for training once
• Then, ideally, repeat the whole procedure n times, i.e. do this on n different partitions.
In practice
• k=10 or k=5, and n=1
• Aim for k=10 and n>1
Why are we doing this?
• We want to know how strongly the selection of training data impacts model building
• We want an estimate on how the model might perform on future data
18
© British Telecommunications plc 2019
Cross-Validation: What to look for
Compare the performance statistics of training and test (validation) sets
• You are looking for the training performance to be better (but only slightly) than the test performance
• You are looking for consistent figures across all folds
• If training performance is much better than test the model is overfitting
• If the test performance is much better than the training performance something is wrong
(e.g. data leakage)
• If the figures vary a lot you probably do not have enough data or the wrong data
Check the model structure in each fold
• You want the structure (selected features, discovered rules, …) to ideally be the same across all folds
• Different model structures mean that data selection is influencing the model build
19
© British Telecommunications plc 2019
The Most Misused Term in Data Science, AI & Machine Learning
Avoid using it!
© British Telecommunications plc 2019
Performance of Binary Classifiers
Confusion Matrix
Model
Yes (1) No (0) ∑
Data
Yes (1) True Positive
(TP)
False Negative
(FN – Type II error)
Condition Positive
(CP)
No (0) False Positive
(FP – Type I error)
True Negative (TN) Condition Negative
(CN)
∑ Predicted Condition
Positive (PCP)
Predictive Condition
Negative (PCN)
Total Sample
(N)
All Data: N = (TP + FN + FP + TN) = (CP + CN) = (PCP + PCN)
Base Rate (Prevalence): CP/N
Accuracy: (TP + TN) / N
Error: (FP + FN) / N
True Positive Rate (Recall,
Detection Rate, Sensitivity): TP / (TP + FN)
False Positive Rate (Fall Out): FP / (FP +TN)
True Negative Rate (Specificity): TN/(FP+TN) = 1 – FPR
Positive Predictive Value (Precision): TP / (TP + FP)
© British Telecommunications plc 2019
Test Your Data!
22 © British Telecommunications plc 2019
While you pre-process your data or while you conduct
exploratory data analysis you should gain some insight into
properties of the data.
Think about which properties are essential and which you
may want to check or guarantee before you conduct any
further analysis.
Write some code to demonstrate that your data is correct.
© British Telecommunications plc 2019
assertr – verify() tests conditions on a data frame
© British Telecommunications plc 2019
… but if it fails it cannot point to the offending value
© British Telecommunications plc 2019
assertr – assert() tests a predicate on a list of columns
The function is internally using the select() function from dplyr
© British Telecommunications plc 2019
assertr – insist() dynamically determines a predicate function
© British Telecommunications plc 2019
R Packages for Assertive Programming
Tony Fischetti's assertr
(https://github.com/tonyfischetti/assertr)
Hadley Wickham's assertthat
(https://github.com/hadley/assertthat)
Richard Cotton’s assertive
(https://bitbucket.org/richierocks/assertive)
Michel Lang's checkmate
(https://github.com/mllg/checkmate)
Stefan Bache's ensurer
(https://github.com/smbache/ensurer)
Gaston Sanchez's tester
(https://github.com/gastonstat/tester
© British Telecommunications plc 2019
Python – TDDA by Nick Radcliffe (http://tdda.info)
© British Telecommunications plc 2019
Research Trends in Data Science: Supporting Tools
Quilt
Data Registry: data versioning
and checks
http://quiltdata.com
FeatureHub
register features and discover
features created by others
https://github.com/HDI-
Project/FeatureHub
ModelDB
manage ML Models
https://mitdbg.github.io/modeldb/
TopNotch
quality controlling large scale
data sets
https://github.com/blackrock/
TopNotch
Data Quality Feature Crafting Model Management
MLflow
manage ML lifecycle
https://mlflow.org/
© British Telecommunications plc 2019
Future of AI: Explainable AI (Xai)
AI systems that will have
to explain to us what
they’re doing and why.
Think explainability by
design.
Model-based design
could be a way.
(see also
http://www.darpa.mil/program/explainable-
artificial-intelligence)
30
http://explainableai.comhttp://www.bbc.co.uk/news/technology-43514566
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
tdd-ml

Mais conteúdo relacionado

Mais procurados

Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!
Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!
Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!Applitools
 
Test Automation - Principles and Practices
Test Automation - Principles and PracticesTest Automation - Principles and Practices
Test Automation - Principles and PracticesAnand Bagmar
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to productionHerman Wu
 
Automation Testing using Selenium
Automation Testing using SeleniumAutomation Testing using Selenium
Automation Testing using SeleniumNaresh Chintalcheru
 
automation testing benefits
automation testing benefitsautomation testing benefits
automation testing benefitsnazeer pasha
 
Automated vs manual testing
Automated vs manual testingAutomated vs manual testing
Automated vs manual testingKanoah
 
Designing APIs with OpenAPI Spec
Designing APIs with OpenAPI SpecDesigning APIs with OpenAPI Spec
Designing APIs with OpenAPI SpecAdam Paxton
 
Kubernetes Selenium Grid
Kubernetes Selenium GridKubernetes Selenium Grid
Kubernetes Selenium GridAmrit pal singh
 
How to test an AI application
How to test an AI applicationHow to test an AI application
How to test an AI applicationKari Kakkonen
 
DevOps Overview
DevOps OverviewDevOps Overview
DevOps OverviewSagar Mody
 
API Security Survey
API Security SurveyAPI Security Survey
API Security SurveyImperva
 
Test Environment Management
Test Environment ManagementTest Environment Management
Test Environment ManagementKanoah
 
Introduction to Azure DevOps
Introduction to Azure DevOpsIntroduction to Azure DevOps
Introduction to Azure DevOpsLorenzo Barbieri
 
SDET approach for Agile Testing
SDET approach for Agile TestingSDET approach for Agile Testing
SDET approach for Agile TestingGopikrishna Kannan
 

Mais procurados (20)

Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!
Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!
Cypress, Playwright, Selenium, or WebdriverIO? Let the Engineers Speak!
 
Test Automation - Principles and Practices
Test Automation - Principles and PracticesTest Automation - Principles and Practices
Test Automation - Principles and Practices
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
Automation Testing using Selenium
Automation Testing using SeleniumAutomation Testing using Selenium
Automation Testing using Selenium
 
DevOps introduction
DevOps introductionDevOps introduction
DevOps introduction
 
automation testing benefits
automation testing benefitsautomation testing benefits
automation testing benefits
 
Automated vs manual testing
Automated vs manual testingAutomated vs manual testing
Automated vs manual testing
 
Intro to Azure DevOps
Intro to Azure DevOpsIntro to Azure DevOps
Intro to Azure DevOps
 
Designing APIs with OpenAPI Spec
Designing APIs with OpenAPI SpecDesigning APIs with OpenAPI Spec
Designing APIs with OpenAPI Spec
 
Kubernetes Selenium Grid
Kubernetes Selenium GridKubernetes Selenium Grid
Kubernetes Selenium Grid
 
Automation Testing
Automation TestingAutomation Testing
Automation Testing
 
How to test an AI application
How to test an AI applicationHow to test an AI application
How to test an AI application
 
DevOps - Transforming the Traditional SDLC
DevOps - Transforming the Traditional SDLCDevOps - Transforming the Traditional SDLC
DevOps - Transforming the Traditional SDLC
 
Automation With A Tool Demo
Automation With A Tool DemoAutomation With A Tool Demo
Automation With A Tool Demo
 
DevOps Overview
DevOps OverviewDevOps Overview
DevOps Overview
 
Selenium ppt
Selenium pptSelenium ppt
Selenium ppt
 
API Security Survey
API Security SurveyAPI Security Survey
API Security Survey
 
Test Environment Management
Test Environment ManagementTest Environment Management
Test Environment Management
 
Introduction to Azure DevOps
Introduction to Azure DevOpsIntroduction to Azure DevOps
Introduction to Azure DevOps
 
SDET approach for Agile Testing
SDET approach for Agile TestingSDET approach for Agile Testing
SDET approach for Agile Testing
 

Semelhante a Test-Driven Machine Learning

Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...
Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...
Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...Jon Mead
 
Why many data science projects fail
Why many data science projects fail Why many data science projects fail
Why many data science projects fail Omnia Safaan
 
Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class QuantUniversity
 
Towards the Industrialization of AI
Towards the Industrialization of AITowards the Industrialization of AI
Towards the Industrialization of AIHui Lei
 
Top machine learning trends for 2022 and beyond
Top machine learning trends for 2022 and beyondTop machine learning trends for 2022 and beyond
Top machine learning trends for 2022 and beyondArpitGautam20
 
Technology Driven Process Powerpoint Presentation Slides
Technology Driven Process Powerpoint Presentation SlidesTechnology Driven Process Powerpoint Presentation Slides
Technology Driven Process Powerpoint Presentation SlidesSlideTeam
 
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Amazon Web Services Korea
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Albert Y. C. Chen
 
Technology Driven Process PowerPoint Presentation Slides
Technology Driven Process PowerPoint Presentation Slides Technology Driven Process PowerPoint Presentation Slides
Technology Driven Process PowerPoint Presentation Slides SlideTeam
 
IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM
 
How Does Data Create Economic Value_ Foundations For Valuation Models.pdf
How Does Data Create Economic Value_ Foundations For Valuation Models.pdfHow Does Data Create Economic Value_ Foundations For Valuation Models.pdf
How Does Data Create Economic Value_ Foundations For Valuation Models.pdfDr. Anke Nestler
 
IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.  IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era. Benoit Marolleau
 
Test data automation: delivering quality data at speed
Test data automation: delivering quality data at speedTest data automation: delivering quality data at speed
Test data automation: delivering quality data at speedCuriosity Software Ireland
 
Bi PowerPoint Presentation Slides
Bi PowerPoint Presentation SlidesBi PowerPoint Presentation Slides
Bi PowerPoint Presentation SlidesSlideTeam
 
NI Automated Test Outlook 2016
NI Automated Test Outlook 2016NI Automated Test Outlook 2016
NI Automated Test Outlook 2016Hank Lydick
 
STS. Smarter devices. Smarter test systems.
STS. Smarter devices. Smarter test systems.STS. Smarter devices. Smarter test systems.
STS. Smarter devices. Smarter test systems.Hank Lydick
 
Pydata Chicago - work hard once
Pydata Chicago - work hard oncePydata Chicago - work hard once
Pydata Chicago - work hard onceJi Dong
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...Edge AI and Vision Alliance
 

Semelhante a Test-Driven Machine Learning (20)

Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...
Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...
Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...
 
Why many data science projects fail
Why many data science projects fail Why many data science projects fail
Why many data science projects fail
 
Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class
 
Towards the Industrialization of AI
Towards the Industrialization of AITowards the Industrialization of AI
Towards the Industrialization of AI
 
Top machine learning trends for 2022 and beyond
Top machine learning trends for 2022 and beyondTop machine learning trends for 2022 and beyond
Top machine learning trends for 2022 and beyond
 
Technology Driven Process Powerpoint Presentation Slides
Technology Driven Process Powerpoint Presentation SlidesTechnology Driven Process Powerpoint Presentation Slides
Technology Driven Process Powerpoint Presentation Slides
 
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0
 
Technology Driven Process PowerPoint Presentation Slides
Technology Driven Process PowerPoint Presentation Slides Technology Driven Process PowerPoint Presentation Slides
Technology Driven Process PowerPoint Presentation Slides
 
IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer
 
How Does Data Create Economic Value_ Foundations For Valuation Models.pdf
How Does Data Create Economic Value_ Foundations For Valuation Models.pdfHow Does Data Create Economic Value_ Foundations For Valuation Models.pdf
How Does Data Create Economic Value_ Foundations For Valuation Models.pdf
 
IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.  IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.
 
Test data automation: delivering quality data at speed
Test data automation: delivering quality data at speedTest data automation: delivering quality data at speed
Test data automation: delivering quality data at speed
 
Bi PowerPoint Presentation Slides
Bi PowerPoint Presentation SlidesBi PowerPoint Presentation Slides
Bi PowerPoint Presentation Slides
 
NI Automated Test Outlook 2016
NI Automated Test Outlook 2016NI Automated Test Outlook 2016
NI Automated Test Outlook 2016
 
STS. Smarter devices. Smarter test systems.
STS. Smarter devices. Smarter test systems.STS. Smarter devices. Smarter test systems.
STS. Smarter devices. Smarter test systems.
 
Pydata Chicago - work hard once
Pydata Chicago - work hard oncePydata Chicago - work hard once
Pydata Chicago - work hard once
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
 

Mais de C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

Mais de C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Último

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Último (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Test-Driven Machine Learning

  • 1. © British Telecommunications plc 20191 Dr Detlef Nauck Chief Research Scientist for Data Science Applied Research BT 1 Test Driven Machine Learning © British Telecommunications plc 2019
  • 2. InfoQ.com: News & Community Site Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ tdd-ml • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London www.qconlondon.com
  • 4. © British Telecommunications plc 2019 Analytics Advanced Analytics Machine Learning https://www.instagram.com/ground.control.toons
  • 5. © British Telecommunications plc 2019 AI – ML – DS Taxonomy
  • 6. © British Telecommunications plc 2019 So what are AI, Machine Learning and Data Science? AI “Artificial intelligence is that activity devoted to making machines intelligent, and intelligence is that quality that enables an entity to function appropriately and with foresight in its environment.” Nils J. Nilsson, The Quest for Artificial Intelligence: A History of Ideas and Achievements (Cambridge, UK: Cambridge University Press, 2010). Machine Learning Field in Computer Science and AI that looks for algorithms that can automatically improve their performance at a task without explicit programming but by observing relevant data. Data Science Data Science uses statistics and machine learning for (data) analytics – the scientific process of turning data into insight for making better decisions. (INFORMS - The Institute for Operations Research and the Management Sciences)
  • 7. © British Telecommunications plc 2019 The Ethical Dimension of AI, ML and Data Science What should AI and Data Science be used / not used for and what is the impact on society such as privacy and automation of jobs? GDPR For example no black box models if decisions significantly affect a person. Organisations must be able to explain their algorithmic decision making in certain conditions. Bias – (unfair) under / over-representation of subgroups in your data Bias in data and models not only makes models perform worse, it can also damage a brand. 5 http://www.wsj.com/articles/wisconsin-supreme-court-to-rule-on-predictive-algorithms-used-in-sentencing-1465119008 https://www.wired.com/story/how-coders-are-fighting-bias-in-facial-recognition-software/
  • 8. © British Telecommunications plc 20196 https://www.theguardian.com/technology/2018/aug/29/coding-algorithms-frankenalgos-program-danger
  • 9. © British Telecommunications plc 2019 A Report from the Coalface: Naïve Bayes – From R to Spark ML Teams builds a naïve Bayes classifier in R (e1071 package) using numerical and categorical features Team had to rebuild the model in Spark ML which can only use categorical features encoded as numbers. Team interpreted this as any numbers are fine, i.e. numerical features can be used too. The team converted the categorical features into numbers and kept the numerical features as they were Fortunately, there were negative values in the numerical features which the Spark ML method refused to accept. Things can get tricky when you switch data analysis methods and environments without proper testing.
  • 10. © British Telecommunications plc 2019 A Report from the Coalface: Naïve Bayes in R
  • 11. © British Telecommunications plc 2019 A Report from the Coalface: Naïve Bayes in Spark ML
  • 12. © British Telecommunications plc 2019 Some Learning Algorithms Complain about Wrong Data
  • 13. © British Telecommunications plc 2019 Machine Learning can work really well – but it doesn’t always Image recognition with deep networks is statistically impressive, but individually unreliable a young boy is holding a baseball bat (Produced by a Deep Network trained for image recognition and automatic image labelling) Source: Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan (Google): Show and Tell: A Neural Image Caption Generator (https://arxiv.org/pdf/1411.4555.pdf)
  • 14. © British Telecommunications plc 2019 https://xkcd.com/1838/
  • 15. © British Telecommunications plc 2019 Challenges in Analytics (especially Machine Learning) Data Provenance Where does the data come from, how was it collected and can I trust it? Data Quality Is the data error free? Data Bias Does the data accurately reflect the population/situation I want to model? What is missing? Model Bias Driven by unconscious bias or data bias – does my model treat everybody fairly? Model Comprehension Why are the outputs of my (black box) models the way they are? Ethics Can I use this data? Do decisions made by models affect people? Is there discriminatory bias? … © British Telecommunications plc 2019
  • 16. © British Telecommunications plc 2019 The Process of Building Models from Data "CRISP-DM Process Diagram" by Kenneth Jensen - Own work. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/wiki/File:CRISP- DM_Process_Diagram.png#/media/File:CRISP-DM_Process_Diagram.png Cross Industry Standard Process for Data Mining Data Wrangling: includes data access, exploratory data analysis and data transformation. 70%-90% of project effort 95% of failed projects fail at deployment Analytics, Machine Learning: Building and testing models Reality check – does it solve the business problem?
  • 17. © British Telecommunications plc 2019 Perception Problem about Machine Learning The Data MineThe AI & Machine Learning Office This is where you would like to work This is where you will spent most of your time
  • 18. © British Telecommunications plc 2019 Test Driven Analytics (TDA) – Learn from Software Development Test-driven development (TDD) is an advanced technique of using automated unit tests to drive the design of software and force decoupling of dependencies. The result of using this practice is a comprehensive suite of unit tests that can be run at any time to provide feedback that the software is still working (Microsoft: Guidelines for test-driven development ) How do you make sure that your analysis is correct? • Is your data any good? • Have you validated your model? • Do decisions taken have the expected business impact?
  • 19. © British Telecommunications plc 2019 Follow Best Practice for AI, ML and Data Science – Test Everything! 17 Versioning Versioning Versioning Versioning Data Repository Testing Testing Testing Testing Data wrangling Feature generation Model building Logging Logging From versioning Testing Testing Testing Decision making Model application (prediction) Model deploymentEnables Build In Production 1 2 3 4 567 Test Scenarios 1. Data quality and completeness 2. Data errors and consistency 3. Feature quality 4. Code errors, model quality 5. Model scalability 6. Model quality, feature consistency 7. Control groups, business impact
  • 20. © British Telecommunications plc 2019 Testing Your Model: Cross-Validation In a nutshell: • Partition your data into k random subsets of equal size • Use k-1 subsets to train your model and the remaining subset to test (validate) it • Repeat k times, i.e. use each subset for training once • Then, ideally, repeat the whole procedure n times, i.e. do this on n different partitions. In practice • k=10 or k=5, and n=1 • Aim for k=10 and n>1 Why are we doing this? • We want to know how strongly the selection of training data impacts model building • We want an estimate on how the model might perform on future data 18
  • 21. © British Telecommunications plc 2019 Cross-Validation: What to look for Compare the performance statistics of training and test (validation) sets • You are looking for the training performance to be better (but only slightly) than the test performance • You are looking for consistent figures across all folds • If training performance is much better than test the model is overfitting • If the test performance is much better than the training performance something is wrong (e.g. data leakage) • If the figures vary a lot you probably do not have enough data or the wrong data Check the model structure in each fold • You want the structure (selected features, discovered rules, …) to ideally be the same across all folds • Different model structures mean that data selection is influencing the model build 19
  • 22. © British Telecommunications plc 2019 The Most Misused Term in Data Science, AI & Machine Learning Avoid using it!
  • 23. © British Telecommunications plc 2019 Performance of Binary Classifiers Confusion Matrix Model Yes (1) No (0) ∑ Data Yes (1) True Positive (TP) False Negative (FN – Type II error) Condition Positive (CP) No (0) False Positive (FP – Type I error) True Negative (TN) Condition Negative (CN) ∑ Predicted Condition Positive (PCP) Predictive Condition Negative (PCN) Total Sample (N) All Data: N = (TP + FN + FP + TN) = (CP + CN) = (PCP + PCN) Base Rate (Prevalence): CP/N Accuracy: (TP + TN) / N Error: (FP + FN) / N True Positive Rate (Recall, Detection Rate, Sensitivity): TP / (TP + FN) False Positive Rate (Fall Out): FP / (FP +TN) True Negative Rate (Specificity): TN/(FP+TN) = 1 – FPR Positive Predictive Value (Precision): TP / (TP + FP)
  • 24. © British Telecommunications plc 2019 Test Your Data! 22 © British Telecommunications plc 2019 While you pre-process your data or while you conduct exploratory data analysis you should gain some insight into properties of the data. Think about which properties are essential and which you may want to check or guarantee before you conduct any further analysis. Write some code to demonstrate that your data is correct.
  • 25. © British Telecommunications plc 2019 assertr – verify() tests conditions on a data frame
  • 26. © British Telecommunications plc 2019 … but if it fails it cannot point to the offending value
  • 27. © British Telecommunications plc 2019 assertr – assert() tests a predicate on a list of columns The function is internally using the select() function from dplyr
  • 28. © British Telecommunications plc 2019 assertr – insist() dynamically determines a predicate function
  • 29. © British Telecommunications plc 2019 R Packages for Assertive Programming Tony Fischetti's assertr (https://github.com/tonyfischetti/assertr) Hadley Wickham's assertthat (https://github.com/hadley/assertthat) Richard Cotton’s assertive (https://bitbucket.org/richierocks/assertive) Michel Lang's checkmate (https://github.com/mllg/checkmate) Stefan Bache's ensurer (https://github.com/smbache/ensurer) Gaston Sanchez's tester (https://github.com/gastonstat/tester
  • 30. © British Telecommunications plc 2019 Python – TDDA by Nick Radcliffe (http://tdda.info)
  • 31. © British Telecommunications plc 2019 Research Trends in Data Science: Supporting Tools Quilt Data Registry: data versioning and checks http://quiltdata.com FeatureHub register features and discover features created by others https://github.com/HDI- Project/FeatureHub ModelDB manage ML Models https://mitdbg.github.io/modeldb/ TopNotch quality controlling large scale data sets https://github.com/blackrock/ TopNotch Data Quality Feature Crafting Model Management MLflow manage ML lifecycle https://mlflow.org/
  • 32. © British Telecommunications plc 2019 Future of AI: Explainable AI (Xai) AI systems that will have to explain to us what they’re doing and why. Think explainability by design. Model-based design could be a way. (see also http://www.darpa.mil/program/explainable- artificial-intelligence) 30 http://explainableai.comhttp://www.bbc.co.uk/news/technology-43514566
  • 33.
  • 34. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ tdd-ml