SlideShare uma empresa Scribd logo
1 de 36
Baixar para ler offline
Customer Churn Analytics
using Microsoft R Open
Malaysia R User Group Meet Up
16th February 2017
Poo Kuan Hoong
https://github.com/kuanhoong/churn-r
Disclaimer: The views and opinions expressed in this slides are those of
the author and do not necessarily reflect the official policy or position
of Nielsen Malaysia. Examples of analysis performed within this slides
are only examples. They should not be utilized in real-world analytic
products as they are based only on very limited and dated open source
information. Assumptions made within the analysis are not reflective of
the position of Nielsen Malaysia.
Agenda
• Introduction
• Customer Churn Analytics
• Machine Learning Framework
• Microsoft R Open and Visual Studio
• Model Performance Comparison
• Demo
Malaysia R User Group (MyRUG)
• The Malaysia R User Group (MyRUG) was formed on June 2016.
• It is a diverse group that come together to discuss anything related to
the R programming language.
• The main aim of MyRUG is to provide members ranging from
beginners to R professionals and experts to share and learn about R
programming and gain competency as well as share new ideas or
knowledge.
https://www.meetup.com/MY-RUserGroup/
https://www.facebook.com/rusergroupmalaysia/
Introduction
• Customer churn can be defined
simply as the rate at which a
company is losing its customers
• Imagine the business as a bucket
with holes, the water flowing from
the top is the growth rate, while the
holes at the bottom is churn
• While a certain level of churn is
unavoidable, it is important to keep
it under control, as high churn rate
can potentially kill your business
Churn analytics
• Predicting who will switch mobile operator
Customer churn - who do customers change
operators?
• The top 3 reasons why
subscribers change providers:
• They want a new handset
• They believe they pay too
much for calls/data
• Providers do not offer
additional loyalty benefits
Data Collection
Data Preprocessing
Attributes selection
• Attribute 1
• Attribute 2
• Attribute 3
Algorithm
Training Model Score Model
Apply Data
/Test Data
Predicting Output
Initialization Step Learn Step Apply Step
Machine Learning Framework
Correlation Matrix
• correlation matrix, which
is used to investigate the
dependence between
multiple variables at the
same time.
Microsoft R Open
• Microsoft R Open, formerly known as Revolution R Open (RRO), is
the enhanced distribution of R from Microsoft Corporation.
• It is a complete open source platform for statistical analysis and
data science.
Key enhancement
• Multi-threaded math libraries that brings multi-threaded
computations to R.
• A high-performance default CRAN repository that provide a
consistent and static set of packages to all Microsoft R Open users.
• The checkpoint package that make it easy to share R code and
replicate results using specific R package versions.
R Tools for Visual Studio
• Turn Visual Studio into a powerful R development environment
• Download R Tools for Visual Studio
R Tools for Visual Studio
• Visual Studio IDE
• Intellisense
• Enhanced multi-threaded math libs, cluster scale computing, and a
high performance CRAN repo with checkpoint capabilities.
• Learn more about R Tools from here:
https://microsoft.github.io/RTVS-docs/
Data Collection
Data Preprocessing
Attributes selection
• Attribute 1
• Attribute 2
• Attribute 3
Algorithm
Training Model Score Model
Apply Data
/Test Data
Predicting Output
Initialization Step Learn Step Apply Step
Machine Learning Framework
Data Preprocessing
• Assign missing values
as zero
• Detect outliers
• Remove unwanted
variables
• Recode variables
• Convert categorical
variables
Data Collection
Data Preprocessing
Attributes selection
• Attribute 1
• Attribute 2
• Attribute 3
Algorithm
Training Model Score Model
Apply Data
/Test Data
Predicting Output
Initialization Step Learn Step Apply Step
Machine Learning Framework
Features selection
• The process of selecting a subset of relevant features (variables,
predictors) for use in model construction.
• Feature selection techniques are used for three reasons:
• simplification of models to make them easier to interpret by researchers/users,
• shorter training times,
• enhanced generalization by reducing overfitting
Correlation Matrix
Models Performance Comparison
• Logistic Regression
• is a regression model where the dependent variable (DV) is categorical.
• Support Vector Machine
• SVM is a supervised learning model with associated learning algorithms that
analyze data used for classification and regression analysis.
• RandomForest
• is an ensemble learning method for classification, regression and other tasks,
that operate by constructing a multitude of decision trees at training time and
outputting the class that is the mode of the classes (classification) or mean
prediction (regression) of the individual trees.
Data Collection
Data Preprocessing
Attributes selection
• Attribute 1
• Attribute 2
• Attribute 3
Algorithm
Training Model Score Model
Apply Data
/Test Data
Predicting Output
Initialization Step Learn Step Apply Step
Machine Learning Framework
Training Model and Algorithm
• Split the data set into 80:20
using library(caret)
• Apply the algorithms: GLM,
SVM and RF
Data Collection
Data Preprocessing
Attributes selection
• Attribute 1
• Attribute 2
• Attribute 3
Algorithm
Training Model Score Model
Apply Data
/Test Data
Predicting Output
Initialization Step Learn Step Apply Step
Machine Learning Framework
Score Model
• Confusion Matrix: a table that is often used to describe the
performance of a classification model (or "classifier") on a set of test
data for which the true values are known.
• true positives (TP): These are cases in which we predicted yes (they have the
disease), and they do churn.
• true negatives (TN): We predicted no, and they don't churn.
• false positives (FP): We predicted yes, but they don't actually churn. (Also
known as a "Type I error.")
• false negatives (FN): We predicted no, but they actually do churn. (Also
known as a "Type II error.")
Confusion Matrix: Generalized Linear Model
(glm)
n=1407
Predicted:
NO
Predicted:
YES
Actual:
NO
TN = 919
(0.653)
FP = 115
(0.082)
1034
Actual:
YES
FN = 167
(0.119)
TP = 206
(0.146)
373
1086 321
Confusion Matrix: Support Vector Machine
(SVM)
n=1407
Predicted:
NO
Predicted:
YES
Actual:
NO
TN= 929
(0.660)
FP= 105
(0.075)
1034
Actual:
YES
FN= 183
(0.130)
TP= 190
(0.135)
373
1112 295
Confusion Matrix: RandomForest
n=1407
Predicted:
NO
Predicted:
YES
Actual:
NO
TN= 939
(0.667)
FP= 95
(0.068)
1034
Actual:
YES
FN= 181
(0.129)
TP= 192
(0.136)
373
1120 287
Receiver Operating Characteristic (ROC) curve
• ROC curve is a graphical plot that illustrates the performance of a
binary classifier system as its discrimination threshold is varied. The
curve is created by plotting the true positive rate (TPR) against the
false positive rate (FPR) at various threshold settings.
Models comparison
• ROC illustrates the
performance of a binary
classifier system as its
discrimination threshold is
varied.
Microsoft R Open vs R
Data Collection
Data Preprocessing
Attributes selection
• Attribute 1
• Attribute 2
• Attribute 3
Algorithm
Training Model Score Model
Apply Data
/Test Data
Predicting Output
Initialization Step Learn Step Apply Step
Machine Learning Framework
Predict test data
• Based on the training
model, select the best
model to be used for test
data prediction
Thanks!
Questions?
@kuanhoong
https://www.linkedin.com/in/kuanhoong
kuanhoong@gmail.com
DEMO

Mais conteúdo relacionado

Mais procurados

Churn prediction
Churn predictionChurn prediction
Churn predictionGigi Lino
 
Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Gunvansh Khanna
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET Journal
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetRohan Choksi
 
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical SciencesExploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical SciencesParag Shah
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term depositPranov Mishra
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxAniket Patil
 
Customer Acquisition Management Process Complete PowerPoint Deck With Slides
Customer Acquisition Management Process Complete PowerPoint Deck With Slides Customer Acquisition Management Process Complete PowerPoint Deck With Slides
Customer Acquisition Management Process Complete PowerPoint Deck With Slides SlideTeam
 
Telco churn presentation
Telco churn presentationTelco churn presentation
Telco churn presentationAditya Bahl
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using ClusteringDessy Amirudin
 
Sales Analytics Using Power BI
Sales Analytics Using Power BISales Analytics Using Power BI
Sales Analytics Using Power BINetwoven Inc.
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesSindhujanDhayalan
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
 
ABM Master Class: Market Segmentation for B2B Success
ABM Master Class: Market Segmentation for B2B SuccessABM Master Class: Market Segmentation for B2B Success
ABM Master Class: Market Segmentation for B2B SuccessDemandbase
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An OverviewMachinePulse
 

Mais procurados (20)

Churn prediction
Churn predictionChurn prediction
Churn prediction
 
Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Data analytics telecom churn final ppt
Data analytics telecom churn final ppt
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom Industry
 
Telecom Churn Prediction
Telecom Churn PredictionTelecom Churn Prediction
Telecom Churn Prediction
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn dataset
 
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical SciencesExploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
 
Customer Acquisition Management Process Complete PowerPoint Deck With Slides
Customer Acquisition Management Process Complete PowerPoint Deck With Slides Customer Acquisition Management Process Complete PowerPoint Deck With Slides
Customer Acquisition Management Process Complete PowerPoint Deck With Slides
 
Telco churn presentation
Telco churn presentationTelco churn presentation
Telco churn presentation
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using Clustering
 
Campaign management
Campaign managementCampaign management
Campaign management
 
Sales Analytics Using Power BI
Sales Analytics Using Power BISales Analytics Using Power BI
Sales Analytics Using Power BI
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniques
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
ABM Master Class: Market Segmentation for B2B Success
ABM Master Class: Market Segmentation for B2B SuccessABM Master Class: Market Segmentation for B2B Success
ABM Master Class: Market Segmentation for B2B Success
 
Churn modelling
Churn modellingChurn modelling
Churn modelling
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 

Semelhante a Customer Churn Prediction using R and Machine Learning

Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusoneDotNetCampus
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATADotNetCampus
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaDatabricks
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1arthi v
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptxDr.Shweta
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptxnarmeen11
 
Understanding Mahout classification documentation
Understanding Mahout  classification documentationUnderstanding Mahout  classification documentation
Understanding Mahout classification documentationNaveen Kumar
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detectionjagan477830
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark MLAhmet Bulut
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxkprasad8
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 antimo musone
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Sagar Deogirkar
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerDatabricks
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAjaved75
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented SoftwarePraveen Penumathsa
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 

Semelhante a Customer Churn Prediction using R and Machine Learning (20)

Rapid Miner
Rapid MinerRapid Miner
Rapid Miner
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
Data Science and Analysis.pptx
Data Science and Analysis.pptxData Science and Analysis.pptx
Data Science and Analysis.pptx
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Understanding Mahout classification documentation
Understanding Mahout  classification documentationUnderstanding Mahout  classification documentation
Understanding Mahout classification documentation
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark ML
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented Software
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 

Mais de Poo Kuan Hoong

Build an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMBuild an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMPoo Kuan Hoong
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's newPoo Kuan Hoong
 
The future outlook and the path to be Data Scientist
The future outlook and the path to be Data ScientistThe future outlook and the path to be Data Scientist
The future outlook and the path to be Data ScientistPoo Kuan Hoong
 
Data Driven Organization and Data Commercialization
Data Driven Organization and Data CommercializationData Driven Organization and Data Commercialization
Data Driven Organization and Data CommercializationPoo Kuan Hoong
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewPoo Kuan Hoong
 
Explore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer LearningExplore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer LearningPoo Kuan Hoong
 
Explore and have fun with TensorFlow: An introductory to TensorFlow
Explore and have fun with TensorFlow: An introductory	to TensorFlowExplore and have fun with TensorFlow: An introductory	to TensorFlow
Explore and have fun with TensorFlow: An introductory to TensorFlowPoo Kuan Hoong
 
The path to be a Data Scientist
The path to be a Data ScientistThe path to be a Data Scientist
The path to be a Data ScientistPoo Kuan Hoong
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenPoo Kuan Hoong
 
Microsoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community BootcampMicrosoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community BootcampPoo Kuan Hoong
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with RPoo Kuan Hoong
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientistPoo Kuan Hoong
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningPoo Kuan Hoong
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RPoo Kuan Hoong
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep LearningPoo Kuan Hoong
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big dataPoo Kuan Hoong
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learningPoo Kuan Hoong
 
Context Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social MediaContext Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social MediaPoo Kuan Hoong
 

Mais de Poo Kuan Hoong (20)

Build an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMBuild an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBM
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's new
 
The future outlook and the path to be Data Scientist
The future outlook and the path to be Data ScientistThe future outlook and the path to be Data Scientist
The future outlook and the path to be Data Scientist
 
Data Driven Organization and Data Commercialization
Data Driven Organization and Data CommercializationData Driven Organization and Data Commercialization
Data Driven Organization and Data Commercialization
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
 
Explore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer LearningExplore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer Learning
 
Deep Learning with R
Deep Learning with RDeep Learning with R
Deep Learning with R
 
Explore and have fun with TensorFlow: An introductory to TensorFlow
Explore and have fun with TensorFlow: An introductory	to TensorFlowExplore and have fun with TensorFlow: An introductory	to TensorFlow
Explore and have fun with TensorFlow: An introductory to TensorFlow
 
The path to be a Data Scientist
The path to be a Data ScientistThe path to be a Data Scientist
The path to be a Data Scientist
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R Open
 
Microsoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community BootcampMicrosoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community Bootcamp
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep Learning
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
Context Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social MediaContext Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social Media
 

Último

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 

Último (20)

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 

Customer Churn Prediction using R and Machine Learning

  • 1. Customer Churn Analytics using Microsoft R Open Malaysia R User Group Meet Up 16th February 2017 Poo Kuan Hoong https://github.com/kuanhoong/churn-r
  • 2. Disclaimer: The views and opinions expressed in this slides are those of the author and do not necessarily reflect the official policy or position of Nielsen Malaysia. Examples of analysis performed within this slides are only examples. They should not be utilized in real-world analytic products as they are based only on very limited and dated open source information. Assumptions made within the analysis are not reflective of the position of Nielsen Malaysia.
  • 3. Agenda • Introduction • Customer Churn Analytics • Machine Learning Framework • Microsoft R Open and Visual Studio • Model Performance Comparison • Demo
  • 4. Malaysia R User Group (MyRUG) • The Malaysia R User Group (MyRUG) was formed on June 2016. • It is a diverse group that come together to discuss anything related to the R programming language. • The main aim of MyRUG is to provide members ranging from beginners to R professionals and experts to share and learn about R programming and gain competency as well as share new ideas or knowledge.
  • 7.
  • 8. Introduction • Customer churn can be defined simply as the rate at which a company is losing its customers • Imagine the business as a bucket with holes, the water flowing from the top is the growth rate, while the holes at the bottom is churn • While a certain level of churn is unavoidable, it is important to keep it under control, as high churn rate can potentially kill your business
  • 9.
  • 10. Churn analytics • Predicting who will switch mobile operator
  • 11. Customer churn - who do customers change operators? • The top 3 reasons why subscribers change providers: • They want a new handset • They believe they pay too much for calls/data • Providers do not offer additional loyalty benefits
  • 12. Data Collection Data Preprocessing Attributes selection • Attribute 1 • Attribute 2 • Attribute 3 Algorithm Training Model Score Model Apply Data /Test Data Predicting Output Initialization Step Learn Step Apply Step Machine Learning Framework
  • 13. Correlation Matrix • correlation matrix, which is used to investigate the dependence between multiple variables at the same time.
  • 14. Microsoft R Open • Microsoft R Open, formerly known as Revolution R Open (RRO), is the enhanced distribution of R from Microsoft Corporation. • It is a complete open source platform for statistical analysis and data science. Key enhancement • Multi-threaded math libraries that brings multi-threaded computations to R. • A high-performance default CRAN repository that provide a consistent and static set of packages to all Microsoft R Open users. • The checkpoint package that make it easy to share R code and replicate results using specific R package versions.
  • 15. R Tools for Visual Studio • Turn Visual Studio into a powerful R development environment • Download R Tools for Visual Studio
  • 16. R Tools for Visual Studio • Visual Studio IDE • Intellisense • Enhanced multi-threaded math libs, cluster scale computing, and a high performance CRAN repo with checkpoint capabilities. • Learn more about R Tools from here: https://microsoft.github.io/RTVS-docs/
  • 17. Data Collection Data Preprocessing Attributes selection • Attribute 1 • Attribute 2 • Attribute 3 Algorithm Training Model Score Model Apply Data /Test Data Predicting Output Initialization Step Learn Step Apply Step Machine Learning Framework
  • 18. Data Preprocessing • Assign missing values as zero • Detect outliers • Remove unwanted variables • Recode variables • Convert categorical variables
  • 19. Data Collection Data Preprocessing Attributes selection • Attribute 1 • Attribute 2 • Attribute 3 Algorithm Training Model Score Model Apply Data /Test Data Predicting Output Initialization Step Learn Step Apply Step Machine Learning Framework
  • 20. Features selection • The process of selecting a subset of relevant features (variables, predictors) for use in model construction. • Feature selection techniques are used for three reasons: • simplification of models to make them easier to interpret by researchers/users, • shorter training times, • enhanced generalization by reducing overfitting
  • 22. Models Performance Comparison • Logistic Regression • is a regression model where the dependent variable (DV) is categorical. • Support Vector Machine • SVM is a supervised learning model with associated learning algorithms that analyze data used for classification and regression analysis. • RandomForest • is an ensemble learning method for classification, regression and other tasks, that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.
  • 23. Data Collection Data Preprocessing Attributes selection • Attribute 1 • Attribute 2 • Attribute 3 Algorithm Training Model Score Model Apply Data /Test Data Predicting Output Initialization Step Learn Step Apply Step Machine Learning Framework
  • 24. Training Model and Algorithm • Split the data set into 80:20 using library(caret) • Apply the algorithms: GLM, SVM and RF
  • 25. Data Collection Data Preprocessing Attributes selection • Attribute 1 • Attribute 2 • Attribute 3 Algorithm Training Model Score Model Apply Data /Test Data Predicting Output Initialization Step Learn Step Apply Step Machine Learning Framework
  • 26. Score Model • Confusion Matrix: a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known. • true positives (TP): These are cases in which we predicted yes (they have the disease), and they do churn. • true negatives (TN): We predicted no, and they don't churn. • false positives (FP): We predicted yes, but they don't actually churn. (Also known as a "Type I error.") • false negatives (FN): We predicted no, but they actually do churn. (Also known as a "Type II error.")
  • 27. Confusion Matrix: Generalized Linear Model (glm) n=1407 Predicted: NO Predicted: YES Actual: NO TN = 919 (0.653) FP = 115 (0.082) 1034 Actual: YES FN = 167 (0.119) TP = 206 (0.146) 373 1086 321
  • 28. Confusion Matrix: Support Vector Machine (SVM) n=1407 Predicted: NO Predicted: YES Actual: NO TN= 929 (0.660) FP= 105 (0.075) 1034 Actual: YES FN= 183 (0.130) TP= 190 (0.135) 373 1112 295
  • 29. Confusion Matrix: RandomForest n=1407 Predicted: NO Predicted: YES Actual: NO TN= 939 (0.667) FP= 95 (0.068) 1034 Actual: YES FN= 181 (0.129) TP= 192 (0.136) 373 1120 287
  • 30. Receiver Operating Characteristic (ROC) curve • ROC curve is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied. The curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings.
  • 31. Models comparison • ROC illustrates the performance of a binary classifier system as its discrimination threshold is varied.
  • 33. Data Collection Data Preprocessing Attributes selection • Attribute 1 • Attribute 2 • Attribute 3 Algorithm Training Model Score Model Apply Data /Test Data Predicting Output Initialization Step Learn Step Apply Step Machine Learning Framework
  • 34. Predict test data • Based on the training model, select the best model to be used for test data prediction
  • 36. DEMO