SlideShare a Scribd company logo
1 of 21
Download to read offline
N. Scott Cardell, Mikhail Golovnya, Dan Steinberg
                  Salford Systems
        http://www.salford-systems.com
                     June 2003
   Churn, the loss of a customer to a competitor, is a
    problem for any provider of a subscription service or
    recurring purchasable
    ◦ Costs of customer acquisition and win-back can be high
    ◦ Best if churn can be prevented by preemptive action or selection
      of customers less likely to churn

   Churn is especially important to mobile phone service
    providers given the ease with which a subscriber can
    switch services

   The NCR Teradata center for CRM at Duke identified churn
    prediction as a modeling topic deserving serious study

   A major mobile provider offered data for an international
    modeling accuracy and targeted marketing competition
   Data was provided for 100,000 customers with at least 6 months
    of service history, stratified into a roughly equal number of
    churners and non-churners

   Objective was to predict probability of loss of a customer 30-60
    days into the future

   Historical information provided in the form of
    ◦ Type and price of handset and recency of change/upgrade

    ◦ Total revenue and recurring charges

    ◦ Call behavior: statistics describing completed calls, failed calls, voice
      and data calls, call forwarding, customer care calls, directory info

    ◦ Statistics included mean and range for at least 3 months, last 6
      months, and lifetime

    ◦ Demographic and geographical information, including familiar Acxiom
      style variables and census-derived neighborhood summaries.
   Competition defined a sharply-defined task: churn within
    a specific window for existing customers of a minimum
    duration

   Challenge was defined in a way to avoid complications of
    censoring that could require survival analysis models

   Each customer history was already summarized

   Data quality was good

   Vast majority of analytical effort could be devoted to
    development of an accurate predictive model of a binary
    outcome
Data Set   Measure      TreeNet    Single    2nd Best   Avg. Std
                        Ensemble   TreeNet

 Current   Top Decile     2.90        2.88      2.80       2.14
              Lift                                        (.536)

 Current      Gini        .409        .403      .370       .269
                                                          (.096)

  Future   Top Decile     3.01        2.99      2.74       2.09
              Lift                                        (.585)

  Future      Gini        .400        .403      .361       .261
                                                          (.098)
   Single TreeNet model always better than 2nd best
    entry in field

   Ensemble of TreeNets slightly better 3 out of 4
    times

   Best entries substantially better than the average

   In broad telecommunications markets the added
    accuracy and lift of TreeNet models over
    alternatives could easily translate into millions of
    dollars of revenue per year
   A modest amount of data preprocessing was undertaken
    to repair and extend original data

   Some missing values could be recorded to “0”

   Select non-missing values were recorded to missing

   Experiments with missing value handling were conducted,
    including the addition of missing value indicators to the
    data
    ◦ CART imputation
    ◦ “All missings together” strategies in decision trees

       Missings in a separate node
       Missings go with non-missing high values
       Missings go with non-missing low values
   TreeNet was key to winning the tournament
    ◦ Provided considerably greater accuracy and top decile lift than any
      other modeling method we tried

   A new technology, different than standard boosting,
    developed by Stanford University Professor Jerome Friedman

   Based on the CART® decision tree and thus inherits these
    characteristics:
    ◦ Automatic feature selection

    ◦ Invariant with respect to order-preserving transforms of
      predictors

    ◦ Immune to outliers

    ◦ Built-in methods for handling missing values
   Based on optimizing an objective function

    ◦ e.g: Likelihood function or sum of squared errors

   Objective function expressed in terms of a target
    function of the data

    ◦ The target function is fit as a nonparametric function of
      the data

    ◦ The fit optimizes the objective function

   Large number of small decision trees used to
    form the nonparametric estimate
   Current implementation allows:

    ◦ Binary classification

    ◦ Multinomial classification

    ◦ Least-squares regression

    ◦ Least-absolute-deviation regression

    ◦ M-regression (Huber loss function)

   Other objective functions are possible
   (Insert equation)

   The dependent variable, y, is coded (-1,+1)

   The target function, F(x), is ½ the log-odds ratio

   F is initialized to the log odds on the full training
    data set

   (Insert equation)
    ◦ Equivalent to fitting data to a constant.
   Do not use all training data in any one iteration
    ◦ Randomly sample from training data (we used a 50%
      sample)

   Compute log-likelihood gradient for each
    observation
    ◦ (insert equation)

   Build a K-node tree to predict G(y,x)
    ◦ K=9 gave the best cross-validated results

    ◦ Important that trees be much smaller than the size of an
      optimal single CART tree
   Let (insert equations)

   Update formula (insert equation)

   Repeat until T trees grown

   Select the value of m≤T that produces the
    best fit to the test data
   Compute Ymn, a single Newton-Raphson step for
    Bmn

   (insert equation)

   Use only a small fraction, p of, Ymn(Bmn=Pymn)

   Apply the update formula

   (insert equation)
   P is called the learning rate, T is the number of trees
    grown

   The product pT is the total learning
    ◦ Holding pT constant, smaller p usually improves model fit to test
      data, but can require many trees

   Reducing the learning rate tends to slowly increase the
    optimal amount of total learning

   Very low learning rates can require many trees

   Our CHURN models used values of p from 0.01 to 0.001

   We used total learning of between 6 and 30
   All the models used to score the data for the entries used
    9-node trees

   Our final models used the following three combinations:
    ◦ (p=.001; T=6000; pT=6)
    ◦ (p=.005; T=2500; pT=12.5)
    ◦ (p=.01; T=3000; pT=30)

   One entry was a single TreeNet model (p=.01; T=3000;
    pT=30)
    ◦ In this range all models had almost identical results on test data

    ◦ The scores were highly correlated (r≥.97)

    ◦ Within this range, a higher pT was the most important factor

    ◦ For models with pT=6, the smaller the learning rate the better
   (insert table)
   (insert graphs)
   (insert graph)
   (insert graph)
   Friedman, J.H. (1999). Stochastic gradient
    boosting. Stanford: Statistics Department,
    Stanford University.
   Friedman, J.H. (1999). Greedy function
    approximation: a gradient boosting machine.
    Stanford: Statistics Department, Stanford
    University.
   Salford Systems (2002) TreeNet™ 1.0 Stochastic
    Gradient Boosting. San Diego, CA.
   Steinberg, D., Cardell, N.S., and Golovnya, M.
    (2003) Stochastic Gradient Boosting and
    Restrained Learning. Salford Systems discussion
    paper.

More Related Content

What's hot

Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsSalford Systems
 
Automation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningAutomation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningPranov Mishra
 
Customer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomCustomer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomChris Chen
 
Machine learning algorithms and business use cases
Machine learning algorithms and business use casesMachine learning algorithms and business use cases
Machine learning algorithms and business use casesSridhar Ratakonda
 
Machine learning algorithms
Machine learning algorithmsMachine learning algorithms
Machine learning algorithmsShalitha Suranga
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Alejandro Correa Bahnsen, PhD
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: InsuranceGregg Barrett
 
Customer Churn Analysis and Prediction
Customer Churn Analysis and PredictionCustomer Churn Analysis and Prediction
Customer Churn Analysis and PredictionSOUMIT KAR
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetRohan Choksi
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term depositPranov Mishra
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Sagar Deogirkar
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET Journal
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest Rupak Roy
 
Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.Kuldeep Mahani
 
User Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayUser Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayAhmed Hassan
 
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...ijaia
 
IRJET- Analyzing Voting Results using Influence Matrix
IRJET- Analyzing Voting Results using Influence MatrixIRJET- Analyzing Voting Results using Influence Matrix
IRJET- Analyzing Voting Results using Influence MatrixIRJET Journal
 
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...IRJET Journal
 

What's hot (20)

Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Automation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningAutomation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep Learning
 
Customer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomCustomer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in Telecom
 
Machine learning algorithms and business use cases
Machine learning algorithms and business use casesMachine learning algorithms and business use cases
Machine learning algorithms and business use cases
 
Machine learning algorithms
Machine learning algorithmsMachine learning algorithms
Machine learning algorithms
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: Insurance
 
Customer Churn Analysis and Prediction
Customer Churn Analysis and PredictionCustomer Churn Analysis and Prediction
Customer Churn Analysis and Prediction
 
DTH Case Study
DTH Case StudyDTH Case Study
DTH Case Study
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn dataset
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom Industry
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest
 
AI Algorithms
AI AlgorithmsAI Algorithms
AI Algorithms
 
Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.
 
User Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayUser Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-Play
 
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
 
IRJET- Analyzing Voting Results using Influence Matrix
IRJET- Analyzing Voting Results using Influence MatrixIRJET- Analyzing Voting Results using Influence Matrix
IRJET- Analyzing Voting Results using Influence Matrix
 
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
 

Similar to Churn Modeling For Mobile Telecommunications

Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Salford Systems
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...Salford Systems
 
Predict Backorder on a supply chain data for an Organization
Predict Backorder on a supply chain data for an OrganizationPredict Backorder on a supply chain data for an Organization
Predict Backorder on a supply chain data for an OrganizationPiyush Srivastava
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptxssuser5c580e1
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptxDhanuDhanu49
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Researchbutest
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Researchkevinlan
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Researchjim
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaRahul Bhatia
 
Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...
Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...
Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...pandavaTirumala
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Lesson 6 measures of central tendency
Lesson 6 measures of central tendencyLesson 6 measures of central tendency
Lesson 6 measures of central tendencynurun2010
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design TrainingESCOM
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 

Similar to Churn Modeling For Mobile Telecommunications (20)

Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
 
Predict Backorder on a supply chain data for an Organization
Predict Backorder on a supply chain data for an OrganizationPredict Backorder on a supply chain data for an Organization
Predict Backorder on a supply chain data for an Organization
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptx
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptx
 
TELECOMMUNICATION (2).pptx
TELECOMMUNICATION (2).pptxTELECOMMUNICATION (2).pptx
TELECOMMUNICATION (2).pptx
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...
Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...
Detection Of Fraudlent Behavior In Water Consumption Using A Data Mining Base...
 
pradeep ppt final.pptx
pradeep ppt final.pptxpradeep ppt final.pptx
pradeep ppt final.pptx
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Lesson 6 measures of central tendency
Lesson 6 measures of central tendencyLesson 6 measures of central tendency
Lesson 6 measures of central tendency
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 

More from Salford Systems

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Salford Systems
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningSalford Systems
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerSalford Systems
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like YouSalford Systems
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To RememberSalford Systems
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideSalford Systems
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to marsSalford Systems
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher EducationSalford Systems
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingSalford Systems
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hivSalford Systems
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning CombinationSalford Systems
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSalford Systems
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998Salford Systems
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPMSalford Systems
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7Salford Systems
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012Salford Systems
 
TreeNet Tree Ensembles and CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles and CART  Decision Trees:  A Winning CombinationTreeNet Tree Ensembles and CART  Decision Trees:  A Winning Combination
TreeNet Tree Ensembles and CART Decision Trees: A Winning CombinationSalford Systems
 

More from Salford Systems (20)

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele Cutler
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To Remember
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hiv
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
 
SPM v7.0 Feature Matrix
SPM v7.0 Feature MatrixSPM v7.0 Feature Matrix
SPM v7.0 Feature Matrix
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPM
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012
 
TreeNet Tree Ensembles and CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles and CART  Decision Trees:  A Winning CombinationTreeNet Tree Ensembles and CART  Decision Trees:  A Winning Combination
TreeNet Tree Ensembles and CART Decision Trees: A Winning Combination
 

Recently uploaded

Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 

Recently uploaded (20)

Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 

Churn Modeling For Mobile Telecommunications

  • 1. N. Scott Cardell, Mikhail Golovnya, Dan Steinberg Salford Systems http://www.salford-systems.com June 2003
  • 2. Churn, the loss of a customer to a competitor, is a problem for any provider of a subscription service or recurring purchasable ◦ Costs of customer acquisition and win-back can be high ◦ Best if churn can be prevented by preemptive action or selection of customers less likely to churn  Churn is especially important to mobile phone service providers given the ease with which a subscriber can switch services  The NCR Teradata center for CRM at Duke identified churn prediction as a modeling topic deserving serious study  A major mobile provider offered data for an international modeling accuracy and targeted marketing competition
  • 3. Data was provided for 100,000 customers with at least 6 months of service history, stratified into a roughly equal number of churners and non-churners  Objective was to predict probability of loss of a customer 30-60 days into the future  Historical information provided in the form of ◦ Type and price of handset and recency of change/upgrade ◦ Total revenue and recurring charges ◦ Call behavior: statistics describing completed calls, failed calls, voice and data calls, call forwarding, customer care calls, directory info ◦ Statistics included mean and range for at least 3 months, last 6 months, and lifetime ◦ Demographic and geographical information, including familiar Acxiom style variables and census-derived neighborhood summaries.
  • 4. Competition defined a sharply-defined task: churn within a specific window for existing customers of a minimum duration  Challenge was defined in a way to avoid complications of censoring that could require survival analysis models  Each customer history was already summarized  Data quality was good  Vast majority of analytical effort could be devoted to development of an accurate predictive model of a binary outcome
  • 5. Data Set Measure TreeNet Single 2nd Best Avg. Std Ensemble TreeNet Current Top Decile 2.90 2.88 2.80 2.14 Lift (.536) Current Gini .409 .403 .370 .269 (.096) Future Top Decile 3.01 2.99 2.74 2.09 Lift (.585) Future Gini .400 .403 .361 .261 (.098)
  • 6. Single TreeNet model always better than 2nd best entry in field  Ensemble of TreeNets slightly better 3 out of 4 times  Best entries substantially better than the average  In broad telecommunications markets the added accuracy and lift of TreeNet models over alternatives could easily translate into millions of dollars of revenue per year
  • 7. A modest amount of data preprocessing was undertaken to repair and extend original data  Some missing values could be recorded to “0”  Select non-missing values were recorded to missing  Experiments with missing value handling were conducted, including the addition of missing value indicators to the data ◦ CART imputation ◦ “All missings together” strategies in decision trees  Missings in a separate node  Missings go with non-missing high values  Missings go with non-missing low values
  • 8. TreeNet was key to winning the tournament ◦ Provided considerably greater accuracy and top decile lift than any other modeling method we tried  A new technology, different than standard boosting, developed by Stanford University Professor Jerome Friedman  Based on the CART® decision tree and thus inherits these characteristics: ◦ Automatic feature selection ◦ Invariant with respect to order-preserving transforms of predictors ◦ Immune to outliers ◦ Built-in methods for handling missing values
  • 9. Based on optimizing an objective function ◦ e.g: Likelihood function or sum of squared errors  Objective function expressed in terms of a target function of the data ◦ The target function is fit as a nonparametric function of the data ◦ The fit optimizes the objective function  Large number of small decision trees used to form the nonparametric estimate
  • 10. Current implementation allows: ◦ Binary classification ◦ Multinomial classification ◦ Least-squares regression ◦ Least-absolute-deviation regression ◦ M-regression (Huber loss function)  Other objective functions are possible
  • 11. (Insert equation)  The dependent variable, y, is coded (-1,+1)  The target function, F(x), is ½ the log-odds ratio  F is initialized to the log odds on the full training data set  (Insert equation) ◦ Equivalent to fitting data to a constant.
  • 12. Do not use all training data in any one iteration ◦ Randomly sample from training data (we used a 50% sample)  Compute log-likelihood gradient for each observation ◦ (insert equation)  Build a K-node tree to predict G(y,x) ◦ K=9 gave the best cross-validated results ◦ Important that trees be much smaller than the size of an optimal single CART tree
  • 13. Let (insert equations)  Update formula (insert equation)  Repeat until T trees grown  Select the value of m≤T that produces the best fit to the test data
  • 14. Compute Ymn, a single Newton-Raphson step for Bmn  (insert equation)  Use only a small fraction, p of, Ymn(Bmn=Pymn)  Apply the update formula  (insert equation)
  • 15. P is called the learning rate, T is the number of trees grown  The product pT is the total learning ◦ Holding pT constant, smaller p usually improves model fit to test data, but can require many trees  Reducing the learning rate tends to slowly increase the optimal amount of total learning  Very low learning rates can require many trees  Our CHURN models used values of p from 0.01 to 0.001  We used total learning of between 6 and 30
  • 16. All the models used to score the data for the entries used 9-node trees  Our final models used the following three combinations: ◦ (p=.001; T=6000; pT=6) ◦ (p=.005; T=2500; pT=12.5) ◦ (p=.01; T=3000; pT=30)  One entry was a single TreeNet model (p=.01; T=3000; pT=30) ◦ In this range all models had almost identical results on test data ◦ The scores were highly correlated (r≥.97) ◦ Within this range, a higher pT was the most important factor ◦ For models with pT=6, the smaller the learning rate the better
  • 17. (insert table)
  • 18. (insert graphs)
  • 19. (insert graph)
  • 20. (insert graph)
  • 21. Friedman, J.H. (1999). Stochastic gradient boosting. Stanford: Statistics Department, Stanford University.  Friedman, J.H. (1999). Greedy function approximation: a gradient boosting machine. Stanford: Statistics Department, Stanford University.  Salford Systems (2002) TreeNet™ 1.0 Stochastic Gradient Boosting. San Diego, CA.  Steinberg, D., Cardell, N.S., and Golovnya, M. (2003) Stochastic Gradient Boosting and Restrained Learning. Salford Systems discussion paper.