SlideShare uma empresa Scribd logo
1 de 22
Predictive Analytics
Advanced Techniques in Data Mining

Sara Venturina



                      Copyright © 2011, SAS Institute Inc. All rights reserved.
Agenda
• What is predictive analytics?

• Predictive Analytics Process

• Data Preparation techniques

• Modeling Techniques

• Model Monitoring techniques




                                                                                      2



                          Copyright © 2011, SAS Institute Inc. All rights reserved.
What is Predictive Analytics?
Different levels of analytics


                                                                      Forecasting               Predictive
                                                                                                modeling     Optimization
                                           Statistical
                                           analysis
                     Query drilldown Alerts
                     (or OLAP)
           Ad hoc
           reports
Standard
reports




                                                                                                                            3



                                    Copyright © 2011, SAS Institute Inc. All rights reserved.
What is Predictive Analytics?
Unfortunately, there is no “magic” involved!

• Use of data from different source tables
• Utilizing various data transformation techniques
• Employing statistical theories as foundation
• Will need software to manage this



Focus on business/commercial (as opposed to
 research) analytics is trickier as you need to
 balance the theories with realistic application


                                                                                    4



                        Copyright © 2011, SAS Institute Inc. All rights reserved.
Predictive Analytics Process


                                                Defining
                                               Objectives




             Model                                                                     Data
           Monitoring                                                               Preparation
                                              Predictive
                                              Analytics
                                               Process




                  Deployment                                                Modeling




                                                                                                  5



                        Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Preparation Techniques
• Possible data sources
• Data transformation techniques
• Deriving “behavioral” information
• Data quality check before modeling




                                                                                  6



                      Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Preparation Techniques
Possible data sources
• Data warehouse/ data marts
• Operational systems i.e. transaction systems, billing,
  call center data, etc
• External data i.e. survey data, campaign, data from
  external agencies, etc

For external data make sure information is consistently available




                                                                                      7



                          Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Preparation Techniques
Data transformation techniques
• Entity-level information
• Indicator variables
   • Are values skewed towards 1 level?

• Categorization/grouping of values
   • Is there too many levels of values?
   • Are there values that rarely occur?

• Binning of continuous variables
• Benchmarking information, i.e. industry benchmarking

                                                                                     8



                         Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Preparation Techniques
Deriving “behavioral” information using several time
 periods
• Average behavior over the last X time periods
• Measures of variation
   • Standard deviation
   • Coefficient of Variation
   • Deviation from the Mean

• Measures of trend information
   • Ratio of 1 vs 3, 3 vs 6 time periods
   • Proportion of Current vs Average of last X time periods
   • Slope of regression line                                                         9



                          Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Preparation Techniques
Data quality check before modeling
• Generation of summary statistics of derived variables
• Random checking
• Correct imputation of missing values




                                                                                 10



                     Copyright © 2011, SAS Institute Inc. All rights reserved.
Modeling Techniques
• Use of SAS Enterprise Miner
• Ensemble modeling outside of SAS
• Base SAS modeling i.e. for categorical target, survival
 analysis, etc




                                                                                 11



                     Copyright © 2011, SAS Institute Inc. All rights reserved.
Modeling Techniques
Use of SAS Enterprise Miner




     For initial /basic modeling, use Decision Tree, Regression.
      Neural networks can be used to provide diagnostic insights
                                                                                   12



                       Copyright © 2011, SAS Institute Inc. All rights reserved.
Modeling Techniques
Ensemble modeling in and out of SAS EM
                                         Ensemble Models based on the
                                                                      Weightage
                                               following models
                                             Model 1        Decision     0.4
                                             Model 2       Regression    0.6
                                             Model 3       Regression    0.4




                                                                                  13



                  Copyright © 2011, SAS Institute Inc. All rights reserved.
Modeling Techniques
Base SAS modeling
• Categorical data modeling i.e.
    • PROC CATMOD/GENMOD
    • PROC SURVEYLOGISTIC
• Survival analysis:
    • PROC LIFEREG
    • PROC LIFETEST
    • PROC PHREG

Base SAS modeling requires more familiarity with underlying statistical
 concepts
                                                                                     14



                         Copyright © 2011, SAS Institute Inc. All rights reserved.
Model Monitoring Techniques
• Comparing actual vs predicted
• Scored base analysis:
   • Variable distribution analysis
   • Predicted Score distribution




                                                                                  15



                      Copyright © 2011, SAS Institute Inc. All rights reserved.
Model Monitoring
Monitoring of model assessment charts i.e.
                                                                                measures what percentage of all churners
 Compares the effectiveness of running a                                        are in the scoring list (i.e. top 10% scores
    model versus selecting randomly                                                 captured 40% of actual churners)




Other model assessment statistics can be computed such as hit rate,
 Gini coefficient, etc
                                                                                                                               16



                                  Copyright © 2011, SAS Institute Inc. All rights reserved.
Model Monitoring (cont’d)
Scored base analysis i.e.
• Variable distribution analysis




                                                                                   17



                       Copyright © 2011, SAS Institute Inc. All rights reserved.
Model Monitoring (cont’d)
Scored base analysis i.e.
• Predicted Score distribution




                                                                                  18



                      Copyright © 2011, SAS Institute Inc. All rights reserved.
Predictive Analytics as an Iterative Process


                                                 Defining
                                                Objectives




              Model                                                                     Data
            Monitoring                                                               Preparation
                                               Predictive
                                               Analytics
                                                Process




                   Deployment                                                Modeling




                                                                                                   19



                         Copyright © 2011, SAS Institute Inc. All rights reserved.
Questions?




                                                                              20

                                                                         20
             Copyright © 2011, SAS Institute Inc. All rights reserved.
21

                                                            21
Copyright © 2011, SAS Institute Inc. All rights reserved.
Copyright © 2011, SAS Institute Inc. All rights reserved.

Mais conteúdo relacionado

Mais procurados

data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousingSunny Gandhi
 
Predictive analysis and modelling
Predictive analysis and modellingPredictive analysis and modelling
Predictive analysis and modellinglalit Lalitm7225
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse designines beltaief
 
The 7 steps of Machine Learning
The 7 steps of Machine LearningThe 7 steps of Machine Learning
The 7 steps of Machine LearningWaziri Shebogholo
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An OverviewMachinePulse
 
Machine Learning
Machine LearningMachine Learning
Machine LearningVivek Garg
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksHima Patel
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | EdurekaEdureka!
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Simplilearn
 

Mais procurados (20)

Machine Learning
Machine LearningMachine Learning
Machine Learning
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
Predictive analysis and modelling
Predictive analysis and modellingPredictive analysis and modelling
Predictive analysis and modelling
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
The 7 steps of Machine Learning
The 7 steps of Machine LearningThe 7 steps of Machine Learning
The 7 steps of Machine Learning
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Analytics
AnalyticsAnalytics
Analytics
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Text MIning
Text MIningText MIning
Text MIning
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | Edureka
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 

Semelhante a Predictive Analytics: Advanced techniques in data mining

Big Data Needs Big Analytics
Big Data Needs Big AnalyticsBig Data Needs Big Analytics
Big Data Needs Big AnalyticsDeepak Ramanathan
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Kun Le
 
Asian Bankers Association, Manila Conference
Asian Bankers Association, Manila ConferenceAsian Bankers Association, Manila Conference
Asian Bankers Association, Manila ConferenceDeepak Ramanathan
 
Big data meets big analytics
Big data meets big analyticsBig data meets big analytics
Big data meets big analyticsDeepak Ramanathan
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data European Data Forum
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsTeradata Aster
 
What is the Value of SAS Analytics?
What is the Value of SAS Analytics?What is the Value of SAS Analytics?
What is the Value of SAS Analytics?SAS Canada
 
Zakipoint Introduction
Zakipoint IntroductionZakipoint Introduction
Zakipoint Introductionrameshkbudhani
 
Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...
Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...
Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...Pivotal Analytics (Cetas Analytics)
 
Introduction to SAS Forecasting
Introduction to SAS ForecastingIntroduction to SAS Forecasting
Introduction to SAS ForecastingSAS Canada
 
Data Management for High Performance Analytics
Data Management for High Performance AnalyticsData Management for High Performance Analytics
Data Management for High Performance AnalyticsMary Snyder
 
Sybase Complex Event Processing
Sybase Complex Event ProcessingSybase Complex Event Processing
Sybase Complex Event ProcessingSybase Türkiye
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionRevolution Analytics
 

Semelhante a Predictive Analytics: Advanced techniques in data mining (20)

Big Data Needs Big Analytics
Big Data Needs Big AnalyticsBig Data Needs Big Analytics
Big Data Needs Big Analytics
 
Big Data Needs Big Analytics
Big Data Needs Big AnalyticsBig Data Needs Big Analytics
Big Data Needs Big Analytics
 
101 ab 1345-1415
101 ab 1345-1415101 ab 1345-1415
101 ab 1345-1415
 
101 ab 1345-1415
101 ab 1345-1415101 ab 1345-1415
101 ab 1345-1415
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...
 
Future of Analytics is here
Future of Analytics is hereFuture of Analytics is here
Future of Analytics is here
 
Asian Bankers Association, Manila Conference
Asian Bankers Association, Manila ConferenceAsian Bankers Association, Manila Conference
Asian Bankers Association, Manila Conference
 
Big data meets big analytics
Big data meets big analyticsBig data meets big analytics
Big data meets big analytics
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
 
What is the Value of SAS Analytics?
What is the Value of SAS Analytics?What is the Value of SAS Analytics?
What is the Value of SAS Analytics?
 
Zakipoint Introduction
Zakipoint IntroductionZakipoint Introduction
Zakipoint Introduction
 
Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...
Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...
Streaming Cloud Analytics: Enabling Dynamic Product Innovation From User Expe...
 
Introduction to SAS Forecasting
Introduction to SAS ForecastingIntroduction to SAS Forecasting
Introduction to SAS Forecasting
 
Data Management for High Performance Analytics
Data Management for High Performance AnalyticsData Management for High Performance Analytics
Data Management for High Performance Analytics
 
Sybase Complex Event Processing
Sybase Complex Event ProcessingSybase Complex Event Processing
Sybase Complex Event Processing
 
Clinical approach to technical upgrade
Clinical approach to technical upgradeClinical approach to technical upgrade
Clinical approach to technical upgrade
 
Technology update
Technology update   Technology update
Technology update
 
Technology Update
Technology UpdateTechnology Update
Technology Update
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 

Mais de SAS Asia Pacific

Improving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble ApproachesImproving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble ApproachesSAS Asia Pacific
 
Instantly & Visually Explore Big Data with Powerful Analytics
Instantly & Visually Explore Big Data with Powerful AnalyticsInstantly & Visually Explore Big Data with Powerful Analytics
Instantly & Visually Explore Big Data with Powerful AnalyticsSAS Asia Pacific
 
Produce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsProduce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsSAS Asia Pacific
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so farSAS Asia Pacific
 
How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?SAS Asia Pacific
 
Developing an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorDeveloping an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorSAS Asia Pacific
 
Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data SAS Asia Pacific
 
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...SAS Asia Pacific
 
A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...SAS Asia Pacific
 

Mais de SAS Asia Pacific (9)

Improving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble ApproachesImproving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble Approaches
 
Instantly & Visually Explore Big Data with Powerful Analytics
Instantly & Visually Explore Big Data with Powerful AnalyticsInstantly & Visually Explore Big Data with Powerful Analytics
Instantly & Visually Explore Big Data with Powerful Analytics
 
Produce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsProduce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry Needs
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so far
 
How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?
 
Developing an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorDeveloping an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical Competitor
 
Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data
 
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
 
A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...
 

Último

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Último (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Predictive Analytics: Advanced techniques in data mining

  • 1. Predictive Analytics Advanced Techniques in Data Mining Sara Venturina Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 2. Agenda • What is predictive analytics? • Predictive Analytics Process • Data Preparation techniques • Modeling Techniques • Model Monitoring techniques 2 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 3. What is Predictive Analytics? Different levels of analytics Forecasting Predictive modeling Optimization Statistical analysis Query drilldown Alerts (or OLAP) Ad hoc reports Standard reports 3 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 4. What is Predictive Analytics? Unfortunately, there is no “magic” involved! • Use of data from different source tables • Utilizing various data transformation techniques • Employing statistical theories as foundation • Will need software to manage this Focus on business/commercial (as opposed to research) analytics is trickier as you need to balance the theories with realistic application 4 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 5. Predictive Analytics Process Defining Objectives Model Data Monitoring Preparation Predictive Analytics Process Deployment Modeling 5 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 6. Data Preparation Techniques • Possible data sources • Data transformation techniques • Deriving “behavioral” information • Data quality check before modeling 6 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 7. Data Preparation Techniques Possible data sources • Data warehouse/ data marts • Operational systems i.e. transaction systems, billing, call center data, etc • External data i.e. survey data, campaign, data from external agencies, etc For external data make sure information is consistently available 7 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 8. Data Preparation Techniques Data transformation techniques • Entity-level information • Indicator variables • Are values skewed towards 1 level? • Categorization/grouping of values • Is there too many levels of values? • Are there values that rarely occur? • Binning of continuous variables • Benchmarking information, i.e. industry benchmarking 8 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 9. Data Preparation Techniques Deriving “behavioral” information using several time periods • Average behavior over the last X time periods • Measures of variation • Standard deviation • Coefficient of Variation • Deviation from the Mean • Measures of trend information • Ratio of 1 vs 3, 3 vs 6 time periods • Proportion of Current vs Average of last X time periods • Slope of regression line 9 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 10. Data Preparation Techniques Data quality check before modeling • Generation of summary statistics of derived variables • Random checking • Correct imputation of missing values 10 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 11. Modeling Techniques • Use of SAS Enterprise Miner • Ensemble modeling outside of SAS • Base SAS modeling i.e. for categorical target, survival analysis, etc 11 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 12. Modeling Techniques Use of SAS Enterprise Miner For initial /basic modeling, use Decision Tree, Regression. Neural networks can be used to provide diagnostic insights 12 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 13. Modeling Techniques Ensemble modeling in and out of SAS EM Ensemble Models based on the Weightage following models Model 1 Decision 0.4 Model 2 Regression 0.6 Model 3 Regression 0.4 13 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 14. Modeling Techniques Base SAS modeling • Categorical data modeling i.e. • PROC CATMOD/GENMOD • PROC SURVEYLOGISTIC • Survival analysis: • PROC LIFEREG • PROC LIFETEST • PROC PHREG Base SAS modeling requires more familiarity with underlying statistical concepts 14 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 15. Model Monitoring Techniques • Comparing actual vs predicted • Scored base analysis: • Variable distribution analysis • Predicted Score distribution 15 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 16. Model Monitoring Monitoring of model assessment charts i.e. measures what percentage of all churners Compares the effectiveness of running a are in the scoring list (i.e. top 10% scores model versus selecting randomly captured 40% of actual churners) Other model assessment statistics can be computed such as hit rate, Gini coefficient, etc 16 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 17. Model Monitoring (cont’d) Scored base analysis i.e. • Variable distribution analysis 17 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 18. Model Monitoring (cont’d) Scored base analysis i.e. • Predicted Score distribution 18 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 19. Predictive Analytics as an Iterative Process Defining Objectives Model Data Monitoring Preparation Predictive Analytics Process Deployment Modeling 19 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 20. Questions? 20 20 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 21. 21 21 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 22. Copyright © 2011, SAS Institute Inc. All rights reserved.