SlideShare uma empresa Scribd logo
1 de 12
Common Data Driven Mistakes
Promotable Presentation
Jeanette Shutay, PhD
Senior Director, Advanced Analytics
February 5, 2020
HAVI | Confidential & Proprietary | 2/13/2020 | 2
Current Professional Contributions
• Advanced Analytics Center of Excellence Lead at HAVI
• Adjunct Professor for NCU School of Technology
Academic Preparation
• BA in Psychology
• MA in Developmental Psychology
• PhD in Research Methodology
• Student in AIP program
• Student to start in GIS program
Personal Interests
• Family & pets
• Volunteering
• I love animals!!
• Jogging & yoga
• Sports
My son Brett (16)
My son Brendon (12)
HAVI | Confidential & Proprietary | 2/13/2020 | 3
Data Solutions Lifecycle
Key Concepts
• Stakeholders
• Value proposition
• Data quality & characteristics
• Interaction effects / complex
relationships
• Getting to causality
• Constraints
• Scaling
HAVI | Confidential & Proprietary | 2/13/2020 | 4
Defining the problem
- Have you correctly and thoroughly defined the problem?
• Engage domain experts early and maintain continual engagement
- Estimate and consider the value proposition associated with solving the problem
- Identify the key performance indicators and any drivers of interest
- Operationally define all variables and indicators to be measured or observed
• Start with a priori hypotheses based on subject matter expertise and
industry/academic literature
- Brainstorm with key stakeholders & involve people with diverse backgrounds & views
- Generate hypotheses to test prior to specifying data requirements
- Review and align on all assumptions
• Document business requirements
HAVI | Confidential & Proprietary | 2/13/2020 | 5
Specifying the data requirements
- Do your data meet all requirements?
• Granularity & Cadence
- Do you need daily level data, location level data, etc.?
- When are decisions made? Every day, every week?
• Representativeness & Fidelity
- How generalizable are the cases you are studying to the problem as a whole?
• If doing a POC or small pilot, do you have a representative set of cases?
• Are you working with cases that have a high probability of treatment fidelity?
- Example: If testing a new customer experience program, are the stores that you are using
as part of the POC going to implement the program as intended? A low fidelity situation
can do more harm than not testing at all.
HAVI | Confidential & Proprietary | 2/13/2020 | 6
Data preparation
- Normalizing Variables
• In many cases, you will need to standardize your variables before analysis
- Using z scores are a good way to avoid data mistakes in modeling
- Identifying and Managing Anomalies or Outliers
• Sometimes anomalies are what you are interested in
• When anomalies or outliers are problematic, consider dropping those cases
or imputing, but watch out for errors in this approach
- Example: Some values may appear as outliers in time series data with high seasonality
- Model assumptions
• Ensure that the characteristics of your data, and the problem you are trying
to solve, align with the model you are implementing
HAVI | Confidential & Proprietary | 2/13/2020 | 7
Data exploration
- Go beyond univariate exploratory data analysis (EDA)
• Explore interaction effects
Conclusion: There is no difference
between green and yellow feeders.
Conclusion: There is a difference
between green and yellow feeders.
HAVI | Confidential & Proprietary | 2/13/2020 | 8
Causality & spurious relationships
- Causality - three conditions must exist:
• X and Y must be correlated
• X must proceed Y in time
• All other rival causes must be ruled out (e.g., internal validity)
- Beware of Spurious Relationships & Rival Causes
• Example 1: You launch a promotion in March. You believe the success of your promotion
(increased sales) is due to your marketing campaign, but it is a result of a third-party cause
(increased consumer buying power due to tax refunds)
• Example 2: You launch a new crime watch campaign that launches in December and you
see a significant decrease in crime month-over-month. The true cause is seasonality.
• Solution: Design your campaign to minimize potential rival causes. This is where including
the SME is critical.
HAVI | Confidential & Proprietary | 2/13/2020 | 9
- Look for Suppressor Effects
• Example: You launch an employee training
program. You compare their performance at
the end of the program to the general
employee population. You find that those in
the training program had lower performance
ratings than the general population.
• Problem: You didn’t consider pre-existing
differences. You find out that those who were
selected for the program where low
performers.
• Solution: Use deltas (change from baseline to
post) and/or include control variables in your
model (prior performance, demographics, etc.).
1.7
3.2
3.5
3.6
Baseline performance Final performance
Employee Performance Rating 5-Point Scale
Training participants General population
Suppressor Effects
HAVI | Confidential & Proprietary | 2/13/2020 | 10
Time to value & diminishing returns
- Progress vs. Perfection
• Time to value is an important factor to consider. It is better to provide
something for the business to work with and continually improve than to wait
until you reach perfection before sharing with the business
- Data & Analytics ROI
• Know when improving the model and/or adding more external data no longer
yields the return on investment
- Cost-to-benefit analysis
- Assess forecastability
HAVI | Confidential & Proprietary | 2/13/2020 | 11
- Are there specific constraints that might impact your approach?
• Example 1: Can’t recommend an alcoholic beverage, even if customer is likely to buy
• Example 2: Must use interpretable models
• Example 3: Must include non-significant promotions for simulation purposes
- Do you need to scale your solution?
• If you need to scale your solution, try to prototype within the same ecosystem (e.g., Azure,
Python, Spark, etc.) in which you plan to scale.
- Many times results do not replicate when using different software or platforms
- Avoid using data for modeling that is not available at decision time
• Weather data or other data that you have historical information for, but no future data
- Avoid data leakage when building models
• Don’t commingle modeling training data with model validation data
Other important considerations
HAVI | Confidential & Proprietary | 2/13/2020 |
QUESTIONS?

Mais conteúdo relacionado

Mais procurados

Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationMetre22
 
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_StateTraditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_Staterichibh
 
MonetizingStatistics
MonetizingStatisticsMonetizingStatistics
MonetizingStatisticsAaron Sankey
 
Getting Started with Reliability Engineering
Getting Started with Reliability EngineeringGetting Started with Reliability Engineering
Getting Started with Reliability EngineeringAccendo Reliability
 
Case analysis approach
Case analysis approachCase analysis approach
Case analysis approachbalbirsingh
 
Financial analysis for product managers
Financial analysis for product managersFinancial analysis for product managers
Financial analysis for product managersMike Claiborne
 
1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptopRising Media, Inc.
 
TrustImpact - Great Place IT Services
TrustImpact - Great Place IT ServicesTrustImpact - Great Place IT Services
TrustImpact - Great Place IT ServicesShivanshu Singh
 

Mais procurados (20)

1120 track2 bennett
1120 track2 bennett1120 track2 bennett
1120 track2 bennett
 
Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic Experimentation
 
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_StateTraditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_State
 
MonetizingStatistics
MonetizingStatisticsMonetizingStatistics
MonetizingStatistics
 
Getting Started with Reliability Engineering
Getting Started with Reliability EngineeringGetting Started with Reliability Engineering
Getting Started with Reliability Engineering
 
1615 track1 schleicher
1615 track1 schleicher1615 track1 schleicher
1615 track1 schleicher
 
Making sense of numbers - a half-day workshop
Making sense of numbers - a half-day workshopMaking sense of numbers - a half-day workshop
Making sense of numbers - a half-day workshop
 
1030 track1 heiler
1030 track1 heiler1030 track1 heiler
1030 track1 heiler
 
Case analysis approach
Case analysis approachCase analysis approach
Case analysis approach
 
1000 track1 gland_sims
1000 track1 gland_sims1000 track1 gland_sims
1000 track1 gland_sims
 
The Public Sector cannot deliver Benefits
The Public Sector cannot deliver BenefitsThe Public Sector cannot deliver Benefits
The Public Sector cannot deliver Benefits
 
1055 track3 soules
1055 track3 soules1055 track3 soules
1055 track3 soules
 
Financial analysis for product managers
Financial analysis for product managersFinancial analysis for product managers
Financial analysis for product managers
 
1440 track2 roberts
1440 track2 roberts1440 track2 roberts
1440 track2 roberts
 
Case study template
Case study templateCase study template
Case study template
 
1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop
 
TrustImpact - Great Place IT Services
TrustImpact - Great Place IT ServicesTrustImpact - Great Place IT Services
TrustImpact - Great Place IT Services
 
1120 track1 taylor
1120 track1 taylor1120 track1 taylor
1120 track1 taylor
 
Unit b
Unit bUnit b
Unit b
 
Process outcomes vs outputs
Process outcomes vs outputsProcess outcomes vs outputs
Process outcomes vs outputs
 

Semelhante a Common Data Driven Mistakes with HAVI's Sr. Director of Advanced Analytics

Value Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product OverviewValue Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product OverviewUniversity of Utah
 
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdfmtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdfJens-Fabian Goetzmann
 
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapData-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapDATAVERSITY
 
Strategy and roadmap slides
Strategy and roadmap slidesStrategy and roadmap slides
Strategy and roadmap slidesData Blueprint
 
Data Science in Business: Value Creation of Business
Data Science in Business: Value Creation of BusinessData Science in Business: Value Creation of Business
Data Science in Business: Value Creation of BusinessTa-Wei (David) Huang
 
Hypothesis driven storyboarding
Hypothesis driven storyboardingHypothesis driven storyboarding
Hypothesis driven storyboardingRahul Sahai
 
Introduction to Policy Evaluation
Introduction to Policy EvaluationIntroduction to Policy Evaluation
Introduction to Policy EvaluationpasicUganda
 
Data-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & RoadmapData-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & RoadmapData Blueprint
 
Data-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & RoadmapData-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & RoadmapDATAVERSITY
 
Supply Chain Strategy Assessment
Supply Chain Strategy AssessmentSupply Chain Strategy Assessment
Supply Chain Strategy AssessmentChief Innovation
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxAsadAli104515
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDDebarata Basu
 
Speed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptxSpeed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptxPeter Eales
 
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...Gray Associates, Inc
 

Semelhante a Common Data Driven Mistakes with HAVI's Sr. Director of Advanced Analytics (20)

ROI-Institute-Brochure1
ROI-Institute-Brochure1ROI-Institute-Brochure1
ROI-Institute-Brochure1
 
Ranking portfolio initiatives, Bernard Marshall, june 2012
Ranking portfolio initiatives, Bernard Marshall, june 2012Ranking portfolio initiatives, Bernard Marshall, june 2012
Ranking portfolio initiatives, Bernard Marshall, june 2012
 
Value Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product OverviewValue Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product Overview
 
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdfmtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
 
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapData-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
 
Strategy and roadmap slides
Strategy and roadmap slidesStrategy and roadmap slides
Strategy and roadmap slides
 
SMC
SMCSMC
SMC
 
Data Science in Business: Value Creation of Business
Data Science in Business: Value Creation of BusinessData Science in Business: Value Creation of Business
Data Science in Business: Value Creation of Business
 
Hypothesis driven storyboarding
Hypothesis driven storyboardingHypothesis driven storyboarding
Hypothesis driven storyboarding
 
Product Management
Product ManagementProduct Management
Product Management
 
Measuring_HR_ROI-1.pdf
Measuring_HR_ROI-1.pdfMeasuring_HR_ROI-1.pdf
Measuring_HR_ROI-1.pdf
 
Introduction to Policy Evaluation
Introduction to Policy EvaluationIntroduction to Policy Evaluation
Introduction to Policy Evaluation
 
Data-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & RoadmapData-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & Roadmap
 
Data-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & RoadmapData-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & Roadmap
 
Supply Chain Strategy Assessment
Supply Chain Strategy AssessmentSupply Chain Strategy Assessment
Supply Chain Strategy Assessment
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTD
 
Speed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptxSpeed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptx
 
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
 
Hkwaw event 20100622
Hkwaw event   20100622Hkwaw event   20100622
Hkwaw event 20100622
 

Mais de Promotable

Data Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of AnalyticsData Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of AnalyticsPromotable
 
Healthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to PredictionHealthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to PredictionPromotable
 
Data to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science LeadData to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science LeadPromotable
 
How to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent PartnersHow to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent PartnersPromotable
 
Turning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics ManagerTurning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics ManagerPromotable
 
Metrics with BMW Director of Product
Metrics with BMW Director of Product Metrics with BMW Director of Product
Metrics with BMW Director of Product Promotable
 
Becoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product OwnerBecoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product OwnerPromotable
 
Marketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data ScientistMarketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data ScientistPromotable
 

Mais de Promotable (8)

Data Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of AnalyticsData Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of Analytics
 
Healthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to PredictionHealthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to Prediction
 
Data to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science LeadData to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science Lead
 
How to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent PartnersHow to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent Partners
 
Turning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics ManagerTurning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics Manager
 
Metrics with BMW Director of Product
Metrics with BMW Director of Product Metrics with BMW Director of Product
Metrics with BMW Director of Product
 
Becoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product OwnerBecoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product Owner
 
Marketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data ScientistMarketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data Scientist
 

Último

Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 

Último (20)

Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 

Common Data Driven Mistakes with HAVI's Sr. Director of Advanced Analytics

  • 1. Common Data Driven Mistakes Promotable Presentation Jeanette Shutay, PhD Senior Director, Advanced Analytics February 5, 2020
  • 2. HAVI | Confidential & Proprietary | 2/13/2020 | 2 Current Professional Contributions • Advanced Analytics Center of Excellence Lead at HAVI • Adjunct Professor for NCU School of Technology Academic Preparation • BA in Psychology • MA in Developmental Psychology • PhD in Research Methodology • Student in AIP program • Student to start in GIS program Personal Interests • Family & pets • Volunteering • I love animals!! • Jogging & yoga • Sports My son Brett (16) My son Brendon (12)
  • 3. HAVI | Confidential & Proprietary | 2/13/2020 | 3 Data Solutions Lifecycle Key Concepts • Stakeholders • Value proposition • Data quality & characteristics • Interaction effects / complex relationships • Getting to causality • Constraints • Scaling
  • 4. HAVI | Confidential & Proprietary | 2/13/2020 | 4 Defining the problem - Have you correctly and thoroughly defined the problem? • Engage domain experts early and maintain continual engagement - Estimate and consider the value proposition associated with solving the problem - Identify the key performance indicators and any drivers of interest - Operationally define all variables and indicators to be measured or observed • Start with a priori hypotheses based on subject matter expertise and industry/academic literature - Brainstorm with key stakeholders & involve people with diverse backgrounds & views - Generate hypotheses to test prior to specifying data requirements - Review and align on all assumptions • Document business requirements
  • 5. HAVI | Confidential & Proprietary | 2/13/2020 | 5 Specifying the data requirements - Do your data meet all requirements? • Granularity & Cadence - Do you need daily level data, location level data, etc.? - When are decisions made? Every day, every week? • Representativeness & Fidelity - How generalizable are the cases you are studying to the problem as a whole? • If doing a POC or small pilot, do you have a representative set of cases? • Are you working with cases that have a high probability of treatment fidelity? - Example: If testing a new customer experience program, are the stores that you are using as part of the POC going to implement the program as intended? A low fidelity situation can do more harm than not testing at all.
  • 6. HAVI | Confidential & Proprietary | 2/13/2020 | 6 Data preparation - Normalizing Variables • In many cases, you will need to standardize your variables before analysis - Using z scores are a good way to avoid data mistakes in modeling - Identifying and Managing Anomalies or Outliers • Sometimes anomalies are what you are interested in • When anomalies or outliers are problematic, consider dropping those cases or imputing, but watch out for errors in this approach - Example: Some values may appear as outliers in time series data with high seasonality - Model assumptions • Ensure that the characteristics of your data, and the problem you are trying to solve, align with the model you are implementing
  • 7. HAVI | Confidential & Proprietary | 2/13/2020 | 7 Data exploration - Go beyond univariate exploratory data analysis (EDA) • Explore interaction effects Conclusion: There is no difference between green and yellow feeders. Conclusion: There is a difference between green and yellow feeders.
  • 8. HAVI | Confidential & Proprietary | 2/13/2020 | 8 Causality & spurious relationships - Causality - three conditions must exist: • X and Y must be correlated • X must proceed Y in time • All other rival causes must be ruled out (e.g., internal validity) - Beware of Spurious Relationships & Rival Causes • Example 1: You launch a promotion in March. You believe the success of your promotion (increased sales) is due to your marketing campaign, but it is a result of a third-party cause (increased consumer buying power due to tax refunds) • Example 2: You launch a new crime watch campaign that launches in December and you see a significant decrease in crime month-over-month. The true cause is seasonality. • Solution: Design your campaign to minimize potential rival causes. This is where including the SME is critical.
  • 9. HAVI | Confidential & Proprietary | 2/13/2020 | 9 - Look for Suppressor Effects • Example: You launch an employee training program. You compare their performance at the end of the program to the general employee population. You find that those in the training program had lower performance ratings than the general population. • Problem: You didn’t consider pre-existing differences. You find out that those who were selected for the program where low performers. • Solution: Use deltas (change from baseline to post) and/or include control variables in your model (prior performance, demographics, etc.). 1.7 3.2 3.5 3.6 Baseline performance Final performance Employee Performance Rating 5-Point Scale Training participants General population Suppressor Effects
  • 10. HAVI | Confidential & Proprietary | 2/13/2020 | 10 Time to value & diminishing returns - Progress vs. Perfection • Time to value is an important factor to consider. It is better to provide something for the business to work with and continually improve than to wait until you reach perfection before sharing with the business - Data & Analytics ROI • Know when improving the model and/or adding more external data no longer yields the return on investment - Cost-to-benefit analysis - Assess forecastability
  • 11. HAVI | Confidential & Proprietary | 2/13/2020 | 11 - Are there specific constraints that might impact your approach? • Example 1: Can’t recommend an alcoholic beverage, even if customer is likely to buy • Example 2: Must use interpretable models • Example 3: Must include non-significant promotions for simulation purposes - Do you need to scale your solution? • If you need to scale your solution, try to prototype within the same ecosystem (e.g., Azure, Python, Spark, etc.) in which you plan to scale. - Many times results do not replicate when using different software or platforms - Avoid using data for modeling that is not available at decision time • Weather data or other data that you have historical information for, but no future data - Avoid data leakage when building models • Don’t commingle modeling training data with model validation data Other important considerations
  • 12. HAVI | Confidential & Proprietary | 2/13/2020 | QUESTIONS?

Notas do Editor

  1. As I go through the examples, I will speak to how these aforementioned mistakes can have economic implications.
  2. We will unpack these key concepts throughout the presentation.