SlideShare a Scribd company logo
1 of 21
Download to read offline
Your Data Scientist Hates You
Bradford Stephens
bradford@roboticprofit.com
ft. help from Nick Kypreos
nick@roboticprofit.com
About Us
Data Infrastructure and Data Science at scale.
fact
71%1 of Data Science projects fail
fact
63%2 of Data Scientists quit in < 2 years
• Data Scientists aren’t involved from the beginning
• No strategy
• Bad Data: more common than you think and
untestable
• Pointlessness
Why
• Everything stems from this
• Goals need to be attainable
• Data needs to be accessible and formatted
correctly
• You can’t conceive of what’s possible (or impossible)
Involvement
Your Data Strategy
Your Data Strategy: Diagnostic
• Diagnostic: How did we get here?
• Understanding history and how your org drives decisions is key
• What will your org’s immune system allow?
• Infrastructure: what is currently in place and how did it happen?
• Goals: How do we drive revenue or KPIs?
Your Data Strategy: Roadmapping
• Roadmapping: What are we going to build?
• Data Architecture?
• Platform feasible?
• Who builds what when, for how much?
• How do we ensure a low-latency feedback loop? DS highly iterative
Your Data Strategy: Development
• Platform: What’s our stack?
• Storage: Where does data come from, go to, and latency/throughput
requirements on storage?
• Processing: Where do we transform data? Batch? Real-time? Bounds?
• Collaboration: How do we share results, data, and APIs across the org?
(always forgotten)
Bad Data
Data Science is Untestable
Data Science = Math + QA + CS
+ PM + Psionics
Untestable
• Data Scientists spend vast amounts of time fixing data
• …and you need to be OK with that
• Unit Testing doesn’t make sense in science
• Distributions fittings, etc
• Can only test via simulation: a whole ‘nother process
• “Simple” things take weeks to verify
Instrumentation
• Can you even verify your instrumentation?
• Are you collecting everything?
• Collecting the right thing?
• What if only 85% of the time?
• Systematically drop at high enough traffic?
• Someone comes into site through different channel from an acquisition 2 yr ago?
Software is Garbage
• Remember Hadoop?
• Spark?
• MLib bugs for years
• Wrong math won’t fail unit tests
• GIGO
• JSON, weekly microversioning, schema entropy…
• This is why DS efforts are so slow to start w/o initial involvement
• Don’t build the One True Data Platform
• one of our customers had 30 DBs including a critical out-of-license DB2
box
Pointlessness
! Dashboards ! are ! not
! a ! strategy
“Here’s some data, just tell us what’s
interesting…”
“We didn’t think that was interesting, you’re
bad at your job.”
Data Must be Treated like a Product
• Build a Data Products Team
• Engineers, PMs, Design. Data Science. Not just analysts.
• KPIs, Goals, Measurability, Backlogs
• Budget
• Freedom to Innovate
• Staff of diverse backgrounds
A Data Platform will touch every part of your org

More Related Content

What's hot

Importance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introductionImportance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introductionNicolas De Boose
 
Bridging Big Data and Data Science Using Scalable Workflows
Bridging Big Data and Data Science Using Scalable WorkflowsBridging Big Data and Data Science Using Scalable Workflows
Bridging Big Data and Data Science Using Scalable WorkflowsIlkay Altintas, Ph.D.
 
Ilkay Altintas: Kepler
Ilkay Altintas: KeplerIlkay Altintas: Kepler
Ilkay Altintas: KeplerDavid LeBauer
 
Moving Fast At Scale
Moving Fast At ScaleMoving Fast At Scale
Moving Fast At ScaleRandy Shoup
 
WorDS of Data Science in the Presence of Heterogenous Computing Architectures
WorDS of Data Science in the Presence of Heterogenous Computing ArchitecturesWorDS of Data Science in the Presence of Heterogenous Computing Architectures
WorDS of Data Science in the Presence of Heterogenous Computing ArchitecturesIlkay Altintas, Ph.D.
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsBoston Consulting Group
 
MLconf NYC Josh Wills
MLconf NYC Josh WillsMLconf NYC Josh Wills
MLconf NYC Josh WillsMLconf
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...Ilkay Altintas, Ph.D.
 
Experiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan SeshadriExperiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan SeshadriThe Hive
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesPeter Varhol
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service designTheo Schlossnagle
 
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri AlloucheThe top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri AlloucheOmri Allouche
 
Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...
Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...
Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...Dan Rockwell
 
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...Lviv Startup Club
 
Software development project estimation
Software development project estimationSoftware development project estimation
Software development project estimationOleksandr Katrusha
 
DevOps - It's About How We Work
DevOps - It's About How We WorkDevOps - It's About How We Work
DevOps - It's About How We WorkRandy Shoup
 

What's hot (20)

Importance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introductionImportance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introduction
 
Bridging Big Data and Data Science Using Scalable Workflows
Bridging Big Data and Data Science Using Scalable WorkflowsBridging Big Data and Data Science Using Scalable Workflows
Bridging Big Data and Data Science Using Scalable Workflows
 
Ilkay Altintas: Kepler
Ilkay Altintas: KeplerIlkay Altintas: Kepler
Ilkay Altintas: Kepler
 
Moving Fast At Scale
Moving Fast At ScaleMoving Fast At Scale
Moving Fast At Scale
 
Analysis paralysis
Analysis paralysisAnalysis paralysis
Analysis paralysis
 
WorDS of Data Science in the Presence of Heterogenous Computing Architectures
WorDS of Data Science in the Presence of Heterogenous Computing ArchitecturesWorDS of Data Science in the Presence of Heterogenous Computing Architectures
WorDS of Data Science in the Presence of Heterogenous Computing Architectures
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
 
MLconf NYC Josh Wills
MLconf NYC Josh WillsMLconf NYC Josh Wills
MLconf NYC Josh Wills
 
Hadoop Meets Scrum
Hadoop Meets ScrumHadoop Meets Scrum
Hadoop Meets Scrum
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Experiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan SeshadriExperiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan Seshadri
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service design
 
Time tracking
Time trackingTime tracking
Time tracking
 
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri AlloucheThe top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri Allouche
 
Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...
Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...
Tech Impact on Healthcare talk for P4 (Personalized Medicine) Lecture - Class...
 
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
 
Software development project estimation
Software development project estimationSoftware development project estimation
Software development project estimation
 
DevOps - It's About How We Work
DevOps - It's About How We WorkDevOps - It's About How We Work
DevOps - It's About How We Work
 
GTD(R) Workshop
GTD(R) WorkshopGTD(R) Workshop
GTD(R) Workshop
 

Viewers also liked

Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?Fabricio Quintanilla
 
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Adrian Cockcroft
 
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...Joe Clements
 
What kind of Data Scientist do you need?
What kind of Data Scientist do you need?What kind of Data Scientist do you need?
What kind of Data Scientist do you need?Agnieszka Zdebiak
 
Data Scientist Toolbox
Data Scientist ToolboxData Scientist Toolbox
Data Scientist ToolboxAndrei Savu
 
Becoming a Data Scientist: Advice From My Podcast Guests
Becoming a Data Scientist: Advice From My Podcast GuestsBecoming a Data Scientist: Advice From My Podcast Guests
Becoming a Data Scientist: Advice From My Podcast GuestsRenee Teate
 
Data Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyData Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyLyn Fenex
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist Experian_US
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyPeter Kua
 
Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! PromptCloud
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientistPoo Kuan Hoong
 
A Data Scientist Experiment
A Data Scientist ExperimentA Data Scientist Experiment
A Data Scientist ExperimentJan Chipchase
 
Data Scientist 101 BI Dutch
Data Scientist 101 BI DutchData Scientist 101 BI Dutch
Data Scientist 101 BI DutchJos van Dongen
 
Вебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data ScientistВебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data ScientistFlyElephant
 
Data Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data AnalystData Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data AnalystCloudera, Inc.
 
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...Dataconomy Media
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Edureka!
 

Viewers also liked (20)

So you want to be a Data Scientist?
So you want to be a Data Scientist?So you want to be a Data Scientist?
So you want to be a Data Scientist?
 
Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?Sql saturday el salvador 2016 - Me, A Data Scientist?
Sql saturday el salvador 2016 - Me, A Data Scientist?
 
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
 
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...
 
Data scientist start now!
Data scientist   start now!Data scientist   start now!
Data scientist start now!
 
What kind of Data Scientist do you need?
What kind of Data Scientist do you need?What kind of Data Scientist do you need?
What kind of Data Scientist do you need?
 
Data Scientist Toolbox
Data Scientist ToolboxData Scientist Toolbox
Data Scientist Toolbox
 
Becoming a Data Scientist: Advice From My Podcast Guests
Becoming a Data Scientist: Advice From My Podcast GuestsBecoming a Data Scientist: Advice From My Podcast Guests
Becoming a Data Scientist: Advice From My Podcast Guests
 
Data Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyData Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st Century
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
 
Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps!
 
Data Scientist Why now?
Data Scientist Why now?Data Scientist Why now?
Data Scientist Why now?
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
 
A Data Scientist Experiment
A Data Scientist ExperimentA Data Scientist Experiment
A Data Scientist Experiment
 
Data Scientist 101 BI Dutch
Data Scientist 101 BI DutchData Scientist 101 BI Dutch
Data Scientist 101 BI Dutch
 
Вебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data ScientistВебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data Scientist
 
Data Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data AnalystData Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data Analyst
 
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!
 

Similar to Your Data Scientist Hates You

NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkVivian S. Zhang
 
The Analysis Part of Integration Projects
The Analysis Part of Integration ProjectsThe Analysis Part of Integration Projects
The Analysis Part of Integration ProjectsBizTalk360
 
IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...
IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...
IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...IWMW
 
You've Got No UI?! (Agile Data Teams)
You've Got No UI?! (Agile Data Teams)You've Got No UI?! (Agile Data Teams)
You've Got No UI?! (Agile Data Teams)Mark Barber
 
Market Sounding Brief: ACT Government Data Management
Market Sounding Brief: ACT Government Data ManagementMarket Sounding Brief: ACT Government Data Management
Market Sounding Brief: ACT Government Data ManagementChristopher Norman
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
Games Industry Analytics Forum 2 - Plumbee
Games Industry Analytics Forum 2 - PlumbeeGames Industry Analytics Forum 2 - Plumbee
Games Industry Analytics Forum 2 - PlumbeeGIAF
 
Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...
Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...
Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...Andy Talbot
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
Big Data Rampage
Big Data RampageBig Data Rampage
Big Data RampageNiko Vuokko
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
SRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
SRE Topics with Charity Majors and Liz Fong-Jones of HoneycombSRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
SRE Topics with Charity Majors and Liz Fong-Jones of HoneycombDaniel Zivkovic
 
Feature development cycle at Dashlane - Agile en Seine 2021
Feature development cycle at Dashlane - Agile en Seine 2021Feature development cycle at Dashlane - Agile en Seine 2021
Feature development cycle at Dashlane - Agile en Seine 2021Agile En Seine
 
Mapping your network
Mapping your network Mapping your network
Mapping your network Maria H
 
No, you don't need to learn python
No, you don't need to learn pythonNo, you don't need to learn python
No, you don't need to learn pythonQuantUniversity
 
Leveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul GamesLeveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul GamesInMobi
 
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart DataJourney of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart DataBenjamin Nussbaum
 

Similar to Your Data Scientist Hates You (20)

Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
 
The Analysis Part of Integration Projects
The Analysis Part of Integration ProjectsThe Analysis Part of Integration Projects
The Analysis Part of Integration Projects
 
IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...
IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...
IWMW 2004: It Always Takes Longer Than You Think (Even If You Think It Will T...
 
You've Got No UI?! (Agile Data Teams)
You've Got No UI?! (Agile Data Teams)You've Got No UI?! (Agile Data Teams)
You've Got No UI?! (Agile Data Teams)
 
Market Sounding Brief: ACT Government Data Management
Market Sounding Brief: ACT Government Data ManagementMarket Sounding Brief: ACT Government Data Management
Market Sounding Brief: ACT Government Data Management
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Games Industry Analytics Forum 2 - Plumbee
Games Industry Analytics Forum 2 - PlumbeeGames Industry Analytics Forum 2 - Plumbee
Games Industry Analytics Forum 2 - Plumbee
 
Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...
Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...
Building SharePoint Enterprise Platforms - Off the beaten path - SharePoint S...
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Big Data Rampage
Big Data RampageBig Data Rampage
Big Data Rampage
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
SRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
SRE Topics with Charity Majors and Liz Fong-Jones of HoneycombSRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
SRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
 
Feature development cycle at Dashlane - Agile en Seine 2021
Feature development cycle at Dashlane - Agile en Seine 2021Feature development cycle at Dashlane - Agile en Seine 2021
Feature development cycle at Dashlane - Agile en Seine 2021
 
Mapping your network
Mapping your network Mapping your network
Mapping your network
 
No, you don't need to learn python
No, you don't need to learn pythonNo, you don't need to learn python
No, you don't need to learn python
 
Leveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul GamesLeveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul Games
 
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart DataJourney of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
 

Recently uploaded

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Your Data Scientist Hates You

  • 1. Your Data Scientist Hates You Bradford Stephens bradford@roboticprofit.com ft. help from Nick Kypreos nick@roboticprofit.com
  • 2. About Us Data Infrastructure and Data Science at scale.
  • 3. fact 71%1 of Data Science projects fail
  • 4. fact 63%2 of Data Scientists quit in < 2 years
  • 5. • Data Scientists aren’t involved from the beginning • No strategy • Bad Data: more common than you think and untestable • Pointlessness Why
  • 6. • Everything stems from this • Goals need to be attainable • Data needs to be accessible and formatted correctly • You can’t conceive of what’s possible (or impossible) Involvement
  • 8. Your Data Strategy: Diagnostic • Diagnostic: How did we get here? • Understanding history and how your org drives decisions is key • What will your org’s immune system allow? • Infrastructure: what is currently in place and how did it happen? • Goals: How do we drive revenue or KPIs?
  • 9. Your Data Strategy: Roadmapping • Roadmapping: What are we going to build? • Data Architecture? • Platform feasible? • Who builds what when, for how much? • How do we ensure a low-latency feedback loop? DS highly iterative
  • 10. Your Data Strategy: Development • Platform: What’s our stack? • Storage: Where does data come from, go to, and latency/throughput requirements on storage? • Processing: Where do we transform data? Batch? Real-time? Bounds? • Collaboration: How do we share results, data, and APIs across the org? (always forgotten)
  • 12. Data Science is Untestable
  • 13. Data Science = Math + QA + CS + PM + Psionics
  • 14. Untestable • Data Scientists spend vast amounts of time fixing data • …and you need to be OK with that • Unit Testing doesn’t make sense in science • Distributions fittings, etc • Can only test via simulation: a whole ‘nother process • “Simple” things take weeks to verify
  • 15. Instrumentation • Can you even verify your instrumentation? • Are you collecting everything? • Collecting the right thing? • What if only 85% of the time? • Systematically drop at high enough traffic? • Someone comes into site through different channel from an acquisition 2 yr ago?
  • 16. Software is Garbage • Remember Hadoop? • Spark? • MLib bugs for years • Wrong math won’t fail unit tests • GIGO • JSON, weekly microversioning, schema entropy… • This is why DS efforts are so slow to start w/o initial involvement • Don’t build the One True Data Platform • one of our customers had 30 DBs including a critical out-of-license DB2 box
  • 18. ! Dashboards ! are ! not ! a ! strategy
  • 19. “Here’s some data, just tell us what’s interesting…”
  • 20. “We didn’t think that was interesting, you’re bad at your job.”
  • 21. Data Must be Treated like a Product • Build a Data Products Team • Engineers, PMs, Design. Data Science. Not just analysts. • KPIs, Goals, Measurability, Backlogs • Budget • Freedom to Innovate • Staff of diverse backgrounds A Data Platform will touch every part of your org