The document discusses the role of data scientists and trends in data science. It describes how data scientists identify business needs, prepare and analyze data, interpret results, and communicate findings. However, emerging tools are automating some of these tasks using techniques like machine learning and natural language processing. This could change the role of data scientists and enable more self-service data analysis. The document also lists some vendors developing tools to support self-service data science through augmented intelligence.
1. JULY 12, 2018
The Disappearing Data Scientist
Adrian J Bowles, PhD
Founder, STORM Insights, Inc.
Lead Analyst, AI, Aragon Research
2. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
FROM THE GUARDIAN CAREER CHOICES SECTION
“What's a data scientist and how do I become one?”
“There is currently a shortage of data scientists – with companies looking for
programmers and analytical thinkers to plug the gap”
“…the next three years offer a veritable goldmine for data scientists.”
June 30, 2015
3. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
What IS a Data Scientist Anyway?
The role, responsibilities, skills
How the role will change
Emerging tools to augment or automate data science
AGENDA
4. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
WHAT IS A DATA SCIENTIST ANYWAY?
Someone who can
Identify/Interpret Business Need for Insights
Identify and Prepare Data
Analyze - using tools and algorithms appropriate
for the problem at hand
Interpret the results
Tell the story
Wikipedia contributors. "Slide rule." Wikipedia, The Free Encyclopedia.
Wikipedia, The Free Encyclopedia, 11 Jul. 2018. Web. 11 Jul. 2018.
"Sextant (astronomical)." Wikipedia, The Free Encyclopedia.
Wikipedia, The Free Encyclopedia, 12 Oct. 2017. Web. 11 Jul. 2018
Probability and Statistics
Experimental Design
Communication Skills
5. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
TWO APPROACHES: REPORTING VS EXPLORING
Data
Discovery
Data
Preparation
Model
Analyze
Interpret
Problem
Definition
Data
Discovery
Data
Preparation
Model
Analyze
Interpret
Pattern
Detection
~Supervised ML ~unSupervised ML
7. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
TECHNOLOGY AUGMENTATION & AUTOMATION
PROGRAMMERS
SCOPE
TECHNOLOGY
IMPACT
8. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
DATA
SCIENTISTS
TECHNOLOGY AUGMENTATION & AUTOMATION
SCOPE
TECHNOLOGY
IMPACT
9. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
THE NATURAL PROGRESSION
Business
User
IT
Business
User
IT
Business
User
Data IT
Data
Scientist
Data
Data
12. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
FORCES DRIVING SELF-SERVICE DATA SCIENCE
Data Growth Budgets for
Analysis Tools
Bypassing ITDearth of Skills
Issues Demand Supply
AI Technologies
Maturing to
Augment Business
Analysis Requirements
13. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
WHAT CAN BE AUTOMATED?
Identify/Interpret Business Need for Insights
Identify and Prepare Data
Analyze
Interpret
14. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
DATA SCIENCE TRENDS
Self-Service Data Science BI AI
15. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
AUTOMATION/AUGMENTATION: MODEL GENERATION
Properties of the Data
+
Comparative Analysis | User interrogation
+
Machine Learning
16. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
PROBLEM DEFINITION TREND: FROM SQL TO NLP
Structured Queries Visual Queries Natural Language
Data-Centric Business User-Centric
17. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
DATA EXPLORATION TREND
…Interactive to Conversational
Distributed Analysis - Put the Power Close to the Pain
18. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
VENDORS MEETING SELF-SERVICE DATA SCIENCE DEMAND (REPRESENTATIVE LIST)
IBM - Watson Explorer, Watson Analytics,
Microsoft PowerBI
MicroStrategy
Oracle Data Visualization
Qlik Sense
SAP Lumira, Analytics Cloud
SAS Visual Analytics
Sisense
Tableau
Tibco Spotfire
19. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
NATURAL LANGUAGE GENERATION FOR STORY TELLING
Narrative Science Automated Insights
SAP Lumira
Sisense
Microsoft PowerBI
MicroStrategy
Qlik Sense
Tableau
Tibco Spotfire
20. Copyright (c) 2018 by STORM Insights Inc. All Rights Reserved.
FINDINGS AND RECOMMENDATIONS
Findings
Increasing demand will continue to drive self service analytics
Quality of automation vs augmentation varies widely
Biggest benefits are coming from AI classification and NLP technologies
Recommendations
Don’t think in terms of “citizen data scientist” - think productivity
Evaluate your current BI vendor’s roadmap
Train user on analysis fundamentals and experimental design before
deploying the new tools - they may look easy, but they may also just help you
solve the wrong problem faster
Test the new interfaces in your environment before choosing a tool
If you’re a data scientist - relax