This document discusses how to develop effective data scientists. It suggests focusing on "skill hyperparameters" or meta-skills like creativity, curiosity, and a scientific mindset. These attributes indicate someone who can learn new data science tools and approaches. The document also advocates for exploring different domains to foster a multidisciplinary approach and flexibility. It proposes applying principles from machine learning and agile methods to create a lab environment for developing data science teams through guided exploration and collaboration between trainees, technicians, and decision makers.
2. Who amI?
• Jordan Engbers, PhD
• Founder of Desid Labs
• Primary background in biological and health data
• Interested data systems and decision support
4. For the want ofa data scientist…
“By 2018…the United States alone may
face a 50 to 60 percent gap between
supply and requisite demand of deep
analytical talent…” – McKinsey Global
Institute
“…the need for data scientists is growing at about 3x those for statisticians and BI analysts,
and an anticipated 100,000+ person analytic talent shortage through 2020.“
–Gartner
… the kingdom was lost (?)
5. Where do you get data
scientists
(or team members)?
source talent
develop talent
6. The real question:
who will make an effective data
scientist?
This is a prediction problem…
so can we borrow from machine
learning?
?
7. how to train your model
2. Makeit learn1. Pickyourmodel
goal
start here
stochastic gradient descent
(i.e. guided trial-and-error)
desired technical skillset
skill hyperparameters (i.e. metaskills)
8. skillhyperparameters
“A data scientist is somebody who is inquisitive, who can stare at
data and spot trends.
It’s almost like a Renaissance individual who really wants to learn and
bring change
to an organization.” - Anjul Bhambhi, IBM
Examples
• Creativity
• Curiosity
• Critical Thinking
• Scientific Mindset (Systematic Approach)
If wecan identify people with these attributes, chances are they can becomedata
scientists (giventhe propertraining)
9. Why we should focus on
metaskills?
Data science tools (May 2014)
https://dreamtolearn.com/ryan/data_analytics_viz/54
11. Why advocate exploration?
• Fosters a multidisciplinary approach
• Indicates a passion for new knowledge
• Shows a flexibility in thinking
• Demonstrates an ability to learn
• Most important: we don’t know what
makes the ideal data scientist.
• Caveat: Guided exploration vs flakiness
13. well, this is science…
lab
PI
tech
PDF
trainees
collaboration
lab
PI
tech
trainees
PDF
decision makers
Welcome to the lab!
14. …and if we run labs with agile
methods
http://www.inqbation.com/agile-methodology-of-web-developmen
15. Now we have
+ lab environmentdata science team + agile processes
= system for effective data
science
16. Data systems research and development
Research Development Consulting
Because there
are always new
things to learn
Because we need
tools to translate
knowledge into
action
Because we want
to empower you
and your
organization
Jordan Engbers
jordan@desidlabs.co
m
desidlabs.com
Notas do Editor
This is not traditional IT – this is not traditional BI. These tools move fast. And can you really afford to ignore them if they are going to provide significant improvement over your current stack?
The question is not what tools your data science team should know. It is “how can we make sure they will keep learning?”!