Many disciplines are on the wrong side of speed - there is a tradeoff with development speed and security, data science, compliance, etc. Let us look at disciplines that have succeeded in shifting left by integrating development, and learn successful patterns: testing, DevOps, agile, DataOps.
19. www.scling.com
Big data workflows
immutable datasets
functional transform processing
data pipelines
homogeneous, coordinated workflows
democratised data
minimal operations
minimal risk deployments
reproducibility
19
Real value of big data.
Not achieved by late adopters.
22. www.scling.com
Enterprise culture big data
Hadoop / Spark / Flink
+
traditional workflows
mutable data
microservices
functional teams
heterogeneous data platform
=
worst of two worlds
22
24. www.scling.com
How to shift left
common goals
same definition of success
incentives, not gateways
common tools & environments
common processes
common teams
24
28. www.scling.com
IT craft to factory
28
Lotus final assembly, Brian Snelson, https://www.flickr.com/photos/32659528@N00/2868525496
AMC Javelin, CZmarlin, https://commons.wikimedia.org/wiki/File:AMC_Javelin_1971-74_purple_blown_custom.JPG
Security Waterfall
Application
delivery
Traditional
operations
Traditional
QA
Infrastructure
DevSecOps Agile
Containers
DevOps CI/CD
Infrastructure
as code
29. www.scling.com
Security Waterfall
IT craft to factory
29
Application
delivery
Traditional
operations
Traditional
QA
Infrastructure
DB-oriented
architecture
DevSecOps Agile
Containers
DevOps CI/CD
Infrastructure
as code
Data factories,
data pipelines,
DataOps
Lotus final assembly, Brian Snelson, https://www.flickr.com/photos/32659528@N00/2868525496
AMC Javelin, CZmarlin, https://commons.wikimedia.org/wiki/File:AMC_Javelin_1971-74_purple_blown_custom.JPG
35. www.scling.com
Towards sustainable production ML
35
Multiple models,
parameters, features
Assess ingress data quality
Repair broken data from
complementary source
Choose model and parameters based
on performance and input data
Benchmark models
Try multiple models,
measure, A/B test
39. www.scling.com
Doing the shift
long game
very long
ladders
recipes
people & teams
craft to process
focus on delta
learn faster by collaboration
39
40. www.scling.com
Data value = data + domain expertise + data practices
40
Disrupt?
https://xkcd.com/1831/
Adapt?
+ 1000s of failures...
41. www.scling.com
Data value = data + domain expertise + data practices
41
Disrupt?
https://xkcd.com/1831/
Adapt?
+ 1000s of failures...
42. www.scling.com
Data value = data + domain expertise + data practices
42
Data lake
Stream storage
Client data +
domain expertise
Practices from
data leaders
Disrupt?
https://xkcd.com/1831/
Collaborate?
Data-value-as-a-service
Adapt?
+ 1000s of failures...