If you listen to grandiose tales of DevOps journeys, everything is awesome. But how can those of us not living in The Lego Movie transform our technology in smart and systematic ways? What is “awesome”? How do we point our organizations in that direction, and how will we know progress when we see it?
The best-performing IT organizations have the highest quality, throughput, and reliability while also showing value on the bottom line. When embarking on a journey of transformation, you want to measure your current status and subsequent progress while keeping tabs on factors that drive improvement in technology performance. Nicole Forsgren explains the importance of knowing how (and what) to measure—ensuring you catch successes and failures when they first show up, not just when they’re epic. Measuring progress lets you focus on what’s important and helps you communicate this progress to peers, leaders, and executives who decide budget. Business outcomes don’t realize themselves, after all, and “doing DevOps” doesn’t define stakeholder value any more than “being awesome” does.
8. @nicolefv
Our direction in DevOps
IT Performance: developing and delivering
software with both speed and stability
- Deploy frequency
- Lead Time for Changes
- Mean Time to Recover (MTTR)
- Change fail rate
12. @nicolefv
Is this the DevOps Journey?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
14. @nicolefv
J curve
The WORK
High speed
High stability
High innovation
High retention
Low speed
Low stability
Low innovation
Low retention
15. @nicolefv
Some evidence of the J curve
• 2016 State of DevOps Report: unplanned rework
• 2017 State of DevOps Report: manual work
High Medium Low
21% 32% 27%
16. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
18. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
19. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
20. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
21. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
22. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
23. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
24. @nicolefv
Why do we care?
Why even start this journey?
Why hit that dip in the J-curve?
@nicolefv
25. @nicolefv
High Performing organizations are twice
as likely to achieve or exceed goals
Commercial Goals
• Productivity
• Profitability
• Market Share
• # of customers
26. @nicolefv
High Performing organizations are twice
as likely to achieve or exceed goals
Commercial Goals
• Productivity
• Profitability
• Market Share
• # of customers
Non-Commercial Goals
• Quantity of Products or
Services
• Operating efficiency
• Customer satisfaction
• Quality of products or
services provided
• Achieving organizational
and mission goals
27. @nicolefv
But wait, there’s more!
• Once you’re a High Performer, there’s evidence that you can
overcome the mythical man month.
Etsy Code Deployment
What once required 6-14 hours and an “Army”
…Now takes 15 minutes and 1 person
2013 Mike Brittain, Continuous Deployment: The Dirty Details
3/2014 Daniel Schauenberg , Qcon London
4/2014 tweet @philkates
30+Deploys
per day
2013
50Deploys per day
March 2014
QCon London
80-90Deploys per day
April 2014
ChefConf
30. @nicolefv
Your DevOps Signposts
• IT Performance: Direction
• Deploy frequency
• Lead time
• MTTR
• Change fail rate
• Look for improvements in all four
• We know that all four in tandem are possible
• Watch for: sustained tradeoffs!
• Not possible to see or notice if you aren’t measuring!
31. @nicolefv
Your DevOps Signposts
• Waste Work: Detours (J-curve)
• Unplanned rework
• Manual work
• Quality proxies (specific to your context)
• Defect incidents, security remediation
These help you judge the depth and breadth of
your J-curve
36. @nicolefv
We know there are key capabilities
that drive IT Perf
•They fall into four categories:
•Tech and automation
•Process
•Measurement/monitoring
•Culture
37. @nicolefv
Tech and automation
• Version control
• Deployment automation
• Continuous integration
• Trunk-based development
• Test automation
• Test data management
• Shift left on security
• Continuous delivery
• Loosely-coupled architecture
• Architect for empowered teams
38. @nicolefv
Process
• Gather and implement customer feedback
• Work in small batches
• Lightweight change approval process
• Team experimentation
39. @nicolefv
Trunk-based development & change
approval process
By focusing on trunk-based development
and streamlining their change approval
processes, Capital One saw stunning
improvements in just two months.
40. @nicolefv
Measurement and Monitoring
• Visual management
• Monitoring for business decisions
• Check system health proactively
• WIP limits
• Visualizations
45. @nicolefv
The hard part
• Prioritizing work
• The quick wins part of the J-curve makes it easy to
figure this out. The growing complexity part makes
this difficult.
46. @nicolefv
Where should I start?
• “It depends.” Everyone is different
• Patterns I see often:
• Architecture is highest contributor to Continuous Delivery
(SODR 2017) and shows up for very many teams (DORA:
as the need for loosely-coupled architecture or trunk-
based development)
• Lightweight change approval process is a constraint for
most teams (DORA)
• Continuous integration (DORA – and its full complement)
48. @nicolefv
So what to do?
1. Identify your constraints. Pick 1-3.
2. Work to eliminate those constraints.
3. Re-evaluate your environment and system.
4. Rinse and repeat.
50. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
51. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
52. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
53. @nicolefv
Can both be true?
Low speed
Low stability
Low innovation
Low retention
High speed
High stability
High innovation
High retention
Low Med High
IT Performance
IT Performance
The WORK
54. @nicolefv
How to measure?
• It’s important to have good metrics, wherever you
get them from
• If you know they’re sh*t, toss them and start over.
• Value a real baseline, even if it’s not encouraging.
55. @nicolefv
Getting a full picture of your system is
important
•Start now.
•Full system instrumentation takes time.
•Use system and survey measures to give
you coverage.
57. @nicolefv
Types of measures to collect
• Data about systems and process from systems
• Data about systems and process from people
• Data about people from people
58. @nicolefv
Data about systems and process from
systems
• This data is good for:
• Precision
• Continuous data
• Specific data
• Volume/scale
• This data is not good for:
• A holistic view of your system
• Capturing drifts in data
• Capturing behavior outside of the system
• Cultural or perceptual measures
59. @nicolefv
Data about systems and process from
people
• This data is good for
• Accuracy (if collected w/validated measures)
• A holistic view of your system
• Triangulation with system data
• Capturing behavior outside of the system
• Perceptual measures related to the system (e.g., deploy pain)
• This data is not good for
• Precision (e.g., milliseconds)
• Continuous data - survey fatigue is a thing
• Measures in strained environments (where it is not safe to be honest)
• Volume - once you pass ~20 min, generally bad news
65. @nicolefv
Data about people from people
• This data is good for:
• Understanding your culture
• Capturing perceptual measures
• Leading indicator
• This data is not good for:
• Continuous data (survey fatigue is a thing)
• Measures in strained environments
• HR system data is only a lagging indicator of what has
already happened
68. @nicolefv
Leadership Necessary but not Sufficient
• Teams with the least transformational leaders (the
bottom third) were one-half as likely to be high IT
performers
• Leaders cannot do it alone! Teams with the top 10%
of transformational leaders performed no better
than the median
69. @nicolefv
What can you do?
• Read and share Data Driven by @dpatil and @hmason
• Ask others about metrics
• Where is metrics collection happening?
• Is it considered key to improvement?
• Think about your own metrics & signposts
• Start with outcomes: IT Performance
• Then think about what influences IT Performance:
technology, process, monitoring, culture.
• Also measure your waste work to track your j-curve
70. @nicolefv
What we’ve talked about
• Direction, not destination
• Your DevOps direction
• Moving along your journey
• Checking progress
• Measures
@nicolefv
71. @nicolefv
For more metrics and data:
• ROI whitepaper
• Case studies
• Peer-reviewed research
• 2014 – 2017 State of DevOps Reports
• Learn about assessment
devops-research.com