Computer simulations of software teams process helps you gain insight into its inherent complexity and assists in making better decisions about process policies
3. Purpose of This Presentation
Learn how to use process simulations to...
4. Purpose of This Presentation
Learn how to use process simulations to...
Gain insights into soft ware development
process
5. Purpose of This Presentation
Learn how to use process simulations to...
Gain insights into soft ware development
process
Make better decisions about soft ware
process
6. Purpose of This Presentation
Learn how to use process simulations to...
Gain insights into soft ware development
process
Make better decisions about soft ware
process
Convince ourselves and others
7. Purpose of This Presentation
Learn how to use process simulations to...
Gain insights into soft ware development
process
Make better decisions about soft ware
process
Convince ourselves and others
Bend the laws of physics
8. Perspectives
on a soft ware
development
operation
There are many ways to
look at a soft ware
development operation
10. Perspectives
on a soft ware
development
operation
Here we are
talking
about this
perspective
11. Kanban Board In Kanban
this Machine
is visualized
on a board
Ready (5) Development (4) Testing (2) Done
12. Questions
What are the right WIP limits? (*)
What is the optimal User Story Size?
Hire a developer or a tester?
Invest time in code inspections?
testing automation?
How can I measure Productivity?
(*) - WIP = Work in Process. WIP Limits - Limits we impose on the size of queues in order to improve efficiency
29. What is Our Performance?
How do We Measure Performance?
Profit/Loss is great...
But... trailing indicator (after the fact)
Takes into account many more than the
efficiency of the soft ware development
operation (Sales & Marketing etc.)
Other “gauges” need to be used
30. Cumulative Flow
Ready
Development
Testing
Done
At any point in time, how many items are in each stage
(queue)
31. Cycle Time
On day 47, a work item
was delivered 43 days
after it was requested
This work item was
requested on day 40 &
delivered on day 65.
Cycle time is 25
32. Lesson #1
It’s hard to know your
performance when you
don’t see it. Use cycle time
and cumulative flow to
control your process
41. Some Insights From This Simple Case
1. Top financial performance coincides with
reaching top utilization of the team
42. Some Insights From This Simple Case
1. Top financial performance coincides with
reaching top utilization of the team
2. Top financial performance coincides with a
low cycle time
43. Some Insights From This Simple Case
1. Top financial performance coincides with
reaching top utilization of the team
2. Top financial performance coincides with a
low cycle time
3. The longer the queue, the longer the system
will take to respond to a change in WIP
limits
52. Decision Making
In Agile:
• Frequent inspect & adapt
• What about long-term decision?
53. Decision Making
In Agile:
• Frequent inspect & adapt
• What about long-term decision?
• What about irreversible decisions?
54. The CFO has just
approved a new hire
for your team
55. The CFO has just
approved a new hire
for your team
Great! We’ll hire a
new developer so we
can crank out more
features
56. The CFO has just
approved a new hire
for your team
Great! We’ll hire a
new developer so we
can crank out more
features
No way! We
need a new tester in
order to improve
quality!
58. Adding a Team Member
Total Value Delivered, Simulated Process
Value [$1,000]
5 Developers, 3 Testers
5 Developers, 2 Testers
6 Developers, 2 Testers
Time [days]
59. Adding a Team Member
Total Value Delivered, Simulated Process
Value [$1,000]
$500K More Value
5 Developers, 3 Testers
5 Developers, 2 Testers
6 Developers, 2 Testers
Time [days]
60. Lesson #3
In some situations,
hiring an extra
Developer will reduce
value delivered
62. User Story Size
Value Delivered
$1,125,000
$750,000
$375,000
$0
($375,000)
($750,000)
($1,125,000)
($1,500,000) Story size
1 2 3 4 5 6 7 [days]
63. e
User-Story Size
High Resolution Simulation
Value
Cycle Time
Cycle Time
Story Size (total estimated effort in days)
High Cycle Time =
Noisy Process
64. Lesson #4
When value delivered
decays with time, small
batches (user stories)
increase value
65. From Simple to Complex
From the previous chart, we can see that
a combination of simple rules, yields
bewilderingly complex behavior
+ =
66. Aircraft & Crew
An interesting
analogy of how we
can deal with
complex systems Simulate?
Why
To Gain Insights
Airplane Simulators
• Complicated System
• Easy to make costly mistakes
• Small input changes can have large
unintended consequences
70. Flight
Simulation
• Experiment in a safe environment
• Can bring the system to extremes without risk
• Can pause, fast for ward to interesting parts
71. Flight
Simulation
• Experiment in a safe environment
• Can bring the system to extremes without risk
• Can pause, fast for ward to interesting parts
• Track & Analyze during and after
72. Flight
Simulation
• Experiment in a safe environment
• Can bring the system to extremes without risk
• Can pause, fast for ward to interesting parts
• Track & Analyze during and after
• Cheaper to operate than the real thing
73. Soft ware
Development
Organizations
• Complicated System
• Easy to make costly mistakes
• Small input changes can have large
unintended consequences
74. Playing to Learn
Example: GetKanban.com - Looking at the
micro-mechanics of Kanban
Here we are looking at the macro-mechanics
of flow in product development
Looking at the 10,000’ level
Kanban trainings & Other settings
75. Lesson #5
Simulations are a cost-
effective way to get
insights into the
operation of complex
systems
76. The Holy Grail of metrics...
Often sought...
Seldom reliably measured
Productivity
77. The Holy Grail of metrics...
Often sought...
Seldom reliably measured
Productivity
=
Output
Input
80. Productivity Measurement
Best measure Value Delivered and maximize it
Holistic - Includes Product Management,
Operations, Sales, Development, Testing...
81. Productivity Measurement
Best measure Value Delivered and maximize it
Holistic - Includes Product Management,
Operations, Sales, Development, Testing...
Bottom-line
82. Productivity Measurement
Best measure Value Delivered and maximize it
Holistic - Includes Product Management,
Operations, Sales, Development, Testing...
Bottom-line
Best correlation to what business wants -
“It’s all about Benjamins, baby!”
84. Productivity Measurement
Downside of Using Bottom-line Value
Holistic - Measures dev + qa + operations +
support + PM + Sales - We usually want to
measure development operation (incl. testing)
separately
85. Productivity Measurement
Downside of Using Bottom-line Value
Holistic - Measures dev + qa + operations +
support + PM + Sales - We usually want to
measure development operation (incl. testing)
separately
Value could be derived from solutions which
incorporate multiple products, which require
arbitrary attribution of value to products
87. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
88. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
89. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
Time spent working on defects
90. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
Time spent working on defects
Time spent on items that are still in progress
91. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
Time spent working on defects
Time spent on items that are still in progress
Idle time
92. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
Time spent working on defects
Time spent on items that are still in progress
Idle time
Manual regression testing
93. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
Time spent working on defects
Time spent on items that are still in progress
Idle time
Manual regression testing
Process overhead
94. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
Time spent working on defects
Time spent on items that are still in progress
Idle time
Manual regression testing
Process overhead
Support interruptions
95. Productivity Measurement
Alternative to Bottom-line Value
Can measure value-add hours vs. paid hours
Delta is Waste:
Time spent working on defects
Time spent on items that are still in progress
Idle time
Manual regression testing
Process overhead
Support interruptions
Deployment
101. Productivity Measurement
Example:
5 developers, 2 testers, working for 100 days
Input = 100 * ( 5+2) = 700
delivered 70 features whose estimated effort
was an average of 6 days
Value = 70*6 = 420
102. Productivity Measurement
Example:
5 developers, 2 testers, working for 100 days
Input = 100 * ( 5+2) = 700
delivered 70 features whose estimated effort
was an average of 6 days
Value = 70*6 = 420
Productivity = 420/700 = 70%
104. The observation itself
affects the system
under observation
This is true in
soft ware teams
as well...
105. Measuring Productivity
Would this measurement work?
In-situ (real teams) In-vitro (Simulations)
Will be gamed? Absolutely Bits don’t cheat
Demoralizing? Probably Bits don’t care
Yes - Eminently Yes - Insights should
Applicable?
relevant metric port well to real life
106. Team Composition &
Productivity
Productivity
Cycle time (avg, stdev)
Productivity = Efficiency:
Can be measured just fine in a
simulation
108. Should We Invest in
Automated Testing? Value [$1,000]
10% ongoing development effort
= save 2 days duration to perform full regression testing,
and keep this saving for ever
No investment in automation
= 1 more day to regression every quarter
10% Ongoing Dev Effort on Automated Regression Testing
Time [days]
109. Should We Invest in
Automated Testing? Value [$1,000]
$300K More Value Generated
10% ongoing development effort
= save 2 days duration to perform full regression testing,
and keep this saving for ever
No investment in automation
= 1 more day to regression every quarter
10% Ongoing Dev Effort on Automated Regression Testing
Time [days]
111. Best Practices In Creating Simulations
Use real historical data as a baseline
112. Best Practices In Creating Simulations
Use real historical data as a baseline
When real data not available, use industry
data (e.g. Capers Jones)
113. Best Practices In Creating Simulations
Use real historical data as a baseline
When real data not available, use industry
data (e.g. Capers Jones)
Validate model using visual simulation
114. Best Practices In Creating Simulations
Use real historical data as a baseline
When real data not available, use industry
data (e.g. Capers Jones)
Validate model using visual simulation
Keep it simple (Low fidelity is often enough)
115. Best Practices In Creating Simulations
Use real historical data as a baseline
When real data not available, use industry
data (e.g. Capers Jones)
Validate model using visual simulation
Keep it simple (Low fidelity is often enough)
Translate results into $$$
119. Summary
Simulations are a great addition to our toolset
To gain insight into complex processes
To make better decisions
120. Summary
Simulations are a great addition to our toolset
To gain insight into complex processes
To make better decisions
To convince (yourself / others)
121. Summary
Simulations are a great addition to our toolset
To gain insight into complex processes
To make better decisions
To convince (yourself / others)
To bend the laws of physics
122. Summary
Simulations are a great addition to our toolset
To gain insight into complex processes
To make better decisions
To convince (yourself / others)
To bend the laws of physics
To forecast
123. Key Take Aways
Understand that there is a machine within
your team’s process
Realize it’s a complex machine, who has a
significant impact on the outcome
Simulations are but one of many tools that
help us there
Should not be the main consideration in making
decisions - People considerations are more
important, but can provide a great deal of help
124. More Information
Check it out at: http://flower.agilesparks.com
Talk to us:
sagi@agilesparks.com
yuval@agilesparks.com
@FLOWerSimulator
125. More Information
Check it out at: http://flower.agilesparks.com
Talk to us:
sagi@agilesparks.com
yuval@agilesparks.com
@FLOWerSimulator
Editor's Notes
\n
\n
\n
\n
\n
\n
\n
\n
One disclaimer to get you in the right mind set...\nOne of the many ways to look at software development operations is to use a perspectives model such as this one\nThis talk is about the “machine” layer. Not because it’s more important... it’s not. \nBecause it is important as it provides insights to the people about the basic mechanics & physics laws of their process,\nThen the people can make better decisions.\n
One disclaimer to get you in the right mind set...\nOne of the many ways to look at teams is to use a perspectives model such as this one\nThis talk is about the “machine” layer. Not because it’s more important... it’s not. \nBecause it is important as it provides insights to the people about the basic mechanics & physics laws of their process,\nThen the people can make better decisions.\n
One disclaimer to get you in the right mind set...\nOne of the many ways to look at teams is to use a perspectives model such as this one\nThe machine layer is the one who contains lists of things to do for different stages in the process, the policies for how things move from one list to another, for how work items (tasks and bugs etc.) look like. etc.\nThis talk is about the “machine” layer. Not because it’s more important... it’s not. \nBecause it is important as it provides insights to the people about the basic mechanics & physics laws of their process,\nThen the people can make better decisions.\n
one of the nice things about Kanban is exposing this machine so we see what’s going on...\nThere are many decisions to make about how the machine operates:\nWIP limits, who works on what, when? policies for how things are transitioning from list to list, classes of service, prioritization. Size of tasks\n
\n
\n
\n
\n
\n
\n
Flow is a great book which combines Statistics, Finance & Ops research to get important insights about how to run this machine. \n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
like in radioactivity\nrelated to the cost of delay\naverage for all items\n
\n
Are we doing well?\n
\n
\n
\n
\n
\n
\n
\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
The longer the queue, the slower the system is to respond to change, \nin this case, changes to cycle time following changes in WIP limits.\n100% utilization marks the start of the sweet spot.\n\n\n
True for this simple setup. If there’s a team that deals with interruptions it’s a different matter.\n
True for this simple setup. If there’s a team that deals with interruptions it’s a different matter.\n
True for this simple setup. If there’s a team that deals with interruptions it’s a different matter.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
This is true only for a queue with WIP limits, feeding off of a backlog with a finite size\n(has wip limits itself). This is not an M/M/1/inf. queue. This is an w/x/y/z queue.\nImportant to notice that here we are getting optimum performance at peak utilization.\nThis is at odds with the statement that “full utilization creates waste”.\n
\n
\n
overhead: queues need to be managed, status needs to be reported...\n\n
\n
Each point represents an experiment with 500 working days. How much value was generated.\nIf you don’t operate on this plateau you are wasting money\nWhat determines the width? Cost of delay.\nWhy isn’t this sensitive to changes in WIP limits in the testing queue?\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
If we keep everything constant, but do a what-if on both scenarios we can get some data which can support the argument one way or the other\n
\n
\n
\n
Keep in mind that this applies to a particular scenario, where there are two working queues,\nthe first queue has 5 workers with an an average daily capacity of 5 work units, a WIP limit of 10 tasks, the second queue has 2 workers with the same average capacity, a WIP limit of 5. Tasks have no variability in size.\nNote that interestingly enough, this does not take into account many of the effects of large stories,\nsuch as reduced estimate accuracy, reduced quality. The main driver here is that it takes more\ntime to get value out, so the value decays more.\n
Keep in mind that this applies to a particular scenario, where there are two working queues,\nthe first queue has 5 workers with an an average daily capacity of 5 work units, a WIP limit of 10 tasks, the second queue has 2 workers with the same average capacity, a WIP limit of 5. Tasks have no variability in size.\nNote that interestingly enough, this does not take into account many of the effects of large stories,\nsuch as reduced estimate accuracy, reduced quality. The main driver here is that it takes more\ntime to get value out, so the value decays more.\n
Similar analysis, but with a fine-grained story size, with stdev included\nFor example, short stories. This is something I believe in, and always felt it was hard to convince.\nIn this chart every point represents 2 years’ worth of a team process.\nSame process, with only one thing changing - the average size of the story.\nYou can see in red the cycle time and its standard deviation.\nIn blue you can see total value delivered over the life time of the project.\nYou can see how significant the impact is.\nOne thing that surprised me is the complexity of the result. Out of a small set of simple rules, you get such a complex outcome.\nYou can see that there’s a sweet spot around one day. This is where you want to be. Again, this is true only for the particular process I described. you can see that with a smaller stories average size, you hurt the \n
\n
From that chart, we can see how complexity emerges\nfrom the combination of a set of fairly simple rules.\nComplex systems exhibit unpredictable response to changes:\n - Small change can have a big impact\n - Change will sometimes be unpredicted\nDoes not mean that there is an inconsistent relationship between cause and effect.\nThis is where simulation comes to the rescue - we can gain more insights into the relationship\n
One interesting analogy to look at is the one of an aircraft and its crew\n\nand in that world getting a handle of complexity is crucial as the price of making mistakes can be quite high,\nAnd an airplane and its crew form a complex system, where small changes can have big impacts, and not necessarily in the anticipated direction.\nSo simulations are used to provide crews with a safe environment where you can experiment and take the system to its extreme in a cheaper, safer manner, you can pause, replay, fast-forward\n
\n
\n
\n
\n
\n
Software development operations are complex systems as well which exhibit similar attributes\nDon’t look as intimidating as a crew in an airliner’s cockpit, with the shiny uniforms and golden badges, but it doesn’t mean that they are different.\nBut somehow simulations are rarely used if at all as part of training of team members and management, and this means that people need to learn from\ncostly mistakes, or from books, and my experience is that the majority of practitioners don’t read books about process, and even when they\ndo, what they learn is sometimes so different from common sense and the existing dogma, that they are likely to face resistance in implementing them.\nSimulations can be a big help there, and I think there’s a lot more than can be done in using those as a tool in that space.\n
a shout out to GetKanban which has been an inspiration in creating the simulations\n
\n
\n
Productivity is the holy grail of the metrics & Impossible to measure in real teams.\nPossible to measure in a simulation in a fairly straightforward way: \nThe user story estimates are correlated to value (assuming a smart PM).\nSo let’s measure the amount of value delivered in any time, and measure the time invested by the team. The ratio between them is productivity, or efficiency.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
You probably know that in physical systems, you cannot observe a quantity without affecting another (e.g. the heisenberg uncertainty principle).\nIn software systems we see the same behavior, and it is far from being negligible.\nIf you start measuring something in a real team, it will change the behavior of the team.\nIn a simulation, you are god a creator, and if you program this out, it won’t happen.\n
\n
Applied to the problem discussed before (dev/qa mix), we can measure productivity (expressed in % here) and keep a straight face.\nThere is no cheating here since we haven’t programmed it it.\nThe insights should port pretty well into reality.\n
\n
Sometimes we are thinking about investing in quality\nExamples are testing automation, which will slow us down in the short term,\nbut will help us in the long term. It’s always very hard to make the call, but if we simulate to forecast, we can see what is going to be the break-even point...\n5 developers, 2 testers\n