Seu SlideShare está sendo baixado. ×

# #NoEstimates project planning using Monte Carlo simulation

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Vídeos do YouTube não são mais aceitos pelo SlideShare

Próximos SlideShares
Pert Analysis
Carregando em…3
×

1 de 63 Anúncio

# #NoEstimates project planning using Monte Carlo simulation

Here is the text behind the slides http://www.infoq.com/articles/noestimates-monte-carlo

Here is a video I prepared in order to help people understand how to plan a release using the Monte Carlo simulation in MS Excel http://youtu.be/r38a25ak4co

And here is an Excel file to show how Monte Carlo is done http://modernmanagement.bg/data/NoEstimate_Project_Planning_MonteCarlo.xlsx

Here are the SIPs for the baseline project http://modernmanagement.bg/data/SIPs_MonteCarlo_FVR.xlsx

Here is the planing simulation in Excel http://modernmanagement.bg/data/High_Level_Project_Planning.xlsx

The video ( after the 3:00 minute) http://youtu.be/GE9vrJ741WY on how to use the Excel files

Here is the text behind the slides http://www.infoq.com/articles/noestimates-monte-carlo

Here is a video I prepared in order to help people understand how to plan a release using the Monte Carlo simulation in MS Excel http://youtu.be/r38a25ak4co

And here is an Excel file to show how Monte Carlo is done http://modernmanagement.bg/data/NoEstimate_Project_Planning_MonteCarlo.xlsx

Here are the SIPs for the baseline project http://modernmanagement.bg/data/SIPs_MonteCarlo_FVR.xlsx

Here is the planing simulation in Excel http://modernmanagement.bg/data/High_Level_Project_Planning.xlsx

The video ( after the 3:00 minute) http://youtu.be/GE9vrJ741WY on how to use the Excel files

Anúncio
Anúncio

Anúncio

Anúncio

### #NoEstimates project planning using Monte Carlo simulation

1. 1. Dimitar Bakardzhiev Managing Director Taller Technologies Bulgaria @dimiterbak #NoEstimates Project Planning using Monte Carlo simulation
2. 2. Clients come to us with an idea for a new product and they always ask the questions - how long will it take and how much will it cost us to deliver? They need a delivery date and a budget estimate.
3. 3. Reality is uncertain, yet we as software developers are expected to deliver new products with certainty.
4. 4. To increase the chances of project success we need to incorporate the uncertainty in our planning and exploit it.
5. 5. WE CAN’T CONTROL THE WAVES OF UNCERTAINTY, BUT WE CAN LEARN HOW TO SURF!
6. 6. TO ME #NOESTIMATES MEANS No effort estimates Effortless estimates No estimates of effort
7. 7. Deterministic planning used these days forces certainty on uncertain situations and masks the uncertainty instead of highlighting it.
8. 8. Project management paradigm is based on 1st principle of Scientific Management namely “In principle it is possible to know all you need to know to be able to plan what to do”.
9. 9. Project management paradigm believes uncertainty play a role in project management uncertainty could be eliminated by a more detailed planning.
10. 10. We challenge the project management paradigm and suggest that for planning purposes it is better to model projects as a flow of work items through a system.
11. 11. A project is a batch of work items each one representing independent customer value that must be delivered on or before due date.
12. 12. We don’t try to estimate the size of the work items. There are only two "sizes" - “Small Enough" and “Too Big". "Too big" should be split and not allowed to enter the backlog.
13. 13. A Project in a Kanban System Input Queue DEPLOYED! Project Backlog Development Test QA WIP 5 WIP 4 NO WIPWIP 2
14. 14. High-level probabilistic planning • The initial budget and the range of the time frame • Does not include detailed project plans • The plan is created with the appropriate buffers • Schedules are the execution of the high-level plan • Keep focus on the project intent
15. 15. Reference class forecasting Reference class forecasting promises more accuracy in forecasts by taking an "outside view" on the project being forecasted based on knowledge about actual performance in a reference class of comparable projects. Daniel Kahneman
16. 16. Reference class forecasting • Identification of a relevant reference class of past, similar projects. The class must be broad enough to be statistically meaningful but narrow enough to be comparable with the specific project. • Establishing a probability distribution for the selected reference class. • Comparing the new project with the reference class distribution, in order to establish the most likely outcome for the new project.
17. 17. IDENTIFICATION OF A REFERENCE CLASS OF SIMILAR PROJECTS
18. 18. Are the Team structures comparable?
19. 19. Are the Technologies used comparable?
20. 20. Are the Development processes comparable?
21. 21. Are the Client types comparable? http://blog.7geese.com/2013/07/04/7-reasons-why-i-decided-to-work-for-a-startup/
22. 22. Are the Business domains comparable? http://www.mindoceantech.com/
23. 23. ESTABLISHING A PROBABILITY DISTRIBUTION FOR THE SELECTED REFERENCE CLASS
24. 24. What metric will be used in the forecast? The metric should allow us: • to take an “outside view” on the development system that worked on the project • calculating delivery time • make sense from client’s perspective.
25. 25. Takt Time!
26. 26. Takt Time is the average time between two successive deliveries
27. 27. How manufacturing measure Takt Time?
28. 28. How knowledge workers measure Takt Time?
29. 29. Inter-Departure Time (IDT) is the time between two successive deliveries Start 5 days 7 days 2 days 2 days 1 day 5 days Finish IDT = 0 days IDT = 0 days IDT = 5 days IDT = 7 days Project delivery time (T) = 5 + 7 + 2 + 2 + 1 + 5 = 22 days
30. 30. 𝑇 = 𝑖=1 𝑁 𝐼𝐷𝑇𝑖 = 5 + 7 + 0 + 2 + 0 + 0 + 2 • T is the time period over which the project was delivered • IDT is the inter-departure time or the time between two successive deliveries
31. 31. Takt Time 𝑇𝑇 = 𝑇 𝑁 = 𝑖=1 𝑁 𝐼𝐷𝑇𝑖 𝑁 • T is the time period over which the project was delivered • N is the number of items to be delivered in period [0,T] • 𝑇𝑇 is the Takt Time for period [0,T]
32. 32. TT calculation 𝑇𝑇 = 𝑇 𝑁 = 22 𝑑𝑎𝑦𝑠 10 𝑠𝑡𝑜𝑟𝑖𝑒𝑠 = 2.2 𝑑𝑎𝑦𝑠/𝑠𝑡𝑜𝑟𝑦
33. 33. Project Delivery time 𝑇 = 𝑁𝑇𝑇 • T is the time period over which the project will be delivered N is the number of items to be delivered in period [0,T] • 𝑇𝑇 is the Takt Time for period [0,T]
34. 34. Project Delivery time 𝑇 = 𝑁𝑇𝑇 = 45 𝑠𝑡𝑜𝑟𝑖𝑒𝑠 2.2 𝑑𝑎𝑦𝑠 𝑠𝑡𝑜𝑟𝑦 = 99 𝑑𝑎𝑦𝑠
35. 35. We should NOT use the Takt Time as a single number but a distribution of the Takt Time instead!
36. 36. Bootstrapping • Introduced by Bradley Efron in 1979 • Based on the assumption that a random sample is a good representation of the unknown population. • Does not replace or add to the original data. • Bootstrap distributions usually approximate the shape, spread, and bias of the actual sampling distribution. • Bootstrap is based on the assumption of independence.
37. 37. 1. Have Inter-Departure Time (IDT) sample of size n 2. Have the number of work items delivered (N) 3. Draw n number of observation 𝑰𝑫𝑻𝒊 with replacement out of the sample from step 1 4. Calculate Project Delivery time (T) for the sample from step 2 using 𝑻 = 𝑰𝑫𝑻𝒊 5. Calculate Takt Time (𝑇𝑇) by 𝑻𝑻 = 𝑻/𝑵 using T from step 4 and N from step 2 6. Repeat many times 7. Prepare distribution for Takt Time (𝑇𝑇) Bootstrapping the distribution of Takt Time
38. 38. Example: Bootstrapping Takt Time (𝑇𝑇) Sampled IDT data 𝑰𝑫𝑻𝒊=(0,0,1,1,1,2,2,2,5,7) 𝑻 = 𝑰𝑫𝑻𝒊 = 𝟐𝟏 𝒅𝒂𝒚𝒔 𝑻𝑻 = 𝑻/𝑵 = 2.1 days/story Another 998 draws with replacement Historical IDT data 𝑰𝑫𝑻𝒊=(0,0,0,0,1,2,2,5,5,7) 𝑻 = 𝑰𝑫𝑻𝒊 = 𝟐𝟐 𝒅𝒂𝒚𝒔 𝑻𝑻 = 𝑻/𝑵 = 2.2 days/story Sampled IDT data 𝑰𝑫𝑻𝒊=(0,1,1,1,1,2,5,5,5,7) 𝑻 = 𝑰𝑫𝑻𝒊 = 𝟐𝟖 𝒅𝒂𝒚𝒔 𝑻𝑻 = 𝑻/𝑵 = 2.8 days/story 1st draw with replacement 1000th draw with replacement
39. 39. Result: Takt Time (𝑇𝑇) distribution Median 2,2 STD 0,788833 Average T 2,1943 85 Perc 3 95 Perc 3,5 Mode(s) 2,4 SIP size 1000
40. 40. Stochastic Information Packet (SIP) • Comprised of a list of trials of some uncertain parameter or metric generated from historical data using Monte Carlo simulation (resampling) • Represents an uncertainty as an array of possible outcomes (distribution) • It is unique per context (business domain, team, delivery process used etc.)
41. 41. COMPARING THE NEW PROJECT WITH THE REFERENCE CLASS DISTRIBUTION
42. 42. 𝑇 = 𝑁𝑇𝑇 assumes linear delivery rate Project Delivery Time (T) Project Delivery Time (T) Completed Work (N) 22 days 10 work items
43. 43. Most projects have non-linear delivery rate
44. 44. Z-curve
45. 45. Each leg of the Z-curve is characterized by: • Different work type • Different level of variation • Different staffing in terms of headcount and level of expertise
46. 46. 1st leg – Setup time • climbing the learning curve • conducting experiments to cover the riskiest work items • Innovation! • setting up environments • adapting to client’s culture and procedures • understanding new business domain • mastering new technology
47. 47. 2nd leg – Productivity period If the project is scheduled properly the system should be like a clockwork – sustainable pace, no stress, no surprises…
48. 48. 3rd leg – Cleaning up • Clean up the battlefield • Fix some outstanding defects • Support the transition of the project deliverable into operation https://www.ocoos.com/me/professional-dog-training-in-home/
49. 49. Project delivery time T 𝑇 = 𝑇𝑧1 + 𝑇𝑧2 + 𝑇𝑧3 Where: 𝑇𝑧1 – is the duration of the 1st leg of the Z-curve 𝑇𝑧2 – is the duration of the 2nd leg of the Z-curve 𝑇𝑧3 – is the duration of the 3rd leg of the Z-curve
50. 50. Project delivery time T 𝑇 = 𝑁𝑧1 𝑇𝑇𝑧1 + 𝑁𝑧2 𝑇𝑇𝑧2 + 𝑁𝑧3 𝑇𝑇𝑧3 Where: 𝑇𝑇𝑧1 is the Takt Time for the 1st leg of the Z-curve 𝑇𝑇𝑧2 is the Takt Time for the 2nd leg of the Z-curve 𝑇𝑇𝑧3 is the Takt Time for the 3rd leg of the Z-curve 𝑁𝑧1 is the number of items delivered during the 1st leg of the Z-curve 𝑁𝑧2 is the number of items delivered during the 2nd leg of the Z-curve 𝑁𝑧3 is the number of items delivered during the 3rd leg of the Z-curve
51. 51. Monte Carlo simulation of Project Delivery Time (T) based on Z-curve 1. Have three Takt Time SIPs (𝑇𝑇𝑧1, 𝑇𝑇𝑧2, 𝑇𝑇𝑧3) each one of size n for each of the three legs of the Z-curve 2. Have the number of work items to be delivered for each of the three legs of the Z-curve (𝑁𝑧1, 𝑁𝑧2, 𝑁𝑧3) 3. Draw one observation out of the n, with replacement (bootstrap) from each of (𝑇𝑇𝑧1, 𝑇𝑇𝑧2, 𝑇𝑇𝑧3) 4. Calculate Project Delivery time (T) for the sample from step 3 using 𝑇 = 𝑁𝑧1 𝑇𝑇𝑧1 + 𝑁𝑧2 𝑇𝑇𝑧2 + 𝑁𝑧3 𝑇𝑇𝑧3 5. Repeat many times 6. Prepare Delivery time (T) probability distribution
52. 52. EXAMPLE: MONTE CARLO SIMULATION OF PROJECT DELIVERY TIME (T)
53. 53. The New Project to be delivered • THE SAME Fortune 500 Staffing company • THE SAME development organization • THE SAME technology – Java; Spring; Oracle; • Delivery time TO BE PREDICTED
54. 54. Takt Time distributions for each of the three legs of Z-curve for the reference class
55. 55. Project scope After some analysis the team have broken down the requirements into user stories, accounting for Cost of Delay, added work items for Dark matter and Failure load and decided that: • 12 stories TO BE delivered in the 1st leg of Z-curve • 70 stories TO BE delivered in the 2nd leg of Z-curve • 18 stories TO BE delivered in the 3rd leg of Z-curve
56. 56. Monte Carlo simulated summation of… …will give us the time needed to deliver the project! 12 work items 70 work items 18 work items
57. 57. Monte Carlo simulation of Project Delivery Time (T) Simulated one Project Delivery Time value 𝑻 = 𝑁𝑧1 𝑇𝑇𝑧1 + 𝑁𝑧2 𝑇𝑇𝑧2 + 𝑁𝑧3 𝑇𝑇𝑧3 = 12 × 1.43 + 70 × 0.3 + 18 × 1.11 = 58.14 𝑑𝑎𝑦𝑠 49998 draws with replacement from each of (𝑇𝑇𝑧1, 𝑇𝑇𝑧2, 𝑇𝑇𝑧3) Takt Time SIPs: 𝑇𝑇𝑧1, 𝑇𝑇𝑧2, 𝑇𝑇𝑧3 Work items: 𝑁𝑧1, 𝑁𝑧2, 𝑁𝑧3 1st draw with replacement from each of (𝑇𝑇𝑧1, 𝑇𝑇𝑧2, 𝑇𝑇𝑧3) 50000th draw with replacement from each of (𝑇𝑇𝑧1, 𝑇𝑇𝑧2, 𝑇𝑇𝑧3) Simulated one Project Delivery Time value 𝑻 = 𝑁𝑧1 𝑇𝑇𝑧1 + 𝑁𝑧2 𝑇𝑇𝑧2 + 𝑁𝑧3 𝑇𝑇𝑧3 = 12 × 1.81 + 70 × 0.54 + 18 × 0.64 = 71.04 𝑑𝑎𝑦𝑠
58. 58. Mode = 76 days; Median = 77 days; Mean = 78 days; 85th perc = 90 days
59. 59. By taking an outside view when forecasting a new project we will produce more accurate results faster than using the deterministic inside view.
60. 60. References Here are the distributions for the baseline project SIPs_MonteCarlo_FVR.xlsx Here is the planning simulation in Excel High_Level_Project_Planning.xlsx What is SIP?
61. 61. Dimitar Bakardzhiev is the Managing Director of Taller Technologies Bulgaria and an expert in driving successful and cost-effective technology development. As a Lean-Kanban University (LKU)- Accredited Kanban Trainer (AKT) and avid, expert Kanban practitioner, Dimitar puts lean principles to work every day when managing complex software projects with a special focus on building innovative, powerful mobile CRM solutions. Dimitar has been one of the leading proponents and evangelists of Kanban in his native Bulgaria and has published David Anderson’s Kanban book as well as books by Eli Goldratt and W. Edwards Deming in the local language. @dimiterbak

### Notas do Editor

• Hello everybody and Thank you for attending this presentation!
My name is Dimitar and I will try to show you how you can use Monte Carlo simulation for forecasting the delivery time for your next project.
• Customers come to us with a new product idea and they always ask the questions - how long will it take and how much will it cost us to deliver?
• Reality is uncertain, yet we as software developers are expected to deliver new products with certainty.
• We can’t control the Waves of Uncertainty, but we can learn How to Surf!

We do that by planning using reference class forecasting which promises more accuracy in forecasts by taking an "outside view" on the project being forecasted based on knowledge about actual performance in a reference class of comparable projects.
• We do that by planning using reference class forecasting which promises more accuracy in forecasts by taking an "outside view" on the project being forecasted based on knowledge about actual performance in a reference class of comparable projects. This approach aligns with the #NoEstimates paradigm which aims at "exploring alternatives to estimates [of time, effort, cost] for making decisions in software development" (Zuill, 2013).

To me #NoEstimates means “No effort estimates” which stands both for “Effortless estimates” or estimating with minimal effort and for “Not using estimates of effort”.
• Deterministic planning used these days forces certainty on uncertain situations and masks the uncertainty instead of highlighting it. It calculates the project-specific costs based on a detailed study of the resources required to accomplish each activity of work contained in the project’s work breakdown structure or in other words, taking an “inside view” on the project being estimated. For high-level planning, deterministic estimation of all work items is wasteful of people’s time and infers precision when it isn’t present.
The techniques presented here are fast and for most of the projects they will produce more accurate results.
• Present day’s project management paradigm is based on 1st principle of Scientific Management namely “In principle it is possible to know all you need to know to be able to plan what to do”.
• It does recognize that uncertainty play a role in project management but believes uncertainty could be eliminated by a more detailed planning. It models projects as a network of activities and calculates the time needed to deliver a project by estimating the effort required to accomplish each activity of work contained in the project’s work breakdown structure.
• We argue that planners could not know everything they needed to know and that the world as such is uncertain and every number is a random variable. We challenge the project management paradigm and suggest that for planning purposes it is better to model projects as a flow of work items through a system.
• Hence the definition - a project is a batch of work items each one representing independent customer value that must be delivered on or before due date. The batch contains all the work that needs to be accomplished to deliver a new product with specified capabilities. In order to prepare the batch the product scope needs to be broken down into work items each one representing independent customer value. Even for a quality related requirements or …ilities such as “the system should scale horizontally” we need to have a work item. It is important that each one of the work items can be delivered in any order like the user stories created following the INVEST mnemonic.
• We don’t try to estimate the size of the work items. There are only two "sizes" - "small enough" and "too big". The two sizes are context specific. They have no correlation to the "effort" needed. "Too big" should be split and not allowed to enter the backlog.
• Here is an animation how a batch of work items or a project flows through a Kanban board. Initially all work items are in the Backlog, and then they flow through the board and end up in Completed column.

• Probabilistic high-level plan forecasts the initial budget and also the range of the time frame for a project. We don’t plan in detail what is not absolutely necessary to plan. The short-term details, like the scheduling, are done based on the immediate needs and capabilities – and we create these schedules upon the execution of the higher level plan. When executing the high-level plan we have to keep focus on the project intent but we can never be certain which paths will offer the best chances of realizing it. We exploit uncertainty by making a series of small choices which open up further options then observe the effects of our actions and exploit unexpected successes.
• We plan probabilistically by using reference class forecasting which does not try to forecast the specific uncertain events that could affect the new project, but instead places the project in a statistical distribution of outcomes from the class of reference projects.

Reference Class Forecasting is based on the work of the Princeton’s psychologist Daniel Kahneman who won the Nobel Prize in economics in 2002.
• Reference class forecasting for a particular project requires the following three steps:
Identification of a relevant reference class of past, similar projects. The class must be broad enough to be statistically meaningful but narrow enough to be comparable with the specific project.
Establishing a probability distribution for the selected reference class. This requires access to credible, empirical data for a sufficient number of projects within the reference class to make statistically meaningful conclusions.
Comparing the new project with the reference class distribution, in order to establish the most likely outcome for the new project.

Let’s apply reference class forecasting method for forecasting the delivery time for a new project.
• Identification of a reference class of similar projects

The projects in the reference class should have comparable:

Team structures
Technologies used
Development processes used and the method of capturing the requirements
Client types

Please note that along with the internal characteristics of the projects we also compare the contexts in which projects were executed. The same team may have different performance if the client is a startup or Fortune 500 corporation due to the different way of collaboration with the stakeholders. On the other hand when comparing the projects we should not go into great details. Our goal is to establish a reference class that is broad enough to be statistically meaningful but narrow enough to be comparable with the new projects we will be working on.
• Are the Team structures comparable?
• Are the Technologies used comparable?
• Are the Development processes comparable?
• Are the Client types comparable?
• Are the Business domains comparable?

Please note that along with the internal characteristics of the projects we also compare the contexts in which projects were executed. The same team may have different performance if the client is a startup or Fortune 500 corporation due to the different way of collaboration with the stakeholders. On the other hand when comparing the projects we should not go into great details. Our goal is to establish a reference class that is broad enough to be statistically meaningful but narrow enough to be comparable with the new projects we will be working on.

• We need to decide the metric for which we will establish the probability distribution. The metric should allow us to take an “outside view” on the development system that worked on the project, allow for calculating delivery time and should also make sense from client’s perspective. Takt Time is such a metric.
• We need to decide the metric for which we will establish the probability distribution. The metric should allow us to take an “outside view” on the development system that worked on the project, allow for calculating delivery time and should also make sense from client’s perspective. Takt Time is such a metric.
• Takt Time is the rate at which a finished product needs to be completed in order to meet customer demand. It is defined as the ratio of the available production time divided by customer demand. In other words Takt Time is the average time between two successive deliveries to the customer.
• In what units of time we measure Takt Time? In manufacturing they measure Takt Time in hours, minutes even in seconds for the mass production.
• In knowledge work we measure Lead time and Takt Time in days.
• On the left we have the start date for the project and on the right we have the end date. We can see that five days after the project started the first work item was delivered. Its Takt Time is 5 days. Seven days after that two new work items were delivered. Now what is their Takt Time? The first work item has a Takt Time of seven days, but the second one has a Takt Time of zero days. That is because the time between the two work items is zero days. It is not zero minutes but since we measure Takt Time in days it is zero days. Two days after that three new work items were delivered. According to the definition of Takt Time one of them has Takt Time of two days but the other two work items both have Takt Time of zero days.

And we see how it went – eventually all 10 work items were delivered.

Important thing to note here is that the sum of all Takt Time values equals the delivery time of the project – in this case 22 days.
• Here is a histogram of the Takt Time for the above delivery rate. Note the number of work items with Takt Time of zero.
• Average Takt Time is calculated by dividing over which the project is or will be delivered by the number of work items delivered.

𝑇𝑇= 𝑇 𝑁
T is the time period over which the project was delivered
N is the number of items to be delivered or the total arrivals in [0,T]
𝑇𝑇 is the Takt Time at the development organization level
• In our project we have 22 days delivery time and we have 10 stories delivered hence we have Takt Time of 2.2 days.
𝑇𝑇= 𝑇 𝑁 = 22 𝑑𝑎𝑦𝑠 10 𝑠𝑡𝑜𝑟𝑖𝑒𝑠 =2.2 𝑑𝑎𝑦𝑠/𝑠𝑡𝑜𝑟𝑦

That means that on average the time between two successive deliveries is 2.2 days. Note that it is an unqualified average, a single number without variance.
• If we know the Takt Time for the system and we have a number of N work items to be delivered we can calculate how much time will take the system to deliver all N work items. The formula is Takt Time times the number of work items to be delivered.
𝑇 = 𝑁𝑇𝑇
• For instance if we have to deliver 45 stories and the Takt Time is 2.2 days then it will take us 99 days to deliver.

𝑇 = 𝑁𝑇𝑇=45 𝑠𝑡𝑜𝑟𝑖𝑒𝑠×2.2 𝑑𝑎𝑦𝑠/𝑠𝑡𝑜𝑟𝑦=99 𝑑𝑎𝑦𝑠
• Here comes an important point – because Takt Time is an average value of a random variable then we have a 50% chance of missing the forecast. To get better odds, we need to use the probability distribution of the Takt Time. We usually don’t know how the Takt Time is distributed. How could we find that out?
• Using historical samples via bootstrapping we can infer the distribution of Takt Time and its likelihoods.

Bootstrapping is based on the assumption that the sample is a good representation of the unknown population. Bootstrapping is done by repeatedly re-sampling a dataset with replacement, calculating the statistic of interest and recording its distribution. It does not replace or add to the original data.

In this case the statistic of interest is the average time between two successive deliveries or Takt Time.
• Here is the method applied using the Takt Time data for out fictitious project.

The sample is 𝑰𝑫𝑻 𝒊 =(0,0,0,0,1,2,2,5,5,7)
• And here is Takt Time histogram using data from the fictitious project. Note the Median, Mean and the 85th percentile.

Now we have the probability distribution of Takt Time for our fictitious project. Note that the median value of 2.2 is very close to the Takt Time we calculated initially. Now we have not only the average but also the mode, the median and some percentiles. This Takt Time distribution represents a context specific uncertainty and is unique per context (team structure, delivery process used, technology, business domain and client). This distribution should be preserved in a library to be used for forecasting new projects implemented in the same context. By using it both theoretical knowledge and effort are greatly reduced, facilitating the use of probabilistic modeling. This distribution will be invalidated if any of the following is changed: team structure, development process, technology being used, client and business domain.
• What is this thing called Stochastic Information Packet?

The Stochastic Information Packet or SIP represents an uncertainty as an array of possible outcomes. The concept was formalized in 2006 in an article in OR/MS Today http://probabilitymanagement.org/library/Probability_Management_Part1s.pdf

What is this thing called Stochastic Information Packet?
Every SIP is unique.
We cannot compare one dev system’s historical delivery data with another team. If we do have historical data about average Takt Time it will be invalidated if any of the following happens:
• team structure is changed
• development process is changed
• technology being used is changed
• development process being used is changed - say if introduce pair programming
If anything of the above is changed all historical data will be invalidated including the SIP itself.

• Important thing to note is that 𝑇= 𝑁𝑇𝑇 assumes linear delivery rate. Do projects have linear delivery rate? Not really.
• This is a diagram that visualizes the rate at which the work items were delivered. On the X axis we have the project time in days. On the Y axis we have the number of work items delivered each day. It turns out that the delivery rate follows a “Z-curve pattern” (Anderson, 2003) as visualized by the red line.
• The Z-curve can be divided in three parts or we can say it has three legs. There is empirical evidence that 20% of the time the delivery rate will be slow. Then for 60% of the time we’ll go faster or it’s “the hyper productivity” period. And the 20% till the end we’ll go slowly. Of course numbers may vary depending on the context but the basic principle about the three sections is correct.

Only the second Z-curve leg is representative for the Dev System capability. It shows the common cause variation specific to each Dev System process.

First and third Z-curve legs are project specific and are affected by special cause variation.
• Each leg of the Z-curve is characterized by:
Different work type
Different level of variation
Different staffing in terms of headcount and level of expertise
• The first leg of the Z-curve is the time when the developers climb the learning curve and setup their minds for the new project. But this leg of the Z-curve could also be used for:
conducting experiments to cover the riskiest work items
Innovation!
setting up environments
adapting to client’s culture and procedures
mastering new technology
All above are examples of special causes of variation specific to a project.
• The second leg of the Z-curve is the productivity period. If the project is scheduled properly the system should be like clockwork – sustainable pace, no stress, no surprises…
• The third leg of the Z-curve is when the team will clean up the battlefield, fix some outstanding defects and support the transition of the project deliverable into operation.
• Project delivery time can be represented as a sum of the duration of each one of the three legs of the Z-curve. Or in other words it equals the duration of the 1st leg plus the duration of the 2nd leg plus the duration of the 3rd leg of the Z-curve.
• Let’s substitute the duration of each of the three legs with the formula 𝑇= 𝑁 𝑇𝑇 . Now we have a new formula that, if we know the Takt Time and the number of work items to be delivered during each of the three legs of the Z-curve, will allow us to calculate how much time will take the system to deliver all N work items where 𝑁= 𝑁 𝑧1 + 𝑁 𝑧2 + 𝑁 𝑧3 is the total number of work items for the project.

𝑇= 𝑁 𝑧1 𝑇𝑇 𝑧1 + 𝑁 𝑧2 𝑇𝑇 𝑧2 + 𝑁 𝑧3 𝑇𝑇 𝑧3
Here we are calculating the delivery of 𝑁 𝑧1 work items with Takt Time 𝑇𝑇 𝑧1 during the 1st leg of the Z-curve plus 𝑁 𝑧2 work items with Takt Time 𝑇𝑇 𝑧2 during the 2nd leg of the Z-curve and 𝑁 𝑧3 work items with Takt Time 𝑇𝑇 𝑧3 during the 3rd leg of the Z-curve. This calculation is not credible because it is using Takt Time as a single number and we know we should use a distribution of the Takt Time instead. We need distributions of Takt Time for each one of the three legs of the Z-curve. We already know how to do that using bootstrap. Now we have to sum them but by definition they are distributions of random variables. How could we sum up random variables? Here comes Monte Carlo analysis. Monte Carlo is a tool for summing up random variables (Savage, 2012).
• Let’s see how we can apply the above algorithm using some real data.
Let’s have a new project that we have to plan and provide the customer with delivery date. We have a reference class of projects and when we compare the new project with the reference class we see that the new project is for the same customer, the same team will be working on it, using the same technology. For the reference class we also have the Takt Time distributions for each of the three legs of the Z-curve.
• Here we have the new project that we have to plan and provide the customer with delivery date. We have the reference project and when we compare the new project with the reference project we see that the new project is for the same customer, the same team will be working on it, using the same technology. We have to predict the delivery time by simulating is using Monte Carlo.
The New Project to be delivered
THE SAME Fortune 500 Staffing company
THE SAME development organization
THE SAME technology – Java; Spring; Oracle;
Delivery time TO BE PREDICTED
• We have a reference class of projects and when we compare the new project with the reference class we see that the new project is for the same customer, the same team will be working on it, using the same technology. For the reference class we also have the Takt Time distributions for each of the three legs of the Z-curve.
• After some analysis the team has broken down the new project scope into user stories and then has added some more work items to account for Dark Matter and Failure Load. After that the team decided that 12 stories will be delivered in the 1st leg of the Z-curve, 70 stories will be delivered in the 2nd leg of the Z-curve and 18 stories or work items will be delivered by the 3rd leg of the Z-curve.

Dark matter
The number of work items will grow through natural expansion as we investigate them. It is possible some features will be broken into 2 or more once you get started on them. That is why we add work items to compensate for the unknown work or dark matter.
My experience is that I would buffer at least 20% for this and with novice teams on a new product (especially if it was a highly innovative new product where we had no prior knowledge or experience) then I might go as high as 100%.
We add work items to compensate for the expected failure load in terms of defects, rework, and technical debt.
Failure load tracks how many work items the Kanban system processes due to poor quality. That includes production defects (bugs) in software and new features requested by the users because of a poor usability or a failure to anticipate properly user needs.
Defects represent opportunity cost and affect the lead time and throughput of the Kanban system. The count of defects is a good indicator if the organization is improving or not.

Add work items to compensate for the unknown work or dark matter. The number of work items will grow through natural expansion as we investigate them. It is possible some features will be broken into 2 or more once you get started on them. My experience is that I would buffer at least 20% for this and with novice teams on a new product (especially if it was a highly innovative new product where we had no prior knowledge or experience) then I might go as high as 100%.

Add work items to compensate for the expected failure load in terms of defects, rework, technical debt.

• If we visualize the random variables using their respective PDFs (their “generating functions”) then the Monte Carlo simulated summation of…
…will give us the time needed to deliver the project!
We are simulating this summation say 50,000 times. That will give us the simulated time needed to deliver the new project.
• We end up with a histogram of the projected delivery time for our new project. What we are interested in is the Median, Average and the 85th percentile of the project delivery time (T) and in the shape of the distribution. Based on the Projected Delivery Time histogram we can take the 85th percentile and use it as single number. For this project the 85th percentile is 90 days. So 6 times out of 7 we should have the project delivered in 90 days or less.

• By taking an outside view when forecasting a new project we will produce more accurate results faster than using the deterministic inside view. The method presented can be used by any team that uses user stories for planning and tracking project execution no matter the development process used (Scrum, XP, kanban systems).

My hope is that you will start using the techniques presented here for planning your next project. And don’t forget that even if we can’t control the waves of uncertainty we can learn how to surf!
• http://modernmanagement.bg/data/SIPs_MonteCarlo_FVR.xlsx
http://modernmanagement.bg/data/High_Level_Project_Planning.xlsx
• Thank you very much for your attention. My hope is that you will start using this approach for high-level planning your next project.

Dimitar Bakardzhiev is the Managing Director of Taller Technologies Bulgaria and an expert in driving successful and cost-effective technology development. As a Lean-Kanban University (LKU)-Accredited Kanban Trainer (AKT) and avid, expert Kanban practitioner, Dimitar puts lean principles to work every day when managing complex software projects with a special focus on building innovative, powerful mobile CRM solutions. Dimitar has been one of the leading proponents and evangelists of Kanban in his native Bulgaria and has published David Anderson’s Kanban book as well as books by Eli Goldratt and W. Edwards Deming in the local language. He is also a lecturer and frequent speaker at numerous conferences and his passion is to educate audiences on the benefits of lean principles and agile methodologies for software development.