3. PREFACE
This is a management book. As such, it has only one purpose—to help managers
do their jobs better.
Why then does it have the word “science” in the title? Isn’t science the arcane
pursuit of nerdy guys in lab coats? Doesn’t a scientist seek only to understand the
world, not improve it? Aren’t scientists about as far removed from management as
any group of people we can think of (other than artists maybe)?
It is certainly true that managers are not generally interested in science for its
own sake. But many professionals with no intrinsic interest in science nonetheless
rely on it heavily. A civil engineer uses the science of mechanics to design a brige.
A physician uses the science of physiology to diagnose an illness. Even a lawyer (to
stretch a point) uses the science of formal logic to argue a case. The main premise
of this book is that managers need science too.
But what kind of science? By its very nature, management is interdisciplinary.
Managers deal regularly with issues that involve questions of finance, marketing,
accounting, organizational behavior, operations and many other disciplines. Hence,
a comprehensive science of management is probably a hopeless pipe dream. But
the fact that there is no unified science of medicine does not stop physicians from
relying on several different scientific frameworks. So why should it stop managers
from looking to science for help.
In this book we focus specifically on the science of supply chains. This addresses
the collection of people, resources, and activities involved in bringing materials and
information together to produce and deliver goods and services to customers. Our
goal is to provide a framework for understanding how complex production and sup-
ply chain systems behave and thereby provide a basis for better decision making.
Specifically, the science we present here is useful in answering questions such as
the following:
• You have read the literature on JIT and lean and are up to your eyeballs in
stories about Toyota. But your business is very different from the automotive
industry. Which elements of the Toyota Production System are relevant and
which are not?
• You have implemented some lean manufacturing practices and have reduced
in-process inventories. What should be your next step? How do you identify
the portion of your system that offers the greatest leverage?
iii
4. iv
• You are managing a service operation and (since services cannot be invento-
ried) are wondering whether any of the underlying ideas of lean manufacturing
apply to you. How can you decide what can be adapted?
• You are managing a multi-product manufacturing system. Which of your
products should be made to order and which should be made to stock? What
should you consider in controlling stock levels of both components and finished
goods?
• You have problems getting on-time deliveries from your suppliers. How much
of an impact does this have on your bottom line? What are your best options
for improving the situation?
• You are considering entering into some kind of collaborative relationship with
your suppliers. What factors should you consider in deciding on an appropriate
structure for the partnership?
• You feel that better supply chain management could be a source of competi-
tive advantage. How do you identify the improvements that would make the
most difference? Once you identify them, how do you justify them to upper
management?
Of course, these questions are only the tip of the iceberg. Because each system
is unique, the range of problems faced by managers dealing with supply chains is
almost infinite. But this is precisely the reason that a scientific approach is needed.
A book that tells you how to solve problems can only provide answers for a limited
set of situations. But a book that tells you why systems behave as they do can give
you the tools and insights to deal effectively with almost any scenario.
Our goal is to provide the why of supply chains.
9. Chapter 0
Scientific Foundations
A supply chain is a goal-oriented network of processes and stockpoints
used to deliver goods and services to customers.
0.1 Defining a Supply Chain
By necessity science is reductionist. All real-world systems are too complex to
study in their totality. So scientists reduce them to a manageable size by restricting
their scope and by making simplifying assumptions. For example, all introductory
physics students begin their study of mechanics by learning about objects moving
at sub-relativistic speeds in frictionless environments. Although almost all practical
mechanical systems violate these conditions, the insights one gains from the styl-
ized systems of classical mechanics are vital to the understanding of more realistic
systems. Hence, the friction-free model of moving bodies satisfies the fundamental
criterion of any scientific model—it captures an essential aspect of a real system in
a form that is simple enough to be tractable and understandable.
To get anywhere with a science of supply chains we must first reduce the complex
arrays of suppliers, plants, warehouses, customers, transportation networks and in-
formation systems that make up actual supply chains to structures that are simple
enough to study rigorously. To do this, we must choose a level at which to model a
supply chain. Clearly the level of the entire business is too high; the resulting models
would be hopelessly complex and the details would obscure important commonali-
ties between different supply chains. Similarly, the level of an individual operation
is too low; while modeling a specific process (e.g., metal cutting) in detail may be
tractable, it will give us little insight into what drives the performance metrics (e.g.,
profit) a manager cares about.
An intermediate view is the following.
Definition (Supply Chain): A supply chain is a goal-oriented network of processes
and stockpoints used to deliver goods and services to customers.
1
10. 2 CHAPTER 0. SCIENTIFIC FOUNDATIONS
In this definition, processes represent the individual activities involved in pro-
ducing and distributing goods and services. They could be manufacturing oper-
ations, service operations, engineering design functions or even legal proceedings.
But, since our focus is on the overall performance of the supply chain, we will con-
centrate primarily on the on the flow of goods and services. So we will usually view
the processes in generic terms, with only as much specification as necessary to de-
scribe their effect on these flows. This perspective will enable us to apply our models
across a broad range of industrial settings and adapt insights from one industry to
another.
In addition to processes, our definition involves stockpoints, which represent
locations in the supply chain where inventories are held. These inventories may be
the result of deliberate policy decisions (e.g., as in the case of retail stocks) or the
consequence of problems in the system (e.g., as in the case of a backlog of defective
items awaiting repair). Because managing inventories is a key component of effective
supply chain management, it is vital to include stockpoints in the definition of a
supply chain.
Processes and stockpoints are connected by a network, which describes the
various paths by which goods and services can flow through a supply chain. Figure
1 represents an example of such a network, but the number of possible configurations
is virtually unlimited. So, in the spirit of scientific reductionism, we will often find
it useful to break down complex networks into simpler pieces. A feature of our
definition that helps facilitate this is that, at this level of generality, supply chains
and production operations are structurally similar. As illustrated in Figure 1, if
we probe into a process within a supply chain it will also consist of a network of
processes and stockpoints. Although, as we will see in Part 3 of this book, the size
and complexity of supply chain systems does introduce some interesting management
challenges, we can make use of the same framework to gain a basic understanding
of both individual production systems and aggregations of these in supply chains.
Finally, note that our definition of a supply chain specifies that it is goal ori-
ented. Supply chains are not features of nature that we study for their own sake.
They exist only to support business activities and therefore must be evaluated in
business terms. Usually this means that the fundamental objective of a supply chain
is to contribute to long term profitability. (We say “usually” here because military
and other public sector supply chains are not tied to profits, but instead have cost
effectiveness as their ultimate goal.) But profitability (or cost effectiveness) is too
general to serve as a metric for guiding the design and control of supply chains.
Therefore, a key starting point for a supply chain science is a description of the
strategic objectives the system should support.
0.2 Starting with Strategy
From an operations perspective, a business unit is evaluated in terms of:
1. Cost
2. Quality
11. 0.2. STARTING WITH STRATEGY 3
Figure 1: Supply Chains as Flow Networks.
3. Speed
4. Service
5. Flexibility
because these are the dimensions along which manufacturing and service enterprises
compete. However, as we illustrate in the following examples, the relative weights
a given firm attaches to these measures can vary greatly.
Quality vs. Cost: Few people would regard the Ford Focus as competition for the
Jaguar XKR. The reason is that, although all of the above dimensions matter
to customers for both cars, buyers of the Focus are concerned primarily with
cost, while buyers of the Jaguar are concerned primarily with quality (as they
perceive it). Therefore, the logistics systems to support the two cars should
be designed with different sets of priorities in mind. For instance, while the
Jaguar system might be able to afford to have quality technicians “inspect in”
quality, single pass “quality at the source” methods are almost mandatory for
the Focus in order for it to compete in its price range.
Speed vs. Cost: W.W. Grainger is in the MRO (maintenance, repair and oper-
ating) supplies business. Through catalog and on-line sales, Grainger offers
hundreds of thousands of products, ranging from cleaning supplies to power
tools to safety equipment. But all of these products are made by suppli-
ers; Grainger doesn’t manufacture anything. So, a customer could choose to
purchase any of Grainger’s products directly from a supplier at a lower unit
cost. Given this, why would a customer choose Grainger? The reason is that
Grainger can ship small orders with short lead times, while the suppliers re-
quire longer lead times and bulk purchases. Grainger’s business strategy is to
12. 4 CHAPTER 0. SCIENTIFIC FOUNDATIONS
offer speed and responsiveness in exchange for price premiums. They support
this strategy with a logistics system that inventories products in warehouses
and focuses on efficient order fulfillment. In contrast, the logistics systems of
the suppliers concentrate on production efficiency and therefore tend to make
and ship products in large batches.
Service vs. Cost: Peapod.com advertises itself as an “on-line grocery store.” When
it was founded, Peapod functioned by “picking” orders from local grocery
stores and delivering them to customers’ homes. More recently Peapod has
developed its own system of warehouses, from which deliveries are made. By
offering customers the opportunity to shop on-line and forego visiting the
supermarket, Peapod’s business strategy is based primarily on service. Cus-
tomers willing to shop for bargains and transport their own groceries can
almost certainly achieve lower costs. To achieve this service advantage over
traditional grocery stores, however, Peapod requires an entirely different logis-
tics system, centered around internet ordering and home delivery as opposed
to stocking and sale of merchandise in retail outlets.
Flexibility vs. Cost: Before they outsourced it, IBM manufactured printed cir-
cuit boards (PCB’s) in Austin, Texas. Although they made thousands of
different PCB’s, a high fraction of their sales dollars came from a small frac-
tion of the end items. (This type of demand distribution is called a Pareto
distribution and is very common in industry.) Because all of the products
required similar processes, it would have been feasible to manufacture all of the
PCB’s in a single plant. However, IBM divided the facility into two entirely
separate operations, one to produce low volume, prototype boards, and one
to produce high volume boards. The high volume plant made use of heavily
utilized specialized equipment to achieve cost efficiency, while the high volume
plant employed flexible equipment subject to frequent changeovers. Because
the two environments were so different, it made sense to keep them physically
separate. This sort of focused factory strategy is well-suited to a variety of
production environments with widely varying products.
Having observed that different business conditions call for different operational
capabilities, we can look upon supply chain design as consisting of two parts:
1. Ensuring operational fit with strategic objectives.
2. Achieving maximal efficiency within the constraints established by strategy.
Copying best practices, generally called benchmarking, can only partially en-
sure that an operations system fits its strategic goals (i.e., because the benchmarked
system can only approximate the system under consideration). And benchmarking
cannot provide a way to move efficiency beyond historical levels, since it is by na-
ture imitative. Thus, effective operations and supply chain management requires
something beyond benchmarking.
13. 0.3. SETTING OUR GOALS 5
0.3 Setting Our Goals
Some firms have been lucky enough to find their special “something” in the form of
bursts of genius, such as those achieved by Taiichi Ohno and his colleagues at Toyota
in the 1970s. Through a host of clever techniques that were extremely well adapted
to their business situation, Toyota was able to translate world class operations into
impressive long term growth and profitability. But since geniuses are scarce, effective
supply chain management must generally be based on something more accessible to
the rest of us.
The premise of this book is that the only reliable foundation for designing op-
erations systems that fit strategic goals and push out the boundaries of efficiency
is science. By describing how a system works, a supply chain science offers the
potential to:
• Identify the areas of greatest leverage;
• Determine which policies are likely to be effective in a given system;
• Enable practices and insights developed for one type of environment to be
generalized to another environment;
• Make quantitative tradeoffs between the costs and benefits of a particular
action;
• Synthesize the various perspectives of a manufacturing or service system, in-
cluding those of logistics, product design, human resources, accounting, and
management strategy.
Surprisingly, however, many basic principles of supply chain science are not well
known among professional managers. As a result, the field of supply chain manage-
ment is plagued by an overabundance of gurus and buzzwords, who sell ideas on the
basis of personality and style rather than on substance. The purpose of this book is
to introduce the major concepts underlying the supply chain science in a structured,
although largely non-mathematical, format.
0.4 Structuring Our Study
Defining a supply chain as a network suggests a natural way to organize the princi-
ples that govern its behavior. The basic building blocks of the network are processes
and stockpoints. So to get anywhere we must first understand these. We regard a
single process fed by a single stockpoint as a station. A milling machine processing
castings, a bank teller processing customers and a computer processing electronic
orders are examples of stations.
But, while station behavior is important as a building block, few products are
actually produced in a single station. So, we need to understand the behavior of
a line or a routing, which is a sequence of stations used to generate a product or
service. A manufacturing line, such as the moving assembly line used to produce
14. 6 CHAPTER 0. SCIENTIFIC FOUNDATIONS
automobiles, is the prototypical example of a routing. But a sequence of clerks
required to process a loan applications and a series of steps involved in developing
a new product are also examples of routings.
Finally, since most manufacturing and service systems involve multiple lines
producing multiple products in many different configurations, we need to build upon
our insights for stations and lines to understand the behavior of a supply chain.
With this as our objective, the remainder of this book is organized into three
parts:
1. Station Science: considers the operational behavior of an individual process
and the stockpoint from which it receives material. Our emphasis is on the
factors that serve to delay the flow of entities (i.e., goods, services, information
or money) and hence causes a buildup of inventory in the inbound stockpoint.
2. Line Science: considers the operational behavior of process flows consisting
of logically connected processes separated by stockpoints. We focus in partic-
ular on the issues that arise due to the coupling effects between processes in
a flow.
3. Supply Chain Science: considers operational issues that cut across supply
chains consisting of multiple products, lines and levels. A topic of particular
interest that arises in this context is the coordination of supply chains that
are controlled by multiple parties.
17. Chapter 1
Capacity
Over the long-run, average throughput of a process is always strictly less
than capacity.
1.1 Introduction
The fundamental activity of any operations system centers around the flow of en-
tities through processes. The entities can be parts in a manufacturing system,
people in a service system, jobs in a computer system, or transactions in a financial
system. The processes can be machining centers, bank tellers, computer CPU’s, or
manual workstations. The flows typically follow routings that define the sequences
of processes visited by the entities. Clearly, the range of systems that exhibit this
type of generic behavior is very broad.
In almost all operations systems, the following performance measures are key:
• Throughput: the rate at which entities are processed by the system,
• Work in Process (WIP): the number of entities in the system, which can
be measured in physical units (e.g., parts, people, jobs) or financial units (e.g.,
dollar value of entities in system),
• Cycle Time: the time it takes an entity to traverse the system, including
any rework, restarts due to yield loss, or other disruptions.
Typically, the objective is to have throughput high but WIP and cycle time low.
The extent to which a given system achieves this is a function of the system’s overall
efficiency. A useful measure of this efficiency is inventory turns, defined as
throughput
inventory turns =
WIP
where throughput is measured as the cost of goods sold in a year and WIP is the
dollar value of the average amount of inventory held in the system. This measure
of how efficiently an operation converts inventory into output is the operational
analogy of the return-on-investment (ROI) measure of how efficiently an investment
converts capital into revenue. As with ROI, higher turns are better.
9
18. 10 CHAPTER 1. CAPACITY
Figure 1.1: A System with Yield Loss.
1.2 Measuring Capacity
A major determinant of throughput, WIP, and cycle time, as well as inventory turns,
is the system’s capacity. Capacity is defined as the maximum average rate at which
entities can flow through the system, and is therefore a function of the capacities of
each process in the system. We can think of the capacity of an individual process
as:
process capacity = base capacity − detractors
where base capacity refers to the rate of the process under ideal conditions and
detractors represent anything that slows the output of the process.
For example, consider a punch press that can stamp out metal parts at a rate of
two per hour. However, the press is subject to mechanical failures which cause its
availability (fraction of uptime) to be only 90 percent. Hence, one hour in ten, on
average, is lost to downtime. This means that, over the long term, (0.1)(2) = 0.2
parts per hour are lost because of the failures. Hence, the capacity of the process can
be computed as either 90 percent of the base rate (0.9 × 2 per hour = 1.8 per hour)
or as the base rate minus the lost production (2 per hour − 0.2 per hour =
1.8 per hour). Similar calculations can be done for other types of detractors, such
as setups, rework, operator unavailability, and so on.
The process that constrains the capacity of the overall system is called the bot-
tleneck. Often, this is the slowest process. However, in systems where different
types of entities follow different paths (routings) through the system, where yield
loss causes fallout, or the routings require some entities to visit some stations more
than once (either for rework or because of the nature of the processing require-
ments), then the slowest process need not be the system bottleneck. The reason is
that the amount of work arriving at each station may not be the same. For instance,
consider the system shown in Figure 1.1, in which 50 percent of the entities drop
out (e.g., due to quality problems) after the second station. This means that the
third and fourth stations only have half as much work to handle as do the first and
second.
Clearly, the station that will limit flow through a line like that in Figure 1.1 is
the one that is busiest. We measure this through the utilization level, which is the
19. 1.3. LIMITS ON CAPACITY 11
fraction of time a station is not idle, and is computed as:
rate into station
utilization =
capacity of station
With this, we can give a general definition of bottlenecks as:
Definition (Bottleneck): The bottleneck of a routing is the process with the high-
est utilization.
To illustrate the procedure for identifying the bottleneck of a routing, let us
return to the example of Figure 1.1 and assume that jobs enter the system at a
rate of 1 per minute and the processing times (including all relevant detractors) at
stations 1–4 are 0.7, 0.8, 1, and 0.9 minutes, respectively. Since the arrival rate to
stations 1 and 2 is 1 per minute, while the arrival rate to stations 3 and 4 is only
0.5 per minute (due to yield loss), the utilizations of the four stations are:
1
u(1) = = 0.7
1/0.7
1
u(2) = = 0.8
1/0.8
0.5
u(3) = = 0.5
1/1
0.5
u(4) = = 0.45
1/0.9
Notice that while station 3 is the slowest, it is station 2 that is the bottleneck, since
it has the highest utilization level. Therefore, given this yield loss profile it is station
2 that will define the maximum rate of this line. Of course, if the yield loss fraction
is reduced, then stations 3 and 4 will become busier. If yield is improved enough,
station 3 will become the bottleneck.
1.3 Limits on Capacity
We now state the first fundamental principle of capacity as:
Principle (Capacity): The output of a system cannot equal or exceed its capacity.
While this law may appear to be a statement of the obvious (aren’t we all prone
to saying there are only 24 hours in a day?), it is commonly neglected in practice.
For instance, one frequently hears about production facilities that are running at 120
percent of capacity. What this really means, of course, is that the system is running
at 120 percent of an arbitrarily defined “capacity”, representing one shift with no
overtime, normal staffing levels, a historical average rate, or whatever. But it does
not represent the true limiting rate of the system, or we could not be exceeding it.
More subtly in error are claims that the system is running at 100 percent of
capacity. While it may seem intuitively possible for a workstation to be completely
utilized, it actually never happens over the long term in the real world. This is due
20. 12 CHAPTER 1. CAPACITY
to the fact that all real systems contain variability. We will discuss this important
issue in more detail in the next chapter. For now, we will consider some simple
examples to illustrate the point.
First, suppose that in the previously mentioned punch press example that de-
tractors (downtime, setups, breaks/lunches, etc.) reduce the base rate of 2 parts
per hour to an effective rate of 1.43 parts per hour. If we were to ignore the detrac-
tors and release parts into the station at a rate of 2 per hour, what would happen?
Clearly, the press would not be able to keep up with the release rate, and so work
in process (WIP) would build up over time, as shown in Figure 1.2. The short
term fluctuations are due to variability, but the trend is unmisakably toward station
overload.
Second, suppose that we release parts to the station at exactly the true capacity
of the system (1.43 parts per hour). Now performance is no longer predictable.
Sometimes the WIP level will remain low for a period of time; other times (e.g.,
when an equipment failure occurs) WIP will build rapidly. Figure 1.3 shows two
possible outcomes of the the punch press example when releases are equal to capacity.
The results are very different due to the unpredictable effects of variability in the
processing rates. In the left plot, which did not experience any long equipment
outages, the station is keeping up with releases. However, in the right plot, a long
outage occurred after about 20 days and caused a large increase in WIP. After a
period of recovery, another disruption occurred at about the 40 day mark, and WIP
built up again.
Unfortunately, over the long run, we will eventually be unlucky. (This is what
casino’s and states with lotteries count on to make money!) When we are, WIP
will go up. When release rate is the same as the production rate, the WIP level
will stay high for a long time because there is no slack capacity to use to catch up.
Theoretically, if we run for an infinite amount of time, WIP will go to infinity even
though we are running exactly at capacity.
In contrast, if we set the release rate below capacity, the system stablizes. For
example, Figure 1.4 shows two possible outcomes of the punch press example with a
release rate of 1.167 parts per hour (28 parts per day), which represents a utilization
of 1.167/1.43 = 82%. Although variability causes short-term fluctuations, both
instances show WIP remaining consistently low.
1.4 Impact of Utilization
The behavior illustrated in the above examples underlies the second key principle
of capacity:
Principle (Utilization): Cycle time increases in utilization and does so sharply
as utilization approaches 100%.
As we have seen, when utilization is low, the system can easily keep up with the
arrival of work (e.g., Figure 1.4) but when utilization becomes high the system will
get behind any time there is any kind of temporary slowdown in production (e.g.,
Figure 1.3). One might think that the “law of averages” might make things work
21. 1.4. IMPACT OF UTILIZATION 13
Figure 1.2: WIP versus Time in a System with Insufficient Capacity.
Figure 1.3: Two Outcomes of WIP versus Time at with Releases at 100% Capacity.
22. 14 CHAPTER 1. CAPACITY
Figure 1.4: Two Outcomes from Releasing at 82% of Capacity.
out. But because the machine cannot “save up” production when it is ready but
there is no WIP, the times the machine is starved do not make up for the times it
is swamped.
The only way the machine can be always busy is to have a large enough pile
of WIP in front of it so that it never starves. If we set the WIP level to anything
less than infinity there is always a sequence of variations in process times, outages,
setups, etc. that will exaust the supply of WIP. Hence, achieving higher and higher
utilization levels requires more and more WIP. Since entities must wait behind longer
and longer queues, the cycle times also increase disproportionately with utilization.
The result is depicted in Figure 1.5, which shows that as a station is pushed closer to
capacity (i.e., 100 percent utilization), cycle times increase nonlinearly and explode
to infinity before actually reaching full capacity.
INSIGHT BY ANALOGY - A Highway
On a highway, any empty spot of pavement (i.e., a gap between vehicles) rep-
resents underutilized capacity. Hence, the theoretical capacity of a highway is the
volume it would handle with bumper-to-bumper traffic travelling at the speed limit.
But, of course, we all know this is impossible. Experience shows us that heavier
traffic results in longer travel times. The only time the highway is fully utilized (i.e.,
completely bumper-to-bumper) is when traffic is stopped.
The reasons for this are exactly the same as those responsible for the Capacity
and Utilization principles. The only way for vehicles to travel bumper-to-bumper is
for them to move at precisely the same speed. Any variation, whether the result of
braking to change lanes, inability to maintain a constant speed, or whatever, will
result in gaps and hence less than 100% utilization.
Since no highway is completely variability free, all highways operate at signif-
icantly less than full capacity. Likewise, no production system or supply chain is
without variability and hence these too operate at less than full capacity. Further-
more, just as travel times increase with utilization of a highway, cycle times increase
with utilization in a production system.
23. 1.4. IMPACT OF UTILIZATION 15
Figure 1.5: Nonlinear Relationship of Cycle Time to Utilization.
24. 16 CHAPTER 1. CAPACITY
Figure 1.6: Mechanics Underlying Overtime Vicious Cycle.
The science behind the above law and Figure 1.5 drives a common type of be-
havior in industry, which we term the overtime vicious cycle. This plays out as
follows: Because (a) maximizing throughput is desirable, and (b) estimating true
theoretical capacity is difficult, managers tend to set releases into the plant close
to or even above theoretical capacity (see Figure 1.6). This causes cycle times to
increase, which in turn causes late orders and excessive WIP. When the situation
becomes bad enough, management authorizes overtime, which changes the capacity
of the system (see Figure 1.6 again), and causes cycle times to come back down.
But as soon as the system has recovered, overtime has been discontinued, and man-
agement has vowed “not to let that happen again,” releases are aimed right back
at theoretical capacity and the whole cycle begins again. Depending on how much
variability is in the system and how close management tries to load to capacity, this
cycle can be swift and devastating.
PRINCIPLES IN PRACTICE - Motorola
Motorola Semiconductor Products Sector produces integrated circuits both for
use in Motorola products (e.g., cell phones) and for other OEMs (original equipment
manufacturers). They do this in vastly complex wafer fabs that can cost $2 billion
or more to construct. Not surprisingly, efficient utilization of these enormously
25. 1.4. IMPACT OF UTILIZATION 17
expensive resources is a key concern in the semiconductor industry. Despite this,
Motorola deliberately sizes capacity of each process in a wafer fab so that utilization
will be no higher than a specified limit, typically in the range of 75-85%.
Clearly the cost of this excess capacity is very expensive. But Motorola is well
aware of the Utilization Principle. In a system as complex as a wafer fab, the
dynamics of Figure 1.5 are dramatic and severe. Operating close to full utilization
would require vast amounts of inventory and hence would result in extremely long
cycle times. Excessive inventory would inflate costs, while long cycle times would
hardly suit the needs of customers who are under their own cost and time pressures.
Limiting utilization is expensive, but being uncompetitive is fatal. So Motorola
wisely plans to run at less than full utilization.
27. Chapter 2
Variability
Increasing variability always degrades the performance of a production
system.
2.1 Introduction
Chapter 1 focused on the performance of a single process in terms of throughput,
and examined the roles of utilization and capacity. We now turn to two other key
performance measures, WIP (work in process) and cycle time.
2.2 Little’s Law
The first observation we can make is that these three measures are intimately related
via one of the most fundamental principles of operations management, which can
be stated as:
Principle (Little’s Law): Over the long-term, average WIP, throughput, and cy-
cle time for any stable process are related according to:
WIP = throughput × cycle time
Little’s law is extremely general. The only two restrictions are: (1) it refers to
long-term averages, and (2) the process must be stable. Restriction (1) simply means
that Little’s law need not necessarily hold for daily WIP, throughput, and cycle time,
but for averages taken over a period of weeks or months it will hold. Restriction
(2) means that the process cannot be exhibiting a systematic trend during the
interval over which data is collected (e.g., steadily building up WIP, increasing the
throughput rate, or anything else that makes the process substantially different at
the end of the data collection interval than it was at the beginning). However, this
stability restriction does not preclude cyclic behavior (e.g., WIP rising and falling),
bulk arrivals, batch processing, multiple entity types with different characteristics,
19
28. 20 CHAPTER 2. VARIABILITY
or a wide range of other complex behavior. Indeed, Little’s law is not even restricted
to a single process. As long as WIP, throughput, and cycle time are measured in
consistent units, it can be applied to an entire line, a plant, a warehouse, or any
other operation through which entities flow.
One way to think of Little’s law, which offers a sense of why it is so general, is
as a simple conversion of units. We can speak of WIP in terms of number of entities
or in terms of “days of supply”. So, the units of Little’s law are simply
entities = entities/day × days.
For that matter, we could use dollars to measure inventory and output, so that
Little’s law would have units of
dollars = dollars/day × days.
This would make it possible for us to aggregate many different types of entity into
a single relationship. Note, however, that if we want to, we can also apply Little’s
law separately to each entity type.
Although Little’s law is very simple it is extremely useful. Some common appli-
cations include:
1. Basic Calculations: If we know any two of the quantities, WIP, cycle time,
and throughput, we can calculate the third. For example, consider a firm’s
accounts receivable. Suppose that the firm bills an average of $10,000 per day
and that customers take 45 days on average to pay. Then, working in units
of dollars, throughput is $10,000 per day and cycle time is 45 days, so WIP
(i.e., the total amount of outstanding accounts receivable on average) will be
$450,000.
2. Measure of Cycle Time: Measuring cycle time directly can be tedious. We
must time stamp each entity as it enters the system, record its completion
time, and maintain a running average. While many manufacturing execution
systems (MES) are capable of tracking such data, it is often simpler to keep
track of WIP and throughput than cycle time (i.e., everyone tracks throughput
since it is directly related to revenue and tracking WIP involves only a periodic
system-wide count, while cycle time requires detailed data on every entity).
Notice that we can rearrange Little’s law as
WIP
cycle time =
throughput
Therefore, if we have averages for WIP and throughput, their ratio defines a
perfectly consistent measure of cycle time. Notice that this definition remains
consistent even for assembly systems. For instance, the cycle time of a personal
computer is very difficult to define in terms of tracking entities, since it is made
up of many subcomponents, some of which are processed in parallel. However,
29. 2.3. MEASURING VARIABILITY 21
if we can measure total WIP in dollars and throughput in terms of cost-of-
goods-sold, then the ratio still defines a measure of cycle time.1
3. Cycle Time Reduction: The literature on JIT and lean manufacturing extol
the virtues of WIP reduction, while the literature on time based competition
and agile manufacturing call for cycle time reduction. However, since
WIP
cycle time =
throughput
Little’s law indicates that WIP and cycle time reduction are really two sides
of the same coin. As long as throughput remains constant, any reduction in
WIP must be accompanied by a reduction in cycle time and vice versa. This
implies that separate programs are not needed to reduce WIP and cycle time.
It also implies that “where there is WIP there is cycle time”, so the places
to look for improvements in cycle time are the locations in the production
process where WIP is piling up.
2.3 Measuring Variability
Because of applications like those given above, Little’s law is an essential tool in the
arsenal of every operations or supply chain professional. However, it falls well short
of painting a complete picture of a operations system. Writing Little’s law in yet
another form
WIP
throughput =
cycle time
suggests that is possible to have two systems with the same throughput but where
one has high WIP and long cycle time, while the other has low WIP and short cycle
time. Of course, any manager would prefer the system with low WIP and short cycle
times—such a system is more “efficient” in the sense of its ability to convert WIP
into throughput. But in practice, operations and supply chain systems can exhibit
dramatic differences in efficiency. Why? The answer—and this is a fundamental
insight of the science of logistics—is variability!
Variability is a fact of life. Heights of individuals, SAT scores, light bulb life-
times, daily barometric pressure readings, highway travel times, soil acidity levels,
service times at a bank teller, fraction of people who vote in presidential elections,
and millions of other everyday phenomena are subject to variability. Any collec-
tion of numerical measures that is not perfectly uniform is said to be variable. In
logistical systems, many important quantities are variable, including process times,
equipment uptimes, equipment downtimes, product demands, yield rates number of
workers who show up on a given day, and a host others. Because of the prevalence
1
Of course, when thinking about cycle time from a customer standpoint we must be careful
to note which part of cycle time the customer actually sees. Because of this we are careful to
distinguish between manufacturing cycle time (the time an entity spends in the system) and
customer lead time (the time between when a customer order is placed and when it is received).
Our Little’s Law example addresses manufacturing cycle time. We will treat customer lead time
more carefully in Chapter 9.
30. 22 CHAPTER 2. VARIABILITY
of variability and its disruptive influence on system performance, understanding it
is critical to effective logistics management. This involves two basic steps: (1) spec-
ification of consistent and appropriate measures of variability, and (2) development
of the cause-and-effect roles of variability in logistical systems.
We begin with measures. First, we note that a quantitative measure whose
outcomes are subject to variability is termed a random variable. The set of all
possible realizations of a random variable is called its population. For example, the
height of a randomly chosen American adult male is a random variable whose the
population consists of the set of heights of all American adult males.2 Often, we do
not have data for the entire population of a random variable and therefore consider
a subset or sample of the possible outcomes. For instance, we might estimate
the height characteristics of the American male adult population from a sample of
10,000 randomly chosen individuals.
One way to describe either a population or a sample is by means of summary
statistics. A statistic is a single-number descriptor calculated as a function of
the outcomes in a population or sample. The most common statistic is the mean,
which measures the average or central tendency of a random variable.3 Second most
common is the standard deviation, which measures the spread or dispersion of
the random variable about its mean.4
For example, the mean and standard deviation of the scores on the 1999 SAT
test were 1,017 and 209. For most random variables, a high percentage (e.g., 95
percent or so) of the population lies within two standard deviations of the mean.
In the case of SAT scores, two standard deviations around the mean represents the
range from 599 to 1435. Since roughly 2 percent of test takers scored above 1435 and
2 percent scored below 599, this interval contains about 96 percent of test scores,
which is termed “normal” behavior.
Standard deviation is a measure of variability. However, it is not always the most
suitable one. To see why, suppose we are told that a sample has a standard deviation
of 5. Is this a high or low level of variation? Along these same lines, suppose we
were told that the height of American males averages 68 inches with a standard
deviation of 4 inches. Which is more variable, heights of American males or SAT
scores? We cannot answer questions like these on the basis of standard deviation
alone. The reason is that standard deviations have units, indeed the same units as
the mean (e.g., inches for heights, points for SAT scores). A standard deviation of 5
is meaningless without knowing the units. Similarly, we cannot compare a standard
deviation measured in inches with one given in points.
Because of this, a more appropriate measure of variability is frequently the co-
2
A more mundane example of a random variable is the numerical outcome of the throw of a
single die. The population for this random variable is the set S = {1, 2, 3, 4, 5, 6}.
3
The mean of a set of outcomes, x1 , . . . , xn , is computed by summing them and dividing by
their number, that is x = x1 +···+xn . Note that “x-bar” is commonly used to depict the mean of a
¯ n
sample, while the Greek letter µ (“mu”) is commonly used to depict the mean of a population.
x 2 x 2
4
The variance of a set of outcomes, x1 , . . . , xn , is computed as s2 = (x1 −¯) +···+(xn −¯) . Note
n−1
that this is almost the average of the squared deviations from the mean, except that we divide by
n − 1 instead of n. The standard deviation is the square root of the variance, or s. Note that s is
commonly used to denote the standard deviation of a sample, while the Greek letter σ (“sigma”)
is generally used to represent the standard deviation of a population.
31. 2.3. MEASURING VARIABILITY 23
efficient of variation (CV) which is defined as:
standard deviation
CV =
mean
Because mean and standard deviation have the same units, the coefficient of vari-
ation is unitless. This makes it a consistent measure of variability across a wide
range of random variables. For example, the CV of heights of American males is
4/68 = 0.06, while the CV of SAT scores is 209/1, 017 = 0.21, implying that SAT
scores are substantially more variable than are heights. Furthermore, because it is
unitless, we can use the coefficient of variation to classify random variables. Random
variables with CV’s substantially below 1 are called low variablity, while those
with CV’s substantially above 1 are called high variability. Random variables
with CV’s around 1 (say between 0.75 and 1.33) are called moderate variability.
We now consider variability specifically as it relates to operations systems. As we
noted above, there are many sources of variability in production and service systems,
some of which will be considered in more detail later. However, at the level of a
single process, there are two key sources of variability: (1) interarrival times, and
(2) effective process times. Interarrival times are simply the times between the
arrival of entities to the process, which can be affected by vendor quality, scheduling
policies, variability in upstream processes, and other factors. Effective process
times are measured as the time from when an entity reaches the head of the line
(i.e., there is space in the process for it) and when it is finished. Notice that under
this definition, effective process times include detractors, such as machine failures,
setup times, operator breaks, or anything that extends the time required to complete
processing of the entity.
We can characterize the variability in both interarrival times and effective process
times via the coefficient of variation. For interarrival times, we could envision doing
this by standing in front of the process with a stopwatch and logging the times
between arrivals. If two entities arrive at the same time (e.g., as would be the case
if two customers arrived to a fast food restaurant in the same car), then we record
the interarrival time between these as zero. With this data, we would compute
the mean and standard deviation of the interarrival times, and take the ratio to
compute the coefficient of variation. Figure 2.1 shows two arrival time lines. The
top line illustrates a low variability arrival process (CV = 0.07), while the bottom line
illustrates a high variability arrival process (CV = 2). Notice that low variability
arrivals are smooth and regular, while high variability arrivals are “bursty” and
uneven. Interestingly, if we have a large collection of independent customers arriving
to a server (e.g., toll booths, calls to 9-1-1) the CV will always be close to one.
Such arrival processes are called Poisson and fall right between the high variability
(CV > 1) and low variability (CV < 1) cases.
Analogously, we could collect data on effective process times by recording the
time between when the entity enters the process and when it leaves. Again, we would
compute the mean and standard deviation and take the ratio to find the coefficient of
variation. Table 2.1 illustrates three cases. Process 1 has effective process times that
vary slightly about 25 minutes, so that the CV is 0.1. This low variability process
is representative of automated equipment and routine manual tasks. Process 2 has
32. 24 CHAPTER 2. VARIABILITY
Figure 2.1: High and low variability arrivals.
short process time around 6 minutes punctuated by an occasional 40 minute time.
This results in moderate variability with a CV of 1.2 and is representative of the
situation where a process has fairly short, regular process times except when a setup
is required (e.g., to change from one product type to another). Finally Process 3
is identical to Process 2 except that the 12th observation is much longer. This
behavior, which results in a high variability process with a CV of 2.9, could be the
result of a long machine failure. The key conclusion to draw from these examples is
that low, moderate, and high variability effective process times are all observed in
logistical systems. Depending on factors like setups, failures, and other disruptive
elements, it is possible to observe CV’s ranging from zero to as high as 10 or more.
2.4 Influence of Variability
Now that we have defined an appropriate measure of variability and have identified
the key types of variability at the level of an individual process, we turn to the cause-
and-effect relationships between variability and performance measures in a logistical
system. These are characterized through the science of queueing theory, which
is the study of waiting line phenomena.5 In a operations system, entities queue up
behind processes, so that
Cycle Time = Delay + Process Time
where delay represents the time entities spend in the system not being processed.
As we will see, there are several causes of delay. One of the most important is
queueing delay, in which entities are ready for processing but must wait for a
resource to become available to start processing.
5
Queueing is also the only word we know of with five consecutive vowels, which makes it handy
in cocktail party conversation, as well as supply chain management.
33. 2.4. INFLUENCE OF VARIABILITY 25
Trial Process 1 Process 2 Process 3
1 22 5 5
2 25 6 6
3 23 5 5
4 26 35 35
5 24 7 7
6 28 45 45
7 21 6 6
8 30 6 6
9 24 5 5
10 28 4 4
11 27 7 7
12 25 50 500
13 24 6 6
14 23 6 6
15 22 5 5
te 25.1 13.2 43.2
σe 2.5 15.9 127.0
ce 0.1 1.2 2.9
Class LV MV HV
Table 2.1: Effective Process Times from Various Processes.
We can characterize the fundamental behavior of queueing delay at a station
with the following principle.
Principle (Queueing Delay): At a single station with no limit on the number of
entities that can queue up, the delay due to queuing is given by
Delay = V × U × T
where
V = a variability factor
U = a utilization factor
T = average effective process time for an entity at the station
This expression, which we term the VUT equation, tells us that queueing delay
will be V U multiples of the actual processing time T . A corollary to this is
Cycle Time = V U T + T
These equations are major results in supply chain science, since they provide basic
understanding and useful tools for examining the primary causes of cycle time.
The first insight we can get from the VUT equation is that variability and
utilization interact. High variability (V ) will be most damaging at stations with
34. 26 CHAPTER 2. VARIABILITY
Figure 2.2: Impact of Utilization and Variability on Station Delay.
high utilization (U ), that is, at bottlenecks. So, reducing queueing delay can be done
through a combination of activities that lower utilization and/or reduce variability.
Furthermore, variability reduction will be most effective at bottlenecks.
To draw additional insights, we need to further specify the factors that determine
the U and V factors.
The utilization factor is a function of station utilization (fraction of time station
is busy). While exact expressions do not exist in general and approximations vary
depending on the nature of the station (e.g., whether the station consists of a single
process or multiple processes in parallel), the utilization factor will be proportional to
1/1−u, where u is the station utilization. This means that as utilization approaches
100 percent, delay will approach infinity. Furthermore, as illustrated in Figure 2.2,
it does so in a highly nonlinear fashion. This gives a mathematical explanation for
the utilization principle introduced in Chapter 1. The principle conclusion we can
draw here is that unless WIP is capped (e.g., by the existence of a physical or logical
limit), queueing delay will become extremely sensitive to utilization as the station
is loaded close to its capacity.
The variability factor is a function of both arrival and process variability, as
measured by the CV’s of interarrival and process times. Again, while the exact
expression will depend on station specifics, the V factor is generally proportional
to the squared coefficient of variation (SCV) of both interarrival and process
times.
35. 2.4. INFLUENCE OF VARIABILITY 27
Figure 2.2 also illustrates the impact of increasing process and/or arrival vari-
ability on station delay. In this figure we illustrate what happens to delay at two
stations that are identical except that one has a V coefficient of 0.5 and the other
has a V coefficient of 2. By the Queueing Delay Principle, the delay will be four
times higher for any given level of utilization in the latter system than in the former.
But, as we see from Figure 2.2, this has the effect of making delay in the system
with V = 2 “blow up” much more quickly. Hence, if we want to achieve the same
level of delay in these two systems, we will have to operate the system with V = 2
at a much lower level of utlization than we will be able to maintain in the system
with V = 0.5.
From this discussion we can conclude that reductions in utilization tend to have
a much larger impact on delay than reductions in variability. However, because
capacity is costly, high utilization is usually desirable. By the VUT equation, the
only way to have high utilization without long delays is to have a low variability
factor. For this reason, variability reduction is often the key to achieving high
efficiency logistical systems.
INSIGHT BY ANALOGY - A Restaurant
A restaurant is a service system that is subject to both demand and supply vari-
ability. On the demand side customer arrivals are at least somewhat unpredictable,
while on the supply side the time it takes to feed a customer is uncertain. This
variability can degrade performance in three ways: (1) customers can be forced to
wait for service, (2) customers can balk (go away) if they feel the wait will be too
long, which causes a lost sale and possibly loss of customer good will, and (3) ca-
pacity (waiters, tables, etc.) can experience excessive idleness if the restaurant is
sized to meet peak demand. Because the restaurant business is very competitive,
the manner in which a particular establishment copes with variability can mean the
difference between success and failure.
But because customer expectations are are not the same for all types of restau-
rants, specific responses vary. For instance, in a fast food restaurant, customers
expect to be able to drop in unannounced (so arrival variability is high) and re-
ceive quick service. To respond, fast food restaurants do whatever they can to keep
process variability low. They have a limited menu and often discourage special or-
ders. They keep food on warming tables to eliminate delay due to cooking. They use
simple cash registers with pictures of food items, so that all employees can process
orders quickly, not just those who are adept at operating a keypad and making
change. But even with very low variability on the supply side, the variability on
the demand side ensures that the V coefficient will be quite high in a fast food
restaurant. Hence, because they need to remain fast, such establishments typically
retain excess capacity. In order to be able to respond to peaks in demand, fast food
restaurants will generally be over-staffed during slow periods. Furthermore, they
will frequently shift capacity between operations to respond to surges in demand
(e.g., employees will move from food preparation activities in the back to staff the
front counter when lines get long).
36. 28 CHAPTER 2. VARIABILITY
Contrast this with an upscale restaurant. Since customers do not expect walk-in
service, the restaurant can greatly reduce arrival variability by taking reservations.
Even though the broader menu probably results in higher process variability than
in a fast food restaurant, lower arrival variability means that the upscale restaurant
will have a substantially lower overall V coefficient. Hence, the upscale restaurant
will have a delay curve that resembles the one in Figure 2.2 labelled V = 0.5, while
the fast food restaurant has one that resembles the V = 2 curve. As a result, the
upscale restaurant will be able to achieve higher utilization of their staff and facilities
(a good thing, since pricey chefs and maitre d’s are more expensive to idle than are
fast food fry cooks). Despite their higher utilization, upscale restaurants typically
have lower delays as a percentage of service times. For example, one might wait on
average two minutes to receive a meal that takes 20 minutes to eat in a fast food
restaurant, which implies that V × U = 0.1 (since delay is one tenth of service time).
In contrast, one might wait five minutes for a reserved table to eat a 100 minute
meal, which implies V × U = 0.05. Clearly, the variability reduction that results
from taking reservations has an enormous impact on performance.
To illustrate the behavior described by the VUT equation, let us consider a
simple station where the average effective process time for an entity is T = 1 hour
and the CV for both interarrival times and process times are equal to 1 (which for
this system implies that V = 1). Then the capacity of the process is 1 per hour and
utilization (u) is given by
rate in rate in
u= = = rate in
capacity 1
Suppose we feed this process at a rate of 0.5 entities per hour, so that utilization
equals 0.5. In this simple system, the utilization factor (U ) is given by
u 0.5
U= = =1
1−u 1 − 0.5
Hence, the queueing delay experienced by entities will be:
Delay = V × U × T = 1 × 1 × 1 = 1 hour
and
Cycle Time = V U T + T = 1 + 1 = 2 hours
If we were to double the variability factor to V = 2 (i.e., by increasing the CV
of either interarrival times or process times), without changing utilization, then
queueing delay would double to 2 hours.
However, suppose that we feed this process at a rate of 0.9 entities per hour, so
that utilization is now 0.9. Then, the utilization factor is:
u 0.9
U= = =9
1−u 1 − 0.9
37. 2.4. INFLUENCE OF VARIABILITY 29
so queueing delay will be:
delay = V × U × t = 1 × 9 × 1 = 9 hours
and
Cycle Time = V U T + T = 9 + 1 = 10 hours
Furthermore, doubling the variability factor to V = 2 would double the delay to 18
hours (19 hours for cycle time). Clearly, as we noted, highly utilized processes are
much more sensitive to variability than are lowly utilized ones.
Examples of the above relationship between variability, utilization, and delay
abound in everyday life. A common but dramatic instance is that of ambulance
service. Here, the process is the paramedic team, while the entities are patients
requiring assistance.6 In this system, short delay (i.e., the time a patient must wait
for treatment) is essential. But, because the very nature of emergency calls imply
that they will be unpredictable, the system has high arrival variability and hence a
large variability factor. The only way to achieve short delay is to keep the utilization
factor low, which is precisely what ambulance services do. It is not unusual to find
an ambulance with overall utilization of less than 5 percent, due to the need to
provide rapid response.
A sharply contrasting example is that of a highly automated production process,
such as an automatic soft drink filling line. Here, cans are filled quickly (i.e., a second
or less per can) and with a great deal of regularity, so that there is little process
variability. The filling process is fed by a conveyor that also runs at a very steady
rate so that there is little arrival variability. This implies that the variability factor
(V ) is small. Hence, it is possible to set the utilization close to 1 and still have little
delay.
However, one must be careful not to over-interpret this example and assume that
there are many situations where utilization close to 1 is possible. If the automatic
filling process is subject to failures, requires periodic cleaning, or is sometimes slowed
or stopped due to quality problems, then the variability factor will not be near zero.
This means that entities will have to build up somewhere (e.g., in the form of
raw materials at the beginning of the filling line perhaps) in order to ensure high
utilization and will therefore be subject to delay. If there is limited space for these
materials (and there always is for a very fast line), the line will have to shut down.
In these cases, the utilization will ultimately be less than 1 even though the planned
releases were designed to achieve utilization of 1.
PRINCIPLES IN PRACTICE - Toyota
The Toyota Production System (TPS) has had a profound impact on manufac-
turing practice around the globe. Many specific practices, such as kanban, kaizen,
and SMED (single minute exchange of die), have received considerable attention in
6
Notice that it makes no difference logically whether the process physically moves to the entities
or the entities move to the process. In either case, we can view the entities as queueing up for
processing and hence the VUT equation applies.
38. 30 CHAPTER 2. VARIABILITY
popular management publications. But if one looks closely at the early publications
on TPS, it is apparent that the Queueing Delay Principle is at the core of what
Toyota implemented. For instance, in his seminal book,7 Taiichi Ohno, the father
of TPS, begins his description of the system with a writeup replete with sections
entitled “Establishing a Production Flow,” “Production Leveling,” and “Mountains
Should be Low and Valleys Should be Shallow.” All of these drive home the point
that the only way for production processes to operate with low delay (and by Lit-
tle’s Law, low inventory) is for them to have low variability. Eliminating arrival
variability at stations is the very foundation of the Toyota Production System (as
well as just-in-time, lean and the rest of its decendants).
While Ohno recognized the need for smooth flow into processes, he also recog-
nized that variability in demand is a fact of business life. To compensate, Toyota
placed tremendous emphasis on production smoothing. That is, they took a forecast
of demand for a month and divided it up so that planned production volume and
mix were the same for each day, and indeed each hour. If monthly demand required
producing 75% sedans, then the plant should produce 75% sedans each and every
hour. This avoided the pulsing through the line that would occur if different body
types were produced in batches (e.g., a stream of sedans followed by a stream of
hardtops followed by a stream of wagons).
Of course, feeding one station with a steady arrival stream only ensures that the
next station will receive steady arrivals if the upstream station has low process vari-
ability. So, Toyota also laid great emphasis on reducing variability in process times.
Standard work procedures, total quality control, total preventive maintenance, setup
reduction, and many other integral parts of the TPS were firmly directed at reduc-
ing process variability. With these in place along with the production smoothing
measures, Toyota was able to achieve exceptionally low arrival variability at stations
throughout their production system. By the logic depicted in Figure 2.2 this en-
abled them to run their processes at high levels of utilization and low levels of delay
and inventory. Moreover, because the myriad methods they used to drive variability
out of their processes were notoriously hard to copy, Toyota was able to maintain
a competitive edge in their operations for over twenty years despite being the most
intensely benchmarked company on the planet.
39. Chapter 3
Batching
Delay due to batching (eventually) increases proportionally in the lot size.
3.1 Introduction
Many operations are done in batches. A painting process may paint a number of red
cars before switching to blue ones. A secretary may collect a bundle of copying jobs
before going to the Xerox room to process them. A foundry might place a number of
wrenches into a furnace simultaneously for heat treating. A forklift operator might
allow several machined parts to accumulate before moving them from one operation
to another. The number of similar jobs processed together, either sequentially or
simultaneously, is known variously as the batch size or lot size of the operation.
Why is batching used? The answer is simple, capacity. In many instances, it
is more efficient to process a batch of entities than to process them one at a time.
There are three basic reasons for the increase in efficiency due to batching:
1. Setup Avoidance: A setup or changeover is any operation that must be done
at the beginning of a batch (e.g., removing the paint of one color before going
to a new color in a paint operation, walking to the Xerox room). The larger
the batch size, the fewer setups required, and hence the less capacity lost to
them.
2. Pacing Improvement: In some operations, particularly manual ones, it may be
possible to get into a good “rhythm” while processing a number of like jobs in a
row. For instance, a secretary may handle copying jobs quicker if they are part
of a batch than if they are done separately. The reason is that repetition of
motion tends to eliminate extraneous steps. One could think of the extraneous
motions as setups that are done at the beginning of a batch and then dropped,
but since they are not as obvious as a setup due to cleaning and they may
take several repetitions to disappear, we distinguish pacing improvement from
setup avoidance.
31
40. 32 CHAPTER 3. BATCHING
Figure 3.1: Mechanics of Simultaneous Batching.
3. Simultaneous Processing: Some operations are intrinsically batch in nature
because they can process a batch of entities as quickly as they can process a
single entity. For instance, heat treating may require three hours regardless
of whether the furnace is loaded with one wrench or a hundred. Similarly,
moving parts between operations with a forklift may require the same amount
of time regardless of whether the move quantity is one part or a full load.
Obviously, the larger the batch size the greater the capacity of a simultaneous
operation like this.
Because they are physically different, we distinguish between simultaneous
batches, where entities are processed together, and sequential batches, where
entities are processed one-at-a-time between setups. Although the source of effi-
ciency from batching can vary, the basic mechanics are the same. Larger batch sizes
increase capacity but also increase wait-for-batch time (time to build up a batch)
or wait-in-batch time (time to process a batch) or both. The essential tradeoff
involved in all batching is one of capacity versus cycle time.
3.2 Simultaneous Batching
We begin by examining the tradeoffs involved in simultaneous batching. That is, we
consider an operation where entities are processed simultaneously in a batch and the
process time does not depend on how many entities are being processed (as long as
the batch size does not exceed the number of entities that can fit into the process).
This situation is illustrated in Figure 3.1. Examples of simultaneous batching include
heat treat and burn-in operations, bulk transportation of parts between processes,
and showing a training video to a group of employees. Regardless of the application,
the purpose of simultaneous batching is to make effective use of the capacity of the
process.
Note that simultaneous batching characterizes both process batches (number
of entities processed together at a station) and move batches (number of enti-
ties moved together between stations). From a logistical perspective, heat treating
41. 3.2. SIMULTANEOUS BATCHING 33
wrenches in a furnace and moving machined parts between processes is essentially
the same. Both are examples of simultaneous batch operations.
The fundamental relationship underlying simultaneous batching behavior is the
effect of batch size on utilization. As always, utilization is given by
rate in
utilization =
capacity
Since the process takes a fixed amount of time regardless of the batch size, capacity
is equal to
batch size
capacity =
process time
Hence,
rate in × process time
utilization =
batch size
For the system to be stable, utilization must be less than 100%, which requires
batch size > rate in × process time
While this enables us to compute the minimum batch size needed to keep up
with a given throughput rate, it usually makes sense to run simultaneous batching
operations with batch sizes larger than the minimum. The reason is that, as the
above analysis makes clear, utilization decreases in the batch size. Since, as we
noted in Chapter 1, cycle time increases in utilization, we would expect increasing
batch size to decrease cycle time. And this is exactly what happens as long as larger
batch sizes do not cause entities to wait while forming a batch. For instance, if the
entire batch arrives together, then none of the entities will have to wait and hence
cycle time will unambiguously decrease with batch size.
However, if parts arrive one at a time to a simultaneous batch operation, then
it is possible for cycle time to increase in the batch size. For example, if arrivals to
the operation are slow and the batch size is fixed and large, then the first entities
to arrive will wait a long time for a full batch to form. In this case, even though
reducing the batch size will increase utilization it might well reduce average cycle
time by reducing the time entities wait to form a batch.
A more effective way to avoid excessive wait-for-batch time is to abandon the
fixed batch size policy altogether. For instance, if whenever the operation finishes
a batch we start processing whatever entities are waiting (up to the number that
can fit into the operation, of course), then we will never have an idle process with
entities waiting. But even this does not entirely eliminate the wait-for-batch time.
To see this, let us consider a batch operation with unlimited space. Suppose the
time to process a batch is t minutes, regardless of the number processed. So, we
will start a new batch every t minutes, consisting of whatever entities are available.
If the arrival rate is r, then the average number of parts that will be waiting is rt,
which will therefore be the average batch size. On average, these parts will wait
t/2 minutes (assuming they arrive one-at-a-time over the t minute interval), and so
their entire cycle time for the operation will be t/2 + 2 = 3t/2. By Little’s law,
the average WIP in the station will be r × 3t/2. Note that the average batch size,
42. 34 CHAPTER 3. BATCHING
average cycle time, and average WIP are all proportional to t. Speeding up the
process, by decreasing t, will allow smaller batches, which in turn will decrease WIP
and cycle time.
As an example, consider a tool making plant that currently heat treats wrenches
in a large furnace that can hold 120 wrenches and takes one hour to treat them.
Suppose that throughput is 100 wrenches per hour. If we ignore queueing (i.e.,
having more than 120 wrenches accumulated at the furnace when it is ready to
start a new batch) then we can use the above analysis to conclude that the average
batch size will be 100 wrenches, the average cycle time will be 90 minutes, and the
average WIP at (and in) heat treat will be 150 wrenches.
Now, suppose that a new induction heating coil is installed that can heat treat
one wrench at a time in 30 seconds. The capacity, therefore, is 120 wrenches per
hour, which is the same as the furnace and is greater than the throughput rate.
If we again ignore queueing effects, then the average process time is 0.5 minutes
or 0.00833 hours. So, by Little’s Law,the average WIP is 100 × 0.00833 = 0.833
wrenches. Even if we were to include queueing in the two cases, it is clear that the
WIP and cycle time for this one-at-a-time operation will be vastly smaller than that
for the batch operation. This behavior is at the root of the “lot size of one” goal of
just-in-time systems.
3.3 Sequential Batching
A sequential batch operation is one that processes entities sequentially (one at a
time) but requires time to setup or change over before moving to a different type
of entity. This situation is illustrated in Figure 3.2. A classic example is a punch
press, which can stamp parts from sheet metal at a very fast rate but can take a
significant amount of time to change from one part type to another. The decision of
how many parts of a certain type to process before switching to a different type is
a batch (or lot) sizing decision that involves a tradeoff between capacity and cycle
time.
As in the case of simultaneous batching, there exists a minimum sequential batch
size necessary to ensure sufficient capacity to keep up with demand. To compute
this, we define
r = arrival rate of entities (number per hour)
t = process time for a single entity (hours)
s = setup time (hours)
Q = batch size
Since it takes s + Qt time units to process a batch of Q entities, the capacity of
a sequential batch operation is
Q
capacity =
s + Qt
Hence, utilization is
rate in r(s + Qt)
utilization = =
capacity Q
43. 3.3. SEQUENTIAL BATCHING 35
Figure 3.2: Mechanics of Sequential Batching.
and for utilization to be less than 100% we require
rs
Q>
1 − rt
But as with simultaneous batching, it is frequently appropriate to set the batch
size larger than the minimum level. The reason is that cycle time may be able to
be reduced by striking a better balance between capacity and delay. To see this, let
us divide cycle time at a station into the following components
cycle time = wait-for-batch time + queue time + setup time + process time
Wait-for-batch time is the time it takes to form a batch in front of the operation.
For simplicity, we assume that entities arrive one at a time and that we do not start
processing until a full batch is in place. Under these conditions, the time to form a
batch will increase in proportion to the batch size.
From Chapter 1, we know that queue time increases in utilization. Since larger
batches mean fewer setups, and hence lower utilization, queue time will decrease
(nonlinearly) as batch size increases.
Finally, if we assume that we must process the entire batch before any of the
entities can depart from the operation, the setup plus process time for a batch is
s + Qt, which clearly increases in proportion to the batch size.
Adding all of these times together results in a relationship between cycle time
and batch size like that shown in Figure 3.3. This figure illustrates a case where
the minimum batch size required to keep up with arrivals is larger than one. But
eventually the wait-for-batch and process times become large enough to offset the
utilization (and hence queueing) reduction due to large batch sizes. So, there is
some intermediate batch size, Q∗ , that minimizes cycle time.
The main points of our discussion of batching leading up to Figure 3.3 can be
captured in the following:
44. 36 CHAPTER 3. BATCHING
Figure 3.3: Effect of Sequential Batching on Cycle Time.
Principle (Batching): In a simultaneous or sequential batching environment:
1. The smallest batch size that yields a stable system may be greater than
one,
2. Delay due to batching (eventually) increases proportionally in the batch
size.
In sequential batching situations, the Batching Principle assumes that setup
times are fixed and batch sizes are adjusted to accommodate them. However, in
practice, reducing setup times is often an option. This can have a dramatic impact
on cycle times. To see this, consider a milling machine that receives 10 parts per
hour to process. Each part requires four minutes of machining, but there is a one
hour setup to change from one part type to another.
We first note that the minimum batch size that will allow the operation to keep
up with the arrival rate is
rs 10(1)
Q> = = 30
1 − rt 1 − (10)(4/60)
So, batch size must be at least 30. However, because utilization is still high when
batch size is 30, significant queueing occurs. Figure 3.4 shows that for this case,
cycle time is minimized by using a batch size of 63. At this batch size, total cycle
time is approximately 33 hours.
45. 3.3. SEQUENTIAL BATCHING 37
Figure 3.4: Effect of Setup Reduction on Sequential Batching and Cycle Time.
A batch size of 63 is very large and results in significant wait-for-batch delay. If
we cut setup time in half, to 30 minutes, the minimum batch size is also halved, to
15, and the batch size that minimizes cycle time falls to 31. Total cycle time at this
batch size is reduced from 33 hours to 16.5 hours.
Further reduction of setup time will facilitate smaller batch sizes and shorter
cycle times. Eventually, a batch size of one will become optimal. For instance, if
setup time is reduced to one minute, then, as Figure 3.4 illustrates, a batch size of
one achieves a cycle time of 0.5 hours, which is lower than that achieved by any
other batch size. Clearly, setup reduction and small batch production go hand in
hand.
Finally, we note that there is no intrinsic reason that the process batch must
equal the move batch. For instance, in the above milling machine example with a
30 minute setup time, the fact that we use a process batch size of 31 to balance
capacity with batching delay does not mean that we must also move lots of 31 items
to the next process downstream. We could transfer partial lots to the next station
and begin processing them before the entire batch has been completed at the milling
station. Indeed, if material handling is efficient enough, we could conceivably move
completed parts in lots of one.
Figure 3.5 illustrates the impact on the cycle time versus batch size relationship
of using move batches of size one. The top curve represents the 30 minute setup
case from Figure 3.4, while the bottom curve represents the case where parts are
moved downstream individually as soon as they are finished at the milling station.
Because parts do not have to wait for their batch-mates, total cycle time is reduced
46. 38 CHAPTER 3. BATCHING
Figure 3.5: Effect of Move Batch Splitting on Cycle Time.
by this practice of move batch splitting. In systems with lengthy setup times and
large process batch sizes, reducing move batch sizes can have a significant effect on
overall cycle time.
INSIGHT BY ANALOGY - An Intersection
What happens when a power failure causes the stoplight at a busy intersection
to go out? Temporary stop signs are installed and traffic backs up for blocks in all
directions.
Why does this happen? Because traffic ettiquite at an intersection with stop
signs calls for drivers to take turns. One car goes through the intersection in the
east-west direction and then one goes in the north-south direction. The batch size
is one. But, there is a setup (i.e., reaction and acceleration) time associated with
each car. So a batch size of one is too small. The excess setup time overloads the
system and causes traffic to pile up.
The opposite of the failed traffic light problem is the situation of a traffic light
that stays green too long in each direction. Again traffic backs up. But this time
it is because cars have to wait a long time for a green light. The batch size is too
large, which causes a substantial delay while the batch builds up.
Optimally timing a traffic light so as to minimize average waiting time is very
much like the problem of problem of finding a batch size to minimize cycle time
through a batch operation. The tradeoff is essentially the same as that depicted
47. 3.4. MULTI-PRODUCT BATCHING 39
in Figure 3.3. Fortunately, traffic engineers know about this tradeoff and (usually)
time traffic lights appropriately.
3.4 Multi-Product Batching
The above discussions make the fundamental point that batching is primarily about
balancing capacity and delay. If all entities are identical, then the problem is simply
to find a uniform batch size that strikes a reasonable balance. However, in most
systems, entities (products, customers, data packets, etc.) are not identical. There-
fore, in addition to balancing capacity and delay, we must also address the question
of how to differentiate batch sizes between different entity types.
A common approach to the batching problem is the so-called economic order
quantity (EOQ) model. This model, which is presented in Chapter 7, tries to
strike a balance between holding cost (which is proportional to inventory and hence
cycle time) and setup cost. In purchasing situations, where the cost to order a batch
of items is essentially fixed (i.e., does not depend on the size of the order) and orders
are independent (e.g., come from different suppliers), the EOQ model can be very
useful in setting lot sizes.
However, in production settings, where the setup “cost” is really a proxy for
capacity, EOQ can lead to problems. First of all, there is no guarantee that the batch
sizes produced by the EOQ model will even be feasible (i.e., utilization might exceed
one). Even if they are feasible from a capacity standpoint, it may be very difficult to
construct an actual production schedule from them. For instance, demand may not
be neat multiples of the batch sizes, which means we will wind up with “remnants”
in inventory. Finally, even if the batch sizes are feasible and lead to a schedule, the
schedule might be such that a customer has to wait a long time for a particular
entity type to “come around” on the schedule.
The problem with EOQ is not in the details of the model; it is in the fundamental
approach of thinking about the problem in terms of setting batch sizes. A more
effective way to approach the multi-product sequential batching problem is in terms
of allocating setups to product types. That is, suppose we know the processing rate,
setup time to start a batch, and the quantity that must be processed over a fixed
interval (e.g., to meet demand for the upcoming month). Then, if we allocate n
setups to a given product, we will make n runs of it during the month. If monthly
demand for the product is D, then we will run it in batches of Q = D/n. The
problem thus becomes one of how many setups to allocate to each product.
To illustrate how this might be done, let us consider an example in which a
plant produces four products: Basic, Standard, Deluxe and Supreme. Demand for
the upcoming month (D), production rate in units per hour (p), and setup time
(s) for each product are given in Table 3.1. We also compute D/p, which gives the
number of hours of process time required to meet demand for each product. In this