1. 1
Process Synchronization (Galvin Notes 9th Ed.)
Chapter 5: Process Synchronization
Outline:
CHAPTER OBJECTIVES
To introduce the critical-section problem, whose solutions can be used to ensure the consistency of shared data.
To present both software and hardware solutions of the critical-section problem.
To examine several classical process-synchronization problems.
To explore several tools that are used to solve process synchronization problems.
BACKGROUND
THE CRITICAL SECTION PROBLEM
PETERSON'S SOLUTION
SYNCHRONIZATION HARDWARE
MUTEX LOCKS
SEMAPHORES
o Semaphore Usage
o Semaphore Implementation
o Deadlocks and Starvation
o Priority Inversion
CLASSIC PROBLEMS OF SYNCHRONIZATION
o The Bounded-Buffer Problem
o The Readers–Writers Problem
o The Dining-Philosophers Problem
MONITORS
o Monitor Usage
o Dining-Philosophers Solution
o Using Monitors
o Implementing a Monitor
o Using Semaphores
o Resuming Processes within a Monitor
SKIPPED BOOK CONTENT:
SYNCHRONIZATION EXAMPLES
o Synchronization in Windows
o Synchronization in Linux
o Synchronization in Solaris
o Pthreads Synchronization
ALTERNATIVE APPROACHES
o Transactional Memory
o OpenMP
o Functional Programming Languages
Contents
A cooperating process is one that canaffect or be affectedbyother processes executing inthe system. Cooperating processescaneither directlyshare
a logical address space (that is, bothcode anddata) or be allowedto share data onlythrough filesor messages. The former case is achieved through
the use of threads, discussedinChapter 4. Concurrent accessto shareddata mayresult in data inconsistency, however. In th is chapter, we discuss
various mechanisms to ensure the orderlyexecution ofcooperating processes that share a logical address space, so that data consistency is
maintained.
2. 2
Process Synchronization (Galvin Notes 9th Ed.)
BACKGROUND
We’ve alreadyseen that processes canexecute concurrentlyor in parallel. Section 3.2.2 introduced the role of process sched uling and
described how the CPU scheduler switches rapidlybetween processesto provide concurrent execution. This means that one process may
onlypartiallycomplete execution before another processis scheduled. Infact, a process maybe interruptedat anypoint in its instruction
stream, andthe processing core maybe assigned to execute instructions of another process. Additionally, Section4.2 introduced parallel
execution, in whichtwo instruction streams (representing different processes) execute simultaneouslyon separate processing cores. Inthis
chapter, we explain how concurrent or parallelexecution can contribute to issues involving the integrity of data shared by s everal
processes.
In Chapter 3, we developeda modelof a system consisting of cooperating sequential processesor threads, allrun ning asynchronously and
possiblysharing data. We illustratedthismodel withthe producer–consumer problem, which is representative of operating systems.
Specifically, in Section 3.4.1, we described how a bounded buffer could be used to enable processes t o share memory.
Coming to the bounded buffer problem, as we pointedout, our original solutionallowedat most BUFFER SIZE − 1 items inthe b uffer at the
same time. Suppose we want to modifythe algorithm to remedythis deficiency. One possibility is to add an integer variable counter,
initializedto 0. counter is incrementedeverytime we adda newitemto the buffer andis decremented every time we remove o ne item
from the buffer. The code for the producer and consumer processes can be modified as follows:
Although the producer andconsumer routines shown above are correct separately, they may not function correctly when executed
concurrently. As anillustration, suppose that the value ofthe variable counter is currently5 andthat the producer andconsumer processes
concurrentlyexecute the statements “counter++” and“counter--”. Followingthe executionof these two statements, the value ofthe variable
counter maybe 4, 5, or 6! The onlycorrect result, though, is counter == 5, which is generated correctly if the producer and consumer
execute separately.
Note: Page 205 of 9th edition (which we have read well) shows why the value of the counter may be incorrect. It is due to the way
the statements "Counter++" or "Counter--" are implemented in assembly(and hence machine language) on a typical machine. Since we
know it well, we don't clutter the content here. The following starts after that part in book.
We would arrive at this incorrect state because we allowed bothprocesses to manipulate the variable counter concurrently. A situationlike
this, where several processes access and manipulate the same data concurrently and the outcome of the execution depe nds on the
particular order in whichthe accesstakes place, is calleda race condition. To guard against the race condition above, we need to ensure
that onlyone process at a time canbe manipulating the variable counter. To make such a guarantee, we req uire that the processes be
synchronized in some way.
Situations such as the one just describedoccur frequentlyinoperating systems as different parts of the system manipulate resources.
Furthermore, as we have emphasizedinearlier chapters, the growing importance ofmulticore systems hasbrought anincreased emphasis
on developing multithreadedapplications. In such applications, several threads—which are quite possibly sharing data—are running in
parallel on different processing cores. Clearly, we want anychanges that result from such activities not to interfere with one another.
Because of the importance of thisissue, we devote a major portion of this chapter to process synchronization and coordination among
cooperating processes.
THE CRITICAL SECTION PROBLEM
We beginour considerationof process synchronization by discussing the so-called critical-section
problem. Consider a system consisting of n processes{P0, P1, ..., Pn−1}. Each process has a segment of
code, calleda critical section, in whichthe processmay be changing common variables, updating a
table, writing a file, and so on. The important feature of the system is that, when one process is
executinginits criticalsection, no other process is allowedto execute inits critical section. Th at is, no
two processes are executingintheir critical sections at the same time. The critical-section problem is to
design a protocol that the processes can use to cooperate. Each process must request permission to
enter its critical section. The section of code implementingthis request is the entry section. The critical
sectionmaybe followedbyan exit section. The remainingcode is the remainder section. The general
structure ofa typical process Pi is shown inFigure 5.1. The entrysection and exit sectionare enclosedin
boxes to highlight these important segments of code.
A solution to the critical-section problem must satisfy the following three requirements:
Mutual exclusion. If processPi is executing inits critical section, thenno other processes can be executing in their critical sections.
Progress. If no process is executing in its critical section and some processes wish to enter their critical sections, thenonly thos e processes
that are not executing intheir remainder sections canparticipate indecidingwhichwill enter its critical section next, an d this selection
cannot be postponed indefinitely.
3. 3
Process Synchronization (Galvin Notes 9th Ed.)
Bounded waiting. There exists a bound, or limit, onthe number of times that other processes are allowed to enter their critical sections after
a process has made a request to enter its critical section and before that request is granted.
At a givenpoint intime, manykernel-mode processesmaybe active inthe operatingsystem. As a result, the code implementing an operating system
(kernel code)is subject to several possible race conditions. Consider as anexample a kernel data structure that maintains a list of all open files in the
system. This list must be modifiedwhena newfile is openedor closed(adding the file to the list or removing it from the list). If two processes were to
open filessimultaneously, the separate updatesto this list couldresult in a race condition. Other kerneldata structures that are prone to possible race
conditions include structures for maintaining memoryallocation, for maintaining process lists, andfor interrupt handling. I t is upto kernel developers
to ensure that the operating sys tem is free from such race conditions.
Two general approachesare usedto handle criticalsections in operating systems: preemptive kernels and non-preemptive kernels. A preemptive
kernel allows a process to be preemptedwhile it is runninginkernel mode. A nonpreemptive kernel does not allow a processrunning in kernel mode
to be preempted;a kernel-mode process will run until it exits kernel mode, blocks, or voluntarily yields control of the CPU. Obviously, a non -
preemptive kernel is essentiallyfree from race conditions on kernel data structures, as onlyone process is active inthe kernel at a time. We cannot say
the same about preemptive kernels, sotheymust be carefullydesigned to ensure that shared kernel data are free from race co nditions. Preemptive
kernelsare especiallydifficult to design for SMParchitectures, since in these environments it is possible for two kernel -mode processes to run
simultaneously on different processors.
PETERSON’S SOLUTION
We now illustrate a classic software-basedsolution to the critical-section problem known as Peterson’s solution. Because of the way modern
computer architectures perform basic machine-language instructions, such as loadand store, there are noguarantees that Peterson’s solution will
work correctlyon such architectures. However, we present the solutionbecause it provides a good algorithmic description of solving the critical -
sectionproblemandillustrates some of the complexitiesinvolvedindesigning software that addresses the requirem ents of mutual exclusion,
progress, and bounded waiting.
Peterson’s solutionis restrictedto two processesthat alternate executionbetweentheir criticalsections andremainder sections. The processes are
numberedP0 andP1. For convenience, whenpresenting Pi, we use Pj to denote the other process;that is, j equals 1 − i. Peterson’s solution requires
the two processes to share two data items: int turn; and boolean flag[2];
The variable turnindicateswhose turnit is to enter its critical section. That is, ifturn== i, then process Pi is
allowedto execute in its critical section. The flag arrayis usedto indicate if a process is ready to enter its
critical section. For example, ifflag[i] is true, this value indicatesthat Pi is readyto enter its critical section.
With an explanation ofthese data structures complete, we are nowreadyto describe the algorithm shownin
Figure 5.2. To enter the critical section, processPi first sets flag[i] to be true andthensets turnto the value
j, therebyasserting that ifthe other process wishes to enter the critical section, it cando so. If both processes
try to enter at the same time, turn willbe set to bothi andj at roughly the same time. Only one of these
assignments will last;the other willoccur but will be overwrittenimmediately. The eventual value of turn
determines which of the two processes is allowed to enter its critical section first.
We now prove that this solutionis correct. We needto showthat:1. Mutual exclusionis preserved. 2. The progress requirement is satisfied. 3. The
bounded-waitingrequirement is met.
To prove property1, we note that each Pi enters its criticalsectiononlyif either flag[j]== false or turn == i. Alsonote that, if both processes can be
executingintheir critical sections at the same time, then flag[0] == flag[1] == true. These two observations imply that P0 and P1 could not have
successfullyexecutedtheir while statements at about the same time, since the value of turncanbe either 0 or 1 but cannot be both. Hence, one of the
processes —say, Pj—must have successfullyexecuted the while statement, whereas Pi had to execute at least one additional statement (“turn == j”).
However, at that time, flag[j]== true andturn == j, andthis condition will persist as longas Pj is inits critical section;as a result, mutual exclusion is
preserved.
To prove properties 2 and3, we note that a process Pi canbe preventedfrom entering the critical section onlyifit is stuckinthe while loopwith the
condition flag[j]== true and turn == j;this loopis the onlyone possible. If Pj is not readyto enter the critical section, then flag[j] == false, and Pi can
enter its critical section. If Pj has set flag[j]to true andis also executing in its while statement, theneither turn == i or turn == j. If turn == i, then Pi will
enter the critical section. If turn == j, thenPj willenter the critical section. However, once Pj exits its critical section, it will reset flag[j]to false, allowing
Pi to enter its criticalsection. If Pj resets flag[j]to true, it must alsoset turnto i. Thus, since Pi does not change the value of the variable turn while
executing the while statement, Pi will enter the critical section (progress) after at most one entry by Pj (bounded waiting).
PETERSON’S SOLUTION (WIKIPEDIA)
The algorithmuses twovariables, flag and turn . A flag[n] value of true indicates that the process n wants to enter the critical section. Entrance to
the criticalsectionis grantedfor processP0 if P1 does not want to enter its critical section or if P1 has givenpriority to P0 bysetting turn to 0 .
The algorithmsatisfies the three essential criteria to solve the critical sectionproblem, provided that changes to the variables turn , flag[0] ,
and flag[1] propagate immediatelyandatomically. The while conditionworks evenwith preemption.
4. 4
Process Synchronization (Galvin Notes 9th Ed.)
The three criteria are mutual exclusion, progress, and
bounded waiting.
Since turncantake onone oftwo values, it canbe replaced
bya single bit, meaning that the algorithms requires only
three bits of memory.
Mutual exclusion
P0 and P1 can never be in the critical section at the same
time:If P0 is inits critical section, then flag[0] is true. In
addition, either flag[1] is false (meaning P1 has left its
critical section), or turn is 0 (meaningP1 is just now trying
to enter the critical section, but graciouslywaiting), or P1 is
at label P1_gate (trying to enter its critical section, after
settingflag[1]to true but before settingturn to 0 and busywaiting). So ifbothprocesses are intheir critical sections thenwe conclude that the state
must satisfyflag[0]andflag[1]andturn = 0 and turn= 1. No state cansatisfybothturn= 0 and turn = 1, so there can be nostate where bothprocesses
are in their critical sections. (This recounts an argument that is made rigorous in.[5])
Progress
Progress is definedas the following:ifno process is executing in its critical sectionandsome processes wishto enter their critical sections, then only
those processes that are not executing in their remainder sections can participate inmakingthe decision as to which process will enter its critical
sectionnext. This selectioncannot be postponedindefinitely.[3] A process cannot immediatelyre-enter the critical section if the other process has set
its flag to say that it would like to enter its critical section.
Bounded waiting
Boundedwaiting, or boundedbypassmeans that the number of times a processis bypassed byanother process after it has indicatedits desire to enter
the criticalsectionis bounded bya functionof the number of processes in the system.[3][4]:11 In Peterson's algorithm, a processwill never wait longer
than one turnfor entrance to the critical section:After givingpriorityto the other process, thisprocesswill run to comp letion and set its flag to 1,
thereby never allowing the other process to enter the critical section.
SYNCHRONIZATIONHARDWARE
As mentioned, software-based solutions such as Peterson’s are not guaranteed to work on
modern computer architectures. Inthe followingdiscussions, we explore several more solutions to
the critical-sectionproblemusing techniques ranging from hardware to software -based APIs
available to bothkerneldevelopers andapplicationprogrammers. All these solutions are basedon
the premiseof locking—that is, protectingcritical regions throughthe use oflocks. As we shall
see, the designs ofsuch locks canbe quite sophisticated. We start bypresenting some simple
hardware instructions that are available onmanysystems and showing how they can be used
effectively in solving the critical-section problem. Hardware features can make any
programming task easier and improve system efficiency.
The critical-sectionproblemcould be solvedsimplyina single-processor environment if we
could prevent interrupts from occurringwhile a shared variable was being modified. In this
way, we could be sure that the current sequence of instructions wouldbe allowedto execute in
order without preemption. No other instructions wouldbe run, sono unexpected modifications
could be made to the sharedvariable. This is often the approach taken by
nonpreemptive kernels. Unfortunately, this solution is not as feasible in a
multiprocessor environment. Disablinginterrupts ona multiprocessor can
be time consuming, since the message is passedto allthe processors. This
message passing delays entry into each critical section, and system
efficiencydecreases. Also consider the effect ona system’s clock if the clock
is kept updated by interrupts.
Many modern computer systems therefore provide special hardware
instructions that allowus either to test and modify the content of a word or to swap the contents of two words atomically—that is, as one
uninterruptibleunit. We canuse these special instructions to solve the critical-sectionproblemina relativelysimple manner. We abstract the main
concepts behind these types of instructions by describing the test_ and_ set() and compare_and_swap() instructions.
The atomic test_and_set() instructioncan be defined as shown in Figure 5.3. If the machine supports the test_and_set() instruction, then we can
implement mutual exclusionbydeclaring a boolean variable lock, initialized
to false. The structure of process Pi is shown in Figure 5.4.
The compare_and_swap() instruction, in contrast to the test_and_set()
instruction, operates on three operands; it is defined in Figure 5.5. The
operandvalue is set to newvalue onlyif the expression (*value== exected) is
true. Regardless, compare_and_swap() always returns the original value of
the variable value. Like the test_and_set() instruction, compare_and_swap()
5. 5
Process Synchronization (Galvin Notes 9th Ed.)
is executedatomically. Mutual exclusioncanbe providedas follows: a global variable
(lock) is declared and is initialized to 0. The first process that invokes
compare_and_swap()will set lock to 1. It will thenenter its critical section, because the
original value of lock was equal to the expected value of 0. Subsequent calls to
compare_and_swap()will not succeed, because lock now is not equal to the expected
value of 0. When a process exits its critical section, it sets lock back to 0, which allows
another process to enter its critical section. The structure of process Pi is shown in
Figure 5.6.
Althoughthese algorithms satisfythe mutual-exclusionrequirement, theydo not
satisfythe bounded-waiting requirement. InFigure 5.7, we present another algorithm
using the test_and_set() instructionthat satisfies all the critical-sectionrequirements.
The commondata structures are:
boolean waiting[n];
boolean lock;
These data structures are initialized to false. To prove that the mutual exclusion
requirement is met, we note that processPi canenter its critical section only if either
waiting[i] == false or key == false. The value of key can become false only if the
test_and_set() is executed. The first process to execute the test_and_set() will find key == false; all others must wait. The variable waiting[i] can
become false onlyifanother process leaves its critical section;onlyone waiting[i]is set to false, maintaining the mutual-exclusion requirement. To
prove that the progress requirement is met, we note that the arguments presented for mutual exclusionalsoapplyhere, since a process exiting the
critical sectioneither sets lock to false or sets waiting[j]to false. Bothallow a processthat is waiting to enter its critical section to proceed. To prove
that the bounded-waitingrequirement is met, we note that, whena process leaves its criticalsection, it scans the arraywaitinginthe cyclic ordering (i
+ 1, i + 2, ..., n − 1, 0, ..., i − 1). It designates the first process in this ordering that is inthe entrysection(waiting[j]== true) as the next one to enter the
critical section. Any process waiting to enter its critical section will thus do so within n − 1 turns.
Details describing the implementationof the atomic test_and_set() and compare_and_swap() instructions are discussed more fullyin books on
computer architecture.
MUTEX LOCKS
The hardware-based solutions to the critical-section problem presentedinSection 5.4 are complicatedas well as generally inaccessible to
applicationprogrammers. Instead, operating-systems designers buildsoftware tools to solve the critical-section problem. The simplest of
these tools is the mutex lock. (In fact, the term mutex is short for mutual exclusion.)We use the mutex lock to protect critical regions and
thus prevent race conditions. That is, a process must acquire the lock before entering a critical section;it releasesthe lock when it exits the
critical section. The acquire() function acquires the lock, and the release() function releases the lock, as illustrated in Figure 5.8.
A mutex lockhas a boolean variable available whose value indicates if the lock is
available or not. Ifthe lock is available, a call to acquire() succeeds, and the lock is
then consideredunavailable. A processthat attempts to acquire anunavailable lockis
blockeduntil the lockis released. The definition of acquire() and release() are as
follows:
Calls to either acquire() or release()must be performedatomically. Thus, mutex locks
are oftenimplementedusing one of the hardware mechanisms describedinSection 5.4,
and we leave the description of this technique as an exercise.
The maindisadvantage of the implementation given here is that it requires busy
waiting. While a process is in its critical section, anyother process that tries to enter its
critical sectionmust loopcontinuouslyinthe call to acquire(). Infact, this type of mutex
lock is alsocalleda spinlock because the process “spins” while waiting for the lock to
become available. (We see the same issue with the code examples illustrating the
test_and_set() instruction and the compare_and_swap() instruction.) This continual
loopingis clearlya problem in a real multiprogramming system, where a single CPU is sharedamong manyprocesses. Busy waiting wastes
CPU cycles that some other process might be able to use productively.
Spinlocks do have anadvantage, however, inthat no context switchis requiredwhena process must wait on a lock, and a context switch
maytake considerable time. Thus, when locks are expectedto be heldfor short times, spinlocks are useful. They are often employe d on
multiprocessor systems where one threadcan“spin” onone processor while another thread performs its critical section on ano ther
processor. Later inthis chapter (Section5.7), we examine howmutex locks canbe usedto solve classical synchronizationproblems. We a lso
discuss how these locks are used in several operating systems, as well as in Pthreads.
SEMAPHORES
Mutex locks, as we mentioned earlier, are generallyconsidered the simplest of synchronization
tools. Inthis section, we examine a more robust tool that can behave similarlyto a mutex lock but
can also provide more sophisticated ways for processes to synchron ize their activities. A
semaphore S is aninteger variable that, apart from initialization, is accessed only through two
standardatomic operations: wait() and signal(). The definitions of wait() and signal() are as
6. 6
Process Synchronization (Galvin Notes 9th Ed.)
follows:
All modifications to the integer value of the semaphore inthe wait() andsignal() operations must be executedindivisibly. That is, when one
process modifies the semaphore value, noother process can simultaneouslymodifythat same semaphore value. Inaddition, in the case of
wait(S), the testing of the integer value of S (S ≤ 0), as well as its possible modification(S--), must be executedwithout interruption. We shall
see how these operations can be implemented in Section 5.6.2. First, let’s see how sem aphores can be used.
Semaphore Usage
Operating systems often distinguishbetweencounting and binary semaphores. The value of a counting semaphore can range over an
unrestricteddomain. The value of a binary semaphore can range onlybetween 0 and 1. Thus, binarysemaphores behave similarly to mutex
locks. Infact, onsystems that donot provide mutex locks, binary semaphores can be used instead for providing mutual exclus ion.
Counting semaphores canbe usedto control accessto a given resource consisting of a finite number of instances. The semaphore is
initializedto the number of resources available. Each process that wishesto use a resource performs a wait()operation on the semaphore
(therebydecrementing the count). Whena process releasesa resource, it performs a signal() operation(incrementing the count). When the
count for the semaphore goes to 0, all resources are beingused. After that, processesthat wishto use a resource will block until the count
becomes greater than 0.
We can also use semaphores to solve various synchronizationproblems. For example, consider two concurrentlyrunningprocesses:P1 with
a statement S1 andP2 with a statement S2. Suppose we require that S2 be executedonlyafter S1 has completed. We canimplement this
scheme readilybylettingP1 andP2 share a commonsemaphore synch, initialized to 0. In process P1, we insert the statements
S1;
signal(synch);
In processP2, we insert the statements
wait(synch);
S2;
Because synch is initialized to 0, P2 will execute S2 onlyafter P1 has invoked signal(synch), which is after statement S1 has beenexecuted.
Semaphore Implementation
Recall that the implementationof mutex locks discussedin Section5.5 suffers from busywaiting. The definitions of the wait() andsignal()
semaphore operations just describedpresent the same problem. To overcome the needfor busywaiting, we canmodifythe definitionof the
wait() and signal() operations as follows:Whena process executes the wait() operationandfinds that the semaphore value is not positive, it
must wait. However, rather thanengaging inbusywaiting, the process can blockitself. The blockoperationplacesa process intoa waiting
queue associatedwiththe semaphore, andthe state of the process is switched to the waitingstate. Then control is transferred to the CPU
scheduler, whichselects another process to execute.
A process that is blocked, waiting ona semaphore S, shouldbe restartedwhensome other process
executes a signal() operation. The processis restartedbya wakeup() operation, whichchanges the process
from the waitingstate to the readystate. The process is thenplacedinthe readyqueue. (The CPU mayor
maynot be switched from the running process to the newlyreadyprocess, dependingon the CPU-scheduling algorithm.)To implement
semaphores under this definition, we define a semaphore as follows:
Each semaphore has aninteger value anda list of
processes list. Whena processmust wait on a
semaphore, it is addedto the list of processes. A signal()
operationremoves one process from the list ofwaiting
processes andawakens that process. Now, the wait()
and signal()semaphore operations canbe definedas:
The block() operationsuspends the processthat invokes it. The wakeup(P) operationresumesthe executionof a blockedprocess P.
These twooperations are provided bythe operatingsystemas basic system calls.
Note that inthis implementation, semaphore valuesmaybe negative, whereas semaphore values are never negative under the classical
definitionof semaphoreswith busywaiting. Ifa semaphore value is negative, its magnitude is the number of processes waitingon that
semaphore. Thisfact results from switchingthe order of the decrement andthe test inthe implementationof the wait() operation.
The list of waitingprocesses can be easilyimplementedbya link fieldineach process control block(PCB). Eachsemaphore contains an
integer value anda pointer to a list of PCBs. One wayto addandremove processes from the list soas to ensure bounded waiting is to use a
FIFO queue, where the semaphore contains both headandtail pointers to the queue. Ingeneral, however, the list can use any queueing
strategy.
It is criticalthat semaphore operations be executedatomically. We must guarantee that no two processes can execute wait() and signal()
operations onthe same semaphore at the same time. Thisis a critical-section problem;andina single-processor environment, we can solve
it bysimplyinhibiting interrupts during the time the wait() and signal() operations are executing. This scheme works in a single -processor
environment because, once interrupts are inhibited, instructions fromdifferent processes cannot be interleaved. Onlythe currently running
process executesuntil interrupts are reenabledandthe scheduler can regaincontrol. In a multiprocessor environment, interrupts must be
disabledoneveryprocessor. Otherwise, instructions fromdifferent processes(running ondifferent processors)maybe interleaved in some
arbitraryway. Disablinginterrupts oneveryprocessor canbe a difficult taskandfurthermore can seriouslydiminish performance. Therefore,
7. 7
Process Synchronization (Galvin Notes 9th Ed.)
SMP systems must provide alternative locking techniques— suchas compare_and_swap() or spinlocks—toensure that wait() and signal() are
performed atomically.
It is important to admit that we have not completelyeliminatedbusywaiting withthis definition ofthe wait() andsignal() operations. Rather,
we have movedbusywaiting from the entrysectionto the critical sections of application programs. Furthermore, we have limited busy
waiting to the critical sections ofthe wait() andsignal() operations, andthese sections are short (ifproperlycoded, theyshould be no more
than about teninstructions). Thus, the critical section is almost never occupied, andbusywaiting occurs rarely, and then for only a short
time. An entirelydifferent situation exists withapplicationprograms whose critical sections may be long (minutes or even h ours) or may
almost always be occupied. In such cases, busy waiting is extremely inefficient.
Deadlocks and Starvation
The implementationof a semaphore with a waiting queue mayresult ina situationwhere two or more
processes are waiting indefinitelyfor anevent that can be causedonlybyone ofthe waiting processes. The
event inquestionis the execution ofa signal() operation. When such a state is reached, theseprocesses are
said to be deadlocked. Toillustrate this, consider a system consisting of twoprocesses, P0 andP1, each
accessing two semaphores, S andQ, set to the value 1:
Suppose that P0 executes wait(S) andthenP1 executeswait(Q).WhenP0 executes wait(Q), it must wait
until P1 executes signal(Q). Similarly, whenP1 executeswait(S), it must wait until P0 executes signal(S). Since these signal() operations cannot
be executed, P0 andP1 are deadlocked.
Another problemrelatedto deadlocks is indefinite blocking or starvation, a situation inwhichprocesses wait indefinitelywithinthe
semaphore. Indefinite blockingmayoccur if we remove processes fromthe list associatedwith a semaphore in LIFO (last-in, first-out)order.
Priority Inversion
A scheduling challenge arises when a higher-priorityprocess needs to read or modifykernel data that are currently being accessed by a
lower-priorityprocess—or a chainof lower-priorityprocesses. Since kernel data are typically protected with a lock, the higher-priority
process will have to wait for a lower-priorityone to finishwiththe resource. The situationbecomes more complicated if the lower-priority
process is preempted in favor of another process witha higher priority. As an example, assume we have three processes—L, M, and H—
whose priorities follow the order L < M < H. Assume that processH requires resource R, which is currently being accessed by process L.
Ordinarily, processH would wait for L to finish using resource R. However, now suppos e that process M becomes runnable, thereby
preempting process L. Indirectly, a process witha lower priority—process M—has affected how long processH must wait for L to relinquish
resource R.
This problem is known as priority inversion. It occurs onlyinsystems withmore thantwo priorities, soone solution is to have only two
priorities. That is insufficient for most general -purpose operating systems, however. Typically these systems solve the problem by
implementing a priority-inheritance protocol. According to thisprotocol, allprocesses that are accessing resources needed by a higher-
priorityprocess inherit the higher priorityuntiltheyare finishedwiththe resources inquestion. When they are finished, their priorities
revert to their original values. In the example above, a priority-inheritance protocol would allow process L to temporarilyinherit the priority
of process H, therebypreventing process Mfrom preempting its execution. When process L had finishedusing resource R, it would relinquish
its inherited priorityfrom H andassume its originalpriority. Because resource R would now be available, process H —not M—wouldrun next.
CLASSIC PROBLEMS OF SYNCHRONIZATION
In this section, we present a number of synchronizationproblems as examples of a large class ofconcurrency-control problems. These problems are
usedfor testingnearlyeverynewlyproposedsynchronizationscheme. In our solutions to the problems, we use semaphoresfor synchronization, since
that is the traditional way to present such solutions. However, actual implementations ofthese solutions could use mutex locks in place of binary
semaphores.
The Bounded-Buffer Problem
The bounded-buffer problem wasintroduced inSection5.1;it is commonlyusedto illustrate the power of
synchronizationprimitives. Here, we present a general structure of this scheme without committing
ourselves to anyparticular implementation. We provide a relatedprogramming project inthe exercises at
the end of the chapter.
In our problem, the producer and consumer processesshare the following data structures:
int n;
semaphore mutex = 1;
semaphore empty = n;
semaphore full = 0
We assume that the pool consists of n buffers, eachcapable ofholdingone item. The mutex
semaphore provides mutualexclusionfor accesses to the buffer pool andis initializedto the value
1. The emptyandfullsemaphores count the number of empty andfull buffers. The semaphore
empty is initializedto the value n;the semaphore full is initialized to the value 0.
The code for the producer process is showninFigure 5.9, andthe code for the consumer
8. 8
Process Synchronization (Galvin Notes 9th Ed.)
process is showninFigure 5.10. Note the symmetrybetween the producer andthe consumer. We can interpret thiscode as the p roducer producing
full buffers for the consumer or as the consumer producing emptybuffers for the producer.
The Readers–Writers Problem
Suppose that a database is to be sharedamong several concurrent processes. Some of these processesmaywant onlyto read the database,
whereasothers maywant to update (that is, to read and write) the database. We distinguish
betweenthese twotypes ofprocesses byreferring to the former as readers and to the latter
as writers. Obviously, iftwo readers access the shared data simultaneously, no adverse
effects will result. However, if a writer andsome other process (either a reader or a writer)
access the database simultaneously, chaos mayensue. To ensure that these difficultiesdo not
arise, we require that the writers have exclusive access to the shareddatabase while writing
to the database. This synchronizationproblemis referred to as the readers–writers problem.
Since it was originally stated, it has been used to test nearly every new synchronization primitive.
The readers–writers problemhas severalvariations, all involving priorities. The simplest one, referred to as the first readers–writers
problem, requires that noreader be kept waiting unless a writer has already obtained
permission to use the sharedobject. In other words, noreader shouldwait for other readers
to finish simplybecause a writer is waiting. The second readers –writers problem requires
that, once a writer is ready, that writer perform its write as soonas possible. In other words,
if a writer is waiting to access the object, no new readers may start reading.
A solutionto either problemmayresult instarvation. Inthe first case, writers may starve; in
the second case, readers maystarve. For thisreason, other variants of the problemhave been
proposed. Next, we present a solution to the first readers –writers problem. See the
bibliographical notes at the end of the cha pter for references describing starvation-free
solutions to the second readers –writers problem.
In the solution to the first readers–writers problem, the reader processes share the following
data structures:
semaphore rw mutex = 1;
semaphore mutex = 1;
int read count = 0;
The semaphores mutex andrw_mutex are initializedto 1; read_count is initialized to 0. The semaphore rw_mutex is commonto bothreader
and writer processes. The mutex semaphore is usedto ensure mutual exclusionwhenthe variable read count is updated. The read_count
variable keeps trackof how manyprocessesare currentlyreading the object. The semaphore rw_mutex functions as a mutual exclusion
semaphore for the writers. It is alsousedbythe first or last reader that enters or exits the critical section. It is not used byreaders whoenter
or exit while other readers are intheir critical sections.
The code for a writer process is shownin Figure 5.11;the code for a reader process is shownin Figure 5.12. Note that, if a writer is inthe
critical sectionandn readers are waiting, then one reader is queuedon rw_mutex, and n − 1 readers are queued on mutex. Also observe that,
when a writer executes signal(rw_mutex), we mayresume the executionof either the waiting readers or a single waitingwriter. The selection
is made bythe scheduler.
The readers–writers problemand its solutions have beengeneralizedto provide reader–writer locks onsome systems. Acquiring a reader–
writer lockrequires specifyingthe mode of the lock:either read or write access. Whena processwishes onlyto read shared data, it requests
the reader–writer lockinreadmode. A process wishingto modifythe s hareddata must request the lock inwrite mode. Multiple processes
are permitted to concurrentlyacquire a reader–writer lockinreadmode, but onlyone process mayacquire the lock for writing, as exclusive
access is required for writers.
Reader–writer locks are most useful in the followingsituations:
o In applications where it is easyto identifywhichprocesses only read shared data and which
processes only write shared data.
o In applications that have more readers than writers. This is because reader– writer locks
generallyrequire more overheadto establishthansemaphores or mutual-exclusion locks. The
increasedconcurrencyof allowingmultiplereaders compensates for the overhead involved in
setting up the reader– writer lock.
The Dining-Philosophers Problem
Consider five philosophers whospendtheir lives thinking andeating. The philosophers share a circular
table surroundedbyfive chairs, each belongingto one philosopher. Inthe center of the table is a bowl of
rice, andthe table is laidwithfive single chopsticks (Figure 5.13). When a philosopher thinks, she does not
9. 9
Process Synchronization (Galvin Notes 9th Ed.)
interact withher colleagues. From time to time, a philosopher gets hungry and tries to pick up the two chopsticks that are closest to her (the
chopsticks that are betweenher andher left and right neighbors). A philosopher maypick uponlyone chopstick at a time. Obviously, she cannot pick
up a chopstick that is alreadyinthe handof a neighbor. When a hungryphilosopher has both her chopsticks at the same time, she eats without
releasing the chopsticks. When she is finished eating, she puts down both chopsticks and starts thinking again.
The dining-philosophers problem is considered a classic synchronization problem neither because ofits practical importance nor because
computer scientists dislike philosophers but because it is anexample of a large classof concurrency-control problems. It is a simple
representation of the need to allocate severalresources among severalprocesses ina deadlock-free and starvation-free manner.
One simple solutionis to represent each chopstickwith a semaphore. A philosopher tries to grab a chopstick byexecutinga wait() operation
on that semaphore. She releases her chopsticks byexecuting the signal() operationon the appropriate semaphores. Thus, the shared data
are
semaphore chopstick[5];
where all the elements of chopstick are initializedto 1. The structure of philosopher i is shown in Figure 5.14.
Althoughthis solutionguaranteesthat no twoneighbors are eatingsimultaneously, it nevertheless must be rejected because it couldcreate
a deadlock. Suppose that allfive philosophers become hungryat the same time andeach grabs her left chopstick. All the elements of
chopstick will nowbe equal to 0. When each philosopher tries to grab her right chopstick, she willbe delayed forever.
Several possible remediesto the deadlockproblemare replacedby:
o Allowat most four philosophers to be sitting simultaneouslyat the table.
o Allowa philosopher to pickup her chopsticks onlyif bothchopsticks are available(to do this, she must pickthem upina critical
section).
o Use anasymmetric solution—that is, anodd-numberedphilosopher picks up first her left chopstick andthenher right chopstick,
whereasaneven numberedphilosopher picks up her right chopstickandthen her left chopstick.
In the next sectionon "Monitors", we present a solutionto the dining-philosophers problem that ensures freedomfrom deadlocks. Note,
however, that anysatisfactorysolution to the dining-philosophers problem must guardagainst the possibilitythat one of the philosophers
will starve to death. A deadlock-free solutiondoes not necessarilyeliminate the possibilityof starvation.
MONITORS
Background: Althoughsemaphores provide a convenient and effective mechanism for process synchronization, using them incorrectly can result in
timing errors that are difficult to detect, since these errors happenonlyif particular executionsequences take place and these sequences do not
always occur. We have seenan example of such errors in the use of counters in our solution to the producer–consumer problem (Section 5.1
"Background" ofthis chapter). In that example, the timing problem happened only rarely, and even then the counter value appeared to be
reasonable—off byonly1. Nevertheless, the solutionis obviouslynot anacceptable one. It is for this reason that semaphores were introduced in the
first place.
Unfortunately, such timing errors can stilloccur whensemaphoresare used. To illustrate how, we review the semaphore solution to the critical -
sectionproblem. All processes share a semaphore variable mutex, which is initializedto 1. Each process must execute wait(mutex) before entering the
critical sectionand signal(mutex) afterward. Ifthis sequence is not observed, twoprocesses maybe in their critical sections
simultaneously. Next, we examine the various difficulties that mayresult. Note that these difficultieswill arise evenif a single
process is not well behaved.
Suppose that a process interchangesthe order inwhich the wait() andsignal() operations onthe semaphore mutex
are executed, resulting in the followingexecution:
In this situation, several processes maybe executing intheir critical sections simultaneously, violatingthe mutual-exclusionrequirement.
This error maybe discoveredonlyif several processesare simultaneouslyactive in their critical sections. Note that thiss ituationmaynot
always be reproducible.
Suppose that a process replaces signal(mutex) withwait(mutex). That is, it executes:
In this case, a deadlock willoccur.
Suppose that a process omits the wait(mutex), or the signal(mutex), or both. Inthiscase, either mutual exclusionis
violatedor a deadlock will occur.
These examples illustrate that various types of errors can be generated easilywhenprogrammers use semaphores incorrectly to solve the critical-
sectionproblem. Similar problems mayarise in the other synchronizationmodels discussedinSection 5.7 (Classic problems). To deal withsuch errors,
researchers have developedhigh-level language constructs. Inthissection, we describe one fundamentalhigh-level synchronization construct—the
monitor type.
From Wikipedia
In concurrent programming, a monitor isa synchronization construct that allowsthreadsto have both mutual exclusion andthe ability to wait (block) for a certain
condition to become true.Monitorsalso have a mechanism for signallingother threadsthat their condition has been met. A monitor consistsof a mutex (lock) object
and condition variables.Acondition variable isbasically a container of threadsthat are waiting for a certaincondition. Monitorsprovide a mechanism for threadsto
temporarily give up exclusiveaccessin order to wait for some conditionto be met, before regaining exclusive accessand resuming their task.
10. 10
Process Synchronization (Galvin Notes 9th Ed.)
Another definition of monitor isa thread-safe class, object, or module thatuseswrapped mutual exclusion inorder to safely allow accessto a method or variable by
more than one thread.Thedefiningcharacteristic of a monitor isthat itsmethodsare executed with mutual exclusion:At each pointin time, at most one threadmay be
executing any of itsmethods. Using a conditionvariable(s), it can also provide the ability for threadsto wait on a certaincondition(thususing the above definitionof a
"monitor"). For the rest of this article, thissense of "monitor"will be referred to asa "thread -safe object/class/module".
Monitorswere inventedby Per Brinch Hansen[1] and C. A. R. Hoare, and were first implemented in BrinchHansen'sConcurrent Pascal language.
Note: The Pthreads APIprovides mutex locks, conditionvariables, and read–write locks for threadsynchronization.
SUMMARY
Given a collection ofcooperating sequentialprocesses that share data, mutualexclusionmust be providedto ensure that a critical section of
code is usedbyonlyone process or thread at a time. Typically, computer hardware provides several operations that ensure mutual
exclusion. However, suchhardware-based solutions are too complicatedfor most developers to use. Mutex locks and semaphores overcome
this obstacle. Bothtools canbe usedto solve various synchronizationproblems andcan be implementedefficiently, especially if hardware
support for atomic operations is available.
Various synchronization problems (suchas the bounded-buffer problem, the readers–writers problem, and the dining-philosophers
problem) are important mainlybecause theyare examples of a large class ofconcurrency-control problems. These problems are usedto test
nearly every newly proposed synchronization scheme.
The operating system must provide the means to guardagainst timing errors, andseveral language constructs have beenproposed to deal
with these problems. Monitors provide a synchronizationmechanismfor sharingabstract data types. A conditionvariable provi desa method
by which a monitor function can block its execution until it is signaled to continue.
Operating systems alsoprovide support for synchronization. For example, Windows, Linux, and Solaris provide mechanisms such as
semaphores, mutex locks, spinlocks, andcondition variables to control access to shareddata. The Pthreads API provides s upport for mutex
locks and semaphores, as well as condition variables.
Several alternative approaches focus on synchronization for multicore systems. One approach uses transactional memory, which may
address synchronizationissues usingeither software or hardware techniques. Another approachusesthe compiler extensions offered by
OpenMP. Finally, functional programming languages address synchronization issues by disallowing mutability.