Chapter 6
Chapter 6
Chapter 6
Tools
CHAPTER OBJECTIVES
6.1 Background
We’ve already seen that processes can execute concurrently or in parallel.
Section 3.2.2 introduced the role of process scheduling and described how the
CPU scheduler switches rapidly between processes to provide concurrent
execution. This means that one process may only partially complete execution
before another process is scheduled. In fact, a process may be interrupted at
any point in its instruction stream, and the processing core may be assigned to
execute instructions of another process. Additionally, Section 4.2 introduced
parallel execution, in which two instruction streams (representing different
processes) execute simultaneously on separate processing cores. In this
chapter, we explain how concurrent or parallel execution can contribute to
issues involving the integrity of data shared by several processes.
Let’s consider an example of how this can happen. In Chapter 3, we
developed a model of a system consisting of cooperating sequential processes
or threads, all running asynchronously and possibly sharing data. We
illustrated this model with the producer– consumer problem, which is a
representative paradigm of many operating system functions. Specifically, in
Section 3.5, we described how a bounded buffer could be used to enable
processes to share memory.
We now return to our consideration of the bounded buffer. As we pointed
out, our original solution allowed at BUFFER SIZE − 1 items in the buffer at the
same time. Suppose we want to modify the algorithm to remedy this
deficiency. One possibility is to add an integer variable, count, initialized to
0. count is incremented every time we add a new item to the buffer and is
decremented every time, we remove one item from the buffer. The code for the
producer process can be modified as follows:
while (true) {
/* produce an item in next produced */
while (true) {
while (count == 0)
; /* do nothing */
next consumed = buffer[out];
out = (out + 1) % BUFFER SIZE;
count--;
/* consume the item in next consumed */
}
Although the producer and consumer routines shown above are correct
separately, they may not function correctly when executed concurrently. As
an illustration, suppose that the value of the variable count is currently 5 and
that the producer and consumer processes concurrently execute the statements
“count++” and “count--”. Following the execution of these two statements,
the value of the variable count may be 4, 5, or 6! The only correct result, though,
is count == 5, which is generated correctly if the producer and consumer
execute separately.
We can show that the value of count may be incorrect as follows. Note
that the statement “count++” may be implemented in machine language (on a
typical machine) as follows:
register1 = count
register1 = register1 + 1
count = register1
where register1 is one of the local CPU registers. Similarly, the statement “count-
-” is implemented as follows:
register2 = count
register2 = register2 − 1
count = register2
where again register2 is one of the local CPU registers. Even though register1 and
register2 may be the same physical register, remember that the contents of this
register will be saved and restored by the interrupt handler (Section 1.2.3).
The concurrent execution of “count++” and “count--” is equivalent to a
sequential execution in which the lower-level statements presented previously
are interleaved in some arbitrary order (but the order within each high-level
statement is preserved). One such interleaving is the following:
Notice that we have arrived at the incorrect state “count == 4”, indicating that
four buffers are full, when, in fact, five buffers are full. If we reversed the order
of the statements at T4 and T5, we would arrive at the incorrect state “count
== 6”.
We would arrive at this incorrect state because we allowed both processes
to manipulate the variable count concurrently. A situation like this, where
several processes access and manipulate the same data concurrently and the
outcome of the execution depends on the particular order in which the access
takes place, is called a race condition. To guard against the race condition
above, we need to ensure that only one process at a time can be manipulating
the variable count. To make such a guarantee, we require that the processes be
synchronized in some way.
Situations such as the one just described occur frequently in operating
systems as different parts of the system manipulate resources. Furthermore,
as we have emphasized in earlier chapters, the prominence of multicore
systems has brought an increased emphasis on developing multithreaded
applications. In such applications, several threads— which are quite possibly
sharing data — are running in parallel on different processing cores. Clearly,
we want any changes that result from such activities not to interfere with
one
another. Because of the importance of this issue, we devote a major portion of
this chapter to process synchronization and coordination among cooperating
processes.
while (true) {
entry section
critical section
exit section
remainder section
P0 P1
request request
pid pid
time
next_available_pid = 2615
return return
2615 2615
int turn;
boolean flag[2];
while (true) {
flag[i] = true;
turn = j;
while (flag[j] && turn == j)
;
/* critical section */
flag[i] = false;
/*remainder section */
}
The variable turn indicates whose turn it is to enter its critical section. That is,
if turn == i, then process Pi is allowed to execute in its critical section. The
flag array is used to indicate if a process is ready to enter its critical section.
For example, if flag[i] is true, Pi is ready to enter its critical section. With an
explanation of these data structures complete, we are now ready to describe
the algorithm shown in Figure 6.3.
To enter the critical section, process Pi first sets flag[i] to be true and
then sets turn to the value j, thereby asserting that if the other process wishes
to enter the critical section, it can do so. If both processes try to enter at the same
time, turn will be set to both i and j at roughly the same time. Only one of
these assignments will last; the other will occur but will be overwritten
immediately. The eventual value of turn determines which of the two
processes is allowed to enter its critical section first.
We now prove that this solution is correct. We need to show that:
To prove property 1, we note that each Pi enters its critical section only
if either flag[j] == false or turn == i. Also note that, if both processes can
be executing in their critical sections at the same time, then flag[0] ==
flag[1] == true. These two observations imply that P0 and P1 could not
have successfully executed their while statements at about the same time,
since the value of turn can be either 0 or 1 but cannot be both. Hence, one of
the processes— say, Pj — must have successfully executed the while statement,
whereas Pi had to execute at least one additional statement (“turn == j”).
However, at that time, flag[j] == true and turn == j, and this condition
will persist as long as Pj is in its critical section; as a result, mutual exclusion is
preserved.
To prove properties 2 and 3, we note that a process Pi can be prevented from
entering the critical section only if it is stuck in the while loop with the condition
flag[j] == true and turn == j; this loop is the only one possible. If Pj is not
ready to enter the critical section, then flag[j] == false, and Pi can enter its
critical section. If Pj has set flag[j] to true and is also executing in its while
statement, then either turn == i or turn == j. If turn == i, then Pi will enter
the critical section. If turn == j, then Pj will enter the critical section. However,
once Pj exits its critical section, it will reset flag[j] to false, allowing Pi to
enter its critical section. If Pj resets flag[j] to true, it must also set turn to i.
Thus, since Pi does not change the value of the variable turn while executing
the while statement, Pi will enter the critical section (progress) after at most
one entry by Pj (bounded waiting).
As mentioned at the beginning of this section, Peterson’s solution is not
guaranteed to work on modern computer architectures for the primary rea-
son that, to improve system performance, processors and/or compilers may
reorder read and write operations that have no dependencies. For a single-
threaded application, this reordering is immaterial as far as program correct-
ness is concerned, as the final values are consistent with what is expected. (This
is similar to balancing a checkbook — the actual order in which credit and debit
operations are performed is unimportant, because the final balance will still be
the same.) But for a multithreaded application with shared data, the reordering of
instructions may render inconsistent or unexpected results.
As an example, consider the following data that are shared between two
threads:
boolean flag = false;
int x = 0;
while (!flag)
;
print x;
and Thread 2 performs
x = 100;
flag = true;
The expected behavior is, of course, that Thread 1 outputs the value 100 for
variable x. However, as there are no data dependencies between the variables
flag and x, it is possible that a processor may reorder the instructions for
Thread 2 so that flag is assigned true before assignment of x = 100. In
this situation, it is possible that Thread 1 would output 0 for variable x. Less
obvious is that the processor may also reorder the statements issued by Thread
1 and load the variable x before loading the value of flag. If this were to occur,
Thread 1 would output 0 for variable x even if the instructions issued by Thread 2
were not reordered.
process 0 turn = 1
process 1
time
How does this affect Peterson’s solution? Consider what happens if the
assignments of the first two statements that appear in the entry section of
Peterson’s solution in Figure 6.3 are reordered; it is possible that both threads
may be active in their critical sections at the same time, as shown in Figure 6.4.
As you will see in the following sections, the only way to preserve mutual
exclusion is by using proper synchronization tools. Our discussion of these
tools begins with primitive support in hardware and proceeds through
abstract, high-level, software-based APIs available to both kernel developers
and application programmers.
while (!flag)
memory barrier();
print x;
we guarantee that the value of flag is loaded before the value of x.
Similarly, if we place a memory barrier between the assignments per-
formed by Thread 2
x = 100;
memory barrier();
flag = true;
we ensure that the assignment to x occurs before the assignment to flag.
With respect to Peterson’s solution, we could place a memory barrier
between the first two assignment statements in the entry section to avoid the
reordering of operations shown in Figure 6.4. Note that memory barriers are
considered very low-level operations and are typically only used by kernel
developers when writing specialized code that ensures mutual exclusion.
return rv;
}
Figure 6.5 The definition of the atomic test and set() instruction.
do {
while (test and set(&lock))
; /* do nothing */
/* critical section */
lock = false;
/* remainder section */
} while (true);
The test and set() instruction can be defined as shown in Figure 6.5.
The important characteristic of this instruction is that it is executed
atomically. Thus, if two test and set() instructions are executed
simultaneously (each on a different core), they will be executed sequentially
in some arbitrary order. If the machine supports the test and set()
instruction, then we can implement mutual exclusion by declaring a boolean
variable lock, initialized to false. The structure of process Pi is shown in
Figure 6.6.
The compare and swap() instruction (CAS), just like the test and set()
instruction, operates on two words atomically, but uses a different mechanism
that is based on swapping the content of two words.
The CAS instruction operates on three operands and is defined in Figure
6.7. The operand value is set to new value only if the expression (*value
== expected) is true. Regardless, CAS always returns the original value of
the variable value. The important characteristic of this instruction is that it is
executed atomically. Thus, if two CAS instructions are executed simultaneously
(each on a different core), they will be executed sequentially in some arbitrary
order.
Mutual exclusion using CAS can be provided as follows: A global
variable (lock) is declared and is initialized to 0. The first process that
invokes compare and swap() will set lock to 1. It will then enter its critical
section,
int compare and swap(int *value, int expected, int new value) {
int temp = *value;
if (*value == expected)
*value = new value;
return temp;
}
Figure 6.7 The definition of the atomic compare and swap() instruction.
while (true) {
while (compare and swap(&lock, 0, 1) != 0)
; /* do nothing */
/* critical section */
lock = 0;
/* remainder section */
}
Figure 6.8 Mutual exclusion with the compare and swap() instruction.
because the original value of lock was equal to the expected value of 0.
Subsequent calls to compare and swap() will not succeed, because lock now
is not equal to the expected value of 0. When a process exits its critical section,
it sets lock back to 0, which allows another process to enter its critical section.
The structure of process Pi is shown in Figure 6.8.
Although this algorithm satisfies the mutual-exclusion requirement, it
does not satisfy the bounded-waiting requirement. In Figure 6.9, we present
while (true) {
waiting[i] = true;
key = 1;
while (waiting[i] && key == 1)
key = compare and swap(&lock,0,1);
waiting[i] = false;
/* critical section */
j = (i + 1) % n;
while ((j != i) && !waiting[j])
j = (j + 1) % n;
if (j == i)
lock = 0;
else
waiting[j] = false;
/* remainder section */
}
another algorithm using the compare and swap() instruction that satisfies all
the critical-section requirements. The common data structures are
boolean waiting[n];
int lock;
The elements in the waiting array are initialized to false, and lock is
initialized to 0. To prove that the mutual-exclusion requirement is met, we note
that process Pi can enter its critical section only if either waiting[i] == false
or key == 0. The value of key can become 0 only if the compare and swap()
is executed. The first process to execute the compare and swap() will find key
== 0; all others must wait. The variable waiting[i] can become false only if
another process leaves its critical section; only one waiting[i] is set to false,
maintaining the mutual-exclusion requirement.
To prove that the progress requirement is met, we note that the arguments
presented for mutual exclusion also apply here, since a process exiting the
critical section either sets lock to 0 or sets waiting[j] to false. Both allow a
process that is waiting to enter its critical section to proceed.
To prove that the bounded-waiting requirement is met, we note that, when
a process leaves its critical section, it scans the array waiting in the cyclic
ordering (i + 1, i + 2, ..., n − 1, 0, ..., i − 1). It designates the first process in
this ordering that is in the entry section (waiting[j] == true) as the next one to
enter the critical section. Any process waiting to enter its critical section will
thus do so within n − 1 turns.
Details describing the implementation of the atomic test and set() and
compare and swap() instructions are discussed more fully in books on
computer architecture.
do {
temp = *v;
}
while (temp != compare and swap(v, temp, temp+1));
}
acquire lock
critical section
release lock
remainder section
acquire() {
while (!available)
; /* busy wait */
available = false;
}
LOCK CONTENTION
6.5 Semaphores
Mutex locks, as we mentioned earlier, are generally considered the simplest of
synchronization tools. In this section, we examine a more robust tool that can
behave similarly to a mutex lock but can also provide more sophisticated ways
for processes to synchronize their activities.
A semaphore S is an integer variable that, apart from initialization, is
accessed only through two standard atomic operations: wait() and signal().
Semaphores were introduced by the Dutch computer scientist Edsger Dijkstra,
and such, the wait() operation was originally termed P (from the Dutch
proberen, “to test”); signal() was originally called V (from verhogen, “to
increment”). The definition of wait() is as follows:
wait(S) {
while (S <= 0)
; // busy wait
S--;
}
signal(S) {
S++;
}
All modifications to the integer value of the semaphore in the wait() and
signal() operations must be executed atomically. That is, when one process
modifies the semaphore value, no other process can simultaneously modify
that same semaphore value. In addition, in the case of wait(S), the testing of
the integer value of S (S ≤ 0), as well as its possible modification (S--), must be
executed without interruption. We shall see how these operations can be
implemented in Section 6.6.2. First, let’s see how semaphores can be used.
wait(synch);
S2;
typedef struct {
int value;
struct process *list;
} semaphore;
Each semaphore has an integer value and a list of processes list. When
a process must wait on a semaphore, it is added to the list of processes. A
signal() operation removes one process from the list of waiting processes
and awakens that process.
Now, the wait() semaphore operation can be defined as
wait(semaphore *S) {
S->value--;
if (S->value < 0) {
add this process to S->list;
sleep();
}
}
and the signal() semaphore operation can be defined as
signal(semaphore *S) {
S->value++;
if (S->value <= 0) {
remove a process P from S->list;
wakeup(P);
}
}
The sleep() operation suspends the process that invokes it. The wakeup(P)
operation resumes the execution of a suspended process P. These two
operations are provided by the operating system as basic system calls.
Note that in this implementation, semaphore values may be negative,
whereas semaphore values are never negative under the classical definition of
semaphores with busy waiting. If a semaphore value is negative, its magnitude is
the number of processes waiting on that semaphore. This fact results from
switching the order of the decrement and the test in the implementation of the
wait() operation.
The list of waiting processes can be easily implemented by a link field in
each process control block (PCB). Each semaphore contains an integer value and a
pointer to a list of PCBs. One way to add and remove processes from the list so
as to ensure bounded waiting is to use a FIFO queue, where the semaphore
contains both head and tail pointers to the queue. In general, however, the list
can use any queuing strategy. Correct usage of semaphores does not depend
on a particular queuing strategy for the semaphore lists.
As mentioned, it is critical that semaphore operations be executed
atomically. We must guarantee that no two processes can execute wait() and
signal() operations on the same semaphore at the same time. This is a critical-
section problem, and in a single-processor environment, we can solve it by sim-
ply inhibiting interrupts during the time the wait() and signal() operations
are executing. This scheme works in a single-processor environment because,
once interrupts are inhibited, instructions from different processes cannot be
interleaved. Only the currently running process executes until interrupts are
reenabled and the scheduler can regain control.
In a multicore environment, interrupts must be disabled on every
processing core. Otherwise, instructions from different processes (running on
different cores) may be interleaved in some arbitrary way. Disabling
interrupts on every core can be a difficult task and can seriously diminish
performance. Therefore, SMP systems must provide alternative techniques—
such as com- pare and swap() or spinlocks — to ensure that wait() and
signal() are per- formed atomically.
It is important to admit that we have not completely eliminated busy
waiting with this definition of the wait() and signal() operations. Rather, we
have moved busy waiting from the entry section to the critical sections of
application programs. Furthermore, we have limited busy waiting to the
critical sections of the wait() and signal() operations, and these sections are
short (if properly coded, they should be no more than about ten instructions).
Thus, the critical section is almost never occupied, and busy waiting occurs
rarely, and then for only a short time. An entirely different situation exists
with application programs whose critical sections may be long (minutes or
even hours) or may almost always be occupied. In such cases, busy waiting is
extremely inefficient.
6.6 Monitors
Although semaphores provide a convenient and effective mechanism for
process synchronization, using them incorrectly can result in timing errors
that are difficult to detect, since these errors happen only if particular
execution sequences take place, and these sequences do not always occur.
We have seen an example of such errors in the use of a count in our solution
to the producer – consumer problem (Section 6.1). In that example, the timing
problem happened only rarely, and even then the count value appeared to
be reasonable— off by only 1. Nevertheless, the solution is obviously not an
acceptable one. It is for this reason that mutex locks and semaphores were
introduced in the first place.
Unfortunately, such timing errors can still occur when either mutex locks
or semaphores are used. To illustrate how, we review the semaphore solution
to the critical-section problem. All processes share a binary semaphore variable
mutex, which is initialized to 1. Each process must execute wait(mutex) before
entering the critical section and signal(mutex) afterward. If this sequence is
not observed, two processes may be in their critical sections simultaneously.
Next, we list several difficulties that may result. Note that these difficulties will
arise even if a single process is not well behaved. This situation may be caused
by an honest programming error or an uncooperative programmer.
• Suppose that a program interchanges the order in which the wait() and
signal() operations on the semaphore mutex are executed, resulting in
the following execution:
signal(mutex);
...
critical section
...
wait(mutex);
In this situation, several processes may be executing in their critical
sections simultaneously, violating the mutual-exclusion requirement.
This error may be discovered only if several processes are simultaneously
active in their critical sections. Note that this situation may not always be
reproducible.
• Suppose that a program replaces signal(mutex) with wait(mutex). That
is, it executes
wait(mutex);
...
critical section
...
wait(mutex);
In this case, the process will permanently block on the second call to
wait(), as the semaphore is now unavailable.
• Suppose that a process omits the wait(mutex), or the signal(mutex), or
both. In this case, either mutual exclusion is violated or the process will
permanently block.
These examples illustrate that various types of errors can be generated easily
when programmers use semaphores or mutex locks incorrectly to solve the
critical-section problem. One strategy for dealing with such errors is to
incorporate simple synchronization tools as high-level language constructs. In
this section, we describe one fundamental high-level synchronization
construct— the monitor type.
function P1 ( . . . ) {
. . .
}
function P2 ( . . . ) {
. . .
}
.
.
.
function Pn ( . . . ) {
. . .
}
initialization code ( . . . ) {
. . .
}
}
The only operations that can be invoked on a condition variable are wait()
and signal(). The operation
x.wait();
means that the process invoking this operation is suspended until another
process invokes
x.signal();
entry queue
shared data
...
operations
initialization
code
1. Signal and wait. P either waits until Q leaves the monitor or waits for
another condition.
2. Signal and continue. Q either waits until P leaves the monitor or waits
for another condition.
entry queue
shared data
•••
operations
initialization
code
void release() {
busy = false;
x.signal();
}
initialization code() {
busy = false;
}
}
x.wait(c);
R.acquire(t);
...
access the resource;
...
R.release();
The same difficulties are encountered with the use of semaphores, and
these difficulties are similar in nature to those that encouraged us to develop
the monitor constructs in the first place. Previously, we had to worry about
the correct use of semaphores. Now, we have to worry about the correct use of
higher-level programmer-defined operations, with which the compiler can no
longer assist us.
One possible solution to the current problem is to include the resource-
access operations within the ResourceAllocator monitor. However, using
this solution will mean that scheduling is done according to the built-in
monitor-scheduling algorithm rather than the one we have coded.
To ensure that the processes observe the appropriate sequences, we must
inspect all the programs that make use of the ResourceAllocator monitor and
its managed resource. We must check two conditions to establish the correct-
ness of this system. First, user processes must always make their calls on the
monitor in a correct sequence. Second, we must be sure that an uncooperative
process does not simply ignore the mutual-exclusion gateway provided by the
monitor and try to access the shared resource directly, without using the access
protocols. Only if these two conditions can be ensured can we guarantee that
no time-dependent errors will occur and that the scheduling algorithm will not
be defeated.
Although this inspection may be possible for a small, static system, it is
not reasonable for a large system or a dynamic system. This access-control
problem can be solved only through the use of the additional mechanisms that
are described in Chapter 17.
6.8 Liveness
One consequence of using synchronization tools to coordinate access to critical
sections is the possibility that a process attempting to enter its critical section
will wait indefinitely. Recall that in Section 6.2, we outlined three criteria that
solutions to the critical-section problem must satisfy. Indefinite waiting violates
two of these — the progress and bounded-waiting criteria.
Liveness refers to a set of properties that a system must satisfy to ensure
that processes make progress during their execution life cycle. A process
waiting indefinitely under the circumstances just described is an example of a
“liveness failure.”
There are many different forms of liveness failure; however, all are
generally characterized by poor performance and responsiveness. A very
simple example of a liveness failure is an infinite loop. A busy wait loop
presents the possibility of a liveness failure, especially if a process may loop an
arbitrarily long period of time. Efforts at providing mutual exclusion using
tools such as mutex locks and semaphores can often lead to such failures in
concurrent programming. In this section, we explore two situations that can
lead to liveness failures.
6.8.1 Deadlock
The implementation of a semaphore with a waiting queue may result in a
situation where two or more processes are waiting indefinitely for an event
that can be caused only by one of the waiting processes. The event in question
is the execution of a signal() operation. When such a state is reached, these
processes are said to be deadlocked.
To illustrate this, consider a system consisting of two processes, P0 and P1,
each accessing two semaphores, S and Q, set to the value 1:
P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
. .
. .
. .
signal(S); signal(Q);
signal(Q); signal(S);
6.9 Evaluation
We have described several different synchronization tools that can be used to
solve the critical-section problem. Given correct implementation and usage,
these tools can be used effectively to ensure mutual exclusion as well as address
liveness issues. With the growth of concurrent programs that leverage the
power of modern multicore computer systems, increasing attention is being
paid to the performance of synchronization tools. Trying to identify when
to use which tool, however, can be a daunting challenge. In this section, we
present some simple strategies for determining when to use specific synchro-
nization tools.
The hardware solutions outlined in Section 6.4 are considered very low
level and are typically used as the foundations for constructing other synchro-
nization tools, such as mutex locks. However, there has been a recent focus on
using the CAS instruction to construct lock-free algorithms that provide
protection from race conditions without requiring the overhead of locking.
Although these lock-free solutions are gaining popularity due to low overhead
PRIORITY INVERSION AND THE MARS PATHFINDER
and ability to scale, the algorithms themselves are often difficult to develop
and test. (In the exercises at the end of this chapter, we ask you to evaluate the
correctness of a lock-free stack.)
CAS-based approaches are considered an optimistic approach — you
optimistically first update a variable and then use collision detection to see if
another thread is updating the variable concurrently. If so, you repeatedly
retry the operation until it is successfully updated without conflict. Mutual-
exclusion locking, in contrast, is considered a pessimistic strategy; you assume
another thread is concurrently updating the variable, so you pessimistically
acquire the lock before making any updates.
The following guidelines identify general rules concerning performance
differences between CAS-based synchronization and traditional
synchronization (such as mutex locks and semaphores) under varying
contention loads:
6.9 Summary