Termination Detection
Termination Detection
Termination Detection
Introduction
Introduction
System Model
At any given time, a process can be in only one of the two states: active,
where it is doing local computation and idle, where the process has
(temporarily) finished the execution of its local computation and will be
reactivated only on the receipt of a message from another process.
An active process can become idle at any time.
An idle process can become active only on the receipt of a message from
another process.
Only active processes can send messages.
A message can be received by a process when the process is in either of the
two states, i.e., active or idle. On the receipt of a message, an idle process
becomes active.
The sending of a message and the receipt of a message occur as atomic
actions.
The last process to terminate will have the largest clock value. Therefore,
every process will take a snapshot for it, however, it will not take a snapshot
for any other process.
System Model
A process called controlling agent monitors the computation.
A communication channel exists between each of the processes and the
controlling agent and also between every pair of processes.
Initially, all processes are in the idle state.
The weight at each process is zero and the weight at the controlling agent is
1.
The computation starts when the controlling agent sends a basic message to
one of the processes.
A non-zero weight W (0<W≤1) is assigned to each process in the active
state and to each message in transit in the following manner:
Basic Idea
When a process sends a message, it sends a part of its weight in the message.
When a processreceives a message, it adds the weight received in the
message to it’s weight.
Thus, the sum of weights on all the processes and on all the messages in
transit is always 1.
When a process becomes passive, it sends its weight to the controlling agent
in a control message, which the controlling agent adds to its weight.
The controlling agent concludes termination if its weight becomes 1.
Notations
Algorithm
Correctness of Algorithm
Notations
A: set of weights on all active processes
B: set of weights on all basic messages in transit
C: set of weights on all control messages in transit
Wc : weight on the controlling agent.
Two invariants I1 and I2 are defined for the algorithm:
X
I1 : Wc + W=1
W ∈(A∪B∪C )
I2 : ∀ W ∈ (A∪B∪C), W>0
Correctness of Algorithm
Invariant I1 states that sum of weights at the controlling process, at all active
processes, on all basic messages in transit, and on all control messages in
transit is always equal to 1.
Invariant I2 states that weight at each active process, on each basic message
in transit, and on each control message in transit is non-zero.
Hence,
Wc P=1
⇒ W ∈(A∪B∪C ) W = 0 (by I1 )
⇒ (A∪B∪C ) = φ (by I2 )
⇒ (A∪B) = φ.
(A∪B) = φ implies the computation has terminated. Therefore, the
algorithm never detects a false termination.
Further,
(A∪B) = P φ
⇒ Wc + W ∈C W = 1 (by I1 )
Since the message delay is finite, after the computation has terminated,
eventually Wc =1.
Thus, the algorithm detects a termination in finite time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Termination Detection CUP 2008 14 / 30
Distributed Computing: Principles, Algorithms, and Systems
Two waves of signals generated one moving inward and other outward
through the spanning tree.
Initially, a contracting wave of signals, called tokens, moves inward from
leaves to the root.
If this token wave reaches the root without discovering that termination has
occurred, the root initiates a second outward wave of repeat signals.
As this repeat wave reaches leaves, the token wave gradually forms and starts
moving inward again, this sequence of events is repeated until the
termination is detected.
T1 0
1 2
denotes a token
3 4 5 6
T5 T6
A parent process holding a black token (from one of its children), sends only
a black token to its parent, to indicate that a message-passing was involved
in its subtree.
Tokens are propagated to the root in this fashion. The root, upon receiving a
black token, will know that a process in the tree had sent a message to some
other process. Hence, it restarts the algorithm by sending a Repeat signal to
all its children.
Each child of the root propagates the Repeat signal to each of its children
and so on, until the signal reaches the leaves.
The leaf nodes restart the algorithm on receiving the Repeat signal.
The root concludes that termination has occurred, if
1 it is white,
2 it is idle, and
3 it received a white token from each of its children.
Performance
The best case message complexity of the algorithm is O(N), where N is the
number of processes in the computation, which occurs when all nodes send
all computation messages in the first round.
The worst case complexity of the algorithm is O(N*M), where M is the
number of computation messages exchanged, which occurs when only
computation message is exchanged every time the algorithm is executed.
white token
p q
Figure 2: Node p sends a message m to node q that has already sent a white
token to its parent.
A. Kshemkalyani and M. Singhal (Distributed Computing) Termination Detection CUP 2008 25 / 30
Distributed Computing: Principles, Algorithms, and Systems
begin
scan the stack and delete the first entry of the form TO(y );
if idle then
stack cleanup
end;
A node sends a terminate message to its parent when it satisfies all the
following conditions:
1 It is idle.
2 Each of its incoming links is colored (it has received a warning message on
each of its incoming links).
3 Its stack is empty.
4 It has received a terminate message from each of its children (this rule does
not apply to leaf nodes).
When the root node satisfies all of the above conditions, it concludes that
the underlying computation has terminated.
Performance
In the worst case, each node in the network sends one warning message on
each outgoing link. Thus, each link carries two warning messages, one in
each direction.
Since there are |E| links, the total number of warning messages generated by
the algorithm is 2*|E|.
For every message generated by the underlying computation,exactly one
remove message is sent on the network.
If M is the number of messages sent by the underlying computation, then at
most M remove entry messages are used.
Finally, each node sends exactly one terminate message to its parent and
since there are only |V| nodes and |V|−1 tree edges, only |V| − 1 terminate
messages are sent.
Hence, the total number of messages generated by the algorithm is 2* |E| +
|V| − 1 + M.
Thus, the message complexity of the algorithm is O(|E| + M) as |E| > |V| −
1 for any connected network.
The algorithm is asymptotically optimal in the number of messages.
A. Kshemkalyani and M. Singhal (Distributed Computing) Termination Detection CUP 2008 31 / 30