14CS705B-Distributed Systems Scheme
14CS705B-Distributed Systems Scheme
7 a) What are the uses of clock synchronization? Discuss about various clock synchronization 8M
algorithms.
b) What is replication? What are the reasons for replication? 4M
Page 1 of 24
Part - A
1 Answer all questions (1X12=12
Marks)
a) Define distributed system.
A distributed system is a collection of independent computers that appears to its users as a
single coherent system.
b) Define RPC.
a remote procedure call (RPC) is when a computer program causes a procedure (subroutine)
to execute in a different address space (commonly on another computer on a shared
network), which is coded as if it were a normal (local) procedure call, without the
programmer explicitly coding the details for the remote interaction.
c) What is a socket?
A socket is a communication end point.
d) What is code migration?
Moving entire process from one machine to another machine.
e) What are the two ways to implement structured name resolution?
i. Iterative name resolution
ii. Recursive name resolution
f) List the properties of identifier.
i. An identifier refers to at most one entity
ii. Each entity is referred to by at most one identifier
iii. An identifier always refers to the same entity
g) List the algorithms used to achieve mutual exclusion.
i. Centralized algorithm
ii. Decentralized algorithm
iii. Distributed algorithm
iv. Token ring algorithm
h) Define synchronization.
Performing the tasks in concurrent and collaborative way.
i) What is causal consistency?
Causal consistency captures the potential causal relationships between operations, and
guarantees that all processes observe causally-related operations in a common order. In
other words, all processes in the system agree on the order of the causally-related
operations.
j) What is Byzantine problem?
The Byzantine Generals Problem is a term etched from the computer science description of
a situation where involved parties must agree on a single strategy in order to avoid complete
failure, but where some of the involved parties are corrupt and disseminating false
information or are otherwise unreliable.
k) Define fault tolerance.
Fault tolerance is the property that enables a system to continue operating properly in the
event of the failure of some of its components.
l) Define distributed commit.
The distributed commit problem involves having an operation being performed by each
member of a process group, or none at all.
Page 2 of 24
Part - B
2 a) Explain Distributed system goals. 6M
The four goals of distributed systems are
i. Resource Accessibility
• Support user access to remote resources (printers, data files, web pages, CPU cycles)
and the fair sharing of the resources
• Economics of sharing expensive resources
• Performance enhancement – due to multiple processors; also due to ease of
collaboration and info exchange – access to remote services
– Groupware: tools to support collaboration
• Resource sharing introduces security problems.
ii. Distribution Transparency
• Software hides some of the details of the distribution of system resources.
– Makes the system more user friendly.
• A distributed system that appears to its users & applications to be a single computer
system is said to be transparent.
Users & apps should be able to access remote resources in the same way they access
local resources.
iii. Openness
• An open distributed system “…offers services according to standard rules that
describe the syntax and semantics of those services.” In other words, the interfaces
to the system are clearly specified and freely available.
– Compare to network protocols
– Not proprietary
• Interface Definition/Description Languages (IDL): used to describe the interfaces
between software components, usually in a distributed system
– Definitions are language & machine independent
– Support communication between systems using different OS/programming
languages; e.g. a C++ program running on Windows communicates with a
Java program running on UNIX
– Communication is usually RPC-based.
iv. Scalability
• Dimensions that may scale:
– With respect to size
– With respect to geographical distribution
– With respect to the number of administrative organizations spanned
• A scalable system still performs well as it scales up along any of the three
dimensions.
b) What is transparency? Explain different types of transparencies. 6M
A distributed system that appears to its users & applications to be a single computer system
is said to be transparent.
Advantages:
1) Makes the system more user friendly by hiding some of the details of the distribution
of system resources.
2) Users & apps should be able to access remote resources in the same way they access
local resources.
Page 3 of 24
Transparency has several dimensions.
Transparency Description
Location Hide location of resource (can use resource without knowing its
location)
Hide possibility that a system may change location of resource (no
Migration
effect on access)
Hide the possibility that multiple copies of the resource exist (for
Replication
reliability and/or availability)
Concurrency Hide the possibility that the resource may be shared concurrently
Page 4 of 24
A remote procedure call occurs in the following steps:
1. The client procedure calls the client stub in the normal way.
2. The client stub builds a message and calls the local operating system.
3. The client's OS sends the message to the remote OS.
4. The remote OS gives the message to the server stub.
5. The server stub unpacks the parameters and calls the server.
6. The server does the work and returns the result to the stub.
7. The server stub packs it in a message and calls its local OS.
8. The server's OS sends the message to the client's OS.
9. The client's OS gives the message to the client stub.
10. The stub unpacks the result and returns to the client.
The basic idea is applications communicate by putting messages into and taking messages
out of “message queues”.
Only guarantee: your message will eventually make it into the receiver’s message queue.
This leads to “loosely-coupled” communications. The four combinations for loosely-
coupled communications using queues are as follows.
Page 5 of 24
The basic primitives used in message queuing model are:
Page 7 of 24
toolkits implemented on top of Xlib.
The interesting aspect of X is that the X kernel and the X applications need not necessarily
reside on the same machine. In particular, X provides the X protocol, which is an
application-level communication protocol by which an instance of Xlib can exchange data
and events with an X kernel. For example, Xlib can send requests to the X kernel for
creating or killing a window, setting colors, and defining the type of cursor to display,
among many other requests. In turn, the X kernel will react to local events such as keyboard
and mouse input by sending event packets back to Xlib.
Several applications can communicate at the same time with the X kernel. There is one
specific application that is given special rights, known as the window manager. This
application can dictate the “look and feel” of the display as it appears to the user. For
example, the window manager can prescribe how each window is decorated with extra
buttons, how windows are to be placed on the display, and so on. Other applications will
have to adhere to these rules. In practice, this means that much of the interaction between an
application and an X terminal is redirected through a window manager.
Page 8 of 24
Home-Based approaches:
• Each entity is assigned a home node
• The home node is typically static (has fixed access point and address)
• It keeps track of the current address of the entity
• Entity-home interaction:
• Entity’s home address is registered at a naming service
• The entity updates the home about its current address (foreign address)
whenever it moves
• Name resolution:
• Client contacts the home to obtain the foreign address
• Client then contacts the entity at the foreign location
Hierarchical Approaches:
Page 9 of 24
Page 10 of 24
6 a) Why election algorithms are needed? Explain ring algorithm. 6M
Many distributed algorithms require one process to act as a coordinator. Typically, it does
not matter which process is elected as the coordinator. To elect a coordinator we will use
election algorithms. Election algorithms are needed to elect a coordinator when the the
current coordinator is failed.
Ring Algorithm:
• This algorithm is generally used in a ring topology
• When a process Pi detects that the coordinator has crashed, it initiates the election
algorithm
1. Pi builds an “Election” message (E), and sends it to its next node. It inserts its
ID into the Election message
2. When process Pj receives the message, it appends its ID and forwards the
message
i. If the next node has crashed, Pj finds the next alive node
3. When the message gets back to Pi:
i. Pi elects the process with the highest ID as coordinator
ii. Pi changes the message type to a “Coordination” message (C) and
triggers its circulation in the ring
Page 11 of 24
b) Explain client centric consistency models. 6M
We have four types of client-centric consistency models
Monotonic Reads:
This model provides guarantees on successive reads
If a client process reads the value of data item x, then any successive read operation by that
process should return the same or a more recent value for x
Monotonic Writes:
This consistency model ensures that writes are monotonic
A write operation by a client process on a data item x is completed before any successive
write operation on x by the same process
• A new write on a replica should wait for all old writes on any replica
Page 12 of 24
Read-Your-Writes:
The effect of a write operation on a data item x by a process will always be seen by a
successive read operation on x by the same process
Example scenario:
In systems where password is stored in a replicated data-base, the password change should
be propagated to all replicas
Write-Follow-Reads:
A write operation by a process on a data item x following a previous read operation on x by
the same process is guaranteed to take place on the same or a more recent value of x that
was read
Example scenario:
Users of a newsgroup should post their comments only after they have read the article and
(all) previous comments
7 a) What are the uses of clock synchronization? Discuss about various clock 8M
synchronization algorithms.
Note: Give full marks for the students who write either Physical clock synchronization
algorithms or Logical clock synchronization algorithms
Clock synchronization is a mechanism to synchronize the time of all the computers in a DS.
Clock synchronization is used to synchronize the activities of a process(es) based on the
absolute time or logical event occurrence order.
Physical Clock Synchronization Algorithms
Cristian’s Algorithm
• Flaviu Cristian (in 1989) provided an algorithm to synchronize networked computers
with a time server
• The basic idea:
• Identify a network time server that has an accurate source for time (e.g., the
time server has a UTC receiver)
• All the clients contact the network time server for synchronization
Page 13 of 24
• However, the network delays incurred when the client contacts the time server results
in outdated time
• The algorithm estimates the network delays and compensates for it
Algorithm:
• Client Cli sends a request to Time Server Ser, time stamped its local clock time T1
• Ser will record the time of receipt T2 according to its local clock
• dTreq is network delay for request transmission
• Ser replies to Cli at its local time T3, piggybacking T1 and T2
• Cli receives the reply at its local time T4
• dTres is the network delay for response transmission
• Now Cli has the information T1, T2, T3 and T4
• Assuming that the transmission delay from Cli Ser and Ser Cli are the same
T2-T1 ≈ T4–T3
• Client C estimates its offset θ relative to Time Server S
θ = T3 + dTres – T4
= T3 + ((T2-T1)+(T4-T3))/2 – T4
= ((T2-T1)+(T3-T4))/2
• If θ > 0 or θ < 0, then the client time should be incremented or decremented by θ
seconds
Berkeley Algorithm
Berkeley algorithm is a distributed approach for time synchronization
Approach:
1. A time server periodically (approx. once in 4 minutes) sends its time to all the
computers and polls them for the time difference
2. The computers compute the time difference and then reply
3. The server computes an average time difference for each computer
4. The server commands all the computers to update their time (by gradual time
synchronization)
Page 14 of 24
Network Time Protocol:
• NTP defines an architecture for a time service and a protocol to distribute time
information over the Internet
• In NTP, servers are connected in a logical hierarchy called synchronization subnet
• The levels of synchronization subnet is called strata
• Stratum 1 servers have most accurate time information (connected to a UTC
receiver)
• Servers in each stratum act as time servers to the servers in the lower stratum
Operation:
• When a time server A contacts time server B for synchronization
• If stratum(A) <= stratum(B), then A does not synchronize with B
• If stratum(A) > stratum(B), then:
• Time server A synchronizes with B
• An algorithm similar to Cristian’s algorithm is used to synchronize.
However, larger statistical samples are taken before updating the
clock
• Time server A updates its stratum
stratum(A) = stratum(B) + 1
Logical Clock Synchronization algorithms:
• Synchronization based on “relative time”.
• Note that (with this mechanism) there is no requirement for “relative time” to have
any relation to the “real time”.
• What’s important is that the processes in the Distributed System agree on the
ordering in which certain events occur.
• Such “clocks” are referred to as Logical Clocks.
Lamport's Logical clocks:
• First point: if two processes do not interact, then their clocks do not need to be
synchronized – they can operate concurrently without fear of interfering with each
other.
• Second (critical) point: it does not matter that two processes share a common
Page 15 of 24
notion of what the “real” current time is. What does matter is that the processes
have some agreement on the order in which certain events occur.
• Lamport used these two observations to define the “happens-before” relation (also
often referred to within the context of Lamport’s Timestamps).
The Happens-Before Relation:
• If A and B are events in the same process, and A occurs before B, then we can state
that:
• A “happens-before” B is true.
• Equally, if A is the event of a message being sent by one process, and B is the event
of the same message being received by another process, then A “happens-before” B
is also true.
• (Note that a message cannot be received before it is sent, since it takes a finite,
nonzero amount of time to arrive … and, of course, time is not allowed to run
backwards).
• Obviously, if A “happens-before” B and B “happens-before” C, then it follows that
A “happens-before” C.
• If the “happens-before” relation holds, deductions about the current clock “value”
on each DS component can then be made.
• It therefore follows that if C(A) is the time
on A, then C(A) < C(B), and so on.
• If two events on separate sites have same time, use unique PIDs to break the tie.
• Now, assume three processes are in a DS: A, B and C.
• All have their own physical clocks (which are running at differing rates due to
“clock skew”, etc.).
• A sends a message to B and includes a “timestamp”.
• If this sending timestamp is less than the time of arrival at B, things are OK, as the
“happens-before” relation still holds (i.e., A “happens-before” B is true).
• However, if the timestamp is more than the time of arrival at B, things are NOT OK
(as A “happens-before” B is not true, and this cannot be as the receipt of a message
has to occur after it was sent).
• The question to ask is:
– How can some event that “happens-before” some other event possibly have
occurred at a later time??
• The answer is: it can’t!
• So, Lamport’s solution is to have the receiving process adjust its clock forward to
one more than the sending timestamp value. This allows the “happens-before”
relation to hold, and also keeps all the clocks running in a synchronized state. The
clocks are all kept in sync relative to each other.
• The "happens-before" relation →
can be observed directly in two situations:
1. If a and b are events in the same process,
and a occurs before b, then a → b is true.
2. If a is the event of a message being sent by one process, and b is the event of the
message being received by another process, then a → b.
Page 16 of 24
• Updating counter Ci for process Pi :
1. Before executing an event Pi executes
Ci ← Ci + 1.
2. When process Pi sends a message m to Pj,
it sets m’s timestamp ts(m) equal to Ci
after having executed the previous step.
3. Upon the receipt of a message m, process Pj adjusts its own local counter as
Cj ← max{Cj , ts(m)}, after which it then executes the first step and delivers the
message to the application.
Page 17 of 24
Vector Clock Synchronization Algorithm:
• Vector clocks are constructed by letting each process Pi maintain a vector VCi with
the following two properties:
1. VCi [ i ] is the number of events that have occurred so far at Pi. In other words, VCi [
i ] is the local logical clock at process Pi .
2. If VCi [ j ] = k then Pi knows that k events have occurred at Pj. It is thus Pi’s
knowledge of the local time at Pj .
• Steps carried out to accomplish property 2 of previous slide:
1. Before executing an event, Pi executes
VCi [ i ] ← VCi [i ] + 1.
2. When process Pi sends a message m to Pj, it sets m’s (vector) timestamp ts(m) equal
to VCi after having executed the previous step.
3. Upon the receipt of a message m, process Pj
adjusts its own vector by setting
VCj [k ] ← max{VCj [k ], ts(m)[k ]} for each k,
after which it executes the first step and delivers
the message to the application.
Page 19 of 24
Agreement in faulty systems:
All non-faulty processes reach consensus on some issue, and establish that consensus within
a finite number of steps
Different assumptions about the underlying system require different solutions:
Synchronous versus asynchronous systems
Communication delay is bounded or not
Message delivery is ordered or not
Message transmission is done through unicasting or multicasting
Reaching a distributed agreement is only possible in the following circumstances:
Lamport suggests that each process i constructs a vector V of length N, such that if process i
is non-faulty, V[i] = vi. Otherwise, V[i] is undefined.
Page 20 of 24
Page 21 of 24
9 a) Explain about two phase commit (2PC) protocol. 8M
• First developed in 1978!!!
• Summarized: GET READY, OK, GO AHEAD.
1. The coordinator sends a VOTE_REQUEST message to all group members.
2. The group member returns VOTE_COMMIT if it can commit locally, otherwise
VOTE_ABORT.
3. All votes are collected by the coordinator. A GLOBAL_COMMIT is sent if all the
group members voted to commit. If one group member voted to abort, a
GLOBAL_ABORT is sent.
4. The group members then COMMIT or ABORT based on the last message received
from the coordinator.
Actions taken by a participant P when residing in state READY and having contacted
another participant Q
Page 22 of 24
Actions by Coordinator:
write START_2PC to local log;
multicast VOTE_REQUEST to all participants;
while not all votes have been collected{
wait for any incoming vote;
if timeout{
write GLOBAL_ABORT to local log;
multicast GLOBAL_ABORT to all participants;
exit;
}
record vote;
}
If all participants sent VOTE_COMMIT and coordinator votes COMMIT{
write GLOBAL_COMMIT to local log;
multicast GLOBAL_COMMIT to all participants;
}else{
write GLOBAL_ABORT to local log;
multicast GLOBAL_ABORT to all participants;
}
Actions by Participants:
write INIT to local log;
Wait for VOTE_REQUEST from coordinator;
If timeout{
write VOTE_ABORT to local log;
exit;
}
If participant votes COMMIT{
write VOTE_COMMIT to local log;
send VOTE_COMMIT to coordinator;
wait for DECISION from coordinator;
if timeout{
multicast DECISION_RQUEST to other participants;
wait until DECISION is received; /*remain blocked*/
write DECISION to local log;
}
if DECISION == GLOBAL_COMMIT { write GLOBAL_COMMIT to local log;}
else if DECISION == GLOBAL_ABORT {write GLOBAL_ABORT to local log};
}else{
write VOTE_ABORT to local log;
send VOTE_ABORT to coordinator;
}
Actions for handling decision requests:
/*executed by separate thread*/
while true{
wait until any incoming DECISION_REQUEST is received; /*remain blocked*/
read most recently recorded STATE from the local log;
Page 23 of 24
if STATE == GLOBAL_COMMIT
send GLOBAL_COMMIT to requesting participant;
else if STATE == INIT or STATE == GLOBAL_ABORT
send GLOBAL_ABORT to requesting participant;
else
skip; /*participant remains blocked*/
}
Page 24 of 24