Distributed system module 1
Distributed system module 1
to
Distributed Systems
Department of
Computer Science & Engineering
www.cambridge.edu.in
Why Distributed Systems?
CiTech, BANGALORE
What is a Distributed System?
CiTech, BANGALORE
Components of a Distributed System
CiTech, BANGALORE
Characteristics of a Distributed
System
• Concurrency: Multiple processes execute simultaneously across
different machines.
• Scalability: Ability to grow and handle increased loads by adding
more resources.
• Fault Tolerance: Ability to continue functioning despite failures
of some components.
• Transparency: The system hides the complexity of distribution
from users and applications, making it seem like a single system.
CiTech, BANGALORE
Challenges of a Distributed System
• Heterogeneity: The Internet enables users to access services and run
applications over a heterogeneous collection of computers and
networks.
• Openness: The openness of a computer system is the characteristic
that determines whether the system can be extended and
reimplemented in various ways.
• Security: Many of the information resources that are made available
and maintained in distributed systems have a high intrinsic value to
their users. Their security is therefore of considerable importance.
• Concurrency: Multiple processes execute simultaneously across
different machines.
Challenges of a Distributed System
• Transparency: Transparency is defined as the concealment from
the user and the application programmer of the separation of
components in a distributed system, so that the system is
perceived as a whole rather than as a collection of independent
components.
• Failure handling: Designing the system to handle node failures
and network issues gracefully.
• Scalability: Ensuring that the system can handle increased load
without significant performance degradation.
• Quality of service
CiTech, BANGALORE
Applications of Distributed System
CiTech, BANGALORE
Examples
CiTech, BANGALORE
Chapter 5 – Remote Invocation
CiTech, BANGALORE
Request-Reply Protocols
A protocol built over datagrams avoids unnecessary overheads
associated with the TCP stream protocol. In particular:
• Acknowledgements are redundant, since requests are followed by
replies.
• Establishing a connection involves two extra pairs of messages in
addition to the pair required for a request and a reply.
• Flow control is redundant for the majority of invocations, which
pass only small arguments and results.
CiTech, BANGALORE
Request-Reply Protocols
• The protocol we describe here is based on a trio of communication
primitives, doOperation, getRequest and sendReply, as shown in
Figure.
CiTech, BANGALORE
Request-Reply Protocols
• The protocol we describe here is based on a trio of communication
primitives, doOperation, getRequest and sendReply, as shown in
Figure.
CiTech, BANGALORE
Request-Reply Protocols
• The doOperation method is used by clients to invoke remote
operations. Its arguments specify the remote server and which
operation to invoke, together with additional information
(arguments) required by the operation. Its result is a byte array
containing the reply.
• getRequest is used by a server process to acquire service requests.
• sendReply is used to send the reply message to the client.
CiTech, BANGALORE
Message identifiers
A message identifier consists of two parts:
1. a requestId, which is taken from an increasing sequence of integers
by the sending process;
2. an identifier for the sender process, for example, its port and
Internet address.
CiTech, BANGALORE
Message identifiers
A message identifier consists of two parts:
1. a requestId, which is taken from an increasing sequence of integers
by the sending process;
2. an identifier for the sender process, for example, its port and
Internet address.
CiTech, BANGALORE
Failure model of the request-reply protocol
That is:
• They suffer from omission failures.
• Messages are not guaranteed to be delivered in sender order.
CiTech, BANGALORE
Timeouts & Discarding duplicate request
messages
• The timeout may have been due to the request or reply message
getting lost.
• In cases when the request message is retransmitted, the server
may receive it more than once. This can lead to the server
executing an operation more than once for the same request.
CiTech, BANGALORE
Request-reply message structure
The information to be transmitted in a request message or a reply
message is shown in Figure
CiTech, BANGALORE
Failure model of the request-reply protocol
CiTech, BANGALORE
Timeouts
CiTech, BANGALORE
Discarding duplicate request messages
CiTech, BANGALORE
Styles of exchange protocols
CiTech, BANGALORE
Styles of exchange protocols
CiTech, BANGALORE
Remote Procedure Call (RPC)
• In RPC, procedures on remote machines can be called as if they are
procedures in the local address space.
• The underlying RPC system then hides important aspects of
distribution, including the encoding and decoding of parameters and
results, the passing of messages and the preserving of the required
semantics for the procedure call.
• This concept was first introduced by Birrell and Nelson [1984] and
paved the way for many of the developments in distributed systems
programming.
CiTech, BANGALORE
Design issues for RPC
Before looking at the implementation of RPC systems, we look at three
issues that are important in understanding this concept:
• the style of programming promoted by RPC – programming with
interfaces;
• the call semantics associated with RPC;
• the key issue of transparency and how it relates to remote procedure
calls.
CiTech, BANGALORE
Interfaces in distributed systems
• In a distributed program, the modules can run in separate processes.
• In the client-server model, in particular, each server provides a set of
procedures that are available for use by clients.
• The term service interface is used to refer to the specification of the
procedures offered by a server, defining the types of the arguments of
each of the procedures.
CiTech, BANGALORE
CORBA IDL example
CiTech, BANGALORE
CORBA(Common Object Request Broker
Architecture)
• It is a standard defined by the Object Management Group (OMG)
that allows pieces of programs, known as objects, to communicate
with one another regardless of where they are located (locally or
across a network) and regardless of the programming language
used to write them.
• CORBA achieves this through its middleware framework, enabling
interoperability between distributed systems.
CiTech, BANGALORE
Idempotency
CiTech, BANGALORE
RPC call semantics
The main choices are:
Retry request message: Controls whether to retransmit the request
message until either a reply is received or the server is assumed to
have failed.
Duplicate filtering: Controls when retransmissions are used and
whether to filter out duplicate requests at the server.
Retransmission of results: Controls whether to keep a history of
result messages to enable lost results to be retransmitted without
re-executing the operations at the server.
CiTech, BANGALORE
RPC call semantics
CiTech, BANGALORE
Maybe semantics
CiTech, BANGALORE
At-least-once semantics
CiTech, BANGALORE
At-least-once semantics
CiTech, BANGALORE
Idempotent operation
CiTech, BANGALORE
At-most-once semantics
CiTech, BANGALORE
Implementation of RPC
CiTech, BANGALORE
Remote Method Invocation (RMI)
CiTech, BANGALORE
Remote Method Invocation (RMI)
The commonalities between RMI and RPC are as follows:
• They both support programming with interfaces, with the resultant
benefits that stem from this approach.
• They are both typically constructed on top of request-reply
protocols and can offer a range of call semantics such as
at-least-once and at-most-once.
• They both offer a similar level of transparency – that is, local and
remote calls employ the same syntax but remote interfaces
typically expose the distributed nature of the underlying call, for
example by supporting remote exceptions.
CiTech, BANGALORE
Remote Method Invocation (RMI)
The advantages of RMI are as follows:
• The programmer is able to use the full expressive power of
object-oriented programming in the development of distributed systems
software, including the use of objects, classes and inheritance, and can
also employ related object-oriented design methodologies and associated
tools.
• Building on the concept of object identity in object-oriented systems, all
objects in an RMI-based system have unique object references (whether
they are local or remote), such object references can also be passed as
parameters, thus offering significantly richer parameter-passing semantics
than in RPC.
CiTech, BANGALORE
Design issues for RMI
The key added design issue relates to the object model and, in
particular, achieving the transition from objects to distributed objects.
CiTech, BANGALORE
The object model
Object references
Interfaces
Actions
Exceptions
Garbage collection
CiTech, BANGALORE
The distributed object model
CiTech, BANGALORE
The distributed object model
CiTech, BANGALORE
Implementation of RMI
CiTech, BANGALORE
Implementation of RMI
Communication module : The two cooperating communication modules
carry out the request-reply protocol, which transmits request and reply
messages between the client and server.
CiTech, BANGALORE
Remote reference module
To support its responsibilities, the remote reference module in each process
has a remote object table that records the correspondence between local
object references in that process and remote object references (which are
system-wide).
CiTech, BANGALORE
The RMI software
This consists of a layer of software between the application-level objects
and the communication and remote reference modules.
CiTech, BANGALORE
Distributed garbage collection
The aim of a distributed garbage collector is to ensure that if a local or
remote reference to an object is still held anywhere in a set of distributed
objects, the object itself will continue to exist, but as soon as no object any
longer holds a reference to it, the object will be collected and the memory it
uses recovered.
CiTech, BANGALORE