Chapter 3 Communication in Distributed Systems
Chapter 3 Communication in Distributed Systems
Distributed Systems
CHAPTER 3:
COMMUNICATION IN DISTRIBUTED SYSTEMS
Wilbert P. Umadhay
Instructor
email:
Wilbert.Umadhay@antiquespride.edu.ph
CHAPTER GOALS
Wilbert P. Umadhay P a g e 1 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
In this lesson: communication between distributed objects by means of two models: remote
method invocation (RMI) and remote procedure call (RPC).
RMI, as well as RPC, are implemented on top of request and reply primitives. Request and reply
are implemented on top of the network protocol (e.g. TCP or UDP in case of the Internet).
Network Protocol
Middleware and distributed applications are implemented on top of a network protocol. Such a
protocol is implemented as several layers.
Wilbert P. Umadhay P a g e 2 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
TCP (Transport Control Protocol) and UDP (User Datagram Protocol) are both transport
protocols implemented on top of the Internet protocol (IP).
-TCP is a reliable protocol, TCP guarantees the delivery to the receiving process of all data
delivered by the sending process, in the same order.
-TCP implements mechanisms on top of IP to meet reliability guarantees.
-UDP is a protocol that does not guarantee reliable transmission.
-UDP offers no guarantee of delivery. According to the IP, packets may be dropped because of
congestion or network error. UDP adds no reliability mechanism to this.
-UDP provides a means of transmitting messages with minimal additional costs or transmission
delays above those due to IP transmission.
Its use is restricted to applications and services that do not require reliable delivery of
messages. If reliable delivery is requested with UDP, reliability mechanisms have to
be implemented on top of the network protocol (in the middleware).
-Sequencing: A sequence number is attached to each transmitted segment (packet). At the
receiver side, packets are delivered in order of this number.
-Flow control: The sender takes care not to overwhelm the receiver. This is based on periodic
acknowledgements received by the sender from the receiver.
-Retransmission and duplicate handling: If a segment is not acknowledged within a timeout, it
is retransmitted. Using sequence number, the receiver detects and rejects duplicates.
-Buffering: Buffering balances the flow. If the receiving buffer is full, incoming segments are
dropped. They will be retransmitted by the sender.
-Checksum: Each segment carries a checksum. If the received segment does not match the
checksum, it is dropped (and will be retransmitted).
Wilbert P. Umadhay P a g e 3 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
The Client:
Send request to server-reference, Receive
reply from server.
The Server:
Receive request from client-reference,
Execute requested operations,
Send reply to client-reference
Wilbert P. Umadhay P a g e 4 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
The RMI (Remote Method Invocation) is an API that provides a mechanism to create
distributed application in java. The RMI allows an object to invoke methods on an object
running in another JVM. The RMI provides remote communication between the
applications using two objects stub and skeleton.
-Object A and Object B belong to the application.
-Remote reference module and communication module belong to middleware.
-The proxy for B and the skeleton for B represent the so called RMI software. They are
situated at the border between middleware and application and are generated
automatically with help of available tools that are delivered together with the middleware
software.
Question 1
What if the two computers use different representation for data (integers, chars, floating
point)?
-The most elegant and flexible solution is to have a standard representation used for all
values sent through the network; the proxy and skeleton convert to/from this
representation during marshalling/ unmarshalling.
Wilbert P. Umadhay P a g e 5 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
Question 2
Who generates the classes for proxy and skeleton?
-In advanced middleware systems (e.g. CORBA) the classes for proxies and skeletons can be
generated automatically. Given the specification of the server interface and the standard
representations, an interface compiler can generate the classes for proxies and skeletons.
RPC (Remote Procedure Call) is a protocol that one program can use to request a service
from a program located in another computer on a network without having to understand
the network's details. RPC is used to call other processes on the remote systems like a local
system.
Wilbert P. Umadhay P a g e 6 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
DANGER: Some operations can be executed more than once without any problem; they are
called idempotent operations no danger with executing the duplicate request.
There are operations which cannot be executed repeatedly without changing the effect (e.g.
transferring an amount of money between two accounts) history can be used to avoid
reexecution.
History: stores a record of reply messages that have been transmitted, together with the
message identifier and the client which it has been sent to.
Wilbert P. Umadhay P a g e 7 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
Server Crash
Client Crash
The client sends a request to a server and crashes before the server replies.
Wilbert P. Umadhay P a g e 8 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
The computation which is active in the server becomes an “orphan” (a computation nobody
is waiting for).
Problems:
-wasting of CPU time
-locked resources (files, peripherals, etc.)
-if the client reboots and repeats the RMI, confusion can be created. The
solution is based on identification and killing the “orphans”.
Wilbert P. Umadhay P a g e 9 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
Indirect communication
Indirect communication is defined as communication between entities in a distributed
system through an intermediary with no direct coupling between the sender and the
receiver(s).
Loose coupling
- Resilient relationship between two or more systems or organizations with some kind
of exchange relationship
- Each end of the transaction makes its requirements explicit, e.g. as an interface
description, and makes few assumptions about the other end
- Enhanced flexibility; a change in one module will not require a change in the
implementation of another module (+)
- Example: (Web) Services, which are called via interface; service behind interface
might be replaced
Decoupled
- de-coupled in space and time using (event) messages (e.g. via Message- oriented
Middleware (MoM), publish- subscribe,)
- Often asynchronous stateless indirect communication (e.g. publish-subscribe or complex
event processing systems) - Asynchronous communication (+)
- Parallel processing (+)
- Difficult to ensure transactional integrity (-)
- Issues in maintaining synchronisation (-)
- Example: Event-driven Publish/Subscribe; events are received and sent
Key properties
Space uncoupling - The sender does not know or need to know the identity of the
receiver(s), and vice versa.
Time uncoupling - The sender and the receiver(s) can have independent lifetimes. Indirect
communication is often used in distributed systems where change is anticipated.
Wilbert P. Umadhay P a g e 10 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
Group Communication
Group communication offers a service whereby a message is sent to a group and then this
message is delivered to all members of the group.
The assumption with client-server communication and RMI (RPC) is that two parties are
involved: the client and the server. Sometimes communication involves multiple processes,
not only two. A solution is to perform separate message passing operations or RMIs to each
receiver.
With group communication a message can be sent to a group and then it is delivered to all
members of the group multiple receivers in one operation.
Characteristics
• Sender is not aware of the identities of the receivers
• Represents an abstraction over multicast communication
Wilbert P. Umadhay P a g e 11 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
Publish-Subscribe Systems
The general objective of publish-subscribe systems is to let information propagate from
publishers to interested subscribers, in an anonymous, decoupled fashion.
Publishers publish events Subscribers subscribe to and receive the events they are
interested in.
Subscribers are not directly targeted from publishers but indirectly via the notification
service.
Subscribers express their interest by issuing subscriptions for specific notifications,
independently from the publishers that produces them; they are asynchronously notified for
all notifications, submitted by any publisher, that match their subscription.
Notification Service: is a propagation mechanism that acts as a logical intermediary between
publishers and subscribers, to avoid each publisher to have to know all the subscriptions for
each possible subscriber.
Both publishers and subscribers communicate only with a single entity, the notification
service, that:
- stores the subscriptions associated with each subscriber;
- receives all the notifications from publishers;
- dispatches the notifications to the correct subscribers.
Wilbert P. Umadhay P a g e 12 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
One of the main problems with publish-subscribe systems is to achieve scalability of the
notification service.
Centralized implementations: are the simplest, however, scalability is limited by the
processing power of the machine that hosts the service.
Distributed implementations: the notification service is realized as a network of distributed
processes, called brokers; the brokers interact among themselves with the common aim of
dispatching notifications to all interested subscribers.
-Such a solution is scalable but is more challenging to implement; it requires complex
protocols for the coordination of the various brokers and the diffusion of the information.
Wilbert P. Umadhay P a g e 13 | 14
UA –Main Campus-BS Information Technology NAS 5: Distributed Systems
Reference: George Coulouris, Jean Dollimore, Tim Kindberg, Gordon Blair: "Distributed
Systems - Concepts and Design", Addison Wesley Publ. Comp., 5th edition, 2011.
Wilbert P. Umadhay P a g e 14 | 14