Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (2 votes)
836 views

Lecture Notes Distributed System

This document provides lecture notes on distributed systems. It defines a distributed system as a collection of independent computers that appears as a single coherent system to users. The goals of distributed systems are making resources accessible across multiple computers in a transparent way, being open through standard interfaces, and being scalable. Techniques for improving scalability include hiding communication latency, distributing components across multiple machines, and replicating components. Challenges in developing distributed systems include unreliable networks and lack of a global view. The document discusses hardware concepts like multiprocessors and multicomputers, and software concepts like distributed operating systems and middleware. It also covers architectural styles for distributed systems like layered, object-based, data-centered, and event-based architectures

Uploaded by

FazalUrRahman
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
836 views

Lecture Notes Distributed System

This document provides lecture notes on distributed systems. It defines a distributed system as a collection of independent computers that appears as a single coherent system to users. The goals of distributed systems are making resources accessible across multiple computers in a transparent way, being open through standard interfaces, and being scalable. Techniques for improving scalability include hiding communication latency, distributing components across multiple machines, and replicating components. Challenges in developing distributed systems include unreliable networks and lack of a global view. The document discusses hardware concepts like multiprocessors and multicomputers, and software concepts like distributed operating systems and middleware. It also covers architectural styles for distributed systems like layered, object-based, data-centered, and event-based architectures

Uploaded by

FazalUrRahman
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Distributed

Systems
Lecture
Notes
Distributed Systems Lecture Notes
Himayatullah sharief
UNIT-I
Definition: A distributed system is a collection of independent computers that appears to its users
as a single coherent system.

The first one is that a distributed system consists of components (i.e., computers) that
are autonomous. A second aspect is that users (be they people or programs) think they
are dealing with a single system.

The above diagram shows a distributed system organized as middleware. The middleware layer
extends over multiple machines, and offers each application the same interface.

GOALS of Distributed Systems


1. Making Resources Accessible
The main goal of a distributed system is to make it easy for the users (and applications)
to access remote resources, and to share them in a controlled and efficient way.
Resources can be just about anything, but typical examples include things like printers,
computers, storage facilities, data, files, Web pages, and net-works etc.
2. Distribution Transparency
Hide the fact that its processes and resources are physically distributed across multiple
computers. A distributed system that is able to present itself to users and applications
as if it were only a single computer system is said to be transparent.
Transparency in a Distributed System

Different forms of transparency in a distributed system

1
Distributed Systems Lecture Notes
Himayatullah sharief
3. Openness
An open distributed system is a system that offers services according to standard rules
that describe the syntax and semantics of those services. For example, in computer
networks, standard rules govern the format, contents, and meaning of messages sent
and received. Such rules are formalized in protocols. In distributed systems, services
are generally specified through interfaces.

4. Scalability
The scalability problems in DSs appear as performance problems caused by limited
capacity of server and network. There are basically three techniques for scaling:

1. Hiding communication latencies: try to avoid waiting for response to remote services
as much as possible using asynchronous communication.

2. Distribution: taking a component, splitting it into smaller parts, and subsequently


spreading those parts across the system. Example: Internet Domain Name System
(DNS).

3. Replication: replicating components across a DS to increase availability and balance


the load between components for better performance.
Scalability Problems

Examples of scalability limitations

Characteristics of decentralized algorithms:


1. No machine has complete information about the system state.

2. Machines make decisions based only on local information.

3. Failure of one machine does not ruin the algorithm.

4. There is no implicit assumption that a global clock exists.

Scaling Techniques -1

2
Distributed Systems Lecture Notes
Himayatullah sharief

The difference between letting (a) a server or (b) a client check forms as they are being filled.

Scaling Techniques (2)

An example of dividing the DNS name space into zones

5. Pitfalls when Developing Distributed Systems


False assumptions made by first time developer:

1. The network is reliable.


2. The network is secure.
3. The network is homogeneous.
4. The topology does not change.
5. Latency is zero.
6. Bandwidth is infinite.
7. Transport cost is zero.
8. There is one administrator.
Hardware Concepts

3
Distributed Systems Lecture Notes
Himayatullah sharief
There are several ways to organize hardware in multiple-CPU systems, especially in
teams of how they are interconnected and how they communicate

1. Base on memory organization:


2. Multiprocessor: using shared memory
3. communication: through shared variables
4. Multicomputer: no shared memory, using private memory
5. Based on the architecture of interconnection network:

Multi-computers
Homogeneous multi computers (usually used in parallel systems): a single inter
connection network, all processors are the same and generally have access to the same
amount of private memory.
Heterogeneous multi computers (usually used in distributed systems): a variety of
different, independent computers connected through different networks.
Due to the large scale, inherent heterogeneity, and lack of global system view in
heterogeneous multicomputer, sophisticated software is needed to build applications,
developing a distributed system (DS). Thus DSs usually have a software layer
(middleware) to provide transparency.

Homogeneous Multicomputer Systems

a) 2D-Mesh (Grid) b)Hypercube

Software Concepts: Distributed OS


Distributed systems act as resource manager (like traditional OS), and more
importantly, attempt to hide the heterogeneous nature to provide a virtual single system
on which applications can be easily executed.

Distributed operating systems (DOS): managing multiprocessor and homogeneous


multi computers. DOS aims to support high performance through multiple processors.

4
Distributed Systems Lecture Notes
Himayatullah sharief
In multiprocessor, DOS supports for multiple processors having access to a shared
memory and protects data against simultaneous access the same shared memory
locations.

In multicomputer, DOS offers the message-passing facilities to applications.

The main goal of DOS is to hide the intricacies of managing the underlying hardware
such that it can be shared by multiple processes.

Software Concepts: Network OS


Network operating system (NOS): used for heterogeneous multicomputer systems. In
additional to managing the underlying hardware and other resources, it makes the
local services available to remote clients. NOS does not provide a single system image
(transparency) to the users.
Example: some (non-transparency) services NOS may provide (in UNIX)
• rlogin machine
• rcp machine1:file2 machine2:file2
• The lack of transparency in NOS has some drawbacks, such as they are harder
to use and manage, and introducing some security problem.

Middleware as a communication facilitator


Middleware: a software layer between applications and the NOS provides a higher
level abstraction as well as masking the heterogeneity of the underlying components.
Modern DSs are generally built in this way.

In such a DS, the local OS in each computer manages its resources while the
middleware offer a more-or-less complete collection of services used by the
applications.

Most middleware is based on some model (or paradigm) for describing distribution
and communication, such as distributed file system, remote procedure call (RPC),
distributed object, distributed document etc.

Important styles of architecture for distributed systems


Architectural Styles -1

5
Distributed Systems Lecture Notes
Himayatullah sharief

1. Layered architectures

2. Object-based architectures

3. Data-centered architectures

4. Event-based architectures

Architectural Styles-2

(a) The layered architectural style

Architectural Styles -3

6
Distributed Systems Lecture Notes
Himayatullah sharief

Architectural Styles -4

(a) The event-based architectural style

Architectural Styles (5)


(b) The shared data-space architectural style.

(b) The shared data-space architectural style.

Centralized Architectures

7
Distributed Systems Lecture Notes
Himayatullah sharief

General interaction between a client and a server

Application Layering -1
Recall previously mentioned layers of architectural style

1. The user-interface level

2. The processing level

3. The data level

Application Layering -2

The simplified organization of an Internet search engine into three different layers

Multitier Architectures -1
The simplest organization is to have only two types of machines:

1. A client machine containing only the programs implementing (part of) the user-interface level
2. A server machine containing the rest, the programs implementing the processing and data level

8
Distributed Systems Lecture Notes
Himayatullah sharief

Alternative client-server organizations (a)–(e)

An example of a server acting as client

9
Distributed Systems Lecture Notes
Himayatullah sharief

UNIT-II
Communication
Inter-process communication is at the heart of all distributed systems. Communication
in distributed systems is always based on low-level message passing as offered by the
underlying network.

In the OSI model, communication is divided up into seven layers, as shown below.
Each layer deals with one specific aspect of the communication. In this way, the
problem can be divided up into manageable pieces, each of which can be solved
independent of the others. Each layer provides an interface to the one above it. The
interface consists of a set of operations that together define the service the layer is
prepared to offer its users.
Layered Protocols -1

Layers, interfaces and protocols in the OSI model.

Working
When process-A on machine 1 wants to communicate with process-B on machine 2, it
builds a message and passes the message to the application layer on its machine.

10
Distributed Systems Lecture Notes
Himayatullah sharief
The application layer software then adds a header to the front of the message and passes
the resulting message across the layer 6/7 inter-face to the presentation layer. The
presentation layer in turn adds its own header and passes the result down to the session
layer, and so on. Some layers add not only a header to the front, but also a trailer to the
end. When it hits the bottom, the physical layer actually transmits the message, shown
below.

A typical message as it appears on the network.

Middleware Protocols
Middleware is an application that logically lives (mostly) in the application layer, but
which contains many general-purpose protocols that warrant their own layers,
independent of other, more specific applications. A distinction can be made between
high-level communication protocols and protocols for establishing various middleware
services.
As an example, consider a distributed locking protocol by which a resource can be
protected against simultaneous access by a collection of processes that are distributed
across multiple machines.

11
Distributed Systems Lecture Notes
Himayatullah sharief

An adapted reference model for networked communication

Types of Communication

Viewing middleware as an intermediate (distributed) service in application-level communication

The characteristic feature of asynchronous communication is that a sender


continues immediately after it has submitted its message for transmission. This means
that the message is (temporarily) stored immediately by the middleware upon
submission. With synchronous communication, the sender is blocked until its request
is known to be accepted.
There are essentially three points where synchronization can take place.

12
Distributed Systems Lecture Notes
Himayatullah sharief
1. The sender may be blocked until the middleware notifies that it will take over
transmission of the request.
2. The sender may synchronize until its request has been delivered to the intended
recipient.
3. Synchronization may take place by letting the sender wait until its request has been
fully processed, that is, up the time that the recipient returns a response.

Remote Procedure Call (RPC)


When a process on machine A calls' a procedure on machine B, the calling
process on A is suspended, and execution of the called procedure takes place on B.
Information can be transported from the caller to the callee in the parameters and can
come back in the procedure result. No message passing at all is visible to the
programmer. This method is known as Remote Procedure Call, or often just RPC.

Conventional RPC Call

(a) Parameter passing in a local procedure call: the stack before the call to read

(b) The stack while the called procedure is active.

A value parameter, such as fd or nbytes, is simply copied to the stack as shown


in Fig. (b). To the called procedure, a value parameter is just an initialized local variable.
The called procedure may modify it, but such changes do not affect the original value
at the calling side.

13
Distributed Systems Lecture Notes
Himayatullah sharief
Client and Server Stubs

Principle of RPC between a client and server program

When the message arrives at the server, the server's operating system passes it
up to a server stub. A server stub is the server-side equivalent of a client stub: it is a
piece of code that transforms requests coming in over the network into local procedure
calls. Typically the server stub will have called receive and be blocked waiting for
incoming messages. The server stub unpacks the parameters from the message and then
calls the server procedure in the usual way (i.e., as in Fig of Conventional RPC Call).
From the server's point of view, it is as though it is being called directly by the

Remote Procedure Calls -1


A remote procedure call occurs in the following

Steps:

1. The client procedure calls the client stub in the normal way.
2. The client stub builds a message and calls the local operating system.
3. The client’s OS sends the message to the remote OS.
4. The remote OS gives the message to the server stub.
5. The server stub unpacks the parameters and calls the server.
6. The server does the work and returns the result to the stub.
7. The server stub packs it in a message and calls its local OS.
8. The server’s OS sends the message to the client’s OS.
9. The client’s OS gives the message to the client stub.
10. The stub unpacks the result and returns to the client.

Passing Value Parameters -1


Packing parameters into a message is called Marshaling.
As a very simple example, consider a remote procedure, add (i, j), that takes two integer
parameters i and j and returns their arithmetic sum as a result. (As a practical
14
Distributed Systems Lecture Notes
Himayatullah sharief
matter, one would not normally make such a simple procedure remote due to the
Overhead, but as an example it will do.) The call to add, is shown in the left-hand
Portion (in the client process) in Figure below. The client stub takes its two parameters
and puts them in a message as indicated, It also puts the name or number of the
procedure to be called in the message because the server might support several different
calls, and it has to be told which one is required.

The steps involved in a doing a remote computation through RPC

Asynchronous RPC -1
As in conventional procedure calls, when a client calls a remote procedure, the
client will block until a reply is returned. This strict request-reply behavior is
unnecessary when there is no result to return, and only leads to blocking the client while
it could have proceeded and have done useful work just after requesting the remote
procedure to be called. Examples of where there is often no need to wait for a reply
include: transferring money from one account to another, adding entries into a database,
starting remote services, batch processing, and so on. To support such situations, RPC
systems may provide facilities for what are called asynchronous RPCs, by which a client
immediately continues after issuing the RPC request. With asynchronous RPCs, the
server immediately sends a reply back to the client the moment the RPC request is
received, after which it calls the requested procedure. The reply acts as an
acknowledgment to the client that the server is going to process the RPC.

15
Distributed Systems Lecture Notes
Himayatullah sharief

(a) The interaction between client and server in a traditional RPC

Asynchronous RPC -2

(b) The interaction using asynchronous RPC

Asynchronous RPC -3

A client and server interacting through two asynchronous RPCs

16
Distributed Systems Lecture Notes
Himayatullah sharief
Writing a Client and a Server (1)

The DCE RPC system consists of a number of components, including languages,


libraries, daemons, and utility programs, among others. Together these make it possible
to write clients and servers. The entire process of writing and using an RPC client and
server is summarized in Figure below.

The steps in writing a client and a server in DCE RPC

Writing a Client and a Server -2

Three files output by the IDL compiler:

1. A header file (e.g., interface.h, in C terms).

2. The client stub.

3. The server stub.

Binding a Client to a Server -1

1. Registration of a server makes it possible for a client to locate the server and bind to it.

2. Server location is done in two steps:

1. Locate the server’s machine.

2. Locate the server on that machine.

17
Distributed Systems Lecture Notes
Himayatullah sharief
Binding a Client to a Server -2

Client-to-server binding in DCE

MESSAGE-ORIENTED COMMUNICATION
Remote procedure calls and remote object invocations contribute to hiding
communication in distributed systems, that is, they enhance access transparency.
Unfortunately, neither mechanism is always appropriate. In particular, when it cannot
be assumed that the receiving side is executing at the time a request is issued, alternative
communication services are needed. Likewise, the inherent synchronous nature of
RPCs, by which a client is blocked until its request has been processed, sometimes
needs to be replaced by messaging.

Message-Oriented Transient Communication


Berkeley Sockets

The socket primitives for TCP/IP

18
Distributed Systems Lecture Notes
Himayatullah sharief

General Architecture of a Message-Queuing System -1

The relationship between queue-level addressing and network-level addressing

Queues are managed by queue managers. Normally, a queue manager interacts


directly with the application that is sending or receiving a message. However, there are
also special queue managers that operate as routers, or relays: they for-ward incoming
messages to other queue managers. In this way, a message-queuing system may
gradually grow into a complete, application-level, overlay network, on top of an existing
computer network. This approach is similar to the construction of the early MBone over
the Internet, in which ordinary user processes were configured as multicast routers.

Each queue manager needs a copy of the queue-to-location mapping. It is


needless to say that in large-scale queuing systems. This approach can easily lead to
network-management problems.
One solution is to use a few routers that know about the network topology.
When a sender A puts a message for destination B in its local queue, that message is
first transferred to the nearest router, say R1, as shown in Figure below.

General Architecture of a Message-Queuing System -2

19
Distributed Systems Lecture Notes
Himayatullah sharief

The general organization of a message-queuing system with routers

STREAM-ORIENTED COMMUNICATION
There are also forms of communication in which timing plays a crucial role.
Consider, for example, an audio stream built up as a sequence of 16-bit samples, each
representing the amplitude of the sound wave as is done through Pulse Code Modulation
(PCM).

Streams and Quality of Service


Properties for Quality of Service:

1. The required bit rate at which data should be transported.


2. The maximum delay until a session has been set up
3. The maximum end-to-end delay.
4. The maximum delay variance or jitter.
5. The maximum round-trip delay.

Enforcing QoS -1
Given that the underlying system offers only a best-effort delivery service, a distributed
system can try to conceal as much as possible of the lack of quality of service.

A distributed system should help in getting data across to receivers. Although there are
generally not many tools available, one that is particularly useful is to use buffers to
reduce jitter. The principle is simple, as shown in Figure below.

Assuming that packets are delayed with a certain variance when transmitted over the
network, the receiver simply stores them in a buffer for a maximum amount of time.
This will allow the receiver to pass packets to the application at a regular rate, knowing

20
Distributed Systems Lecture Notes
Himayatullah sharief
that there will always be enough packets entering the buffer to be played back at that
rate.

Using a buffer to reduce jitter

One problem that may occur is that a single packet contains multiple audio and video
frames. Interleaving frames, as shown in Figure below is used. In this way, when a
packet is lost, the resulting gap in successive frames is distributed over time

Enforcing QoS (2)

The effect of packet loss in (a) non interleaved transmission and (b) interleaved transmission

Stream Synchronization

21
Distributed Systems Lecture Notes
Himayatullah sharief
The simplest form of synchronization is that between a discrete data stream and
a continuous data stream. Consider, for example, a slide show on the Web that has been
enhanced with audio. Each slide is transferred from the server to the client in the form
of a discrete data stream. At the same time, the client should play out a specific (part of
an) audio stream that matches the current slide that is also fetched from the server. In
this case, the audio stream is to be 'synchronized with the presentation of slides.

Mechanisms -1(the basic mechanisms for synchronizing two streams)

The principle of explicit synchronization on the level data units

For example, consider a movie that is presented as two input streams. The video
stream contains uncompressed low-quality images of 320x240 pixels, each encoded by
a single byte, leading to video data units of 76,800 bytes each. Assume that images are
to be displayed at 30 Hz, or one images every 33 msec. The audio stream is assumed to
contain audio samples grouped into units of 11760 bytes, each corresponding to 33 ms
of audio, as explained above. If the input process can handle 2.5 MB/sec, we can achieve
lip synchronization by simply alternating between reading an image and reading a block
of audio samples every 33 ms.

Mechanisms -2(the distribution of those mechanisms in a networked environment)

22
Distributed Systems Lecture Notes
Himayatullah sharief

The principle of synchronization as supported by high-level interfaces

This approach to synchronization is followed for MPEG streams. The MPEG


(Motion Picture Experts Group) standards form a collection of algorithms for
compressing video and audio. Several MPEG standards exist. MPEG-2, for example,
was originally designed for compressing broadcast quality video into 4 to 6 Mbps. In
MPEG-2, an unlimited number of continuous and discrete streams can be merged into
a single stream. Each input stream is first turned into a stream of packets that carry a
timestamp based on a 90-kHz system clock. These streams are subsequently
multiplexed into a program stream then consisting of variable length packets.

23
Distributed Systems Lecture Notes
Himayatullah sharief
UNIT-III
Processes
Introduction to Threads
To execute a program, an operating system creates a number of virtual processors, each
one for running a different program. To keep track of these virtual processors, the
operating system has a process table, containing entries to store CPU register values,
memory maps, open files, accounting information. Privileges etc.

A process is often defined as a program in execution, that is, a program that is currently
being executed on one of the operating system's virtual processors.

Thread Usage in Non-distributed Systems


Multithreading is also useful in the context of large applications.

The major drawback of all IPC mechanisms is that communication often requires
extensive context switching, shown at three different points in Figure below

Context switching as the result of IPC

Like a process, a thread executes its own piece of code, independently from other
threads. However, in contrast to processes, no attempt is made to achieve a high degree
of concurrency transparency if this would result in performance degradation.

Thread Implementation

Threads are often provided in the form of a thread package. There are basically two
approaches to implement a thread package.

24
Distributed Systems Lecture Notes
Himayatullah sharief
1. The first approach is to construct a thread library that is executed entirely in user
mode.
2. The second approach is to have the kernel be aware of threads and schedule
them.
A user-level thread library has a number of advantages. First, it is cheap to create
and destroy threads. Because all thread administration is kept in the user's address space,
the price of creating a thread is primarily determined by the cost for allocating memory
to set up a thread stack. Analogously, destroying a thread mainly involves freeing
memory for the stack, which is no longer used. Both operations are cheap.

A second advantage of user-level threads is that switching thread context can


often be done in just a few instructions. Basically, only the values of the CPU register
need to be stored and subsequently reloaded with the previously stored values of the
thread to which it is being switched. There is no need to change memory maps, flush
the TLB, do CPU accounting, and so on. Switching thread context is done when two
threads need to synchronize, for example, when entering a section of shared data.
A major drawback of user-level threads is that invocation of a blocking system
call will immediately block the entire process to which the thread belongs, and thus also
all the other threads in that process.

Combining kernel-level lightweight processes and user-level threads

The thread package has a single routine to schedule the next thread. When
creating an LWP (which is done by means of a system call), the LWP is given its own
stack, and is instructed to execute the scheduling routine in search of a thread to
execute. If there are several LWPs, then each of them executes the scheduler.

Threads in Distributed Systems


25
Distributed Systems Lecture Notes
Himayatullah sharief
To understand the benefits of threads for writing server code, consider the
organization of a file server that occasionally has to block waiting for the disk. The file
server normally waits for an incoming request for a file operation, subsequently carries
out the request, and then sends back the reply. One possible and particularly popular
organization is shown in Figure below. Here one thread, the dispatcher, reads incoming
requests for a file operation. The requests are sent by clients to a well-known end point
for this server. After examining the request, the server chooses an idle (i.e., blocked)
worker thread and hands it the request.

A multithreaded server organized in a dispatcher/worker model

Multithreaded Servers
The single-threaded server retains the ease and simplicity of blocking system
calls, but gives up some amount of performance. The finite-state machine approach
achieves high performance through parallelism, but uses non-blocking calls, thus is
hard to program. These models are summarized in Figure below.

Three ways to construct a server

Networked User Interfaces

26
Distributed Systems Lecture Notes
Himayatullah sharief
A major task of client machines is to provide the means for users to interact with remote
servers. There are roughly two ways In Figure (a) for each remote service the client
machine will have a separate counterpart that can contact the service over the network.
A typical example is an agenda running on a user's PDA that needs to synchronize with
a remote.
A second solution, shown below in Figure (b) is to provide direct access to remote
services by only offering a convenient user interface. Effectively, this means that the
client machine is used only as a terminal with no need for local storage, leading to an
application neutral solution.

(a) A networked application with its own protocol (b) A general solution to allow access to remote applns

Client-Side Software for Distribution Transparency


Client software comprises more than just user interfaces. In many cases, parts of
the processing and data level in a client-server application are executed on the client
side as well. A special class is formed by embedded client software, such as for
automatic teller machines (ATMs), cash registers, barcode readers, TV set-top boxes,
etc. In these cases, the user interface is a relatively small part of the client softw0061re,
in contrast to the local processing and communication facilities

Transparent replication of a server using a client-side solution

27
Distributed Systems Lecture Notes
Himayatullah sharief
General Design Issues -1
A concurrent server does not handle the request itself, but passes it to a separate thread
or another process, after which it immediately waits for the next incoming request. A
multithreaded server is an example of a concurrent server.

(a) Client-to-server binding using a daemon

Clients contact a server. In all cases, clients send requests to an end point, also called a
port, at the machine where the server is running. Each server listens to a specific end
point. How do clients know the end point of a service?

(b) Client-to-server binding using a super server

28
Distributed Systems Lecture Notes
Himayatullah sharief
Distributed Servers
The basic idea behind a distributed server is that clients benefit from a robust,
High performing, stable server. These properties can often be provided by high end
mainframes.
The main idea is to make use of available networking services, notably mobility
support for IP version 6 (MIPv6). In MIPv6, a mobile node is assumed to have a home
network where it normally resides and for which it has an associated stable address,
known as its home address (HoA). This home network has a special router attached,
known as the home agent, which will take care of traffic to the mobile node when it is
away.

Route optimization in a distributed server

Naming
Names, Identifiers, And Addresses

Names: A name in a DS is a string of bits or characters that is used to refer an entity. Ex. Resources
such as hosts, printers, and disks, explicitly named resources are Processes, web pages, network
connections etc

Address: An access point is required to operate an entity. AP is a special kind of entity in DS. The
name of access point is called address.

Properties of a true identifier:

1. An identifier refers to at most one entity.

2. Each entity is referred to by at most one identifier.

An identifier always refers to the same entity

29
Distributed Systems Lecture Notes
Himayatullah sharief
Forwarding Pointers -1

Each forwarding pointer is implemented as a (client stub, server stub) pair as shown in
Figure below. A server stub contains either a local reference to the actual object or a
local reference to a remote client stub for that object.

The principle of forwarding pointers using (client stub, server stub) pairs

Forwarding Pointers -2 and 3

To short-cut a chain of (client stub, server stub) pairs, an object invocation carries the
identification of the client stub from where that invocation was initiated. A client-stub
identification consists of the client's transport-level address, combined with a locally
generated number to identify that stub. When the invocation reaches the object at its
current location, a response is sent back to the client stub where the invocation was
initiated. The current location is piggybacked with this response, and the client stub
adjusts its companion server stub to the one in the object's current location. This
principle is shown in Figure below

(a) Redirecting a forwarding pointer by storing a shortcut in a client stub


(b) Redirecting a forwarding pointer by storing a shortcut in a client stub

30
Distributed Systems Lecture Notes
Himayatullah sharief
Home-Based Approaches
An approach which supports mobile entities in large-scale networks is to
introduce a home location, which keeps track of the current location of an entity. Special
techniques may be applied to safeguard against network or process failures. In practice,
the home location is often chosen to be the place where an entity was created. The home-
based approach is used as a fall-back mechanism for location services based on
forwarding pointers.
Another example where the home-based approach is followed is in Mobile IP.

The principle of Mobile IP

Hierarchical Approach
In a hierarchical scheme, a network is divided into a collection of domains. There
is a single top-level domain that spans the entire network. Each domain can be
subdivided into multiple, smaller sub-domains. A lowest-level domain, called a leaf
domain, typically corresponds to a local-area network in a computer network or a cell
in a mobile telephone network. Each domain D has associated directory node dirt D)
that keeps track of the entities in that domain. This leads to a tree of directory nodes.
The directory node of the top-level domain, called the root (directory) node, knows
about all entities. This general organization of a network into domains and directory
nodes is illustrated in Figure below

31
Distributed Systems Lecture Notes
Himayatullah sharief

Hierarchical organization of a location service into domains, each having an associated directory
node.

Name spaces
Names in Distributed Systems are organized in to what is commonly referred to as name space.
A name space can be represented as a labeled directed graph with 2 types of nodes.

A leaf node represents a named entity and has the property that it has no outgoing edges. A leaf
node represents a named entity and has the property that it has no outgoing edges. A leaf node generally
stores information on the entity it is representing. Ex its address and alternatively it can also store the
state of the entity.

A directory node has a no. of outgoing edges, each labeled with a name. Each node in a naming
graph is considered as yet another entity in DS and in particular has an associated identifier. A directory
node stores a table in which an outgoing edge represented as a pair (edge label, node identifier). Such a
table is called directory table.

Name Spaces -1

A general naming graph with a single root node

Name Spaces -2

32
Distributed Systems Lecture Notes
Himayatullah sharief

The general organizations of the UNIX file system implementation on a logical disk of
contiguous disk blocks

Name Resolution:

Name spaces offer a convenient mechanism for storing and retrieving information about entities
by means of names. M ore generally given a pathname it should be possible to look up any information
stored in the node referred to by that name. the process of looking up a name is called name resolution.
Ex. N:<label1, label2,..label-n>. Resolution of the name starts at N.

Closure mechanism:
Name resolution can take place only if we know how and where to start. Knowing how and where to
start name resolution is referred to as closure mechanism.

Information required to mount a foreign name space in a distributed system

1. The name of an access protocol.


2. The name of the server.
3. The name of the mounting point in the foreign name space.

Mounting remote name spaces through a specific access protocol

33
Distributed Systems Lecture Notes
Himayatullah sharief
Implementation of Name space

A name space forms the heart of a naming service i.e a service that allows users and
processes to add, remove and look up names. A naming service is implemented by name servers.

If a DS is restricted to a LAN, it is often feasible to implement a naming service by means of


only a single name server.

In large scale DS with may entities, possibly spread across a large geographical area, it is
necessary to distribute the implementation of name space over multiple name servers.

Name space Distribution.


Name space for a large scale, possibly WW DS, are usually organized hierarchically.

Name Space Distribution -1

An example partitioning of the DNS name space, including Internet-accessible files, into three layers

Name Space Distribution -2

A comparison between name servers for implementing nodes from a large-scale name space
partitioned into a global layer, an administrational layer, and a managerial layer.

34
Distributed Systems Lecture Notes
Himayatullah sharief
Implementation of Name Resolution

Assume the (absolute) path name root: «nl, VU, CS, ftp, pub, globe, index.html> is to
be resolved. Using a URL notation, this path name would correspond to ftp://ftp.cs.
vu.nl/pub/globe/index.html. There are two ways to implement name resolution.

Implementation of Name Resolution (1)

Iterative name resolution

The principle of iterative name resolution

Implementation of Name Resolution -2

Recursive name resolution

The principle of recursive name resolution

35
Distributed Systems Lecture Notes
Himayatullah sharief
Drawback:

Higher performance demand on each server

Advantage:

Caching results is more effective compared to the iterative name resolution and communication
costs may be reduced.

Recursive name resolution of < nl, vu,cs, ftp>. Name servers cache intermediate results for
subsequent lookups

The benefit of this approach is that eventually lookup operations can be handled extremely efficiently
for ex, suppose that another client later requests resolution of the path name root:< nl, vu, cs, flits>.
This name is passed to the root, which can immediately forward it to the name server for the cs node,
and request it to resolve the remaining path name cs:<flits>

The Domain Name System

The comparison between recursive and iterative name resolution with respect to communication costs

The DNS Name Space

The most important types of resource records forming the contents of nodes in the DNS name space

36
Distributed Systems Lecture Notes
Himayatullah sharief
DNS Implementation

An excerpt from the DNS database for the zone cs.vu.nl

37
Distributed Systems Lecture Notes
Himayatullah sharief
UNIT-IV
Distributed Object-Based Systems
Overview of CORBA
• CORBA: Common Object Request Broker Architecture

• Background:

– Developed by the Object Management Group (OMG) in response to


industrial demands for object-based middleware

– Currently in version #2.4 and #3

– CORBA is a specification: different implementations of CORBA exist

• CORBA provides a simple distributed object model, with specifications for


many supporting services it may be here to stay (for a couple of years)

Architecture of CORBA
– The Object Request Broker (ORB) forms the core of any CORBA
distributed system.

– Horizontal facilities consist of general-purpose high-level services that


are independent of application domains.

• User interface
• Information management
• System management
• Task management
– Vertical facilities consist of high-level services that are targeted to a
specific application domain such as electronic commerce, banking, and
manufacturing.

The General Architecture of CORBA

38
Distributed Systems Lecture Notes
Himayatullah sharief

The Global Architecture of CORBA

Object Model
• CORBA follows an interface based approach to objects:

– Not the objects, but interfaces are the really important entities
– An object may implement one or more interfaces
– Interface descriptions can be stored in an interface repository, and looked
up at runtime
– Mappings from IDL to specific programming are part of the CORBA
specification (languages include C, C++, Smalltalk, Cobol, Ada, and Java.

• In DCOM and Globe, interfaces can be specified at a lower level in the form of
tables, called binary interfaces.

• Object Request Broker (ORB): CORBA's object broker that connects clients,
objects, and services

• Proxy/Skeleton: Precompiled code that takes care of (un)marshaling


invocations and results

• Dynamic Invocation/Skeleton Interface (DII/DSI): To allow clients to


construct invocation requests at runtime instead of calling methods at a proxy, and
having the server side reconstruct those request into regular method invocations

• Object adapter: Server side code that handles incoming invocation requests.

• Interface repository:

39
Distributed Systems Lecture Notes
Himayatullah sharief
– Database containing interface definitions and which can be queried at
runtime
– Whenever an interface definition is compiled, the IDL compiler assigns a
repository identifier to that interface.

• Implementation repository:

– Database containing the implementation (code, and possibly also state)


of objects.
– Given an object reference, an object adaptor could contact the
implementation repository to find out exactly what needs to be done.

Invocation models supported in CORBA

Failure
Request type Description
semantics

Synchronous At-most-once Caller blocks until a response is returned


or an exception is raised

One-way Best effort Caller continues immediately without


delivery waiting for any response from the server

Deferred At-most-once Caller continues immediately and can


synchronous later block until response is delivered

Communication Models
• CORBA supports the message-queuing model through the messaging service.

– In callback model, A client provides an object with an interface


containing callback methods which can be called by the underlying communication
system to pass the result of an asynchronous invocation.

– In polling model, the client is offered a collection of operations to poll


its ORB for incoming result.

• General Inter-ORB Protocol (GIOP) is a standard communication protocol


between the client and server.

40
Distributed Systems Lecture Notes
Himayatullah sharief
• Internet Inter-ORB Protocol (IIOP) is a GIOP on top of TCP

Processes
• CORBA distinguishes two types of processes: clients and servers.

• An interceptor is a mechanism by which an invocation can be intercepted on its


way from client to server, and adapted as necessary before letting it continue.

– It is designed to allow proxies to adapt the client-side software.

– Requestlevel: Allows you to modify invocation semantics (e.g.,


multicasting)

– Messagelevel: Allows you to control messagepassing between client and


server (e.g., handle reliability and fragmentation)

Logical placement of interceptors in CORBA

Naming
• In CORBA, it is essential to distinguish specificationlevel and implementation-
level object references

– Specification level: An object reference is considered to be the same as a


proxy for the referenced object.  Having an object reference means you can directly
invoke methods. There is no separate clienttoobject binding phase

– Implementation level: When a client gets an object reference, the


implementation ensures that, one way or the other, a proxy for the referenced object is
placed in the client's address space.

41
Distributed Systems Lecture Notes
Himayatullah sharief
• Conclusion: Object references in CORBA used to be highly implementation
dependent. Different implementations of CORBA could normally not exchange their
references.

Synchronization
The two most important services that facilitate synchronization in CORBA are its concurrency
control service and its transaction service.

The two services collaborate to implement distributed and nested transactions using two-
phase locking.

There are two types of objects that can be part of transaction:

– A recoverable object is an object that is executed by an object server capable


of participating in a two-phase commit protocol.

– The transactional objects are executed by servers that do not participate in a


transaction’s two-phase commit protocol.

Caching and Replication


CORBA offers no support for generic caching and replication.

CASCADE is built to provide a generic, scalable mechanism that allows any kind of CORBA
object to be cached.

CASCADE offers a caching service implemented as a large collection of object servers referred
to as a Domain Caching Server (DCS).

Each DCS is an object server running on a CORBA ORB. The collection of DCSs may be
spread across the Internet.

The (simplified) organization of a DCS

42
Distributed Systems Lecture Notes
Himayatullah sharief
Fault Tolerance
In CORBA version 3, fault tolerance is addressed. The basic approach for fault tolerance is to
replicate objects into object groups.

Masking failures is achieved through replication by putting objects into object groups. Object
groups are transparent to clients. They appear as normal objects.

This approach requires a separate type of object reference: Interoperable Object Group
Reference (IOGR).

IOGRs have the same structure as IORs. The main difference is that they are used differently.
In IORs an additional profile is used as an alternative. In IOGR, it denotes another replica.

Security
The underlying idea is to allow the client and object to be mostly unaware of all the security
policies, except perhaps at binding time. The ORB does the rest.

Specific policies are passed to the ORB as (local) policy objects and are invoked when
necessary.

Examples: Type of message protection, lists of trusted parties.

A replaceable security service is a service which can be specified by means of standard


interfaces that hide the implementation.

The general organization for secure object invocation in CORBA

43
Distributed Systems Lecture Notes
Himayatullah sharief
Distributed COM
DCOM: Distributed Component Object Model

– Microsoft's solution to establishing interprocess communication, possibly


across machine boundaries.

– DCOM uses the RPC mechanism to transparently send and receive


information between COM components (i.e., clients and servers) on the same network.

– Supports a primitive notion of distributed objects

– Evolved from early Windows versions to current NTbased systems (including


Windows 2000/XP)

– Comparable to CORBA's object request broker (Microsoft’s CORBA).

DCOM Overview
DCOM is related to many things that have been introduced by Microsoft in the past couple of
years:

– DCOM: Adds facilities to communicate across process and machine


boundaries.

– SCM: Service Control Manager, responsible for activating objects (cf., to


CORBA's implementation repository).

– Proxy marshaler: handles the way that object references are passed between
different machines.

The general organization of ActiveX, OLE, and COM

44
Distributed Systems Lecture Notes
Himayatullah sharief
Naming in DCOM
Observation: DCOM can handle only objects as temporary instances of a class. To
accommodate objects that can outlive their client, something else is needed.

Moniker: A name that uniquely identifies a Microsoft's COM (persistent) object similar to a
directory path name.

– A moniker associates data (e.g., a file), with an application or program.

– Monikers can be stored.

– A moniker can contain a binding protocol, specifying how the associated


program should be ‘launched’ with respect to the data.

Fault Tolerance in DCOM


Automatic transactions: Each class object (from which objects are created), has a transaction
attribute that determines how its objects behave as part of a transaction.

Note: Transactions are essentially executed at the level of a method invocation.

Transaction attributes values for DCOM objects

Attribute value Description

A new transaction is always started at each


REQUIRES_NEW
invocation

A new transaction is started if not already


REQUIRED
done so

Join a transaction only if caller is already


SUPPORTED
part of one

NOT_SUPPORTED Never join a transaction

Never join a transaction, even if told to do


DISABLED
so

45
Distributed Systems Lecture Notes
Himayatullah sharief

Security in DCOM
Declarative security: Register per object what the system should enforce with respect to
authentication. Authentication is associated with users and user groups. There are different
authentication levels.
Delegation: A server can impersonate a client depending on a level.
Note: There is also support for programmatic security by which security levels can be set by
an application, as well as the required security services (see book).

Authentication levels in DCOM.

Authentication level Description

NONE No authentication is required

CONNECT Authenticate client when first connected to server

CALL Authenticate client at each invocation

PACKET Authenticate all data packets

PACKET_INTEGRITY Authenticate data packets and do integrity check

PACKET_PRIVACY Authenticate, integrity-check, and encrypt data packets

Globe
•A Globe object is a physically distributed shared object: the object's state may be physically
distributed across several machines

•Local object: A non-distributed object residing a single address space, often representing a
distributed shared object

•Contact point: A point where clients can contact the distributed object; each contact point is
described through a contact address

46
Distributed Systems Lecture Notes
Himayatullah sharief
•Observation: Globe attempts to separate functionality from distribution by distinguishing
different local sub-objects:

Globe Object Model


Semantics sub object: Contains the methods that implement the functionality of the
distributed shared object

Communication sub object: Provides a (relatively simple), network independent


interface for communication between local objects

Replication sub object: Contains the implementation of an object specific consistency


protocol that controls exactly when a method on the semantics sub object may be
invoked

Control sub object: Connects the user defined interfaces of the semantics sub object to
the generic, predefined interfaces of the replication sub object

The organization of a Globe distributed shared object

Communication in GLOBE

Invoking an object in Globe that uses active replication

47
Distributed Systems Lecture Notes
Himayatullah sharief

.Globe Naming Service


Iterative DNS-based name resolution in Globe

Caching and Replication


Observation: Here's where Globe differs from many other systems:

– The organization of a local object is such that replication is inherently


part of each distributed shared object

– All replication sub objects have the same interface:

– This approach allows to implement any object specific


caching/replication strategy

The behavior of the control sub object as a finite state machine

48
Distributed Systems Lecture Notes
Himayatullah sharief

Examples of Replication in Globe (1)


State transitions and actions for active replication

Read method

State Action to take Method call Next state

START None Start INVOKE

INVOKE Invoke local Invoked RETURN


method
RETURN Return results to None START
caller

Modify method
State Action to take Method call Next state

START None Start SEND

SEND Pass marshaled Send INVOKE


invocations
INVOKE invoke local method Invoked RETURN

RETURN Return results to None START


caller

49
Distributed Systems Lecture Notes
Himayatullah sharief
State transitions and actions with primary-backup replication

Read method

State Action to take Method call Next state

START None Start INVOKE

INVOKE Invoke local Invoked RETURN


method

RETURN Return results to None START


caller

Modify method at backup replica

State Action to take Method call Next state

START None Start SEND

SEND Pass marshaled Send RETURN


invocation

RETURN Return results to None START


caller

Modify method at primary replica

State Action to take Method call Next state

START none Start INVOKE

INVOKE invoke local method Invoked RETURN

RETURN Return results to None START


caller

50
Distributed Systems Lecture Notes
Himayatullah sharief
Security
Additional security sub object checks for authorized communication, invocation, and
parameter values. Globe can be integrated with existing security services:
The position of a security sub object in a Globe local object

Comparison - Comparison of CORBA, DCOM and Globe

Issue CORBA DCOM Globe

Design goals Interoperability Functionality Scalability

Object model Remote objects Remote objects Distributed objects

Services Many of its own From environment Few

Interfaces IDL based Binary Binary

Sync.
Yes Yes Yes
communication

Async.
Yes Yes No
communication

Callbacks Yes Yes No

Events Yes Yes No

Messaging Yes Yes No

51
Distributed Systems Lecture Notes
Himayatullah sharief
Object server Flexible (POA) Hard-coded Object dependent

Directory service Yes Yes No

Trading service yes No No

UNIT-V

Distributed Multimedia Systems


What is Distributed Multimedia?

It is a large quantity of distributed data, typically streamed out from one or many
receivers of the data which run over general purpose infrastructure. Data is time
sensitive, but not necessarily real time

The working of Distributed Multimedia systems can be broadly classified into


4Phases
1. Encoding
2. Storage (not always required)
3. Transport
4. Decoding

Transport: Quality of Service


 Issue: Gracefully and dynamically manage against the underlying
infrastructure's changing behavior
 Approaches: caching, priorities, resource availability modeling,
compression
 Similar to locking - QoS tries to guarantee that a set of resources will be
available
The Major QoS Concerns are Latency, Bandwidth, Loss Rate, Bursting and Jitter

QoS:
QoS guarantees requires that resources are allocated and scheduled to multimedia
applications under real time requirements need for QoS-driven resource management
when resources are shared between several application and some of these have real time
deadlines.

52
Distributed Systems Lecture Notes
Himayatullah sharief

A typical Distributed Multimedia System

Vi deo cam era


an d mi ke

Lo cal netwo rk Lo cal netwo rk

Wi de a rea g ateway Vi deo Digi tal


se rve r TV/radi o
se rve r

Characteristics of typical multimedia streams

Data rate Sample or frame


(approximate) frequency size
Telephone speech 64 kbps 8 bits 8000/sec
CD-quality sound 1.4 Mbps 16 bits44,000/sec
Standard TV video 120 Mbps up to 640x 480 24/sec
(uncompressed) pixels
x 16 bits
Standard TV video 1.5 Mbps variable 24/sec
(MPEG-1 compressed)
HDTV video 1000–3000 Mbps
up to 1920
x 1080 24–60/sec
(uncompressed) pixels
x 24 bits
HDTV video 10–30 Mbps variable24–60/sec
MPEG-2 compressed)

53
Distributed Systems Lecture Notes
Himayatullah sharief

Typical infrastructure components for multimedia applications

PC/workstation PC/workstation
Wi ndow system
Came ra H
A K G
Code c Code c

B L
Micropho nes Mixer

Network
C conne cti ons
Screen Vi deo fi le system Vi deo
D store
M
Code c

Wi ndow system

: mu ltim ediastream
Wh ite boxes repres ent medi a pro ces sing co mpon ents,
ma ny of wh ich are impl emen ted in s oftware , in clu din g:
codec: codin g/de cod ingfi lter
mi xe r: so und-m ixingcompo nent

QoS specifications for components of the application shown in above “Typical


infrastructure components for multimedia applications” diagram

ComponentBandwidth Latency Loss rate


Resources required
Camera Out: 10 frames/sec, raw video Zero
640x480x16 bits
A Codec In: 10 frames/sec, raw Interactive
video Low 10 ms CPU each 100 ms;
Out: MPEG-1 stream 10 Mbytes RAM
B Mixer In: 2 44 kbps audio Interactive Very low
1 ms CPU each 100 ms;
Out: 1 44 kbps audio 1 Mbytes RAM
H WindowIn: various Interactive
Low 5 ms CPU each 100 ms;
system Out: 50 frame/sec framebuffer 5 Mbytes RAM
K NetworkIn/Out:MPEG-1 stream, approx.
Interactive
Low 1.5 Mbps, low-loss
connection 1.5 Mbps stream protocol
L NetworkIn/Out:Audio 44 kbps Interactive
Very low
44 kbps, very low-loss
connection stream protocol

54
Distributed Systems Lecture Notes
Himayatullah sharief

The QoS manager’s task


1. QoS specification
2. QoS parameter translation and distribution
3. QoS negotiation admission control/reservation
4. QoS monitoring
5. QoS renegotiation/resource adaptation
6. QoS adaptation resource de-allocation

Ad miss ioncontro l QoS ne goti atio n


Applic ation c omponents spec ify their QoS
requirements to QoS manager

Fl ow spe c.
QoS manager evaluates new requirements
against the available resourc es .
Suffic ient?

Yes No

Reserve the reques ted resourc es Negotiate reduced resourc eprov is ion w ith application.
Agreement?
Reso urce co ntract Yes No
Allow application to proc eed
Do not allow application to proc eed

Applic ation runs w ith res ources as Applic ation notifies QoS manager of
per resourc e contract inc reased resourc e requirements

Admission control
QoS values must be mapped to resource requirements

Admission test for

1. Schedulability: - can the CPU slots be assigned to tasks such that all tasks
receive sufficient slots?

55
Distributed Systems Lecture Notes
Himayatullah sharief
2. buffer space e.g., for encoding/decoding, jitter removal buffer, ...
3. bandwidth e.g., MPEG1 stream with VCR quality generates about 1.5 Mbps
4. availability/capabilities of devices

Resource Management
• Resource Scheduling

o To provide Quality of Service (Qos) to an application not only system


must have sufficient resource (performance), it also needs to make these
resource available to an application when they are needed (scheduling).

• Types of Resource Scheduling

1. Fair Scheduling
If several streams compete for a same resource it is necessary to consider
fairness and to prevent ill behaved streams taking too much bandwidth.
Round robin method is used on bit by bit basis, which provides more fairness
with respect to varying packet sizes and arrival times.

2. Real-time Scheduling
The Scheduling Algorithms assigns CPU time slots to a set of processes in a
manner that ensures that they complete their tasks on time.
Earliest- deadline first (EDF).

56

You might also like