Distributed Systems-Question Bank
Distributed Systems-Question Bank
Distributed Systems-Question Bank
UNIT I
PART A
A distributed system is a collection of independent computers that appears to its users as a single
coherent system. A distributed system is one in which components located at networked
communicate and coordinate their actions only by passing message.
4. Define heterogeneity.
The Internet enables users to access services and run applications over a heterogeneous collection
of computers and networks. Heterogeneity (that is, variety and difference) applies to all of the
following:
• Networks;
• Computer Hardware;
• Operating Systems;
• Programming Languages;
• Implementations By Different Developers.
7. Define transparencies.
Transparency is defined as the concealment from the user and the application programmer of
the separation of components in a distributed system, so that the system is perceived as a
whole rather than as a collection of independent components. The implications of
transparency are a major influence on the design of the system software.
9. Define Middleware.
The term middleware applies to a software layer that provides a programming abstraction as well
as masking the heterogeneity of the underlying networks, hardware, operating systems and
programming languages. In addition to solving the problems of heterogeneity, middleware
provides a uniform computational model for use by the programmers of servers and distributed
applications.
If the well-defined interfaces for a system are published, it is easier for developers to add new
features or replace sub-systems in the future. Example: Twitter and Facebook have API that allows
developers to develop their own software interactively.
• Performance issues
o Throughput
o Balancing computational loads
• Quality of service
• Use of casting and replication
• Dependency issues
• Fault tolerance
• Security
UNIT -II
(1) Overlay networks allow both networking developers and application users to easily design
and implement their own communication environment and protocols on top of the Internet,
such as data routing and file sharing management.
(2) (2) Data routing in overlay networks can be very flexible, quickly detecting and avoiding
network congestions by adaptively selecting paths based on different metrics, such as probed
latency.
(3) (3) The end-nodes in overlay networks are highly connected to each other due to flexible
routing. As long as the physical network connections exist, one end-node can always
communicate to another end-node via overlay networks. Thus, scalability and robustness in
overlay networks are two attractive features.
(4) The high connectivity of increasingly more end-nodes to join overlay networks enables
effective sharing of a huge amount of information and resources available in the Internet.
17. Give the advantages in using name caches in the file system
• Better performance since repeated accesses to the same information is handled additional
network accesses and disk transfers. This is due to locality in file access patterns.
• It contributes to the scalability and reliability of the distributed file system since data can
be remotely cached on the client node.
Examples of Middleware
Inter process communication (IPC) is a set of programming interfaces that allow a programmer to
Coordinate activities among different program processes that can run concurrently in an operating
system. This allows a program to handle many user requests at the same time.
• port numbers for addressing different functions at the source and destination of the
datagram.
Remote Method Invocation (RMI) is an API which allows an object to invoke a method on an object
that exists in another address space, which could be on the same machine or on a remote machine.
Through RMI, object running in a JVM present on a computer (Client side) can invoke methods on
an object present in another JVM (Server side). RMI creates a public remote server object that
enables client and server side communications through simple method calls on the server object.
UNIT-III
Logical Clock
Logical clocks are useful in computation analysis, distributed algorithm design, individual
event tracking, and exploring computational progress.
Some noteworthy logical clock algorithms are:
•Lamport times tamps, which are monotonically increasing software counters.
•Vector clocks, that allow for partial ordering of events in a distributed system.
•Version vectors, order replicas, according to updates, in an optimistic replicated system.
•Matrix clocks, an extension of vector clocks that also contains information about other
processes' views of the system.
The directory services provide a mapping between text names for files and their UFIDs. Client may
obtain the UFIDs of a file by quoting its text name to the directory services. The directory services
provide the function needed to generate directories, to add new file name to directories and to obtain
UFIDs from directories. It is client of the flat file services; its directory is stored in files of the flat
services. When a hierarchic file-naming scheme is adopted as in UNIX, directories hold references to
other directories.
LDAP is a protocol that runs over TCP/IP. The LDAP protocol standard includes low-level
network protocol definitions plus data representations and handling functionality. A directory
that is accessible through LDAP is commonly referred to as an LDAP directory.
41.What is Tapstry?
Tapestry is a peer-to-peer overlay network which provides a distributed hash table, routing, and
multicasting infrastructure for distributed applications.The Tapestry peer-to-peer system offers
efficient, scalable, self-repairing, location-aware routing to nearby resources.
• Pastry
• Tapestry
A peer-to-peer (P2P) network is created when two or more PCs are connected and share resources
without going through a separate server computer. A P2P network can be an ad hoc connection—a
couple of computers connected via a Universal Serial Bus to transfer files.
UNIT –IV
Each process accessing the shared data excludes all others from doing simultaneously called as
Mutual Exclusion.
50. Why clock synchronization is necessary?
It is necessary to coordinate independent clocks. Even when initially set accurately, real clocks will
differ after some amount of time due to clock drift, caused by clocks counting time at slightly
different rates. There are several problems that occur as a result of clock rate differences and several
solutions, some being more appropriate than others in certain contexts
File sharing can be done using several methods. The most common techniques for file storage,
distribution and transmission include the following:
Happened-before relation (denoted: → {\display style \to \;} ) is a relation between the result of two
events, such that if one event should happen before another event, the result must reflect that, even if
those events are in reality executed out of order (usually to optimize program flow).
A cut C is consistent if, for each event it contains, it also contains all the events that
happened-before that event:
Consider the events occurring at processes p1 and p2 shown in Figure. The figure shows two cuts,
one with frontier <e10,e20> and another with frontier <e12,e22>. The leftmost cut is inconsistent. This is
because at p2 it includes the receipt of the message m1, but at p1 it does not include the sending of
that message.
UNIT V
• Availability
• Utilizing special capabilities
In task assignment approach, each process submitted by a user for processing is viewed as a
collection of related tasks and these tasks are scheduled to suitable nodes so as to improve
performance.