Advanced Operating Systems: Text Book
Advanced Operating Systems: Text Book
Advanced Operating Systems: Text Book
] & [IT]
Advanced
Operating Systems
TEXT BOOK:
1. Advanced Concepts in Operating Systems, Mukesh Singhal, Niranjan G. Shivaratri,
Tata McGraw-Hill Edition 2001
REFERENCE:
1. Distributed Systems: Andrew S. Tanenbaum, Maarten Van Steen, Pearson Prentice
Hall, Edition – 2, 2007
central coordinator. Each machine is responsible for maintaining a resource graph for its
processes and resources. A central coordinator maintains the resource utilization graph for the
In the centralized approach of deadlock detection, two techniques are used namely:
Completely centralized algorithm and Ho Ramamurthy algorithm (One phase and Two-phase).
In a network of n sites, one site is chosen as a control site. This site is responsible for deadlock
detection. It has control over all resources of the system. If a site requires a resource it
requests the control site, the control site allocates and de-allocates resources and maintains a
wait for graph. And at regular interval of time, it checks the wait for graph to detect a cycle. If
cycle exits then it will declare system as deadlock otherwise the system will continue working.
The major drawbacks of this technique are as follows:
[1] A site has to send request even for using its own resource.
[2] There is a possibility of phantom deadlock.
In this technique a resource status table is maintained by the central or control site, if a cycle is
detected then the system is not declared deadlock at first, the cycle is checked again as the
system is distributed some or the other resource is vacant or freed by sites at every instant of
time. Now, after checking if a cycle is detected again then, the system is declared as deadlock.
This technique reduces the possibility of phantom deadlock but on the other hand time
consumption is more.
In this technique a resource status table and a process table is maintained by the central or
control site if the cycle is detected in both processes and resource tables then, the system is
declared as deadlock. This technique reduces time consumption but space complexity
increases.
• Distributed Approach
– Detect cycles using probes.
– If process pi blocked on pj , it launches probe pi pj
– pj sends probe pi pj pk along all request edges, etc.
– When probe returns to pi, cycle is detected
In hierarchical deadlock detection algorithm, sites are arranged in a hierarchical fashion and a
site detects deadlocks involving only its descendant sites. Distributed deadlock algorithms
delegate the responsibility of deadlock detection to individual sites while in hierarchical there
are local detectors at each site which communicate their local wait for graphs(WFG) with one
another.
Approach:
Deadlocks that are local to a single site are detected at that site using their local WFG. Each
site also sends its local WFG to deadlock detector at the next level. Thus, distributed deadlocks
involving 2 or more sites would be detected by a deadlock detector in lowest level that has
control over these sites. In this approach, there are 2 methods to detect:
1. Ho-Ramamoorthy Algorithm:
*Uses only two levels i.e. Master control nodes and Cluster control nodes.
*Cluster control nodes are used for detecting deadlock among their members and reporting
dependencies outside their cluster to Master control node.
*The master control node is responsible for detecting inter-cluster deadlocks.
2. Menasce-Muntz Algorithm:
*Leaf controllers are responsible for allocating resources whereas branch controllers find
deadlock among the resources that their children span in the tree.
*Network congestion can be managed and node failure is less critical than in fully centralized.
*Detection can be done ways such as Continuous allocation reporting or periodically allocation
reporting.
Advantages:
If the hierarchy coincides with resource access pattern local to cluster of sites, this approach
can provide efficient deadlock detection as compared to both centralized and distributed
methods.
It reduces the dependence on central sites thus, reducing the communication cost.
Disadvantages:
If deadlocks are span over several clusters, this approach will be inefficient.
It is more complicated to implement and would involve nontrivial modification to lock and
Introduction
A Multiprocessor System is defined as "a system with more than one processor“
Enhanced Performance
Fault Tolerance
A multiprocessor system consists of multiple processors, which execute different programs (or
The main memory is typically shared by all the processors. Based on whether a memory
location can be directly accessed by a processor or not, there are two type s of multiprocessor
system.
Tightly Coupled
Loosely Coupled
Loosely Coupled
In loosely coupled systems, not only is the main memory partitioned and attached to
processors, but each processor has its own address space. Therefore, a processor cannot
directly access the memory attached to other processors. One example of a loosely coupled
Loosely coupled systems, on the other hand, use only message passing for inter-processor
Tightly Coupled
In tightly coupled systems, all processors share the same memory address space and all
Tightly coupled systems can use the main memory for inter-processor communication and
synchronization.
cc-NUMA system
Hybrid system – shared system memory for global data and local memory for local data
(Based on the vicinity and accessibility of the main memory to the processors)
The main memory is located at a central location such that it is equidistant from all the
That is, all the processors have the same access time to the main memory. In addition to this
centralized shared memory, processors may also have private memories, where they can cache
Some examples of UMA architectures are Multimax of Encore Corporation, Balance of Sequent,
The main memory is physically partitioned and the partitions are attached to the processors.
A processor can directly access the memory attached to any other processor, but the time to
Examples of NUMA architectures are cm* of CMU and Butterfly machine of BBN Laboratories.
The main memory is physically partitioned and the partitions are attached to the processors.
A processor cannot directly access the memory of any other processor. The processors must
Multiprocessor Operating System refers to the use of two or more central processing units
(CPU) within a single computer system. These multiple CPUs are in a close communication
processor runs an identical copy of operating system and these copies communicate with each
In order to employ multiprocessing operating system effectively, the computer system must
means additional sockets or slots for the extra chips and a chip-set capable of handling the
multiprocessing arrangement.
system.
It allows multi-computer systems to share files - Even when no other IPC or RPC is needed
E.g.,
Goal:
A transparent DFS hides the location where in the network the file is stored.
Location Transparency – file name does not reveal the file’s physical storage location.
Location Independence – file name does not need to be changed when the file’s physical
A transparent DFS hides the location where in the network the file is stored.
Location Transparency – file name does not reveal the file’s physical storage location.
Location Independence – file name does not need to be changed when the file’s physical
1. Mount remote directories to local directories, giving the appearance of a coherent local
directory tree
• A single global name structure spans all the files in the system.
becomes unavailable.
• Reduce network traffic by retaining recently accessed disk blocks in local cache
• If needed data not already cached, copy of data brought from the server to the local cache.
• Cache-consistency problem – keeping the cached copies consistent with the master file.
In client memory
On client disk
I.e. when is cached data written from the cache to the file?
Write operations complete quickly; some data may be overwritten in cache, saving
Poor reliability
Inconsistent data
H.W. for you (Is locally cached copy of the data consistent with the master copy)
Client-initiated approach
Server-initiated approach
RPC functions
Cached System
Stateless Service
Stateful Service
Server fetches information about file from disk, stores in server memory,
Windows
Failure Recovery: Stateless server failure and recovery are almost unnoticeable.
NFS
(read, write, link, symlink, mkdir, rename, rmdir, readdir, readlink, getattr, setattr, create, remove)
File System is a refinement of the more general abstraction of permanent storage. Databases
A file system defines the naming structure, characteristics of the files and the set of operations
The classification of computer systems and the corresponding file system requirements are
given below.
Each level subsumes the functionality of the layers below in addition to the new functionality
Multi User/Task
Security Single Site
Distributed File Systems constitute the highest level of this classification. Multiple users using
multiple machines connected by a network use a common file system that must be efficient,
Encapsulation: File systems view data in files as a collection of un-interpreted bytes whereas
databases contain information about the type of each data item and relationships with other
data items.
Naming: File Systems organize files into directory hierarchies. The purpose of these hierarchies
The ratio of search time to usage time is the factor that determines whether access by name is
adequate.
Mounting: Mount mechanisms allow the binding together of different file namespaces to form a
Client machines can maintain mount information individually as is done in Sun's Network File
System.
Mount information can be maintained at servers in which case it is possible that every client
sees an identical namespace.
Client Caching is the architectural feature which contributes the most to performance in a
distributed file system.
Bulk Data Transfer allows all data transfer in a network requires the execution of various layers
of communication protocols.
Caching amortizes the cost of accessing remote servers over several local references to the
same information, bulk transfer amortizes the cost of the fixed communication protocol
overheads.
A client operates on files during a session with a file server. A session is an established
connection for a sequence of requests and responses between the client and server. During this
session a number of items of data may be maintained by the parties such as the set of open
files and their clients, file descriptors and handles, current position pointers, mounting
This information is required partly to avoid repeated authentication protocols for each request
and repeated directory lookup operations for satisfying each file access during the session. The
Anonymous “system:anyuser”
Path is a cell
Fetch: return status and optionally data of a file or directory, and place a callback on it
RemoveCallBack: specify a file that the client has flushed from the local machine
Network partition
Reads
Writes
Understand the worst case scenario of no state on the server and see if there are still
Encapsulation: File systems view data in files as a collection of un-interpreted bytes whereas
databases contain information about the type of each data item and relationships with other
data items.
Naming: File Systems organize files into directory hierarchies. The purpose of these hierarchies
The ratio of search time to usage time is the factor that determines whether access by name is
adequate.
Mounting: Mount mechanisms allow the binding together of different file namespaces to form a
Client machines can maintain mount information individually as is done in Sun's Network File
System.
Mount information can be maintained at servers in which case it is possible that every client
sees an identical namespace.
Client Caching is the architectural feature which contributes the most to performance in a
distributed file system.
Bulk Data Transfer allows all data transfer in a network requires the execution of various layers
of communication protocols.
Caching amortizes the cost of accessing remote servers over several local references to the
same information, bulk transfer amortizes the cost of the fixed communication protocol
overheads.
A client operates on files during a session with a file server. A session is an established
connection for a sequence of requests and responses between the client and server. During this
session a number of items of data may be maintained by the parties such as the set of open
files and their clients, file descriptors and handles, current position pointers, mounting
This information is required partly to avoid repeated authentication protocols for each request
and repeated directory lookup operations for satisfying each file access during the session. The
Anonymous “system:anyuser”
Path is a cell
Fetch: Return status and optionally data of a file or directory, and place
a callback on it
RemoveCallBack: Specify a file that the client has flushed from the local machine
Network partition
Reads
Writes
Understand the worst case scenario of no state on the server and see if there are still
End of Unit IV
Distributed Scheduling:
Distributing Algorithms, Requirements for Load Distributing, Task Migration, Issues in task
Migration
Architecture and Motivation, Algorithms for Implementing DSM, Memory Coherence, Coherence
Distributed Scheduling refers to the chaining of different jobs into a coordinated workflow that
For example, you schedule a processing job on machine1 and machine2, and when these are
It deals with the allocation of resources to tasks over given time periods and its goal is to
Implementing a distributed system requires cost for hardware support and agreements on
service expectations.
Optimizing tasks through proper scheduling helps reduce the overall cost of computation while
Scheduling is a decision-making process that is used on a regular basis in many manufacturing and services industries.
It deals with the allocation of resources to tasks over given time periods and its goal is to optimize one or more
objectives.
Implementing a distributed system requires cost for hardware support and agreements on service expectations.
Optimizing tasks through proper scheduling helps reduce the overall cost of computation while increasing the value
customers receive.
Parallel Systems
Distributed Systems
Dedicated Systems
Shared Systems
Homogeneous Systems
Heterogeneous Systems
Online/Dynamic Scheduling
Offline/Static Scheduling
Application Taxonomy
Performance criteria
Online/Dynamic Scheduling
Offline/Static Scheduling
Application Taxonomy
Performance criteria
Distributed scheduling refers to the chaining of different jobs into a coordinated workflow that spans several computers. For
example: - you schedule a processing job on computer1 and computer2, and when these are finished you need to schedule a
job on computer3, this is distributed scheduling.
******************
A distributed system may have a mix of heavily and lightly loaded system.
“A distributed system may have a mix of heavily and lightly loaded system.”
System may be heterogeneous in terms of CPU speed and resources. The distributed system
Even in homogeneous distributed system a system may be idle even when a task is waiting for
Consider a system of N identical, independent servers. Let P be the probability that the system
is in state in which at least 1 task is waiting for service and at least 1 server is idle. Let P be
We can estimate P using probabilistic analysis and plot a graph against system utilization.
LOAD: Load on a system/node can correspond to the queue length of tasks/ processes that
need to be processed.
Queue length of waiting tasks: proportional to task response time, hence a good indicator of
system loads.
If a task transfer (from another node) takes a long time, the node may accept more tasks
during the transfer time.
Affects performance.
Artificially increment the queue length when a task is accepted for transfer from remote node
(to account for the proposed increased in load). We can use TIMEOUTS if Task transfer fail.
Types of Algorithms:
Static load distribution algorithms: Decisions are hard-coded into an algorithm with knowledge
of system.
Dynamic load distribution: Use system state information such as task queue length, processor
utilization.
Adaptive load distribution: Adapt the approach based on system state.(e.g.) Dynamic
Distribution Algorithms Collect load information from nodes. Even at very high system loads.
Load information collection itself can add load on the system as messages need to be
exchanged.
Adaptive distribution algorithms may stop collecting state information at high loads.
Balancing Vs Sharing
Load balancing:
Transfer tasks even if a node is not heavily loaded so that queue length on all
Load sharing:
Transfer tasks only when the queue length exceeds a certain threshold.
Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become
idle/highly loaded
Distributed scheduling refers to the chaining of different jobs into a coordinated workflow that spans several computers. For
example: - you schedule a processing job on computer1 and computer2, and when these are finished you need to schedule a
job on computer3, this is distributed scheduling.
******************
A distributed system may have a mix of heavily and lightly loaded system.
“A distributed system may have a mix of heavily and lightly loaded system.”
System may be heterogeneous in terms of CPU speed and resources. The distributed system
Even in homogeneous distributed system a system may be idle even when a task is waiting for
Consider a system of N identical, independent servers. Let P be the probability that the system
is in state in which at least 1 task is waiting for service and at least 1 server is idle. Let P be
We can estimate P using probabilistic analysis and plot a graph against system utilization.
LOAD: Load on a system/node can correspond to the queue length of tasks/ processes that
need to be processed.
Queue length of waiting tasks: proportional to task response time, hence a good indicator of
system loads.
If a task transfer (from another node) takes a long time, the node may accept more tasks
during the transfer time.
Affects performance.
Artificially increment the queue length when a task is accepted for transfer from remote node
(to account for the proposed increased in load). We can use TIMEOUTS if Task transfer fail.
Types of Algorithms:
Static load distribution algorithms: Decisions are hard-coded into an algorithm with knowledge
of system.
Dynamic load distribution: Use system state information such as task queue length, processor
utilization.
Adaptive load distribution: Adapt the approach based on system state.(e.g.) Dynamic
Distribution Algorithms Collect load information from nodes. Even at very high system loads.
Load information collection itself can add load on the system as messages need to be
exchanged.
Adaptive distribution algorithms may stop collecting state information at high loads.
Balancing Vs Sharing
Load balancing:
Transfer tasks even if a node is not heavily loaded so that queue length on all
Load sharing:
Transfer tasks only when the queue length exceeds a certain threshold.
Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become
idle/highly loaded
Balancing Vs Sharing
Load balancing:
Transfer tasks even if a node is not heavily loaded so that queue length on all
Load sharing:
Transfer tasks only when the queue length exceeds a certain threshold.
Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become
idle/highly loaded
Distributed scheduling is concerned with distributing the load of a system among the available
resources in a manner that improves overall system performance and maximizes resource
utilization. The basic idea is to transfer tasks from heavily loaded machines to idle or lightly
loaded ones.
Static means decisions on assignment of processes to processors are hardwired into the
algorithm, using a priori knowledge, perhaps gained from analysis of a graph model of the
application.
Dynamic algorithms gather system state information to make scheduling decisions and so can
exploit under-utilized resources of the system at run time but at the expense of having to
gather system information.
An Adaptive algorithm changes the parameters of its algorithm to adapt to system loading
conditions. It may reduce the amount of information required for scheduling decisions when
system load or communication is high.
Most models assume that the system is Fully Connected and that every processor can
communicate in a number of hops, with every other processor although generally, because of
communication latency, load scheduling is more practical on tightly coupled networks of
homogeneous processors if the workload involves communicating tasks.
Process migration involves the transfer of a running process from one host to another.
Non migratory algorithms involve the transfer of tasks which have not yet begun, and so do not
have this state information, which reduces the complexities of maintaining transparency.
Resource queue lengths and particularly CPU Queue Length are good indicators of load as
For example, a number of remote hosts could observe simultaneously that a particular site had
a small CPU queue length and could initiate a number of process transfers. This may result in
that site becoming flooded with processes and its first reaction might be to try to move them
Time and Communication Bandwidth) by making poor choices which result in increased
migration activity.
Resource queue lengths and particularly CPU Queue Length are good indicators of load as
For example, a number of remote hosts could observe simultaneously that a particular site had
a small CPU queue length and could initiate a number of process transfers. This may result in
that site becoming flooded with processes and its first reaction might be to try to move them
Time and Communication Bandwidth) by making poor choices which result in increased
migration activity.
Transfer
Selection
Location
Information Policies.
Distributed scheduling refers to the chaining of different jobs into a coordinated workflow that spans several computers. For
example: - you schedule a processing job on computer1 and computer2, and when these are finished you need to schedule a
job on computer3, this is distributed scheduling.
******************
A distributed system may have a mix of heavily and lightly loaded system.
“A distributed system may have a mix of heavily and lightly loaded system.”
System may be heterogeneous in terms of CPU speed and resources. The distributed system
Even in homogeneous distributed system a system may be idle even when a task is waiting for
Consider a system of N identical, independent servers. Let P be the probability that the system
is in state in which at least 1 task is waiting for service and at least 1 server is idle. Let P be
We can estimate P using probabilistic analysis and plot a graph against system utilization.
LOAD: Load on a system/node can correspond to the queue length of tasks/ processes that
need to be processed.
Queue length of waiting tasks: proportional to task response time, hence a good indicator of
system loads.
If a task transfer (from another node) takes a long time, the node may accept more tasks
during the transfer time.
Affects performance.
Artificially increment the queue length when a task is accepted for transfer from remote node
(to account for the proposed increased in load). We can use TIMEOUTS if Task transfer fail.
Types of Algorithms:
Static load distribution algorithms: Decisions are hard-coded into an algorithm with knowledge
of system.
Dynamic load distribution: Use system state information such as task queue length, processor
utilization.
Adaptive load distribution: Adapt the approach based on system state.(e.g.) Dynamic
Distribution Algorithms Collect load information from nodes. Even at very high system loads.
Load information collection itself can add load on the system as messages need to be
exchanged.
Adaptive distribution algorithms may stop collecting state information at high loads.
Balancing Vs Sharing
Load balancing:
Transfer tasks even if a node is not heavily loaded so that queue length on all
Load sharing:
Transfer tasks only when the queue length exceeds a certain threshold.
Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become
idle/highly loaded
Balancing Vs Sharing
Load balancing:
Transfer tasks even if a node is not heavily loaded so that queue length on all
Load sharing:
Transfer tasks only when the queue length exceeds a certain threshold.
Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become
idle/highly loaded
Distributed scheduling is concerned with distributing the load of a system among the available
resources in a manner that improves overall system performance and maximizes resource
utilization. The basic idea is to transfer tasks from heavily loaded machines to idle or lightly
loaded ones.
Static means decisions on assignment of processes to processors are hardwired into the
algorithm, using a priori knowledge, perhaps gained from analysis of a graph model of the
application.
Dynamic algorithms gather system state information to make scheduling decisions and so can
exploit under-utilized resources of the system at run time but at the expense of having to
gather system information.
An Adaptive algorithm changes the parameters of its algorithm to adapt to system loading
conditions. It may reduce the amount of information required for scheduling decisions when
system load or communication is high.
Most models assume that the system is Fully Connected and that every processor can
communicate in a number of hops, with every other processor although generally, because of
communication latency, load scheduling is more practical on tightly coupled networks of
homogeneous processors if the workload involves communicating tasks.
Process migration involves the transfer of a running process from one host to another.
Non migratory algorithms involve the transfer of tasks which have not yet begun, and so do not
have this state information, which reduces the complexities of maintaining transparency.
Resource queue lengths and particularly CPU Queue Length are good indicators of load as
For example, a number of remote hosts could observe simultaneously that a particular site had
a small CPU queue length and could initiate a number of process transfers. This may result in
that site becoming flooded with processes and its first reaction might be to try to move them
Time and Communication Bandwidth) by making poor choices which result in increased
migration activity.
Resource queue lengths and particularly CPU Queue Length are good indicators of load as
For example, a number of remote hosts could observe simultaneously that a particular site had
a small CPU queue length and could initiate a number of process transfers. This may result in
that site becoming flooded with processes and its first reaction might be to try to move them
Time and Communication Bandwidth) by making poor choices which result in increased
migration activity.
STATIC DYNAMIC
Cooperative Non-Cooperative
Distributed scheduling is concerned with distributing the load of a system among the available
resources in a manner that improves overall system performance and maximizes resource
utilization. The basic idea is to transfer tasks from heavily loaded machines to idle or lightly
loaded ones.
Static Means decisions on assignment of processes to processors are hardwired into the
algorithm, using a priori knowledge, perhaps gained from analysis of a graph model of the
application.
Dynamic Algorithms gather system state information to make scheduling decisions and so can
exploit under-utilized resources of the system at run time but at the expense of having to
gather system information.
An Adaptive Algorithm changes the parameters of its algorithm to adapt to system loading
conditions. It may reduce the amount of information required for scheduling decisions when
system load or communication is high.
Most models assume that the system is Fully Connected and that every processor can
communicate in a number of hops, with every other processor although generally, because of
communication latency, load scheduling is more practical on tightly coupled networks of
homogeneous processors if the workload involves communicating tasks.
Process migration involves the transfer of a running process from one host to another.
Non migratory algorithms involve the transfer of tasks which have not yet begun, and so do not
have this state information, which reduces the complexities of maintaining transparency.
Resource queue lengths and particularly CPU Queue Length are good indicators of load as
For example, a number of remote hosts could observe simultaneously that a particular site had
a small CPU queue length and could initiate a number of process transfers. This may result in
that site becoming flooded with processes and its first reaction might be to try to move them
Time and Communication Bandwidth) by making poor choices which result in increased
migration activity.
Transfer Policy
Sender Initiated
Receiver Initiated
Information Policy
Demand Driven
Periodic
State-Change Driven
Transfer Policy
In this When a process is a about to be created, it could run on the local machine or be started
elsewhere.
Bearing in mind that migration is expensive, a good initial choice of location for a process could
If the machine's load is below the threshold then it acts as a potential receiver for remote
tasks.
If the load is above the threshold, then it acts as a sender for new tasks.
Local algorithms using thresholds are simple but may be far from optimal.
This decision will be based on the requirement that the overhead involved in the transfer will be
compensated by an improvement in response time for the task and/or the system.
Means of knowing that the task is long-lived will be necessary to avoid needless migration. This
could be based on past history where number of other factors could influence the decision.
The size of the task's memory space is the main cost of migration.
For efficiency purposes, the number of location dependent calls made by the chosen task
The resources such as a window or Input may only be available at the task's originating site.
Once the transfer policy has decided to send a particular task, the location policy must decide
where the task is to be sent. This will be based on information gathered by the information
policy.
Polling is a widely used SENDER-INITIATED technique. A site polls other sites serially or in
parallel to determine if they are suitable sites for a transfer and/or if they are willing to accept a
transfer. Nodes could be selected at random for polling, or chosen more selectively based on
information gathered during previous polls. The number of sites polled may vary.
work. The goal of the idle site is to find some work to do. An interesting idea is for it to offer to
do work at a price, leaving the sender to make a cost/performance decision in relation to the
task to be migrated.
Information Policy
The information policy decides what information about the states of other nodes should be collected
and where it should be collected from. There are a number of approaches:
Demand Driven
A node collects the state of other nodes only when it wishes to become involved in either sending or
receiving tasks, using sender initiated or receiver initiated polling schemes.
Demand driven policies are inherently adaptive and dynamic as their actions depend on the
system state.
Periodic
Nodes exchange information at fixed intervals. These policies do not adapt their activity to system
state, but each site will have a substantial history over time of global resource usage to guide location
algorithms. Note that the benefits of load distribution are minimal at high system loads and the
periodic exchange of information therefore may be an unnecessary overhead.
State-Change Driven
Nodes disseminate state information whenever their state changes by a certain amount. This state
information could be sent to a centralized load scheduling point or to peers.
Information Policy
The information policy decides what information about the states of other nodes should be collected
and where it should be collected from. There are a number of approaches:
Demand Driven
A node collects the state of other nodes only when it wishes to become involved in either sending or
receiving tasks, using sender initiated or receiver initiated polling schemes.
Demand driven policies are inherently adaptive and dynamic as their actions depend on the
system state.
Periodic
Nodes exchange information at fixed intervals. These policies do not adapt their activity to system
state, but each site will have a substantial history over time of global resource usage to guide location
algorithms. Note that the benefits of load distribution are minimal at high system loads and the
periodic exchange of information therefore may be an unnecessary overhead.
State-Change Driven
Nodes disseminate state information whenever their state changes by a certain amount. This state
information could be sent to a centralized load scheduling point or to peers.
A system is termed as unstable if the CPU queues grow without bound when the long term
arrival rate of work to a system is greater than the rate at which the system can perform work.
If an algorithm can perform fruitless actions indefinitely with finite probability, the algorithm is
said to be unstable.
Sender-Initiated Algorithms
Receiver-Initiated Algorithms
Adaptive Algorithms
ALGORITHMS
STATIC DYNAMIC
Cooperative Non-Cooperative
Static schemes are those when the algorithm uses some priori information of the system based
The disadvantage of this approach is that it cannot exploit the short term fluctuations in the
This is because static algorithms do not collect the state information of the system. These
algorithms are essentially graph theory driven or based on some mathematical programming
Dynamic scheduling collect system state information and make scheduling decisions on these
state information. An extra overhead of collecting and storing system state information is
Dynamic load distribution for homogenous systems scenario of task waiting in one server and
other server being idle was regarded as “wait and idle” (WI) condition. Significantly for a
distributed system of 20 nodes and having a system load between 0.33 and 0.89, the
probability of WI state is greater than 0.9. Thus, at typical system loads there is always a
potential for improvement in performance, even when nodes and process arrival rates are
homogeneous.
Adaptive load balancing algorithms are a special class of dynamic load distribution algorithms,
in that they adapt their activities by dynamically changing the parameters of the algorithm to
Adaptive algorithms use the previous information to query a new node and also adjust their
A pre-emptive transfer involves transfer of task which are partially executed. These transfers
are expensive because the state of the tasks also needs to be transferred to the new location.
Non pre-emptive transfers involves transfer of tasks which has not been started. For a system
that experiences wide fluctuations in load and has a high cost for the migration of partly
Comparison
Sender initiated algorithms work well in low system load, but in case of high system load when
most of the nodes are senders they send query to each other resulting in wastage of CPU cycles
and incurring more delay due to which the system becomes unstable.
This un-stability happens with receiver initiated algorithms when the system load is low and
For symmetrically initiated algorithms, they cannot use the previous gathered information and
so in stateless.
Adaptive algorithms use the previous information to query a new node and also adjust their
Load Balancing Empirical evidence from some studies has shown that often, a small subset of processes
running on a multiprocessor system often account for much of the load and a small amount of effort spent off-
Load sharing algorithms avoid idle time on individual machines when others have anonymous work queues.
Communication Network saturation can be caused by heavy communication traffic induced by data transfers
Fault Tolerance Long running processes may be considered as valuable elements in any system because of the
Application Concurrency The divide and conquer, or crowd approach to problem solving decomposes a problem
into a set of smaller problems, similar in nature, and solves them separately.
distributed system i.e. preemptive task transfer. Some references to task migration include the
transfer of processes before execution begins, but the most difficult issues are those related to
preemption.
Task migration is the movement of an executing task from one host processor (source) in
Task placement is the selection of a host for a new task and the creation
distributed application, by spreading the load more evenly over a set of host.
Resource Access – Not all resources are available across the network; a task may need to
migrate in order to access a special device, or to satisfy a need for a large amount of physical
memory.
Fault-Tolerance – Allowing long running processes to survive the planned shutdown or failure of
a host.
Deleting the task on the source and resuming the task’s execution on the destination
Issues
State Transfer
Location Transparency
State Transfer
They are undesirable for three reasons: reliability, performance and complexity.
Precopying the State - Bulk of the task state is copied to the new host before freezing the task
Copy-On-Reference - Just copy what is migrated task need for its execution.
Location Transparency
Task migration should hide the locations of tasks. Location transparency in principle requires
that names (process name, file names) be independent of their locations (host names). Need to
There will be interaction between the task migration mechanism, the memory management
The mechanisms can be designed to be independent of one another so that if one mechanism’s
protocol changes, the other need not require the migration mechanism it can be turned off
distributed system i.e. preemptive task transfer. Some references to task migration include the
transfer of processes before execution begins, but the most difficult issues are those related to
preemption.
Task migration is the movement of an executing task from one host processor (source) in
Task placement is the selection of a host for a new task and the creation
distributed application, by spreading the load more evenly over a set of host.
Resource Access – Not all resources are available across the network; a task may need to
migrate in order to access a special device, or to satisfy a need for a large amount of physical
memory.
Fault-Tolerance – Allowing long running processes to survive the planned shutdown or failure of
a host.
Deleting the task on the source and resuming the task’s execution on the destination
Issues
State Transfer
Location Transparency
State Transfer
They are undesirable for three reasons: reliability, performance and complexity.
Precopying the State - Bulk of the task state is copied to the new host before freezing the task
Copy-On-Reference - Just copy what is migrated task need for its execution.
Location Transparency
Task migration should hide the locations of tasks. Location transparency in principle requires
that names (process name, file names) be independent of their locations (host names). Need to
There will be interaction between the task migration mechanism, the memory management
The mechanisms can be designed to be independent of one another so that if one mechanism’s
protocol changes, the other need not require the migration mechanism it can be turned off
space.
operating system that implements the shared memory model in distributed systems, which
have no physically shared memory. The shared memory model provides a virtual address space
In DSM, data is accessed from a shared address space similar to the way that virutal memory is
accessed.
Data moves between Secondary and Main Memory, as well as, between the Distributed
Main Memories of different nodes. Ownership of pages in memory starts out in some pre-
defined state but changes during the course of normal operation. Ownership changes take place
when data moves from one node to another due to an access by a particular process.
Hide data movement and provide a simpler abstraction for sharing data. Programmers don't
need to worry about memory transfers between machines like when using the message
passing model.
Takes advantage of "locality of reference" by moving the entire page containing the data
Cheaper to build than multiprocessor systems. Ideas can be implemented using normal
hardware and do not require anything complex to connect the shared memory to the
processors.
Larger memory sizes are available to programs, by combining all physical memory of all
nodes. This large memory will not incur disk latency due to swapping like in traditional
distributed systems.
Unlimited number of nodes can be used. Unlike multiprocessor systems where main memory
is accessed via a common bus, thus limiting the size of the multiprocessor system.
Programs written for shared memory multiprocessors can be run on DSM systems.
DSM Organization
Challenges in DSM
How to keep track of the location of remote data?
How to overcome the communication delays and high overhead associated with the references
to remote data?
How to allow "controlled" concurrent accesses to shared data?
Algorithms:
For write: update the data and send acknowledge to the client
Data is shipped to the location of the data access request -- subsequent accesses are local
For both read/write: Get the remote page to the local machine, then perform the operation.
Keeping track of memory location: location service, home machine for each page, broadcast.
All the location must be kept track of: location service/home machine
Memory coherence is an issue that affects the design of computer systems in which two or
In DSM there are two or more processing elements working at the same time, and so it is
possible that they simultaneously access the same memory location. Provided none of them
changes the data in this location, they can share it indefinitely and cache it as they please. But
as soon as one updates the location, the others might work on an out-of-date copy that, e.g.,
resides in their local cache. Consequently, some scheme is required to notify all the processing
Protocol, and if such a protocol is employed the system is said to have a Coherent Memory.
The set of allowable memory access orderings forms the memory consistency
model.
Sequential Consistency: The result of any execution of the operations of all the
General Consistency: All the copies of a memory location eventually contain the
same data when all the writes issues by every processor have completed.
Consistency.
The intention is that two clients must never see different values for the same
shared data.
The Protocol must implement the basic requirements for Coherence. It can be
causes the invalidation of all copies except one before the write.
Write-Update Protocol: A write to a share data causes all copies of that data to
be updated.
Case Study: Cache Coherence & using Coherence Protocols in the system.
Case Study: Cache Coherence & using Coherence Protocols in the system.
General consistency
master copy.
For write: First update the master copy and then propagated to the copies
On a write fault, if the address indicates a remote node, the update request
is sent to the remote node. If the copy is not the master copy, the update
request is sent to the nod containing the master copy for updating and then
further propagation.
Write is nonblocking.
Distributed shared memory (DSM) systems could overcome major obstacles of the
machines.
codes. Due to its potential advantages, DSM has received increasing attention.
The specific question that we are trying to answer is, " Can we determine a set of system
design parameters that defines an efficient realization of
a distributed shared memory system".