Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chapter 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 117

Distributed Systems

Chapter 1
Introduction to Distributed System
1.1 Introduction to Distributed Systems
1.2 Examples of Distributed Systems
1.3 Main Characteristics
1.4 Advantages and Disadvantages of Distributed Systems
1.5 Design Goals
1.6 Main Problems
1.7 Models of Distributed Systems
1.8 Resource Sharing and the Web Challenges
1.9 Types of Distributed System: Grid, Cluster, Cloud
1.1 Introduction to Distributed
Systems

What is Distributed?
- Data are distributed . If data must exist in multiple computers
for admin and ownership reasons.
- Computation is distributed. Application taking advantages of
parallelism, multiple processors, scalability and heterogeneity
of distributed system.
- Users are distributed. Users communicate and interact via
application.
Q.What is a distributed System?
-A distributed system is a collection of independent computers that appears to its users
as a single coherent system – Andrew Tanenbaum

-A distributed system is the one that prevents you from working because of the failure of a
machine that you had never heard of – Veríssimo & Rodrigues

-Weak definition: A distributed system is a collection of independent computers that are used
jointly to perform a single task or to provide a single service.

-A collection of logically related data that is distributed over different processing nodes of
computer network.

-A distributed system is a collection of independent computers that appear to the users of


the system as a single computer.

-A distributed system is one in which hardware or software components located at


networked computers communicate and coordinate their actions only by message passing.

-With any definition, sharing of resources is a main motivation for constructing distributed
systems
-A collection of independent computers that appears to its users as a single coherent
system.

- One that looks like an ordinary system to its users, but runs on a set of autonomous
processing elements (PE) where each PE has a separate physical memory space and
the message transmission delay is not negligible.

- There is close cooperation among these PEs. The system should support an arbitrary
number of processes and dynamic extensions of PEs.
1.2 Examples of Distributed System
a. Cloud computing: Cloud computing is a distributed system that provides access to shared resources,
such as servers, storage, applications, and services, over the internet. Examples include Amazon Web
Services, Microsoft Azure, and Google Cloud.
b. Content delivery networks (CDNs): CDNs are distributed systems that help deliver web content, such
as images, videos, and other media, to users around the world. Examples include Akamai, Cloudflare,
and Amazon CloudFront.
c. Peer-to-peer (P2P) networks: P2P networks are distributed systems that allow users to share files and
resources directly with each other, without a central server. Examples include BitTorrent and Bitcoin.
d. Distributed databases: Distributed databases are systems that store data across multiple computers,
allowing for high availability and scalability. Examples include Apache Cassandra, Hadoop, and
MongoDB.
e. Internet of Things (IoT): IoT is a distributed system that connects everyday objects, such as
appliances, cars, and wearable devices, to the internet. Examples include smart homes, smart cities, and
industrial automation systems.
f. Grid computing: Grid computing is a distributed system that allows organizations to share computing
resources, such as processing power, storage, and network bandwidth, across multiple locations.
Examples include the Worldwide LHC Computing Grid and SETI@home.
1.3 Main Characteristics of Distributed
Systems
a. Resource Sharing

Resource: hardware- disks and printers
: software- files, windows and data objects

Hardware sharing for convenience and reduction of cost

Data sharing for consistency, exchange of information, cooperation

b. Resource Manager

Software module that manages a set of resources.

Each resources requires its own management policies and methods.

Client server model- server processes act as resource managers for a set of resources and a set of client.

c. Openness

Openness is concerned with extensions and improvements of distributed system.

New components have to be integrated with existing components.
d. Concurrency
- executing two of more tasks at the same time in parallel
– multi programming
– multi processing
– parallel execution in distributed system.

e. Scalability
- the capacity to be change in size or scale
-adaption of distributed systems to : accommodate more users, respond faster
- usually done by adding more of faster processors.
- easy to add system.

f. Fault Tolerance
- even if one system fails there are other system which can work.

g. Transparency
- A transparency is some aspect of the distributed system that is hidden form the user.
1.4 Advantages and Disadvantages
of Distributed Systems
Advantages
1. Cost effective: They are cost effective in the long run. Compared to a mainframe computer, where
a single system is composed of several processors, the distributed system is made up of several computer
together.
2. Speed: A distributed system may have more total computing power than a mainframe. Enhanced
performance through load distributing.
3. Inherent distribution: Some applications are inherently distributed. Ex. Computerized worldwide Airline
reservation, computerized banking system in which a customer can deposit/withdraw money from his/her
account from any branch of the bank.
4. Reliability: If one machine crashes, the system as a whole can still survive. Higher availability and improved
reliability.
5. Scalability: DS are made to be scalable. When ever there is an increase in workload, users can add more
workstation. No any restriction are placed on the number of machines. These machines will be able to handle
high demand workload easily.
6. Data Sharing: allows users to share data using common database.
Disadvantages:
1. Start up cost
- compared to a single system, the implementation cost is higher.

2. Security
- DS always comes with security risks since it contains open system characteristics.
-The data of the user is stored in different workstation.
-Thus the user needs to make sure that their data is secured in each of these computers.

3. Complexity
-The difficulty involved in implementation, maintenance and troubleshooting makes distributed
system a complex strategy.
-Besides hardware complexity, distributed system posses difficulty in software too.
- The software used in DS needs to be well alternative when handling communication and security.
4. Overheads
- This happens when all the workstation try to operate at once.
- There will be an increase in computing time. This impacts the systems response time.

5. Network Errors
- DS are prone to network errors which leads to communication breakdown.
- The information may fail to be delivered or not in the correct sequence.
- Troubleshooting the error is a difficult tasks since the data is distributed across various
nodes.

6. Exchanging of information between components require coordination creating


processing overhead.
1.5 Design Goals/Issues/
Challenges in Distributed System
The construction of DS produces many challenges. These are:
a. Heterogeneity: DS must be constructed from a variety of different networks,OS,
hardware and programming languages.
b. Openness: The openness of DS is determined by the degree to which new resources-
sharing services can be added and be made available for use by a variety of client
program.
c. Security: Has three components

Confidentiality: protection against disclosure to unauthorized individuals.

Integrity: protection against alteration or corruption

Availability: protection against interference with the means to access the resources.
d. Scalability: A system is described as scalable if it will remain effective when there is a significant
increase in the number of resources and the number of users.

e. Failure Handling: Any process, computer of network may fail independently of the others.
Therefore each components needs to be aware of the possible ways in which the components it
depends on may fail and designed to deal with each of these failures appropriately.

f. Concurrency: Both services and applications provide resources that can be shared by clients in
a DS. Therefore there is a possibility that several clients will attempt to access a shared resources
at the same time. Any object that represents a shared resource in DS must be responsible for
ensuring that it operates correctly in a concurrent environment.

g. Transparency: the main aim of transparency is to make certain aspects of distribution invisible to
the application programmers.
Layers of Transparency
a. Access Transparency

Deals with hiding differences in data representation and the way that
resources can be accessed by users.

At a basic level, we wish to hide differences in machine architectures, but more
important is that we reach agreement on how data is to be represented by
different machines and operating system.

For example a DS may have computer systems that run different OS each
having their own file naming conversion. Difference in naming conventions as
well as how files can be manipulated should all be hidden from users and
applications.
Layers of Transparency
b. Location Transparency

Refers to the fact that users cannot tell where a resource is physically located in the system.

Naming plays an important roles in achieving location transparency.

Location transparency can be achieved by assigning only logical names resources, that is ,
name in which the location of a resources is not secretly encoded.

Example: http://prenhall.com/index.html which gives no clue about the location of Prentice
Hall’s main web server.

Basically, it enables resources to be accessed without knowledge of their locations.
Layers of Transparency
c. Migration Transparency

DS in which resources can be moved without affecting how these resources can
be accessed are said to provide migration transparency.

d. Replication Transparency

Replication plays a very important role in DS.

Example, resources may be replicated to increase availability or to improve performance by
placing a copy close to the place where it is accessed.

Replication Transparency deals with hiding the facts that several copies of a resource exist.

To hide the replication from the users, it is necessary that all replicas have the same name.
Layers of Transparency
e. Concurrency transparency

An important goal of DS is to allow sharing of resources.

In many cases, sharing resources is done in a cooperative way.

Example: two independent users may each have stored their files on the same file
server or may be accessing the same tables in a shared database.

In such cases, it is important that each user does not notice that the other is making
use of the same resources.

This phenomenon is called concurrency transparency.
Layers of Transparency
f. Failure Transparency

It means that a user does not notice that a resource fails to
work properly and that the system subsequently recovers from
that failure.
Transparency Description

Access Hide differences in data representation and how a resource is


accessed
Location Hide where a resource is located

Migration Hide that a resource may move to another location or is


migrated to newer version
Relocation Hide that a resource may be moved to another location while
in use
Replication Hide that a resource may be shared by several competitive
users
Concurrency Hide that a resource may be shared by several competitive
users
Failure Hide the failure and recovery of a resource

Persistence Hide whether a (software) resource is in memory or on disk


Q. What do you mean by Models of Distributed System?

-In the context of computer science and information technology, a model of a


distributed system refers to a conceptual representation or abstraction that helps
describe and understand the structure, behavior, and interactions of components
within a distributed system.

-Models of distributed systems provide a way to analyze, design, and reason about the
behavior and properties of such systems.
1.7 Models of Distributed Systems
a. Architectural Model
i. Client-server model
ii. Multi tired architecture
iii. Peer to Peer Architecture

b. Fundamental Models
i. Interaction Model
ii. Failure Model
iii. Security Model
a. Architectural Model

The most evident aspect of distributed system design is the division of responsibilities
between system components (applications, servers, and other processes) and the
placement of the components on computers in the network.

The overall aim is to ensure that the structure will meet present and likely future
demands on it.

Major concerns are to the system reliable, manageable, adaptable and cost-effective.

An architectural model of DS is concerned with placement of its parts and the
relationship between them.

An architectural model defines the way in which the components of system interact with
one another and the way in which they are mapped onto an underlying network of
computer.
- an architectural model first simplifies and abstracts the functions of the individual components
of a distributed system and then it considers:
- the placement of the components across a network of computers, seeking to define useful
patterns for the distribution of data and workload.
- most distributed system are arranged according to one of a variety of architectural model.
Example: peer to peer model , client server model
I. client -server model

Client process interact with individual server processes in a separate
host computers in order to access the shared resources.

In the basic client- sever model, processes in a DS are divided into two
groups:
- A server - A client

A server is a process implementing a specific service. For example a
file system service of a database service.

A client is a process that requests a service from a server by sending it
a request and waiting for the servers reply.
- In client- server model, any process can act as a server or client.
- It is not the type of machine , size of the machine or the computing power which makes it
server, it is the ability of serving request that makes a machine a server.
- A system can act as server and client simultaneously, that is one process is acting as a
server and another is acting as a client.
- this may also happen that both client and server processes resides on the same machine.
Variants of client server model

The problem of client-server model is placing a service in
a server at a single address that does not scale well
beyond the capacity of computer host and bandwidth of
network connections.

To address this problem, several variations of client-
server model have been proposed.

Some of these variations are discussed in the next slide.
Variants of Client Sever Model

Services provided by multiple servers : Services
may be implemented as several server
processes in separate host computers
interacting as necessary to provide a service to
client processes.

E.g. cluster that can be used for search engines.

Variants of Client Sever Model
Proxy server and caches
a. Proxy server:

A proxy is a dedicated computer or a software running on a computer that
acts as an intermediary between an endpoint device such as a computer
and another server from which a user or client is requesting a service.

The proxy server may exist in the same machine, or it may be on a separate
server which forward requests.

It provides a single point of access and control.

It is an intermediary system that accepts request from clients and forward
them to original servers.
Q. How proxy server works?

When a proxy server receives a request for a resource,it looks into
its local cache.

If it finds the page, it returns it to the client without needing to
forward the request to the original server.

If the page is not in the cache, the proxy server acting as a client on
behalf of the user uses one of its IP addresses to request the page
from the server out on the internet.

When the page is returned, the proxy server forwards it on to the
user.
HTTP request
HTTP request Original
Proxy server
client
server
HTTP response
HTTP response
b. Cache

A cache is a store of recently used data objects that is
closer to the client process than those remote objects.

When a new object is received at a computer it is added
to the cache store, replacing some existing objects if
necessary.

When an object is needed by a client process that
caching services checks the cache and supplies the
object from there in case of an up to date copy is
available. If not an up to date copy is fetched.
ii. Multi- tier architecture

Also known as n-tier architecture

In this architecture the function such as presentation, application and data tier are
physically separated.

By separating an application into tiers, developers obtain the option of changing or
adding a specific layer, instead of reworking the entire application.

It provides a model by which developers can create flexible and reusable applications.

Software architecture consists of one tier, two tier , three tier and n- tier architecture.

Tier can also be referred to as layer.

Three layers involved in the application: presentation layer, business layer and data
layer
Presentation layer

Also known as client layer

Top most layer of an application

This is the layer we see when we use a software

By using this layer we can access the web page

The main functionality of this layer is to communicate with application layer

This layer passes the information which is given by the user in terms of
keyboard actions, mouse clicks to application layer.

Example: login page of gmail where an end user could see text boxes and
buttons to enter user id, password and to click on sign in .
Business layer/ application layer

As per the gmail login page example, once user clicks on
the login button , application layer interacts with the
database layer and sends required information to the
presentation layer.

It controls an application’s functionality by performing
detailed processing.

This layer acts as a mediator between the presentation
and the data layer.
Database layer

The data is stored in database layer.

Application layer communicates with database
layer to retrieve the data.

It contains methods that connects the database
and performs required action.

Example: insert , update, delete etc.
Types of software architecture
a. One tier architecture
b. Two tier architecture
c. Three tier architecture
a. one tier architecture

Basically, one tier architecture keeps all of the elements of an application in
presentation layer, business layer and database layer in one place.

Developers see these types of systems as the simplest and most direct.

The file you want to work with must be accessible from a local or shared drive.

This is the simplest of all the architecture but also the least secure.

Since users have direct access to the file they could accidentally move, modify or
even worse delete the file by accident or on purpose.

There is also usually an issue when multiple users access the same file at the same
time.

In many cases only one can edit the file while others only have read only access.
b. Two tier architecture

A two tier architecture is similar to a client-server application.

Direct communication occurs between the client and server.

There is no middleman between the client and the server.

Separating these two components into different locations represents a
two tier architecture.

This architecture is also called client- server architecture because of the
two components:
- the client that runs the application.
- the server that handles the database back end.

When the client starts it establishes a connection to the server and
communicates as needed with the server while running the client.

The client computer usually cannot see the database directly and
can only access the data by starting the client.

This means that the data on the server is much more secure.

Now users are unable to change or delete data unless they have
specific user rights to do so.

The client server solution allows multiple users to access the
database at the same time as long as they are accessing data in
different parts of the database.
c. Three tier architecture

This involves on more layer called the business layer/ application layer.

In client- server solution the client was handling the business logic that makes the
client thick.

By introducing the middle layer, the client is only handling presentation layer. This
means that only little communication needed between the client and the middle tier.

As more users access the system a three tier solution is more scalable than the other solution
because you can add as many middle tiers as needed to ensure good
performance.

Security is also the best because the middle layer protects the database tier.

Drawback to the n- tier is that the additional tiers increases the complexity and cost of the
installation.
Comparison between one- tier, two-tier and three-tier architecture

1- tier 2- tier 3- tier

benefits -very simple, inexpensive -good security - exceptional security


-no server needed - faster - very scalable
execution

issues - poor security -more costly - very costly


- multi user issues -more complex -very complex

users -Usually 1 or few - 2-100 50-2000+


iii. Peer to Peer architecture

Peer to peer architecture is a commonly used computer networking architecture in which
each workstation or node has the same capabilities and responsibilities.

P2P may also be used to refer to a single software program designed so
that each instance of the program may act as both client and server with the same
responsibilities and status.

Peer-to-peer (P2P) architecture is a decentralized computing model where multiple
computers, known as peers, are connected to each other and can communicate and
share resources directly without relying on a central server.

In a P2P network, each peer has equal capabilities and can act as both a client and a
server.
Here are some key features and characteristics of a peer-to-peer architecture:

1. Decentralization: P2P networks do not have a central server or authority controlling the
network. Instead, each peer has equal status and can communicate directly with other
peers.

2. Distributed resources: Peers in a P2P network can share and access resources such
as files, processing power, or bandwidth. Each peer can contribute its resources to the
network and utilize the resources available on other peers.

3. Scalability: P2P networks can scale easily as new peers can join or leave the network
without affecting its overall operation. The more peers that join, the more resources
become available for sharing.

4. Redundancy and fault tolerance: Due to the decentralized nature of P2P networks, if
one peer fails or leaves the network, the remaining peers can continue to operate and
provide services. Resources and data are distributed across multiple peers, reducing the
impact of failures.
5. Self-organization: Peers in a P2P network can discover and connect to other peers
autonomously without relying on a central directory. Various mechanisms like distributed
hash tables (DHTs) or peer discovery protocols enable the identification and connection of
peers.

6. Privacy and security: P2P networks can provide enhanced privacy since there is no
centralized authority that can monitor or control the communication between peers.
However, it also poses challenges in terms of security, as malicious peers or content can
be distributed within the network.
S.N. Basis for client-server Peer- to - peer
comparison

1 Basic There is a specific server Client and servers are not


and specific clients distinguished; each node acts as a
connected to the server client and server
2 Service The client requests for Each node can request services and
service and the server can also provide the services.
responds with the service.
3 Focus Sharing the information connectivity

4 Data The data is stored in a Each peer has its own data.
centralized server.
5 Server When several clients request As services are provided by
the services simultaneously, a several services distributed in
server can get bottleneck the p2p system, a server in not
bottleneck
6 Expense The client- server are P2P are less expensive to
expensive to implement implement

7 Stability Client- server is more stable P2P suffers if the number of


and scalable. peers increases in the system.
b. Fundamental Model

Fundamental Models are concerned with a more formal description of the
properties that are common in all of the architectural models.

In general, fundamental model provides operating system with the general
ingredients that is necessary to understand system behavior

The purpose of this model is:
- to make explicit all the relevant assumption about the system we are
modeling.
– To make generalizations concerning what is possible or impossible.
– They are concerned with more formal description of the properties that are common
in architectural model.

It has three sub parts:
- interaction model
- failure model
- security model
I. interaction model

Computation occurs within processes that interact by passing messages,
resulting in communication and coordination between processes.

DS can be composed of many processes interacting in complex ways.
For example: multiple server processes may cooperate with one another
to provide a service such as DNS, which partitions and replicates its data
at servers throughout the internet.

Processes in DS interact with each other by passing messages, resulting
in communication and coordination between processes. Each process
has its own state.

Interacting processes in a DS are affected by two significant factors:
a. performance of communication channel
- communication over a computer network has the following performance
characteristics relating to latency, bandwidth and jitters.
-the delay between the sending of a message by one process and its receipt by
another is referred to as latency.
- the bandwidth of a computer network is the total amount of information that can be
transmitted over it in a given time.
- Jitters is the variation in the time taken to deliver series of messages.
b. computer clocks and timing events

Each computer in a DS has its own internal clock, which can be
used by local processes to obtain the value of the current time.

Therefore, two processes running on different computers can
associate timestamps with their events.

However, even if two processes read their clocks at the same
time, their local clocks may supply different time value.
Variants of the interaction model

In a DS it is hard to set time limits for process
execution, message delivery or clock drift.

Two variants of the interaction model are
a. synchronous distributed system
b. asynchronous distributed system
Variants of the interaction model
a. synchronous distributed system: are defined to be system in which:

Time to execute each step of a process has known lower and upper bounds.

Each message transmitted over a channel is received within a known bounded time

Each process has a local clock whose drift rate from perfect time has a known bound.

It is possible to suggest likely upper and lower bounds for process execution time,
messages delay and clock drift rates in a DS.

But it is difficult to arrive at realistic values and to provide guarantees of the chosen
values.

Unless the values of the bounds can be guaranteed, any design based on the chosen
values will not be reliable.
Variants of the interaction model
b. Asynchronous distributed systems:

Have no bound on process execution, speeds, message
transmission delays and clock drift rates.

Has no time limit

Message transmission delay can occur

Actual DS tends to be asynchronous in nature

Example: Internet
ii. Failure Model

Failure model defines and classifies the faults.

Failure model defines the way in which failure may occur in order to
provide an understanding of the effects of failures.

Masking is a technique by which a more reliable service is built from a
less reliable one by masking some of the failures it exhibits.

Types of failures:
-omission failure
-Arbitrary failure
-timing failure
-fail stop
- crash

Omission failure: process or channel failed to do something. A
message inserted in an outgoing message buffer never arrives at
the other ends incoming message buffer.

Arbitrary failure: any type of error can occur in processes or
channels.

Timing failure: applicable only to synchronous DS where time
limit may not be met. Clock drifts exceeds allowable bounds.

Fail stop: a process halts and remains halted, other process can
detect that the process has failed.

Crash: a process halts and remains halted and other process
cannot detect that the process has failed.
iii. Security Model

The security of a distributed system can be achieved by securing the
processes and the channels used in their interactions and by protecting
the objects that they encapsulate against unauthorized access.

Protecting Objects:
- Access rights: Access rights specify who is allowed to perform the operations
on an object. Who is allowed to read or write its state.
– Principal: Principal is the authority associated with each invocation and each
result.
A principal may be a user or a process. The invocation comes from a user and
the result from a server.
The sever is responsible for

Verifying the identity of the principal (user) behind
each invocation.

Checking that they have sufficient access rights to
perform the requested operation on the particular
object invoked.

Rejecting those that do not.
The enemy

To model security threats, we assume an enemy that is capable of
sending any message to any process and reading or copying any
message between a pair of processes.

Threats from a potential enemy are classified as:
- Threats to processes
- Threats to communication channels
- Denial of service
Defeating security threats

Secure systems are based on the following main techniques:
i. Cryptography and shared secrets
• Cryptography is the science of keeping message secure.
• Encryption is the process of scrambling a message in such a way as to hide
its contents.
ii. Authentication
• The use of shared secrets and encryption provides the basis for the
authentication of messages.

Secure channels
- Encryption and authentication are used to build secure
channels as a service layer on top of the existing communication
services.
- A secure channel is a communication channel connecting a
pair of processes, each of which acts on behalf of a principal.
- VPN (Virtual Private Network) and secure socket layer (SSL)
protocols are instances of secure channel.
Other possible threats from an enemy

Denial of service:
-This is a form of attack in which the enemy interferes with the
activities of authorized users by making excessive and
pointless invocations on services of message transmissions in
a network.
- It results in overloading of physical resources (network
bandwidth, server processing capacity).

Mobile code
- Mobile code is security problem for any process
that receives and executes program code from
elsewhere, such as the email attachment.
- Such attachment may include a code that accesses
or modifies resources that are available to the host
process but not to the originator of the code.
Types of Distributed Systems

Distributed Computing Distributed Information Distributed pervasive


System System System
Cluster computing
Grid Computing
Cloud Computing
Cloud computing service models
SaaS
IaaS
PaaS
Cloud Computing Deployment Model
Public Cloud
Private Cloud
Hybrid Cloud
1.Distributed Computing System

A distributed computer system is made up of multiple software
components that run on different computers but work together as one.

A distributed system’s computer can be physically close together and
connected by a local network or geographically separated and
connected by a wide area network.

Uses group of computer that share a common computation problem
among them so as to generate an efficient result in short time span.

Typically deployed for high- performance applications often originating
from the field of parallel computing.
a. Cluster computing systems

Cluster computing refers to a type of computing system that consists of a group of
interconnected computers or servers working together as a single entity.

These individual computers, known as nodes, are connected via a local area network (LAN)
or a high-speed interconnect, allowing them to communicate and collaborate on various
computational tasks.

The main objective of cluster computing is to harness the combined processing power and
resources of multiple machines to solve complex problems more efficiently and quickly than a
single computer could achieve.

By distributing the workload across multiple nodes, cluster computing enables parallel
processing, which is essential for handling computationally intensive tasks such as scientific
simulations, data analysis, and large-scale simulations.

Each node runs the same OS.
Example of Cluster Computing:
One example of cluster computing is a high-performance computing (HPC) cluster used for
scientific simulations. Let's take the example of a weather forecasting application.

In weather forecasting, complex mathematical models are used to simulate and predict
weather patterns. These simulations require significant computational power and can benefit
from parallel processing. A cluster computing system can be used to distribute the workload
and accelerate the simulation process.

In this scenario, the cluster would consist of multiple interconnected nodes, each equipped
with processors, memory, and storage. The nodes are connected via a high-speed network,
allowing them to communicate and share data efficiently.

The weather forecasting application would be designed to divide the computational tasks
into smaller units that can be processed independently. These units, known as "jobs" or
"tasks," are distributed across the nodes in the cluster. Each node works on its assigned
tasks simultaneously, leveraging parallel processing capabilities.

As the nodes process their respective tasks, they exchange information and data with each
other over the network. Once all the tasks are completed, the results are combined to
produce the final weather forecast.
The use of a cluster allows for faster execution of the weather simulations compared to
running them on a single machine. By harnessing the combined computing power of multiple
nodes, the cluster can handle larger and more complex simulations, providing more accurate
and detailed weather predictions.

Cluster computing is not limited to weather forecasting but is also applied in various domains,
such as bio informatics, financial modeling, computational chemistry, and many other scientific
and engineering fields that require intensive computational resources.
b. Grid Computing System

A grid computing system is a network of computers or resources that work
together to solve complex computational problems or perform large-scale data
processing tasks.

It enables the sharing and coordination of computing power, storage, and
other resources across multiple locations and organizations.

In a grid computing system, individual computers or resources are
interconnected through a network, forming a virtual supercomputer.

Each resource retains its autonomy and can operate independently, but they
can also collaborate and share their computing power when required.

Grid computing systems are designed to handle computationally intensive
tasks that cannot be easily processed by a single computer or within a single
organization's infrastructure.

They provide a scalable and flexible solution for scientific research,
data analysis, simulations, and other high-performance computing
needs.

The nodes may be different in hardware, software and network
technology.

Every node has access to enormous processing power and storage
capacity.

It works on the principle of pooled resources (sharing the load across
multiple nodes to complete tasks more efficiently and efficiently ).
Figure: A layered architecture of grid computing
The architecture consists of four layers:
i. Fabric layer: The Fabric layer includes the physical resources that are shared inside the grid.
This comprises network resources, computational resources, storage systems, sensors,
software modules and additional system resources.
ii.Connectivity layer: The most significant functionalities at the connectivity layer comprise
identification, transfer, navigation, and support for safe conversation. The most crucial
requirements for security support involve support for login, support for designation as per
which a program may perform and access resources according to the user’s authority and
support for interoperability in combination with regional security resolutions and regulations.
iii. Resource Layer: The resource layer provides protection and interaction activity as
distinguished by the connectivity layer, which is used for many applications such as
accounting, scrutinizing, etc and computes whole expenses for using the individual resources.
It incorporates mainly information and management practices. The information practices are
employed to finish information concerning the creation and state of clear resources.
Management practices are employed for consulting access to resources and providing a
strategy application point by making certain that the resource usage is consistent with the
method under which the resource is to be shared.
Example of Grid Computing
One example of grid computing is the Large Hadron Collider (LHC) Computing Grid used by
CERN (European Organization for Nuclear Research) for particle physics research.

The LHC is the world's largest and most powerful particle accelerator, located at CERN in
Switzerland. It generates vast amounts of data from particle collisions, and analyzing this
data requires enormous computational resources. The LHC Computing Grid is a distributed
computing infrastructure that enables researchers around the world to collaborate and
process this massive amount of data.

The grid consists of thousands of computing resources, including clusters, servers, and
storage systems, distributed across different countries and institutions. These resources are
connected through a high-speed network, allowing for efficient data transfer and processing.

When particle collisions occur in the LHC, detectors capture the resulting data, which is then
divided into smaller chunks called "data sets." These data sets are distributed to various
computing nodes within the grid for processing.
Researchers and scientists from different institutions submit their computing jobs to the grid,
specifying the required resources and the data sets they need to process. The grid scheduler
allocates the appropriate resources and assigns the jobs to available computing nodes.

Each computing node processes its assigned tasks independently, performing data analysis,
simulations, and calculations. Once the processing is complete, the results are collected and
combined for further analysis and interpretation.

Grid computing is crucial for the LHC research community as it allows scientists worldwide to
access and utilize the computing power required for analyzing the vast amount of data
generated by the particle accelerator. It enables efficient data sharing, collaboration, and
resource utilization, facilitating groundbreaking discoveries in the field of particle physics.

Grid computing is also utilized in other domains, such as astronomy, genomics, drug
discovery, and other scientific and research-intensive areas where large-scale data processing
and collaboration across multiple organizations or institutions are required.
c. Cloud Computing

Cloud computing refers to the on demand delivery of computing power, database, storage,
application and other IT resources over the internet with pay as you go pricing.

The characteristics of cloud computing are:
i. On-demand self-service:
-All computing services like storage, applications, networking, etc. can be accessed
whenever required and without any interaction with service providers.
-Users or organizations can use the web self-service portal to access the required resources.
ii. Broad network access
- All computing resources offered by cloud servers are available over the network and users
can access them from anywhere and at anytime with the help of their devices and internet
connection.
iii. Resource pooling
-To serve multiple customers, service providers create a pool of resources.
-This pool should be large and flexible enough to meet all the requirements of multiple clients.
-These resources can be assigned and reassigned on the customer’s demand
iv. Rapid elasticity
-Cloud computing services can be elastically provisioned or released.
-Cloud computing has the ability to assign resources when they are in need by the customers
and remove them when they don’t need them.
-The usage, capacity, and cost can be scaled up or down automatically with no additional
contract or penalty.
v. Measured service
-Cloud computing is based on the pay-per-use principle ,i.e., charged for the resources that
users use.
-A cloud system leverages a metering capability to measure the resources used.
-Measurement helps service provider to allocating resources to the customers in the best
possible way
- Cost is variable and is based on the consumption of resources.
Advantages of Cloud Computing
i. Economical
-No need to buy and maintain expensive IT infrastructure
-Moving to cloud computing provides access to various computing
resources at a low cost.
-There are no administrative, operational, and upfront costs.
ii. Universal access
-Cloud computing enables you to work and gain access from anywhere and
any computer or device with just an internet connection.
iii. Scaling
- A business can scale (increase or decrease) performance, functionalities,
and resources of cloud computing as per the requirements.
iv. Collaboration
-Cloud environments enables better collaboration across teams: developers,
QA, operations, security and product architects are all exposed to the same
infrastructure and can operate simultaneously without stepping on each other
toes.
v. Ensures backup and recovery of data
-In cloud systems, various copies of data are maintained by service providers on
different machines/nodes so as to provide universal access and recovery of data
in case any failure occurs at any data center or the data from one machine gets
lost.
-The feature of cloud storage provides you with the facility of automatic data
backups and access from any device at any time and any place.
vi. Reliability

In cloud systems, the data are stored in multiple nodes. When one or some
nodes fails, the whole system can still work fine
Disadvantages of Cloud Computing
i. Dependency on Internet Connectivity:

Cloud computing heavily relies on stable and reliable internet connectivity.

If there is a network outage or internet disruption, it can result in reduced or complete loss of access to
cloud services.

This dependency can be problematic in areas with limited internet infrastructure or during situations
where network connectivity is compromised.
ii. Potential Security and Privacy Risks:

Storing data and running applications in the cloud means entrusting sensitive information to a third-
party service provider.

This can raise concerns about data security and privacy.

Organizations must carefully evaluate the security measures implemented by cloud providers and
ensure that appropriate data protection mechanisms are in place.
iii. Limited Control and Customization:

Cloud computing often involves using shared resources and standardized services provided by the
cloud provider.

This can limit the level of control and customization that organizations have over their computing
environment.

Certain configurations or software requirements may not be supported by the cloud provider's
infrastructure, leading to limitations in tailoring the system to specific needs.
iv. Vendor lock-in :

When transferring services from one vendor to another, organizations may run into issues.

Because different vendors offer different platforms, moving from one cloud to another can be
difficult.
v. Cost Considerations:

While cloud computing offers the potential for cost savings through pay-as-you-go models and
resource scalability, it is essential to carefully manage cloud usage to avoid unexpected costs.

If resources are not properly optimized or if there is excessive demand, cloud expenses can quickly
escalate.

Additionally, long-term or predictable workloads may sometimes be more cost-effective to run on
dedicated on-premises infrastructure.

Difference between cluster, cloud and grid computing
1. Cluster Computing:
Cluster computing involves the use of multiple interconnected computers or servers, referred
to as nodes, that work together to perform a specific task or provide a unified computing
resource. The nodes in a cluster are typically physically close to each other and are tightly
coupled, meaning they have high-speed interconnects and share resources such as memory
and storage. Cluster computing is commonly used for high-performance computing (HPC)
applications, scientific simulations, and parallel processing tasks.

2. Cloud Computing:
Cloud computing refers to the delivery of on-demand computing resources, such as servers,
storage, databases, and software applications, over the internet. It enables users to access
and use computing resources remotely, without the need for local infrastructure or
management. Cloud computing services are typically provided by third-party service
providers, who maintain and manage the underlying infrastructure. Users can scale their
resource usage up or down based on their needs and are billed based on their usage. Cloud
computing offers flexibility, scalability, and cost-effectiveness, making it popular for various
applications ranging from web hosting to data analytics.
3. Grid Computing:
Grid computing involves the coordination and sharing of distributed computing resources
across multiple administrative domains. Unlike cluster computing, where nodes are typically
within a single organization, grid computing aims to utilize resources from multiple
organizations or institutions that may be geographically dispersed. Grid computing focuses
on large-scale resource sharing and collaboration, allowing organizations to leverage idle
resources and increase overall computing power. Grids are often used for complex and data-
intensive applications, such as scientific research, data mining, and distributed simulations.
Types of Cloud Computing
a. Deployment Model

Cloud deployment model represents a specific type of cloud
environment, primarily distinguished by ownership, size, and access.

Type of deployment model:
i. Public Cloud
ii. Private Cloud
iii. Hybrid Cloud
I. Public Cloud

Provision of computing services over the public internet by a third-party
provider , i.e., Cloud service provider available to anyone who wants to use
them and can be free or paid to use

Public Cloud provides a shared platform that is accessible to the general
public through an Internet connection. same storage is being used by
multiple users at the same time, i.e., multitenancy

Public cloud is owned, managed, and operated by businesses, universities,
government organizations, or a combination of them.

Examples: Amazon , Microsoft Azure, IBM’s Blue Cloud, Sun Cloud, and
Google Cloud are examples of the public cloud.
Advantages of Public Cloud

Low Cost: shares the same resources with a large
number of consumers.

Location Independent: services are offered through
the internet

Scalability and reliability: offers scalable (easy to add
and remove) and reliable (24 × 7 available) services
to the users at an affordable cost.
Disadvantages of Public Cloud

Low Security: less secure because resources
are shared publicly

Performance: depends upon the speed of
internet connectivity

Less customizable: less customizable than the
private cloud
ii. Private Cloud

Cloud that is privately owned, managed, and operated by the business or organization

Not open to the public and is exclusively owned by one business or organization ,also
known as internal or corporate cloud

Infrastructures and services in a private cloud can be accessed only within the
organization

An organization that is using a private cloud is solely responsible for its management,
maintenance, and regular updates

Servers can be physically located on organization’s premise or can be hosted by third
party service providers

Examples: Dell, Microsoft, Apache, and Open Stack
Advantages of Private Cloud

Customizable: more control over their resources and hardware than
public clouds

Security & privacy: greater range of security as data is protected
behind a firewall

Improved performance: offers better performance with improved
speed and space capacity
Disadvantage of Private Cloud

High cost: set up and maintain hardware resources
are costly

Restricted area of operations: accessible within the
organization, so the area of operations is limited

Limited scalability: scaled only within the capacity
of internal hosted resources
iii. Hybrid Cloud

Hybrid cloud is a combination of public and private security clouds

By merging the benefits of private and public cloud services: a hybrid cloud provides the
private cloud’s security and the public cloud’s speed to the organization or business

Main aim to combine these cloud: is to create a unified, automated, and well-managed
computing environment

In the Hybrid cloud:
-non-critical activities −performed by→ public cloud
-critical activities −performed by→ private cloud

Mainly, a hybrid cloud is used in finance, healthcare, and Universities

Examples: Amazon, Microsoft, Google, Cisco, IBM, and NetApp
Advantages of Hybrid Cloud

Flexible and secure: provides flexible resources (public cloud)
and secure resources (private cloud)

Cost effective: costs less than the private cloud. Also helps
organizations to save costs for both infrastructure and application
support

Optimize Workload Resources: Process complex workloads in
the public cloud where additional capacity is low-cost and easy to
access, but keep your simpler workloads in private cloud
infrastructure
Disadvantages of Hybrid Cloud

Networking issues: becomes complex because
of the private and the public cloud

Infrastructure Compatibility: due to dual-levels
of infrastructure, a private cloud controls the
company, and a public cloud does not
b. services model

A specific, pre-packaged combination of IT
resources offered by a cloud provider

Type of cloud services/ delivery model:
i. Software-as-a-Service (SaaS)
ii. Platform-as-a-Service (PaaS)
iii. Infrastructure-as-a-Service (IaaS)
I. Software as a Service(SaaS)

Way of delivering service and application over the
internet, also known as "on-demand software"

E.g: Dropbox (shared files), gmail (service as sent or
received email)

Works on shared model, i.e., multitenancy
environment

Eg: Google drive, Office 365, Google play
Characteristics of SaaS

Hosted and maintenance by cloud service provider

Users are not responsible for hardware and software updates

Updates are applied automatically

The services are purchased on the pay-as-per-use basis

On demand availability

Easily scalable as per need

Work on shared model
Benefits of SaaS

Easy to Accessible: SaaS application must be
easily accessible from anywhere at any time,
across operating systems

Efficient use of software license

Multi-tenancy

Centralized management
Issues of SaaS

Limited customization: Most SaaS applications offer little in the way of
customization from the vendor

Browser based risks: If user visited malicious website and malicious
code get attached to the browser. If same browser access SaaS on
cloud. SaaS application may get infected by the malicious code

Network dependence: SaaS model is based on web delivery, if your
internet service fails, you will lose access to your software or data

Portability Issues: If user want to transfer SaaS application move from
one SaaS cloud to another SaaS cloud
ii. Platform as a Service

Allows organizations to build, run and manage applications
without the IT infrastructure

Allows programmers to easily create, test, run, and deploy web
applications on the runtime environment

PaaS delivers a framework for developers and IT architects to
create web or mobile apps that are scalable, without worrying
about setting up or managing the underlying infrastructure of
servers, storage, network, and databases needed for
development
Characteristics of PaaS

Secured and scalable web services

Easy workflow and approval process

Easy integration with other applications on the
same cloud
Benefits of PaaS

Cost: Customers do not need to pruchase hardware and software

Scalability: PaaS can be seen as a great scalability solution, as it delivers an
environment with highly scalable spaces, tools, and resources

Availability & Mobility: PaaS makes it possible to centralize team communication
in a single environment, forming a unified communication structure (without
losing sync, even if they are in different locations). This can help solve problems
faster and bring agility to the company’s activities.

Boosts productivity: PaaS allows you to develop and implement new applications
without the need to spend time creating your own work environment. This can
speed up the application development, testing, and delivery cycle.

Updates are applied automatically

Less administrative overhead
Issues of PaaS

Security risk: The provider’s cloud database
houses all of the application data. Since the
provider can see private and sensitive information,
this raises concerns about confidentiality

Integration problems: Every PaaS provider has a
unique integration method, similar to compatibility.
Merging two PaaS products is not possible

PaaS model comprises of the following services:
-Platform: OS and Middleware
-Infrastructure: Servers, Storage, Network, Security

Examples of PaaS include Google App Engine,
Microsoft Azure, Red Hat’s OpenShift Platform, and
OpenStack
iii. Infrastructure as a Service

Cloud service model which provides pay-as-you-go or pay-per-use access to all
computing resources like storage, database, servers and networking

Users don’t have to deal with expenses incurred in buying and maintaining physical
servers and other data center resources.

Businesses and organizations rely on the vendor’s infrastructure to build software
applications on their own platforms.

IaaS model provides the following services to businesses or organizations:
-Servers
-Storage
-Network
-Security

Examples are Amazon Web Services (AWS) and Google Compute Engine (GCE)
Difference between grid, cluster and cloud computing
2. Distributed Information System

The aim of distributed information system is to distribute
information across several servers.

Remote processes called clients access the servers to
manipulate the information.

Different communication models are used to serve this purpose.

The most usual are RPC and the RMI.

Example: Transaction Processing Systems, Enterprise
Application Integration.
3. Distributed Pervasive System

In a distributed pervasive system, devices such as
smartphones, wearables, smart appliances, and IoT (Internet
of Things) devices communicate with each other and the cloud
to gather data, exchange information, and perform tasks.

These systems leverage technologies like wireless
communication protocols, cloud computing, edge computing,
and sensor networks to enable pervasive computing
scenarios.
Differences between centralized and distributed
systems
Centralized System Distributed System
In centralized system, all the computation is In distributed system, the calculation and
done in one particular time. computation is distributed to multiple
computers.

There is a single point of failure. There is no single point of failure.


They have a global state(refers to the They do not have a global state.
concept of having a single, central
location or entity that stores and
manages the complete state information
of the system.)
Characteristics: presence of global clock, one Characteristics: concurrency of component,
single central unit and dependent failure of lack of a global clock and independent failure
components of components
Advantages: easy to physically secure, Advantages: higher performance than a
smooth and elegant personal experience, centralized system, higher reliability, and
dedicated resources, quick update are easier to share data/resources.

You might also like