Parallel and Distributed Computing Complete Notes
Parallel and Distributed Computing Complete Notes
Parallel Computing:
Focus: Speeding up computations.
Hardware: Single computer with multiple processors (cores) or
tightly-coupled multi-computer systems.
Tasks: A single large task is divided into smaller subtasks that are
executed simultaneously by multiple processors.
Distributed Computing:
Distributed computing involves multiple
autonomous computers that appear as a single system to users,
communicating through message passing without shared memory.
Key Points:
System components are located at different locations.
Uses multiple computers communicating through message passing.
Allows for data sharing, resource sharing, and geographic
flexibility.
Speedup:
Speedup refers to the performance gain achieved when a task
is executed in parallel compared to sequentially. It is calculated as the
ratio of the time taken to complete a task on a single processor to the
time taken on multiple processors. The formula for speedup ((S)) is:
S = T serial / T parallel
Where:
(T_{\text{serial}}) is the time taken to execute the task on a single
processor.
(T_{\text{parallel}}) is the time taken to execute the task on
multiple processors.
Amdahl’s Law:
Amdahl’s Law provides a way to estimate the
maximum possible speedup for a task based on the portion of the task
that can be parallelized. It takes into account the fact that not all parts of
a program can be parallelized. The law is expressed as:
1. Client-Server Architecture:
In this architecture, there are one or more central servers that
provide services to multiple clients.
Examples include web servers serving web pages to clients
over the internet.
2. Peer-to-Peer (P2P) Architecture:
In P2P architecture, individual nodes (peers) act both as
clients and servers, sharing resources directly with other
peers without the need for centralized servers.
Peers collaborate to perform tasks collectively, such as file
sharing in Bit Torrent networks.
3. Three-Tier Architecture:
This architecture divides the system into three layers:
presentation layer, application layer, and data layer.
The presentation layer interacts with users, the application
layer contains the business logic, and the data layer manages
data storage and retrieval.
4. Grid Computing Architecture:
Grid computing involves the coordinated use of distributed
resources from multiple administrative domains to solve
large-scale computational problems.
{HT}
It typically relies on middleware for resource management,
scheduling, and authentication.
5. Cloud Computing Architecture:
It typically involves virtualization technology to create and
manage virtual instances of resources.
Cloud computing architectures may include public, private,
or hybrid clouds, depending on deployment models.
Applications of Multiprocessor:
1. As a uniprocessor, such as single instruction, single data stream
(SISD).
{HT}
2. As a multiprocessor, such as single instruction, multiple data
stream (SIMD), which is usually used for vector processing.
Advantages:
Improved performance
Better scalability
Increased reliability
Reduced cost
Disadvantages:
Increased complexity
Higher power consumption
Difficult programming
Synchronization issues
Challenges:
NOWs face challenges such as ensuring efficient
communication, managing load balancing, and maintaining fault
tolerance.
Classification of Cluster:
1. Open Cluster:
IPs are needed by every node and those are accessed
only through the internet or web. This type of cluster causes enhanced
security concerns.
2. Close Cluster:
The nodes are hidden behind the gateway node, and
they provide increased protection. They need fewer IP addresses and are
good for computational tasks.
8: Software Architectures
Software architectures
in Parallel and Distributed Computing are designed to manage the
complexity of parallel processing and the distribution of computations
across multiple computing nodes.
Parallel Computing:
Parallel computing involves processing multiple
tasks simultaneously on multiple processors, known as parallel
processing.
{HT}
Distributed Computing:
Distributed computing involves software
components spread over different computers but functioning as a single
entity.
Architecture Types:
Client-Server Computing:
Allows separation, decomposition, and
potential distribution of system and application functionality.
Grid Architecture:
Connects computers that do not fully share
resources, operating more like a computing utility.
Peer-to-Peer Architectures:
Relies on the computing power and bandwidth
of participants without a central server-client structure.
Threads:
{HT}
A thread is a lightweight unit of execution within a process. A
single process can have multiple threads that share the same memory
space and resources. This allows threads to communicate and access
data efficiently.
Benefits:
Faster execution: By dividing tasks among threads, parallel
programs can leverage multiple cores or processors on a single
machine, leading to faster execution.
Efficient communication: Threads within a process can
directly access and modify shared data, reducing the need for
complex communication mechanisms compared to distributed
systems.
Shared Memory:
Shared memory is a memory space that can be accessed
by all threads within a process. This allows threads to directly read and
write data, facilitating efficient communication and collaboration.
Advantages:
Simplicity: Programming with shared memory offers a simpler
model compared to distributed memory, as threads don't need to
explicitly manage data transfer.
Performance: Direct access to shared memory can lead to
faster data exchange between threads, especially for frequently
accessed data.
Disadvantages:
{HT}
Scalability: Shared memory systems become complex to
manage with a large number of threads due to increased
synchronization overhead.
Limited scope: Shared memory is restricted to a single
machine, limiting the ability to harness the power of multiple
computers in a distributed network.
Processes:
A process is an instance of a program that is actively running. In
parallel and distributed computing, multiple processes cooperate
on a single task.
Each process has its own private address space, meaning it can
access its own local memory directly.
Message Passing:
Since processes have separate address spaces, they cannot directly
access each other's memory. Message passing is the mechanism by
which processes communicate and exchange data.
{HT}
Processes send and receive messages through communication
channels.
Advantages of DSM:
{HT}
Simplified Programming: DSM hides the complexities of
message passing, making it easier to develop parallel and
distributed applications.
Larger Memory Space: The combined memory of all nodes
becomes accessible, offering a larger virtual memory space for
applications.
Disadvantages of DSM:
Overhead: Maintaining data consistency across multiple nodes
adds overhead compared to local memory access.
Limited Control: Programmers might have less control over
data placement and communication compared to explicit message
passing.
Definition:
Distributed shared data refers to data that is accessible and
modifiable by multiple processing units or nodes in a distributed
computing system.
It allows multiple nodes to share access to the same data structure,
enabling concurrent processing and computation.
Characteristics:
Shared: The data is shared among multiple processing units or
nodes.
Distributed: The data is distributed across different nodes in
the system, rather than being centralized.
Concurrent: Multiple nodes can access and modify the data
simultaneously, allowing for parallel computation.
Challenges:
Consistency
Concurrency Control
Scalability
Fault Tolerance
Applications:
Distributed databases
Cloud computing platforms
High-performance computing clusters
Distributed file systems
2. Distributed Systems:
Streaming Computations
Mobile Edge Computing
Digital Virtual Environment
Wireless Urban Computing
3. Newest Topics:
Real-Time Parallel Computing Plan
MPI Scaling Up for Power list
Adaptive Barrier Algorithm in MPI
Parallelizing Machine Learning Optimization Algorithms
Parallel Model Checking based on Pushdown Systems
{HT}
Types of Parallelism:
There are the different types of parallelism:
Thread-level parallelism: Using multiple threads of
execution within a single process to perform tasks concurrently.
Task-level parallelism: Dividing a computational task into
smaller independent tasks that can be executed concurrently.
Data parallelism: Distributing data across multiple processing
units and performing the same operation on different parts of the
data simultaneously.
Concurrency:
Concurrency refers to the ability of a system to execute
multiple computations or tasks simultaneously.
Synchronization:
Synchronization is the coordination of concurrent
processes to ensure they execute in a controlled manner and maintain
consistency.
Data Partitioning:
Data partitioning involves splitting a large dataset
into smaller subsets that can be processed in parallel.
Work Partitioning:
{HT}
Work partitioning refers to the division of
computational tasks among multiple processors or nodes in a parallel or
distributed system. The objective is to ensure that each processing unit
has an equal amount of work, maximizing the efficiency of the system.
Parallelization Strategies:
Parallelization is a fundamental technique
in high-performance computing that aims to divide a computational task
into smaller, independent subtasks that can be executed concurrently on
multiple processors or computers.
3. Task Parallelism:
In task parallelism, a computational task is
broken down into smaller, independent subtasks that can be executed
concurrently.
Advantages:
o Allows for fine-grained control over task scheduling and
execution.
Disadvantages:
o May involve overhead in creating and managing tasks,
especially for large numbers of small tasks.
{HT}
18: Granularity
Granularity in parallel
and distributed computing refers to the size of tasks into which a
computation is divided to be processed concurrently.
Granularity Levels:
Fine-Grained: Tasks are very small, and the computation is
broken down into many tiny pieces.
Coarse-Grained: Larger tasks are processed, which means
less frequent communication and synchronization between tasks.
Impact of Granularity:
Load Balancing: Fine-grained parallelism can lead to better
load balancing across processors but may incur higher
communication overhead.
Performance: Coarse-grained tasks might lead to
underutilization of resources if the workload is not evenly
distributed.
Challenges:
As tasks become finer, the cost of communication can outweigh
the benefits of parallelism.
Ensuring that tasks are properly synchronized, especially in fine-
grained systems, can be complex.
{HT}
Application availability:
Server failure or maintenance can
increase application downtime, making your application unavailable to
visitors.
Run application server maintenance or upgrades without
application downtime.
Provide automatic disaster recovery to backup sites.
Application scalability:
You can use load balancers to direct
network traffic intelligently among multiple servers.
Prevents traffic bottlenecks at any one server.
Predicts application traffic so that you can add or remove different
servers, if needed.
{HT}
Application security:
Load balancers come with built-in security
features to add another layer of security to your internet applications.
Monitor traffic and block malicious content.
Application performance:
Load balancers improve application
performance by increasing response time and reducing network latency.
Distribute the load evenly between servers to improve application
performance.
Ensure the reliability and performance of physical and virtual
computing resources.
23: Threads
24: Pthreads
Benefits of Pthreads:
{HT}
Improved performance: By dividing tasks into smaller,
concurrent threads, Pthreads can leverage multi-core or multi-
processor systems to execute computations faster.
Efficient resource utilization: Threads are lightweight
compared to processes, so creating and managing them incurs less
overhead.
Portability: Pthreads are a standard API, making code written
for one POSIX-compliant system (like Linux, macOS) often
portable to others.
Locks:
Locks are mechanisms that allow only one thread or process
to access a critical section of code or data at a time, preventing
interference and conflicts.
Types:
Mutex: A binary lock that can be either locked or unlocked by a
single thread or process.
Read-Write Lock: Allows multiple threads to read data
concurrently but only one thread to write exclusively.
Drawbacks:
Spinlocks can waste CPU cycles and increase power consumption.
Semaphores:
{HT}
Semaphores are synchronization primitives used to
control access to shared resources by multiple threads or processes.
Types:
Binary Semaphore: Acts as a mutex, allowing only one
thread to access a resource.
Counting Semaphore: Allows a specified number of threads
to access a resource simultaneously.
Benefits:
Prevents deadlock and starvation when used correctly.
Enables controlled access to shared resources in parallel
environments.
Advantages:
Scalability
Flexibility
Challenges:
Complexity
Performance
28: MPI
MPI stands for Message
Passing Interface. It's a widely used, standardized library that facilitates
communication and collaboration between processes in parallel and
distributed computing environments.
29: PVM
Parallel Virtual Machine (PVM)
is a software tool designed for parallel networking of computers,
allowing a network of heterogeneous Unix and /or Windows machines
to function as a single distributed parallel processor.
31: Aurora
There are two main things
that "Aurora" can refer to in parallel and distributed computing:
Benefits:
o Abstraction: Hides implementation details, making code easier
to understand and modify.
o Data Integrity: Enforces allowed operations, preventing
accidental data corruption.
o Security: By controlling access to data operations, ADTs can
help prevent unauthorized modifications.
Examples:
Distributed Arrays
{HT}
Distributed Hash Tables (DHTs)
Parallel Queues
Scoped Behavior:
Scoped behavior is a programming technique that allows for fine-
grained control over how data objects are shared and accessed in a
parallel or distributed system.
Benefits:
o Performance Optimization: Scoped behavior allows
programmers to tailor communication patterns to specific program
needs, potentially improving performance.
o Reduced Communication Overhead: By optimizing
communication patterns, scoped behavior can help reduce the
amount of data that needs to be sent between processes, leading to
performance gains.
Parallel Systems:
SD-Clouds: Utilizing cloud computing for parallel processing.
{HT}
Multi-Core and Multi-Processor: Exploring systems with
multiple cores and processors for parallel computing.
Hadoop Files: Implementing parallel processing using the
Hadoop framework.
Distributed Systems:
Streaming Computations: Processing data streams in a
distributed manner.
Mobile Edge Computing: Computing at the edge of
networks for distributed systems.
Digital Virtual Environment: Creating virtual
environments for distributed computing.