Lecture 3

Uploaded by

np03cs4s230155

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Lecture 3

Uploaded by

np03cs4s230155

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

6CS005 High Performance Computing

Lecture 3
Parallel Computing
Contents
• Parallel Computing Overview
• Key Components of Parallel Computing
• Serial and Parallel Computing
• Sequential and Parallel Programming
• Relationship Between Tasks
• Classification of Computing Systems: Flynn’s Classification
• Enhancing Computational Efficiency: Key Objectives
• Classification of Computer Architecture by Memory Organization
• Homogeneous Computing
• Homogeneous Architecture
Parallel Computing
Overview

• Primary goral of parallel computing is to improve

the speed of computation by performing many
calculations simultaneously
• Definition:
– Calculation perspective: Large problems are divided
into smaller tasks, solved concurrently
– Programmer perspective: Concurrent tasks are
mapped onto multiple computing resources (cores or
computers) for parallel execution
• Parallel computing involves the close integration
of hardware and software to achieve efficient
performance in solving complex problems
Key Components of Parallel
Computing
• Hardware (Computer Architecture)
– Focuses on supporting parallelism at the architectural
level
• Software (Parallel Programing)
– Focuses on solving a problem concurrently by fully
using the computational power of the computer
architecture
• Note: To enable parallel execution in software,
the hardware must support concurrent execution
of multiple processes or threads.
Hardware (Computer Architecture)

• Most modern processors utilize the Harvard

architecture instead of the Von Neumann
architecture

Harvard Architecture Von Neuman Architecture

Contd…
• CPU(Core):
– The primary component for processing
tasks
– Early computers operated with a single
core on a chip, known as uniprocessor
– Modern chip design integrates multiple
cores(individual processing unit within a
processor) into a single processor, referred
to as multicore architecture.
– This design supports parallelism,
allowing multiple tasks to be processed
simultaneously
– Example Quad-core processor(Intel Core
i7-7700 or AMD Ryzen 7, has four cores
(Core 0, Core 1, Core 2, Core 3) that can Core Processor Architecture
run separate thread simultaneously
DIMM: Dual Inline Memory Module are essential components in core processor architecture,
providing the necessary memory resources for efficient processing, types or volatile memory
Level 3 Cache: Type of cache memory used in computer architecture to improve the performance
of the CPU by storing frequently accessed data and instructions
Can a quad-core processor
handle four tasks at the same
time?
• The ability of a quad-core processor to handle four tasks simultaneously is
not automatic; it depends on several factors:
• Multithreading:
– Single-Threaded Applications:
• If an application is single-threaded, it can only use one core at a
time, even on a quad-core processor, activating only one core for
that task.
– Multi-Threaded Applications:
• Multi-threaded applications can utilize multiple cores by splitting
workloads into separate threads, enabling concurrent execution. A
well-optimized multi-threaded application can effectively use all
four cores of a quad-core processor.
• Operating System and Task Management
– The operating system manages tasks and allocates them to available
cores, distributing processes or threads based on their availability and
workload.
Serial and Parallel
Computing
Sequential and Parallel
Programming
• Sequential Programming
– Involves writing code where tasks are executed one
after the other
– Each calculation or instruction waits for the previous
one to complete
Relationship Between Tasks

• Based on execution constraints:

• Sequential/Dependent Tasks:
– Dependent tasks are tasks that have a direct relationship where
the outcome of one task is necessary for the next task to
execute
• Concurrent/Independent Tasks:
– Independent tasks are tasks that can operate on their own
without needing to wait for the results of other tasks
• Understanding data dependencies is crucial for implementing
parallel algorithms, as they are major barriers to achieving
parallelism
• Often, multiple independent chains of dependent tasks provide
the best opportunities for effective parallelization.
Contd…

• Multiple independent chains of dependent tasks

• Example:
• Chain of dependent task:
– Tas1Task2Task3
– In this chain, Task2 can only start once Tasks is finished, and
Task3 can only start after Task2 is completed
• Multiple independent chains:
– Chain 1: TaskA1TaskA2TaskA3
– Chain 2: TaskB1TaskB2TaskB3
– Here, Chain 1 and Chain 2 can run concurrently because the
do not share any dependencies
Contd…

• Parallel Programming
– Involves writing code that enables multiple tasks to be
executed concurrently, often by splitting a problem into
smaller tasks
– Takes advantage of multi-core or distributed systems to
improve performance
Parallelism Overview

• Parallelism is essential in modern computing

• It improves performance by allowing multiple tasks to run at the same time
• Types or Parallelism:
– Task Parallelism
• Multiple independent tasks or function running at once
• Tasks or function are distributed across different CPU cores
• Eg: Database Server(Multiple user can query and update data
simultaneously), Web Browsers(Manage different browser tabs),
Operating Systems(Run multiple programs at the same time)
– Data Parallelism
• Many pieces of data processed simultaneously
• Data is divided among multiple cores for faster processing
• Eg: Image Processing: Each pixel can be processed independently
with the same filter operation such as blur, sharpen, or edge
detection
Classification of Computing
System: Flynn's Taxonomy
• Computing systems can be classified into four major categories based
on the number of instruction and data streams they can process
simultaneously
• Instruction Stream: A sequence of instructions executed by a processor
• Data Stream: A sequence of data required by an instruction stream
Contd…

• SISD
– Traditional serial architecture with a single core
– At any time, only one instruction stream is executed, and
operations are performed on one data stream
– Examples: most PCs, single CPU workstations and
mainframes
Contd…

• SIMD
– A type of parallel computer
– Best suited for specialized problems such as image processing
and vector computation
Contd…

• MISD
– Uncommon architecture
– Each core operates on the same data stream but uses different
instruction stream
Contd…

• MIMD
– Advanced parallel architecture
– Multiple cores operate on multiple data streams, each
executing independent instructions
– Multi-core processors and distributed computing systems
Enhancing Computational
Efficiency: Key Objectives
• At the architectural level, significant advancements have been
made to accomplish the following goals:
– Reduce Latency: Minimizing the time taken for an operation
to start and complete
– Enhance Bandwidth: Increasing the volume of data that can
be processed per unit of time
– Increase Throughput: Maximizing the number of operations
executed within a given timeframe
Classification of Computer
Architectures by Memory
Organization
• Multi-node with distributed memory
– Many processors, each with local memory, communicate over
a network
– Suitable for clusters
Contd…

• Multiprocessor with Shared Memory

– Multiple processors sharing a common memory space,
enabling direct data access and faster inter-processor
communication
Heterogeneous Computing

• Homogeneous Computing
– Involves one or more processors of the same architecture to execute
applications
– In these systems, all processing units are identical and perform the same
tasks in a similar manner
– Developers can write programs that assume all processors behave
identically, leading to simpler software design
• Heterogeneous Computing
– Utilizes a variety of processor architectures to execute applications,
allowing tasks to be assigned to the most suitable architecture for
performance improvements
– Systems can include CPUs, GPUs, FPGAs, and other specialized
processors, each optimized for specific tasks
– Programming for heterogeneous systems can be more complex as
developers must manage different architectures and ensure efficient task
distribution
Heterogeneous Architecture

Terminology:
Host: CPU
Device: GPU
PCIe: Peripheral Component Interconnect Express
End of Lecture 3

Lecture-2-06.01.2025
No ratings yet
Lecture-2-06.01.2025
21 pages
Week1-Parallel-and-Distributed-Computing
No ratings yet
Week1-Parallel-and-Distributed-Computing
55 pages
1. GPU Unit-1
No ratings yet
1. GPU Unit-1
10 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
Parallel Computing
No ratings yet
Parallel Computing
57 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
90 pages
Computer Achitecture II - Parallel - Computing
No ratings yet
Computer Achitecture II - Parallel - Computing
46 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Lecture 4
No ratings yet
Lecture 4
27 pages
Parallel Computing Main
No ratings yet
Parallel Computing Main
47 pages
Unit 5
No ratings yet
Unit 5
96 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
Parallel Computing
No ratings yet
Parallel Computing
19 pages
W3C1 Principles of Parallel Computing
No ratings yet
W3C1 Principles of Parallel Computing
28 pages
Paralle Processing in Brief
No ratings yet
Paralle Processing in Brief
31 pages
COA - Module-5
No ratings yet
COA - Module-5
35 pages
Parallelism in Computer Architecture
No ratings yet
Parallelism in Computer Architecture
27 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Coa Chapter 5
No ratings yet
Coa Chapter 5
96 pages
Parallel Programming
No ratings yet
Parallel Programming
42 pages
Introduction To Parallel Programming
No ratings yet
Introduction To Parallel Programming
129 pages
Cloud Computing - Lecture 3
No ratings yet
Cloud Computing - Lecture 3
22 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
Unit 1
No ratings yet
Unit 1
22 pages
downloadfile (3)
No ratings yet
downloadfile (3)
16 pages
Topic 1 2024
No ratings yet
Topic 1 2024
41 pages
KCS 713 Unit 1 Lecture 5
No ratings yet
KCS 713 Unit 1 Lecture 5
32 pages
Parallel_computing
No ratings yet
Parallel_computing
32 pages
HPC-Unit-1
No ratings yet
HPC-Unit-1
65 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
BCSE412L - Parallel Computing 01
No ratings yet
BCSE412L - Parallel Computing 01
27 pages
Hpc_unit-1 Insem Notes
No ratings yet
Hpc_unit-1 Insem Notes
76 pages
Parallel Programming Module 4
No ratings yet
Parallel Programming Module 4
93 pages
CC UNIT-1 Material
No ratings yet
CC UNIT-1 Material
26 pages
Parallel Computing Terminology
No ratings yet
Parallel Computing Terminology
11 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
30 pages
Lec1 and 2
No ratings yet
Lec1 and 2
52 pages
Unit V
No ratings yet
Unit V
95 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
PARALLEL VS DISTRIBUTED COMPUTING
No ratings yet
PARALLEL VS DISTRIBUTED COMPUTING
9 pages
Lec1 Introduction to Parallel Computing (2)
No ratings yet
Lec1 Introduction to Parallel Computing (2)
40 pages
001__DDS-IIIT-Jan-10th
No ratings yet
001__DDS-IIIT-Jan-10th
34 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
01 Intro Parallel Computing
No ratings yet
01 Intro Parallel Computing
40 pages
Mca 4
No ratings yet
Mca 4
61 pages
Unit 5
No ratings yet
Unit 5
66 pages
CS0051 - Module 01
No ratings yet
CS0051 - Module 01
52 pages
Architecture
No ratings yet
Architecture
67 pages
HPC-Unit-2
No ratings yet
HPC-Unit-2
72 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
CC_UNIT 1
No ratings yet
CC_UNIT 1
29 pages
Module 4- Architecture
No ratings yet
Module 4- Architecture
22 pages
Introduction To Computing
No ratings yet
Introduction To Computing
6 pages
Unit -01 easid
No ratings yet
Unit -01 easid
18 pages
Flynns
No ratings yet
Flynns
41 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
28 pages
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Leveraging_Mongo_DB_for_Efficient_Data_storage_in_MERN
No ratings yet
Leveraging_Mongo_DB_for_Efficient_Data_storage_in_MERN
6 pages
Lecture 5
No ratings yet
Lecture 5
51 pages
Business_Analysis_of_Walmart_-_Success_Factors_and
No ratings yet
Business_Analysis_of_Walmart_-_Success_Factors_and
7 pages
Lecture 4
No ratings yet
Lecture 4
41 pages
Subash Kumar Rawat - 230155 - WPS01
No ratings yet
Subash Kumar Rawat - 230155 - WPS01
11 pages
Addressing Mode Numerical ExampleBCA (TU) Second Semester
75% (4)
Addressing Mode Numerical ExampleBCA (TU) Second Semester
11 pages
Embedded Coder Getting Started Guide
No ratings yet
Embedded Coder Getting Started Guide
118 pages
MODULE 4 Hardware the CPU & Storage
No ratings yet
MODULE 4 Hardware the CPU & Storage
62 pages
Application of Disruptive Technologies in Business: Mba (M&S) Section-B Mba (RM) Section-A
No ratings yet
Application of Disruptive Technologies in Business: Mba (M&S) Section-B Mba (RM) Section-A
18 pages
cs3451 Ios Unit III Notes
No ratings yet
cs3451 Ios Unit III Notes
31 pages
Midterm Exam1 - 2020-2021 - Model Answer
No ratings yet
Midterm Exam1 - 2020-2021 - Model Answer
2 pages
01 Laboratory Exercise 1 (Platform)
No ratings yet
01 Laboratory Exercise 1 (Platform)
2 pages
Computer Science Question Bank STD XI Session 2014-15
0% (1)
Computer Science Question Bank STD XI Session 2014-15
39 pages
Distributed Database Management Notes - 3
86% (7)
Distributed Database Management Notes - 3
48 pages
Unit 2-Part2
No ratings yet
Unit 2-Part2
49 pages
Computer Pilot 2010-0203
No ratings yet
Computer Pilot 2010-0203
92 pages
ICS 2305 Systems Programming
No ratings yet
ICS 2305 Systems Programming
20 pages
Mastering Apache Camel - Sample Chapter
No ratings yet
Mastering Apache Camel - Sample Chapter
19 pages
N Mon Visualizer Overview
No ratings yet
N Mon Visualizer Overview
27 pages
Hpe Course Hpe Server Options
No ratings yet
Hpe Course Hpe Server Options
21 pages
Ambo University Waliso Campus: Dep:-Information Technology
No ratings yet
Ambo University Waliso Campus: Dep:-Information Technology
9 pages
Zen1 Manual
No ratings yet
Zen1 Manual
32 pages
Unit 4 - Embedded System
No ratings yet
Unit 4 - Embedded System
134 pages
Multivalue Fields - Lab Guide: Index Type Sourcetype Interesting Fields
No ratings yet
Multivalue Fields - Lab Guide: Index Type Sourcetype Interesting Fields
17 pages
Ictnew
100% (1)
Ictnew
12 pages
Alu Arithmetic
No ratings yet
Alu Arithmetic
7 pages
Red Hat Enterprise Linux-9-Monitoring and Managing System Status and Performance-En-Us
No ratings yet
Red Hat Enterprise Linux-9-Monitoring and Managing System Status and Performance-En-Us
305 pages
Unit 1A - Introduction To OS
No ratings yet
Unit 1A - Introduction To OS
34 pages
Ict Theory .Last Hour Revision
No ratings yet
Ict Theory .Last Hour Revision
29 pages
CUDA Compute Unified Device Architecture
No ratings yet
CUDA Compute Unified Device Architecture
26 pages
Ict Ss1-2nd Term 1st CA
No ratings yet
Ict Ss1-2nd Term 1st CA
6 pages
Evolution of Operating Systems
No ratings yet
Evolution of Operating Systems
3 pages
Atmel 2486 8 Bit AVR Microcontroller ATmega8 L Datasheet
No ratings yet
Atmel 2486 8 Bit AVR Microcontroller ATmega8 L Datasheet
15,723 pages
Weekly Edition For November 17, 2022 (LWN - Net)
No ratings yet
Weekly Edition For November 17, 2022 (LWN - Net)
14 pages
Fundamental of Programming Language
No ratings yet
Fundamental of Programming Language
80 pages