0% found this document useful (0 votes)

11 views

Parallel Computing Lab4

Uploaded by

Agha Ammar Khan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Parallel Computing Lab4

Uploaded by

Agha Ammar Khan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

AIR UNIVERSITY

DEPARTMENT OF ELECTRICAL & COMPUTER ENGINEERING

EXPERIMENT NO. 4

Lab Title: Introduction to Parallel Programming with CudaC: Exploring CUDA C programming 2D
operations

Student Name: M.Bilal Ijaz, Agha Ammar Khan Reg. No:210316,210300

Objective: Implement and analyze various 2D array/matrix operations in CUDAC

LAB ASSESSMENT:

Attributes Excellent Good Average Satisfactory Unsatisfactory

(5) (4) (3) (2) (1)

Ability to Conduct
Experiment

Ability to assimilate the

results

Effective use of lab

equipment and follows the
lab safety rules

Total Marks: Obtained Marks:

LAB REPORT ASSESSMENT:

Attributes Excellent Good Average Satisfactory Unsatisfactory

(5) (4) (3) (2) (1)

Data Presentation

Experiment Results

Conclusion

Total Marks: Obtained Marks:

Date: 17/10/2024 Signature:

LAB#04
TITLE: Exploring CUDA C Programming: 2D Operations.

Objective:
Implement and analyze various 2D array/matrix operations in CUDAC

Introduction:
The aim of this lab was to explore parallel computing techniques by implementing basic 2D
matrix operations using CUDA C. These operations include matrix addition, matrix
multiplication, matrix transposition, and scalar multiplication. CUDA (Compute Unified
Device Architecture) provides a platform for parallel computing on NVIDIA GPUs, allowing
developers to write code that exploits data-level parallelism for large datasets, such as 2D
matrices.

The primary objective of this lab is to:

• Understand how to allocate and transfer memory between host (CPU) and device
(GPU).
• Use CUDA kernels to implement matrix operations using thread blocks.
• Optimize the design for thread allocation and workload distribution for 2D matrices.

This lab demonstrates the concepts of:

• Grid and block structure for managing threads.
• Memory handling between host and device.
• Synchronization of threads in GPU to ensure correct computation.

Experiment Setup:
• Software: CUDA toolkit, NVIDIA CUDA Compiler (NVCC), C/C++ for code
implementation.
• Hardware: A machine with an NVIDIA GPU compatible with CUDA.
Each operation was implemented on a 16x16 matrix, using a block size of 16x16 threads.
This configuration allowed one thread to compute one element of the matrix.
Matrix Addition:
The task is to add two 2D matrices element-wise. Each thread computes the sum for one element of
the resulting matrix.

Key steps:
1. Matrices A and B are initialized on the host.
2. Memory is allocated on the device, and data is transferred from the host to the
device.
3. A CUDA kernel is launched, where each thread adds corresponding elements of
matrices A and B.
4. The result is copied back from the device to the host.

Matrix Multiplication:
Matrix multiplication involves computing the dot product of the rows of the first matrix with
the columns of the second matrix.

Key steps:
1. Each thread computes the value of one element in the resulting matrix.
2. For each thread, the dot product of one row of matrix A and one column of matrix B
is computed and assigned to the result matrix C.

Matrix Transposition:
Matrix transposition involves switching the rows and columns of a matrix. In this case, each
thread transposes one element of the matrix.

Key steps:
1. A CUDA kernel is launched where each thread switches the row and column indices
to transpose the matrix.
2. For every element A[i][j], it is assigned to B[j][i].
Scalar Multiplication:
Scalar multiplication involves multiplying each element of a matrix by a constant scalar
value.

Key steps:
1. Each thread multiplies the element of the matrix A by a scalar k.
2. The result is stored in matrix C.

Performance Considerations:
For all of the operations:
1. Thread Management: The grid and block dimensions were chosen to optimize the
number of threads per block, ensuring efficient parallelism.
2. Memory Transfer: Efficient transfer of data between host and device is crucial. The
use of pinned memory or using memory pools may further optimize this.
3. Thread Synchronization: No explicit synchronization is required in these operations
since each thread works independently on separate elements of the matrix.

Lab Tasks:
Code and Output:
Task2:

Code and output:

Task3:

Code and Output:

Task4:

Code and Output:

Conclusion:
This lab offered practical experience in performing basic 2D matrix operations using CUDA,
highlighting the power of data parallelism through the use of thread blocks. The programs
showcased significant performance improvements compared to serial CPU execution, as
tasks such as matrix addition, multiplication, transposition, and scalar multiplication were
efficiently distributed across multiple threads in a grid, enabling parallel processing.
The concepts learned in this lab lay a strong foundation for more advanced matrix
operations and optimization techniques, including the use of shared memory, tiling, and
stream-based computations, which will be explored in upcoming labs.

Robert Melillo, Gerry Leisman (Auth.) - Neurobehavioral Disorders of Childhood - An Evolutionary Perspective (2010, Springer US)
No ratings yet
Robert Melillo, Gerry Leisman (Auth.) - Neurobehavioral Disorders of Childhood - An Evolutionary Perspective (2010, Springer US)
452 pages
English File 4e Pre Int Culture Reading Comp Answer Key
No ratings yet
English File 4e Pre Int Culture Reading Comp Answer Key
3 pages
GoM Report On Government Communication
67% (9)
GoM Report On Government Communication
97 pages
Interpersonal Deception Theory
No ratings yet
Interpersonal Deception Theory
40 pages
Pdclab 5
No ratings yet
Pdclab 5
11 pages
LR8 Ali Haider 231783 CP
No ratings yet
LR8 Ali Haider 231783 CP
17 pages
CN LAB Manual Keval
No ratings yet
CN LAB Manual Keval
86 pages
Thesis Archive
No ratings yet
Thesis Archive
57 pages
Prepared By: Computer Network Lab Manual
No ratings yet
Prepared By: Computer Network Lab Manual
47 pages
Gujarat Technological University: W.E.F. AY 2018-19
No ratings yet
Gujarat Technological University: W.E.F. AY 2018-19
3 pages
combinepdf
No ratings yet
combinepdf
28 pages
Department of Civil Engineering: Diploma in Land Surveying
No ratings yet
Department of Civil Engineering: Diploma in Land Surveying
17 pages
Computer Science & Engineering: Experiment 2.1
No ratings yet
Computer Science & Engineering: Experiment 2.1
2 pages
HPC 4 B
No ratings yet
HPC 4 B
5 pages
Quantum_Hybrid_KDDCup_Detailed_with_Notebook
No ratings yet
Quantum_Hybrid_KDDCup_Detailed_with_Notebook
4 pages
CN Lab Manual
100% (1)
CN Lab Manual
49 pages
Pdclab 6
No ratings yet
Pdclab 6
15 pages
Computer Network Laboratory
No ratings yet
Computer Network Laboratory
3 pages
Lab12 Parallel and Distributed Computing
No ratings yet
Lab12 Parallel and Distributed Computing
14 pages
CN Lab Manual - v1
No ratings yet
CN Lab Manual - v1
117 pages
IC_Tester_Using_MATLAB
No ratings yet
IC_Tester_Using_MATLAB
4 pages
21CSU393 Kunal Verma - V&CC Lab Manual
No ratings yet
21CSU393 Kunal Verma - V&CC Lab Manual
94 pages
VTU ECE CNLAB Manual 15ECL68
50% (4)
VTU ECE CNLAB Manual 15ECL68
2 pages
Lab10 Parallel and Distributed Computing
No ratings yet
Lab10 Parallel and Distributed Computing
7 pages
CCN Manual
No ratings yet
CCN Manual
63 pages
Con volume things of the best
No ratings yet
Con volume things of the best
83 pages
CN Lab Manual Cse Department
No ratings yet
CN Lab Manual Cse Department
49 pages
Lab11 Parallel and Distributed Computing
No ratings yet
Lab11 Parallel and Distributed Computing
7 pages
Lab Report For Computer Network
No ratings yet
Lab Report For Computer Network
22 pages
LICD Lab 3
No ratings yet
LICD Lab 3
7 pages
Lab Manual of Computer Networks: Prepared by Borse S.B
No ratings yet
Lab Manual of Computer Networks: Prepared by Borse S.B
81 pages
LLM_1740791424
No ratings yet
LLM_1740791424
76 pages
Experiment 1 - DDCA
No ratings yet
Experiment 1 - DDCA
9 pages
Chapter 4
No ratings yet
Chapter 4
3 pages
Department of Electrical & Computer Engineering EE-462 Computer Networks Lab
No ratings yet
Department of Electrical & Computer Engineering EE-462 Computer Networks Lab
25 pages
Comm Networks Lab
100% (1)
Comm Networks Lab
74 pages
Lab6a 2-Out-Of-5 To BCD Code Converter and Display Circuit
No ratings yet
Lab6a 2-Out-Of-5 To BCD Code Converter and Display Circuit
11 pages
Jbrec CN Lab Manual
No ratings yet
Jbrec CN Lab Manual
52 pages
2021-EE-18-Lab - 04
No ratings yet
2021-EE-18-Lab - 04
7 pages
Computer NW Lab Manual
No ratings yet
Computer NW Lab Manual
24 pages
Ec8552-Cao Unit 5
No ratings yet
Ec8552-Cao Unit 5
72 pages
2022 Mid 1
No ratings yet
2022 Mid 1
4 pages
Semester-IV: Computer Networking Course Code: 4340703
No ratings yet
Semester-IV: Computer Networking Course Code: 4340703
10 pages
13. Basic Computer Networking
No ratings yet
13. Basic Computer Networking
2 pages
CN Lab Manual
100% (1)
CN Lab Manual
45 pages
Final CN Labmanual
No ratings yet
Final CN Labmanual
56 pages
mergeSortppt1
No ratings yet
mergeSortppt1
15 pages
EC8381 - Fundamentals of Data Structures in C Laboratory Manual - by LearnEngineering - in
No ratings yet
EC8381 - Fundamentals of Data Structures in C Laboratory Manual - by LearnEngineering - in
60 pages
Manual Cpe006
No ratings yet
Manual Cpe006
57 pages
CN ALL merged
No ratings yet
CN ALL merged
32 pages
CN Lab Manual 2018 19
No ratings yet
CN Lab Manual 2018 19
81 pages
Lab 6 - 2-Out-Of-5 To BCD Code Converter With Display
No ratings yet
Lab 6 - 2-Out-Of-5 To BCD Code Converter With Display
8 pages
Networks Laboratory (IT 3095), Lesson Plan-Autumn 2023
No ratings yet
Networks Laboratory (IT 3095), Lesson Plan-Autumn 2023
8 pages
CN Lab Manual
No ratings yet
CN Lab Manual
74 pages
CCN Lab Report 03 (Waqar)
No ratings yet
CCN Lab Report 03 (Waqar)
3 pages
2 Digital Principles Assignment Outcome 1 2 3 4
No ratings yet
2 Digital Principles Assignment Outcome 1 2 3 4
7 pages
CN Lab Manual
No ratings yet
CN Lab Manual
59 pages
Lab-Report NO 3
No ratings yet
Lab-Report NO 3
12 pages
Computer Networks Lab Manual Latest
No ratings yet
Computer Networks Lab Manual Latest
44 pages
amandt3
No ratings yet
amandt3
10 pages
L14-220844-Osama-Lb-Report
No ratings yet
L14-220844-Osama-Lb-Report
4 pages
CN Lab Manual Print PDF
No ratings yet
CN Lab Manual Print PDF
60 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
DME Question Bank
No ratings yet
DME Question Bank
18 pages
Sell ClickBank Products Using Ebay Classified Ads. (PDFDrive)
No ratings yet
Sell ClickBank Products Using Ebay Classified Ads. (PDFDrive)
112 pages
Assignment 1 and 2
No ratings yet
Assignment 1 and 2
4 pages
Cryptographic Algorithm Validation Program
No ratings yet
Cryptographic Algorithm Validation Program
21 pages
DOC-20241128-WA0006_241213_211559
No ratings yet
DOC-20241128-WA0006_241213_211559
131 pages
Pulse Oximetry 2
100% (1)
Pulse Oximetry 2
38 pages
Notes On Business Studies (The Nature of Business)
No ratings yet
Notes On Business Studies (The Nature of Business)
13 pages
A. SDRRM Team 7-Artemis
No ratings yet
A. SDRRM Team 7-Artemis
1 page
Comprehensive Systematic Review On Virtual Reality For Cultural Heritage Practices Coherent Taxonomy and Motivations
No ratings yet
Comprehensive Systematic Review On Virtual Reality For Cultural Heritage Practices Coherent Taxonomy and Motivations
17 pages
JCU Practice Questions 1
No ratings yet
JCU Practice Questions 1
6 pages
SS2 Technical Drawing Lesson Plan Week 5
100% (1)
SS2 Technical Drawing Lesson Plan Week 5
6 pages
10 9MA0 01 9MA0 02 A Level Pure Mathematics Practice Set 10
No ratings yet
10 9MA0 01 9MA0 02 A Level Pure Mathematics Practice Set 10
5 pages
A Study On Perception of Life Insurance Agency As A Career For Bajaj Allianz Life Insurance Company Limited"
0% (1)
A Study On Perception of Life Insurance Agency As A Career For Bajaj Allianz Life Insurance Company Limited"
12 pages
iLS - English - Y7 - T1 - Endoftermtest - Set 2
No ratings yet
iLS - English - Y7 - T1 - Endoftermtest - Set 2
14 pages
Brood Brothers Transcription
No ratings yet
Brood Brothers Transcription
23 pages
Hacking The Matrix
0% (2)
Hacking The Matrix
6 pages
Reviewer (STAS111)
No ratings yet
Reviewer (STAS111)
14 pages
Yale j813gp-glp-gdp110vx Lift Truck Service Repair Manual
No ratings yet
Yale j813gp-glp-gdp110vx Lift Truck Service Repair Manual
57 pages
VMware KB - Sample Configuration of Virtual Switch VLAN Tagging (VST Mode
No ratings yet
VMware KB - Sample Configuration of Virtual Switch VLAN Tagging (VST Mode
4 pages
Materi - Jabfung Tpkip24
No ratings yet
Materi - Jabfung Tpkip24
23 pages
Article THE CPA GOES CSI
No ratings yet
Article THE CPA GOES CSI
5 pages
MS Azure Data Factory Lab Overview
No ratings yet
MS Azure Data Factory Lab Overview
58 pages
AC 25.812-1A - Floor Proximity Emer Escape Path Systems
No ratings yet
AC 25.812-1A - Floor Proximity Emer Escape Path Systems
11 pages
Distant Space Travel Better As Family Affair
No ratings yet
Distant Space Travel Better As Family Affair
1 page
The Trapeze Artist This Text Is About Linda Spelman - They Was A Lawyer Who Found A New Career On A Circus
No ratings yet
The Trapeze Artist This Text Is About Linda Spelman - They Was A Lawyer Who Found A New Career On A Circus
1 page
Mithun cv-2
No ratings yet
Mithun cv-2
2 pages