Parallel ProgrammingSyllabus
Parallel ProgrammingSyllabus
Course Content
Unit I
A Short History of Supercomputing: Von Neumann Architecture, Cray, multinode computing, nvidia
and cuda, alternatives to cuda, types of parallelism.
Pedagogy / Course delivery tools: Chalk and talk, Power Point Presentation, Videos.
Links: https://onlinecourses.nptel.ac.in/noc20_cs92/preview
Unit II
GPUs History of GPU Computing: FLYNN’S TAXONOMY, SOME COMMON PARALLEL
PATTERNS, Reduced Instruction Set Computers, Multiple Core Processors, Vector Processors, Limits
to parallelizability, Amdahl’s law on Parallelism.
Pedagogy/Course delivery tools: Chalk and talk, Power Point Presentation, Videos.
Links: https://onlinecourses.nptel.ac.in/noc20_cs92/preview
Unit III
Introduction: GPUs as Parallel Computers, Architecture of a Model GPU, Why More Speed or
Parallelism? GPU Computing. Introduction to CUDA: Data Parallelism, CUDA Program Structure, A
Vector Addition Kernel , Device Global Memory And Data Transfer, Kernel Functions and Threading.
Pedagogy/Course delivery tools: Chalk and talk, Power Point Presentation, Videos.
Links: https://onlinecourses.nptel.ac.in/noc20_cs92/preview
Unit IV
CUDA Threads: CUDA Thread Organization, Mapping Threads To Multidimensional Data,
Synchronization and Transparent Scalability, Assigning Resources to Blocks, Thread Scheduling and
Latency Tolerance.
Pedagogy/Course delivery tools: Chalk and talk, Power Point Presentation, Videos.
Links: https://www.youtube.com/watch?v=xDtitNlLByQ
Unit V
Implementation of algorithms in CUDA: A Matrix-Matrix Multiplication, Program to implement
sorting using CUDA, Program to Histogram calculation using CUDA, Program to create threads using
default stream in CUDA, . CUDA for Deep Learning - A Case Study.
Pedagogy/Course delivery tools: Chalk and talk, Power Point Presentation, Videos.
Links: https://www.youtube.com/watch?v=IiKhXC6NFDg
Laboratory Session:
1. OpenMp parallel programs on using #pragma directive in C.
2. OpenMp parallel programs on using #pragma directive using work sharing constructs in C
14
3. OpenMp programs using sections like omp for and omp single.
4. OpenMp programs on parallel constructs.
5. OpenMp programs on task construct.
6. OpenMp programs using thread private directives.
7. OpenMp programs using thread private directives.
8. OpenMp programs on threads scheduling.
9. OpenMp programs using last private reduction, copying and shared.
10. Programs for Point to Point MPI calls.
11. Programs for Message passing MPI calls.
12. CUDA programs on message passing.
13. CUDA programs on broadcasting
14. Graph Processing with GPU
Suggested Learning Resources
Text Book:
1. Introduction to parallel computing by Ananth Grama, Pearson education Publishers, second
edition, 2003.
2. CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs, Shane Cook
Morgan Kaufmann, 2013, ISBN: 978-0-12-415933-4
Reference:
1. GPU parallel program development using CUDA by Tolga Soyata. CRC Press 2018.
15