0% found this document useful (0 votes)

1K views

Parallel Computing Terminology

Parallel computing involves using multiple processors simultaneously to solve computational problems. It breaks problems into discrete parts that can be solved concurrently. Traditionally, software runs serially on a single CPU, with instructions executed one by one. Parallel computing allows instructions from different parts of a problem to execute simultaneously on different CPUs. This can reduce the time needed to solve problems and allow larger problems to be solved.

Uploaded by

maxsen021

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views

Parallel Computing Terminology

Uploaded by

maxsen021

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 11

What is Parallel Computing?

• Traditionally, software has been written for serial computation:

o To be run on a single computer having a single Central Processing Unit
(CPU);
o A problem is broken into a discrete series of instructions.
o Instructions are executed one after another.
o Only one instruction may execute at any moment in time.

• In the simplest sense, parallel computing is the simultaneous use of multiple

compute resources to solve a computational problem:
o To be run using multiple CPUs
o A problem is broken into discrete parts that can be solved concurrently
o Each part is further broken down to a series of instructions
o Instructions from each part execute simultaneously on different CPUs
• The compute resources can include:
o A single computer with multiple processors;
o An arbitrary number of computers connected by a network;
o A combination of both.
• The computational problem usually demonstrates characteristics such as the
ability to be:
o Broken apart into discrete pieces of work that can be solved
simultaneously;
o Execute multiple program instructions at any moment in time;
o Solved in less time with multiple compute resources than with a single
compute resource.

Uses for Parallel Computing:

• Historically, parallel computing has been considered to be "the high end of

computing", and has been used to model difficult scientific and engineering
problems found in the real world. Some examples:
o Atmosphere, Earth, Environment
o Physics - applied, nuclear, particle, condensed matter, high pressure,
fusion, photonics
o Bioscience, Biotechnology, Genetics
o Chemistry, Molecular Sciences
o Geology, Seismology
o Mechanical Engineering - from prosthetics to spacecraft
o Electrical Engineering, Circuit Design, Microelectronics
o Computer Science, Mathematics

• Today, commercial applications provide an equal or greater driving force in the

development of faster computers. These applications require the processing of
large amounts of data in sophisticated ways. For example:
o Databases, data mining
o Oil exploration
o Web search engines, web based business services
o Medical imaging and diagnosis
o Pharmaceutical design
o Management of national and multi-national corporations
o Financial and economic modeling
o Advanced graphics and virtual reality, particularly in the entertainment
industry
o Networked video and multi-media technologies
o Collaborative work environments

Why Use Parallel Computing?

Main Reasons:

• Save time and/or money: In theory, throwing more resources at a task will
shorten its time to completion, with potential cost savings. Parallel clusters can be
built from cheap, commodity components.

• Solve larger problems: Many problems are so large and/or complex that it is
impractical or impossible to solve them on a single computer, especially given
limited computer memory. For example:
o "Grand Challenge" (en.wikipedia.org/wiki/Grand_Challenge) problems
requiring PetaFLOPS and PetaBytes of computing resources.
o Web search engines/databases processing millions of transactions per
second

• Provide concurrency: A single compute resource can only do one thing at a time.
Multiple computing resources can be doing many things simultaneously. For
example, the Access Grid (www.accessgrid.org) provides a global collaboration
network where people from around the world can meet and conduct work
"virtually".
• Use of non-local resources: Using compute resources on a wide area network, or
even the Internet when local compute resources are scarce. For example:
o SETI@home (setiathome.berkeley.edu) uses over 330,000 computers for a
compute power over 528 TeraFLOPS (as of August 04, 2008)
o Folding@home (folding.stanford.edu) uses over 340,000 computers for a
compute power of 4.2 PetaFLOPS (as of November 4, 2008)

• Limits to serial computing: Both physical and practical reasons pose significant
constraints to simply building ever faster serial computers:
o Transmission speeds - the speed of a serial computer is directly dependent
upon how fast data can move through hardware. Absolute limits are the
speed of light (30 cm/nanosecond) and the transmission limit of copper
wire (9 cm/nanosecond). Increasing speeds necessitate increasing
proximity of processing elements.
o Limits to miniaturization - processor technology is allowing an increasing
number of transistors to be placed on a chip. However, even with
molecular or atomic-level components, a limit will be reached on how
small components can be.
o Economic limitations - it is increasingly expensive to make a single
processor faster. Using a larger number of moderately fast commodity
processors to achieve the same (or better) performance is less expensive.

Current computer architectures are increasingly relying upon hardware level

parallelism to improve performance:

o Multiple execution units

o Pipelined instructions
o Multi-core
o

Concepts and Terminology

von Neumann Architecture

• Named after the Hungarian mathematician John von Neumann who first authored
the general requirements for an electronic computer in his 1945 papers.
• Since then, virtually all computers have followed this basic design, which differed
from earlier computers programmed through "hard wiring".
o Comprised of four main components:
 Memory
 Control Unit
 Arithmetic Logic Unit
 Input/Output
o Read/write, random access memory is used to store both program
instructions and data
 Program instructions are coded data which tell the computer to do
something
 Data is simply information to be used by the program
o Control unit fetches instructions/data from memory, decodes the
instructions and then sequentially coordinates operations to accomplish
the programmed task.
o Aritmetic Unit performs basic arithmetic operations

o Input/Output is the interface to the human operator

Flynn's Classical Taxonomy

• There are different ways to classify parallel computers. One of the more widely
used classifications, in use since 1966, is called Flynn's Taxonomy.
• Flynn's taxonomy distinguishes multi-processor computer architectures according
to how they can be classified along the two independent dimensions of
Instruction and Data. Each of these dimensions can have only one of two
possible states: Single or Multiple.
• The matrix below defines the 4 possible classifications according to Flynn:
SISD SIMD

Single Instruction, Single Data Single Instruction, Multiple Data

MISD MIMD

Multiple Instruction, Single Data Multiple Instruction, Multiple Data

Single Instruction, Single Data (SISD):

• A serial (non-parallel) computer

• Single instruction: only one instruction stream is
being acted on by the CPU during any one clock
cycle
• Single data: only one data stream is being used as
input during any one clock cycle
• Deterministic execution
• This is the oldest and even today, the most common
type of computer

• Examples: older generation mainframes,

minicomputers and workstations; most modern day
PCs.

Single Instruction, Multiple Data (SIMD):

• A type of parallel computer

• Single instruction: All processing units execute the same instruction at any given
clock cycle
• Multiple data: Each processing unit can operate on a different data element
• Best suited for specialized problems characterized by a high degree of regularity,
such as graphics/image processing.
• Synchronous (lockstep) and deterministic execution
• Two varieties: Processor Arrays and Vector Pipelines
Multiple Instruction, Single Data (MISD):

• A single data stream is fed into multiple processing units.

• Each processing unit operates on the data independently via independent
instruction streams.
• Few actual examples of this class of parallel computer have ever existed. One is
the experimental Carnegie-Mellon C.mmp computer (1971).
• Some conceivable uses might be:
o multiple frequency filters operating on a single signal stream
o multiple cryptography algorithms attempting to crack a single coded
message.

Multiple Instruction, Multiple Data (MIMD):

• Currently, the most common type of parallel computer. Most modern computers
fall into this category.
• Multiple Instruction: every processor may be executing a different instruction
stream
• Multiple Data: every processor may be working with a different data stream
• Execution can be synchronous or asynchronous, deterministic or non-
deterministic
• Examples: most current supercomputers, networked parallel computer clusters
and "grids", multi-processor SMP computers, multi-core PCs.

• Note: many MIMD architectures also include SIMD execution sub-components

Handler classificqation :

In computer programming, an event handler is an asynchronous callback subroutine that

handles inputs received in a program. Each event is a piece of application-level
information from the underlying framework, typically the GUI toolkit. GUI events
include key presses, mouse movement, action selections, and timers expiring. On a lower
level, events can represent availability of new data for reading a file or network stream.
Event handlers are a central concept in event-driven programming.

The events are created by the framework based on interpreting lower-level inputs, which
may be lower-level events themselves. For example, mouse movements and clicks are
interpreted as menu selections. The events initially originate from actions on the
operating system level, such as interrupts generated by hardware devices, software
interrupt instructions, or state changes in polling. On this level, interrupt handlers and
signal handlers correspond to event handlers.
Created events are first processed by an event dispatcher within the framework. It
typically manages the associations between events and event handlers, and may queue
event handlers or events for later processing. Event dispatchers may call event handlers
directly, or wait for events to be dequeued with information about the handler to be
executed. Handling signals

Signal handlers can be installed with the signal() system call. If a signal handler is not
installed for a particular signal, the default handler is used. Otherwise the signal is
intercepted and the signal handler is invoked. The process can also specify two default
behaviors, without creating a handler: ignore the signal (SIG_IGN) and use the default
signal handler (SIG_DFL). There are two signals which cannot be intercepted and
handled: SIGKILL and SIGSTOP.

[edit] Risks

Signal handling is vulnerable to race conditions. Because signals are asynchronous,

another signal (even of the same type) can be delivered to the process during execution of
the signal handling routine. The sigprocmask() call can be used to block and unblock
delivery of signals.

Signals can cause the interruption of a system call in progress, leaving it to the
application to manage a non-transparent restart.

Signal handlers should be written in a way that doesn't result in any unwanted side-
effects, e.g. errno alteration, signal mask alteration, signal disposition change, and other
global process attribute changes. Use of non-reentrant functions, e.g. malloc or printf,
inside signal handlers is also unsafe.

[edit] Relationship with Hardware Exceptions

A process's execution may result in the generation of a hardware exception, for instance,
if the process attempts to divide by zero or incurs a TLB miss. In Unix-like operating
systems, this event automatically changes the processor context to start executing a
kernel exception handler. With some exceptions, such as a page fault, the kernel has
sufficient information to fully handle the event and resume the process's execution. In
other exceptions, however, the kernel cannot proceed intelligently and must instead defer
the exception handling operation to the faulting process. This deferral is achieved via the
signal mechanism, wherein the kernel sends to the process a signal corresponding to the
current exception. For example, if a process attempted to divide by zero on an x86 CPU,
a divide error exception would be generated and cause the kernel to send the SIGFPE
signal to the process. Similarly, if the process attempted to access a memory address
outside of its virtual address space, the kernel would notify the process of this violation
via a SIGSEGV signal. The exact mapping between signal names and exceptions is
obviously dependent upon the CPU, since exception types differ between architectures.
Amdahl's law and Gustafson's law

A graphical representation of Amdahl's law. The speed-up of a program from

parallelization is limited by how much of the program can be parallelized. For example, if
90% of the program can be parallelized, the theoretical maximum speed-up using parallel
computing would be 10x no matter how many processors are used.

Optimally, the speed-up from parallelization would be linear—doubling the number of

processing elements should halve the runtime, and doubling it a second time should again
halve the runtime. However, very few parallel algorithms achieve optimal speed-up. Most
of them have a near-linear speed-up for small numbers of processing elements, which
flattens out into a constant value for large numbers of processing elements. Grid
computing is the most distributed form of parallel computing. It makes use of computers
communicating over the Internet to work on a given problem. Because of the low
bandwidth and extremely high latency available on the Internet, grid computing typically
deals only with embarrassingly parallel problems. Many grid computing applications
have been created, of which SETI@home and Folding@Home are the best-known
examples.[31]

Most grid computing applications use middleware, software that sits between the
operating system and the application to manage network resources and standardize the
software interface. The most common grid computing middleware is the Berkeley Open
Infrastructure for Network Computing (BOINC). Often, grid computing software makes
use of "spare cycles", performing computations at times when a computer is idling.

The potential speed-up of an algorithm on a parallel computing platform is given by

Amdahl's law, originally formulated by Gene Amdahl in the 1960s.[11] It states that a
small portion of the program which cannot be parallelized will limit the overall speed-up
available from parallelization. Any large mathematical or engineering problem will
typically consist of several parallelizable parts and several non-parallelizable (sequential)
parts. This relationship is given by the equation:

S= 1/1-p

where S is the speed-up of the program (as a factor of its original sequential runtime), and
P is the fraction that is parallelizable. If the sequential portion of a program is 10% of the
runtime, we can get no more than a 10× speed-up, regardless of how many processors are
added. This puts an upper limit on the usefulness of adding more parallel execution units.
"When a task cannot be partitioned because of sequential constraints, the application of
more effort has no effect on the schedule. The bearing of a child takes nine months, no
matter how many women are assigned."[12]

Gustafson's law is another law in computer engineering, closely related to Amdahl's law.
It can be formulated as:

S(p)=p- aplha(p-1)

Assume that a task has two independent parts, A and B. B takes roughly 25% of the time
of the whole computation. With effort, a programmer may be able to make this part five
times faster, but this only reduces the time for the whole computation by a little. In
contrast, one may need to perform less work to make part A twice as fast. This will make
the computation much faster than by optimizing part B, even though B got a greater
speed-up (5× versus 2×).

where P is the number of processors, S is the speed-up, and α the non-parallelizable part
of the process.[13] Amdahl's law assumes a fixed-problem size and that the size of the
sequential section is independent of the number of processors, whereas Gustafson's law
does not make these assumptions.

Top 10 Best Practices For SQL Server Maintenance For SAP
No ratings yet
Top 10 Best Practices For SQL Server Maintenance For SAP
4 pages
Implementing Employee Central Service Center: Implementation Guide - Public Document Version: Q4 2019 - 2020-01-24
No ratings yet
Implementing Employee Central Service Center: Implementation Guide - Public Document Version: Q4 2019 - 2020-01-24
70 pages
Introduction To Computing
No ratings yet
Introduction To Computing
6 pages
Parallel Computing Main
No ratings yet
Parallel Computing Main
47 pages
Lecture Parallel Computing
No ratings yet
Lecture Parallel Computing
6 pages
Parallel Computing Varun Patial
No ratings yet
Parallel Computing Varun Patial
41 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
Lecture 4
No ratings yet
Lecture 4
27 pages
01 Intro Parallel Computing
No ratings yet
01 Intro Parallel Computing
40 pages
Lec1 Introduction to Parallel Computing (2)
No ratings yet
Lec1 Introduction to Parallel Computing (2)
40 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
30 pages
Chapter 1 - Parallel Architectures
No ratings yet
Chapter 1 - Parallel Architectures
60 pages
Introduction To Parallel Computing
No ratings yet
Introduction To Parallel Computing
38 pages
KCS 713 Unit 1 Lecture 5
No ratings yet
KCS 713 Unit 1 Lecture 5
32 pages
Week1-Parallel-and-Distributed-Computing
No ratings yet
Week1-Parallel-and-Distributed-Computing
55 pages
Lecture_2_Computer_Architecture_course_2024_1
No ratings yet
Lecture_2_Computer_Architecture_course_2024_1
57 pages
Computer Achitecture II - Parallel - Computing
No ratings yet
Computer Achitecture II - Parallel - Computing
46 pages
Parallel Computing
100% (1)
Parallel Computing
53 pages
Lecture 2 General Parallelism Terms
No ratings yet
Lecture 2 General Parallelism Terms
22 pages
Topic 1 2024
No ratings yet
Topic 1 2024
41 pages
Unit 1
No ratings yet
Unit 1
22 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
90 pages
Introduction To Parallel Computing-Dr Nousheen
No ratings yet
Introduction To Parallel Computing-Dr Nousheen
43 pages
W3C1 Principles of Parallel Computing
No ratings yet
W3C1 Principles of Parallel Computing
28 pages
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
No ratings yet
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
170 pages
Chapter # 1
No ratings yet
Chapter # 1
117 pages
Parallel 123
No ratings yet
Parallel 123
28 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
28 pages
10 Parallel Computing
No ratings yet
10 Parallel Computing
15 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Lecture 2 General Parallelism Terms
No ratings yet
Lecture 2 General Parallelism Terms
22 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
28 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
34 pages
PP Cuda Unit1 1
No ratings yet
PP Cuda Unit1 1
77 pages
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
No ratings yet
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
11 pages
Assignment 1st PC
No ratings yet
Assignment 1st PC
12 pages
Paralle Processing in Brief
No ratings yet
Paralle Processing in Brief
31 pages
Parallel Computers
No ratings yet
Parallel Computers
19 pages
Lecture-2-06.01.2025
No ratings yet
Lecture-2-06.01.2025
21 pages
Lecture Week - 1 Introduction 1 - SP-24
No ratings yet
Lecture Week - 1 Introduction 1 - SP-24
51 pages
Ppt1 Lecture 1 Distributed and Parallel Computing CSE423
No ratings yet
Ppt1 Lecture 1 Distributed and Parallel Computing CSE423
24 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
Parallel Processor Computing Unit 1
No ratings yet
Parallel Processor Computing Unit 1
10 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Unit4 Session1 Intro To Parallel Computing
No ratings yet
Unit4 Session1 Intro To Parallel Computing
24 pages
Introduction To Parallel Computing
No ratings yet
Introduction To Parallel Computing
30 pages
Parallel Distributed Computing
No ratings yet
Parallel Distributed Computing
51 pages
I Notes
No ratings yet
I Notes
27 pages
Cloud Computing
No ratings yet
Cloud Computing
27 pages
Lec1 Introduction
No ratings yet
Lec1 Introduction
23 pages
Module 1: Parallelism Fundamentals Week 1 Learning Outcomes
No ratings yet
Module 1: Parallelism Fundamentals Week 1 Learning Outcomes
8 pages
Case Study On Amazon Ec2
100% (1)
Case Study On Amazon Ec2
30 pages
u 1 c
No ratings yet
u 1 c
20 pages
Synopsis On "Massive Parallel Processing (MPP) "
No ratings yet
Synopsis On "Massive Parallel Processing (MPP) "
4 pages
CC UNIT-1 Material
No ratings yet
CC UNIT-1 Material
26 pages
Quantum Computer Vs Traditional Computer
From Everand
Quantum Computer Vs Traditional Computer
Arief Muinnudin
No ratings yet
Introduction to Computing DSST Quick Prep Sheet
From Everand
Introduction to Computing DSST Quick Prep Sheet
Justin Orgeron
No ratings yet
Learn Computer Science
From Everand
Learn Computer Science
Knowledge Flow
No ratings yet
NCP 5.5 Questions
No ratings yet
NCP 5.5 Questions
27 pages
22521-2022-winter-model-answer-papermsbte-study-resources
No ratings yet
22521-2022-winter-model-answer-papermsbte-study-resources
25 pages
Plex - ArchWiki
No ratings yet
Plex - ArchWiki
6 pages
SAP Innovation Awards 2020 Entry Pitch Deck: Design and Implementation - EY IFRS 17 Cockpit
No ratings yet
SAP Innovation Awards 2020 Entry Pitch Deck: Design and Implementation - EY IFRS 17 Cockpit
10 pages
Introduction Into Files and Folders (Directory)
100% (1)
Introduction Into Files and Folders (Directory)
17 pages
Networking Basics of Java
No ratings yet
Networking Basics of Java
3 pages
Batch-03 ECM101 3
No ratings yet
Batch-03 ECM101 3
3 pages
Mongo DB Exp 1-Content Beyond The Syllabus
No ratings yet
Mongo DB Exp 1-Content Beyond The Syllabus
13 pages
AZ-900 StudyGuide ENU FY23Q1 10.04
No ratings yet
AZ-900 StudyGuide ENU FY23Q1 10.04
2 pages
Affected Items HTTP Plen in
No ratings yet
Affected Items HTTP Plen in
26 pages
Aws VPC Guide
No ratings yet
Aws VPC Guide
224 pages
Flutter Interview Questions
No ratings yet
Flutter Interview Questions
8 pages
Software Project Model For: Library Management System
No ratings yet
Software Project Model For: Library Management System
7 pages
SAP Basis Infrastructure Audit Program Excerpt
No ratings yet
SAP Basis Infrastructure Audit Program Excerpt
9 pages
Sysops 4
No ratings yet
Sysops 4
98 pages
1-2 Time A Week Loss of A VPCPU VPM35
No ratings yet
1-2 Time A Week Loss of A VPCPU VPM35
2 pages
Computer Studies: Paper 1
No ratings yet
Computer Studies: Paper 1
16 pages
SAP SUM Upgrade Phases
No ratings yet
SAP SUM Upgrade Phases
3 pages
CSE2021 - MODULE 1ppt
No ratings yet
CSE2021 - MODULE 1ppt
62 pages
Installation Planning Guide
No ratings yet
Installation Planning Guide
160 pages
Data Lakehouse - A Survey and Experimental Study
No ratings yet
Data Lakehouse - A Survey and Experimental Study
19 pages
TGPecats Circular - Implementation of 30 Days Password Reset Policy Wef 11.3.2024
No ratings yet
TGPecats Circular - Implementation of 30 Days Password Reset Policy Wef 11.3.2024
3 pages
HTML 2014
No ratings yet
HTML 2014
4 pages
Collaboration Endpoint Software Api Transport
No ratings yet
Collaboration Endpoint Software Api Transport
24 pages
Avinash Kumar Singh Resume
No ratings yet
Avinash Kumar Singh Resume
1 page
From Zero To Hero - The Ultimate Full-Stack Developer Roadmap in 2024
No ratings yet
From Zero To Hero - The Ultimate Full-Stack Developer Roadmap in 2024
12 pages
MODULE-3 Notes
100% (1)
MODULE-3 Notes
4 pages
Midterm Exam
No ratings yet
Midterm Exam
4 pages