Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

COA Chapter 9

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 36

Chapter 9

PIPELINE AND VECTOR PROCESSING


Outline
Parallel processing
Pipelining
Vector Processing
Array Processors
Parallel processing

Parallel processing is a term used to denote a large


class of technique that are used to provide simultaneous
data processing tasks for the purpose of increasing
computational speed of a computer system.
A parallel processing system is able to perform
concurrent data processing to achieve faster execution
time.
Cont…
Example: While an instruction is being executed
in the ALU, the next instruction can be read
from memory.
System may have two or more ALU to execute
two or more executions at the same time.
The purpose of parallel processing is to speed
up the computer processing capability and
increase its throughput which is the amount of
processing that can be interval of time.
Cont…
Parallel processing is established by
distributing the data among the multiple
functional units.
Figure below shows one possible way of
separating the execution unit into eight
functional units operating in parallel.
Figure: Processor with Multiple Functional Units
Cont…
Parallel processing can be considered under the
following topics:
Pipeline processing
Vector processing
Array processors
Pipelining
Pipelining is a technique of decomposing a
sequential process into sub operations with each sub
process being executed in a special dedicated segment
that operates concurrently with all other segments.
Example: To perform the combined multiple and add
operations with a stream of numbers.
Ai*Bi + Ci for I = 1, 2, 3 ….7
The sub-operations performed in each segment of the
pipeline are as follows:
Figure: Example of Pipeline Processing
Arithmetic Pipeline
Used to implement floating-point operations,
multiplication of fixed-point numbers and similar
computations encountered in scientific problems.
Example: Consider the following arithmetic
operations were pipeline is used in floating-
point adder pipeline binary numbers:
• X=A*2a
• Y=B*2b
Cont…
Where A & B are two fractions that represent
mantissa and a & b are the exponents.
The floating-point addition and subtraction can
be performed in four segments as follows:
Compare the exponents
Align the mantissas
Add or subtract the mantissas
Normalize the result.
The following procedure is outlined in the below figure.
Instruction Pipeline

Instruction pipeline is a technique for overlapping the


execution of several instructions to reduce the
execution time of a set of instructions.
Six Phases in an Instruction Cycle:
Fetch the instruction from memory
Decode the instruction.
Calculate the effective address
Fetch the operands from memory
Execute the instruction
Store the result in the proper place.
Example: Four – segment instruction pipeline
Cont…
The above figure with the abbreviated symbol
FI is the segment that fetches an instruction.
DA is the segment that decodes the instruction
and calculates the effective address
FO is segment that fetches the operand.
EX is the segment that executes the
instruction.
Figure: Timing of Instruction Pipeline
Cont…
In general there are three major difficulties that cause
the instruction pipeline to deviate from its normal
operation:
Resource conflicts caused by access to memory by two
segments at the same time. Most of the self-conflicts can
be resolved by using separate instruction and data
memories.
Data dependency conflicts arise when an instruction
depends on the result of a previous instructions but this
result is not yet available.
Branch difficulties arise from branch and other
instructions that change the value of PC.
RISC Pipeline

The reduced instruction set computer (RISC) is


its ability to use an efficient instruction pipeline.
RISC is a machine with a very fast clock cycle
that executes at the rate of one instruction per
cycle.
Simple Instruction Set
Fixed Length Instruction Format
Register-to-Register Operations
Example: Three- segment instruction pipeline
Cont…
The instruction cycle can be divided into three
sub-operations and implemented in three
segments:
I: Instruction Fetch
A: ALU operation
E: Execute instruction
Vector Processing

Computers with vector processing capabilities


are in demand in specialized application. Some
of the major application areas of Vector
processing are:
Long-range weather forecasting
Petroleum explorations
Seismic data analysis
Cont…
Medical diagnosis
Aerodynamics and space flights simulations
Artificial intelligence and expert systems
Mapping the human genome
Image processing
Vector operations
A vector is an ordered set of a one-dimensional ‘array of data
items. A vector V of length n is represented as row vector by
V=[V1,V2,V3.....Vn].It may be represented as column vector if the
data item are listed in a column. Consequently operations on
vector must be broken down into single computations with
subscripted variables. The element Viif vector V is written asV(I)
and the index I refers to a memory address or register where the
number is stored. To examine the difference between a
convection scalar processor and a vector processor consider the
following Fortran Do loop
• DO 20 I = 1,100
• 20 C(I) =B(I)+A(I) This is a program for adding to
vectors A and B of length 100 to produce a vector C.
Matrix Multiplication

Matrix multiplication is one of the most


computational intensive operations performed in
computer with vector processors. The multiplication
of two n x n matrices consists of n2 inner products
or n3 multiply -add operations. An n x m matrix of
numbers has n rows and m column and may be
considered as constituting a set of n row vector or a
set of m column vectors. Consider, for examples,
the multiplication of two 3x3 matrices A and B.
Cont…
Cont…
The product matrix C is a 3 x 3 matrix whose
elements are related to the elements of A and B
by the inner product:
Figure: Instruction format for vector processor

For example, the number in the first row and


first column of matrix C is calculated by letting
i=1, j=1, to obtain
Cont…
In general, the inner product consists of the sum
of k products terms of the form
Memory Interleaving

Memory interleaving is the technique of using memory


from two or more sources.
An instruction pipeline may require the fetching of
instruction and an operand at the same time from two
different segments .
Similarly, an arithmetic pipeline usually requires two or
more operands to enter the pipeline at the same time
instead of using two memory buses simultaneous access
the memory can be partitioned into a number of modules
connected to a common memory address and data buses.
The advantage of a modular is that it allows the use of a
technique called interleaving
Figure: Multiple module memory organization
Array Processors

An array processor is a processor that performs


computations on large arrays of data.
The terms used to refer to two different types of processors.
An attached array processor is an auxiliary processor
attached to a generally - purpose computer.
It is intended to improve the performance of the host
computer in specific numerical computational tasks.
An SIMD array processor is a processor that has a single
instruction multiple-data organization.
It manipulates vector instructions by means of multiple
functional units responsible to common instruction.
Attached Array Processor

An attached array processor is designed peripheral


for a convectional host computer and its purpose to
enhance the performance of the computer by
providing vector processing complex scientific
applications. It achieves high performance by
means of parallel processing with multiple
functional units. It includes an arithmetic unit
containing one or more pipeline floating adder and
multipliers. The array processor can be
programmed by the user to accommodate variety
arithmetic problems.
Figure: Attached array processor with host computer
SIMD Array Processor

An SIMD array processor is a computer with


multiple processing units operating in parallel. The
processing units Synchronized to perform the same
operation under the control of a common control
unit/thus providing a single instruction stream,
multiple data stream (SIMD)organization. A general
block diagram of an array processor is shown in
below figure. It contains a set of identical
processing elements (PEs) each having a local
memory M.
Figure: SIMD array processor organization
Completed all chapters

Thank you

You might also like