Parallel processing is a term used to denote a large
class of technique that are used to provide simultaneous data processing tasks for the purpose of increasing computational speed of a computer system. A parallel processing system is able to perform concurrent data processing to achieve faster execution time. Cont… Example: While an instruction is being executed in the ALU, the next instruction can be read from memory. System may have two or more ALU to execute two or more executions at the same time. The purpose of parallel processing is to speed up the computer processing capability and increase its throughput which is the amount of processing that can be interval of time. Cont… Parallel processing is established by distributing the data among the multiple functional units. Figure below shows one possible way of separating the execution unit into eight functional units operating in parallel. Figure: Processor with Multiple Functional Units Cont… Parallel processing can be considered under the following topics: Pipeline processing Vector processing Array processors Pipelining Pipelining is a technique of decomposing a sequential process into sub operations with each sub process being executed in a special dedicated segment that operates concurrently with all other segments. Example: To perform the combined multiple and add operations with a stream of numbers. Ai*Bi + Ci for I = 1, 2, 3 ….7 The sub-operations performed in each segment of the pipeline are as follows: Figure: Example of Pipeline Processing Arithmetic Pipeline Used to implement floating-point operations, multiplication of fixed-point numbers and similar computations encountered in scientific problems. Example: Consider the following arithmetic operations were pipeline is used in floating- point adder pipeline binary numbers: • X=A*2a • Y=B*2b Cont… Where A & B are two fractions that represent mantissa and a & b are the exponents. The floating-point addition and subtraction can be performed in four segments as follows: Compare the exponents Align the mantissas Add or subtract the mantissas Normalize the result. The following procedure is outlined in the below figure. Instruction Pipeline
Instruction pipeline is a technique for overlapping the
execution of several instructions to reduce the execution time of a set of instructions. Six Phases in an Instruction Cycle: Fetch the instruction from memory Decode the instruction. Calculate the effective address Fetch the operands from memory Execute the instruction Store the result in the proper place. Example: Four – segment instruction pipeline Cont… The above figure with the abbreviated symbol FI is the segment that fetches an instruction. DA is the segment that decodes the instruction and calculates the effective address FO is segment that fetches the operand. EX is the segment that executes the instruction. Figure: Timing of Instruction Pipeline Cont… In general there are three major difficulties that cause the instruction pipeline to deviate from its normal operation: Resource conflicts caused by access to memory by two segments at the same time. Most of the self-conflicts can be resolved by using separate instruction and data memories. Data dependency conflicts arise when an instruction depends on the result of a previous instructions but this result is not yet available. Branch difficulties arise from branch and other instructions that change the value of PC. RISC Pipeline
The reduced instruction set computer (RISC) is
its ability to use an efficient instruction pipeline. RISC is a machine with a very fast clock cycle that executes at the rate of one instruction per cycle. Simple Instruction Set Fixed Length Instruction Format Register-to-Register Operations Example: Three- segment instruction pipeline Cont… The instruction cycle can be divided into three sub-operations and implemented in three segments: I: Instruction Fetch A: ALU operation E: Execute instruction Vector Processing
Computers with vector processing capabilities
are in demand in specialized application. Some of the major application areas of Vector processing are: Long-range weather forecasting Petroleum explorations Seismic data analysis Cont… Medical diagnosis Aerodynamics and space flights simulations Artificial intelligence and expert systems Mapping the human genome Image processing Vector operations A vector is an ordered set of a one-dimensional ‘array of data items. A vector V of length n is represented as row vector by V=[V1,V2,V3.....Vn].It may be represented as column vector if the data item are listed in a column. Consequently operations on vector must be broken down into single computations with subscripted variables. The element Viif vector V is written asV(I) and the index I refers to a memory address or register where the number is stored. To examine the difference between a convection scalar processor and a vector processor consider the following Fortran Do loop • DO 20 I = 1,100 • 20 C(I) =B(I)+A(I) This is a program for adding to vectors A and B of length 100 to produce a vector C. Matrix Multiplication
Matrix multiplication is one of the most
computational intensive operations performed in computer with vector processors. The multiplication of two n x n matrices consists of n2 inner products or n3 multiply -add operations. An n x m matrix of numbers has n rows and m column and may be considered as constituting a set of n row vector or a set of m column vectors. Consider, for examples, the multiplication of two 3x3 matrices A and B. Cont… Cont… The product matrix C is a 3 x 3 matrix whose elements are related to the elements of A and B by the inner product: Figure: Instruction format for vector processor
For example, the number in the first row and
first column of matrix C is calculated by letting i=1, j=1, to obtain Cont… In general, the inner product consists of the sum of k products terms of the form Memory Interleaving
Memory interleaving is the technique of using memory
from two or more sources. An instruction pipeline may require the fetching of instruction and an operand at the same time from two different segments . Similarly, an arithmetic pipeline usually requires two or more operands to enter the pipeline at the same time instead of using two memory buses simultaneous access the memory can be partitioned into a number of modules connected to a common memory address and data buses. The advantage of a modular is that it allows the use of a technique called interleaving Figure: Multiple module memory organization Array Processors
An array processor is a processor that performs
computations on large arrays of data. The terms used to refer to two different types of processors. An attached array processor is an auxiliary processor attached to a generally - purpose computer. It is intended to improve the performance of the host computer in specific numerical computational tasks. An SIMD array processor is a processor that has a single instruction multiple-data organization. It manipulates vector instructions by means of multiple functional units responsible to common instruction. Attached Array Processor
An attached array processor is designed peripheral
for a convectional host computer and its purpose to enhance the performance of the computer by providing vector processing complex scientific applications. It achieves high performance by means of parallel processing with multiple functional units. It includes an arithmetic unit containing one or more pipeline floating adder and multipliers. The array processor can be programmed by the user to accommodate variety arithmetic problems. Figure: Attached array processor with host computer SIMD Array Processor
An SIMD array processor is a computer with
multiple processing units operating in parallel. The processing units Synchronized to perform the same operation under the control of a common control unit/thus providing a single instruction stream, multiple data stream (SIMD)organization. A general block diagram of an array processor is shown in below figure. It contains a set of identical processing elements (PEs) each having a local memory M. Figure: SIMD array processor organization Completed all chapters
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More