Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
171 views

SIMD Array Processor

The document describes two SIMD array processors - ILLIAC-IV and Burroughs Scientific Processor (BSP). ILLIAC-IV consists of multiple processing elements under a single control unit. Each processing element contains an ALU, registers and local memory. Vector instructions are sent to processing elements for distributed execution to achieve spatial parallelism. A masking scheme is used to control the status of processing elements during instruction execution. BSP has fewer processing units than ILLIAC-IV but with all processors having equal access to a common logical address space divided into separate memory modules. Each processing element is an arithmetic unit with input/output registers. The document also discusses various interconnection network topologies used

Uploaded by

Tejodeep Bose
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
171 views

SIMD Array Processor

The document describes two SIMD array processors - ILLIAC-IV and Burroughs Scientific Processor (BSP). ILLIAC-IV consists of multiple processing elements under a single control unit. Each processing element contains an ALU, registers and local memory. Vector instructions are sent to processing elements for distributed execution to achieve spatial parallelism. A masking scheme is used to control the status of processing elements during instruction execution. BSP has fewer processing units than ILLIAC-IV but with all processors having equal access to a common logical address space divided into separate memory modules. Each processing element is an arithmetic unit with input/output registers. The document also discusses various interconnection network topologies used

Uploaded by

Tejodeep Bose
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

SIMD ARRAY

PROCESSORS
Chapter 4

NOTE: Refer two author book Kai Hwang and Briggs page no:325
SIMD ARRAY PROCESSORS

• ILLIAC-IV

• Burroughs Scientific Processor(BSP)


ILLIAC-IV
ILLIAC IV
• Constitutes of multiple synchronized processing
elements under one control unit

• Each processing elements contain an ALU, registers


and local memory

• User programs are loaded into the control unit


memory from an external source

• The control unit then decodes and decides where the


instruction should be executed.
ILLIAC IV
• Branching instructions are executed directly on the
control unit

• Vector instructions are sent to the processing


elements for distributive execution to achieve spatial
parallelism via duplicate arithmetic units

• The data can be loaded into the processing element’s


memory from external source via system bus and
broadcast mode of control unit
ILLIAC

IV
Masking scheme is used to control the status of
processing element during execution

• A PE may either be activated or deactivated during an


instruction cycle

• Enabled PE only perform execution

• Data exchange between the PE is done via


interconnection network that performs data routing
and manipulation function
ILLIAC IV
• Interconnection network is too under the supervision
of control unit
• A host computer is interfaced with array processor
through control unit
• A host computer is a general purpose machine which
serves as a overall manager of the entire system.
• Host : front-end machine; manages resources, i/o activity
• Array Processor can be considered as back-end attached
computer
• Note: each node has local memory
Burroughs Scientific Processor(BSP)
• Effectively a successor to the ILLIAC IV machine

• But with an architecture modified to reflect the fact that the BSP was
intended to be a commercial product

• It has fewer processing units than ILLIAC IV

• All processors enjoyed equal access to a common logical address space


which was divided into a number of physically separate memory
modules

• Each processing element is nothing more than an arithmetic unit with


input and output registers, and these units are homogeneous and non-
pipelined
• Note: All nodes share the same memory howsoever the memory are divided
into modules
SIMD Interconnection Network

1D : Linear; 2D: Ring, Mesh,Star, Tree; 3D: Hypercube


SIMD Interconnection Network
• Data exchange among the PE’s are done via interconnection
network
• They perform all data routing and manipulation functions
• Architecture of an interconnection network is based on
topology
• Static: pattern is fixed and cannot be reconfigured
• Dynamic : pattern inside the network are not fixed
• Depending on the number of stages it is divided into two
types
• Single stage
• Multi stage

NOTE: refer 334 page no of two author kai Hwang and Briggs
Single Stage
• Only one stage is used
• Depending on the interstage connection used a single stage is also
known as recirculating network
• Data may recirculate single stage many times before reaching their
destination
• E.g : Crossbar Network

NOTE: refer 334 page no of two author kai Hwang and Briggs
Multi Stage
• Consist of many stages interconnected switch
• Characterized by switch box and network connectivity
• The connectivity is controlled by choice of interstage
connection pattern
• A switch box may have any of the four patterns
mentioned in the next slide.

NOTE: refer 337 page no of two author kai Hwang and Briggs
1: Straight
2:Exchange
3: Lower Broadcast
4:Upper Broadcast

NOTE: refer 337 page no of two author kai Hwang and Briggs
Mesh Connected Illilac Network

No of Nodes = N=16
No of interconnection per node (r) =√N=4
Max no of hops≤ √N-1 Example:
Routing function for connecting ith node: Node 3 will be connected with:
R+i(1)= (i+1) mod N 0≤ i ≤N-I i+1= 4th node
R-i(1)= (i-1) mod N i-1=3rd node
R+r(i)= (r+i) mod N i+4=7th node
R-r(i)= (r-i) mod N i-4=(-1)=15
Shuffle exchange and omega Networks:
• The class of shuffle- exchange network is based on two routing function
shuffle(S) and exchange(G).
• Two types: perfect shuffle and Inverse shuffle
• For perfect shuffle
• Let A=an-1, an-2,…….a1,a0 be a PE address.
• S(an-1, an-2,…….a1,a0 )=an-2,…….a1,a0 , an-1 ..
• N= number of PE’s.
• n= log n
• The cyclic shifting of bits in A to the left for one bit position is performed by
the S.
NOTE: 350 page no of two author kai Hwang and Briggs
PERFECT SHUFFLE

NOTE: 351 page no of two author kai Hwang and Briggs


Shuffle exchange and omega Networks:
• Inverse shuffle
• Let A=an-1, an-2,…….a1,a0 be a PE address.
• S(an-1, an-2,…….a1,a0 )=a0, an-1, an-2,…….a1
• N= number of PE’s.
• n= log n
• The cyclic shifting of bits in A to the right for one bit position is
performed by the S.

NOTE: 350 page no of two author kai Hwang and Briggs


Inverse Shuffle

NOTE: 351 page no of two author kai Hwang and Briggs


OMEGA NETWORK (by using perfect
shuffle)

NOTE: refer Video


BLOCKING STATE
• If the i/o ports are on the same side then the network
is called one-sided networks, also known as full
switches
• Two side multistage has an input and output side
• This can be classified further as blocking or non blocking

• If the simultaneous connections of some multiple i/o pairs result in


conflicts in switches or links, then the multistage network is known
as blocking network. Eg: OMEGA NETWORK

• If the network can perform all possible connections sources (input)


and destination(output) by rearranging its connection then its
known as non-blocking network. Eg: Benes Network
Cube Interconnection Network
• Cube network can be implemented as either a re-circular network or
as multistage network for SIMD.
• In cube network by single cube we can able to connect 8 PEs.
• To connect 16 node we need two cube and so on.
• For representing 8 PEs 3bits are required.
• Interconnection made by using following rule:
• vertical lines connect vertices(PEs) differ in the most significant bit.
• Horizontal line differs in the least significant bit.
• Vertices at both ends of diagonal lines differs in the middle bit position.

(refer page no:343


Cube Interconnection Network
Assignment:
• Design a 4-cube network for an array processor consisting of 16
Processing Elements (PEs). Trace the path to route a packet of data
from the node 0110 to 1101.
• Barrel Shifter Network (refer page no:345)
• With the aid of an example explain the ‘Masking’ and ‘Data Routing’
mechanism in an SIMD Array processor.
Thank You

You might also like