0% found this document useful (0 votes)

39 views

Interconnection Networks

Some paper on interconnection networks describing common solutions in multiprocessor architectures.

Uploaded by

Стефан Савић

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

39 views

Interconnection Networks

Some paper on interconnection networks describing common solutions in multiprocessor architectures.

Uploaded by

Стефан Савић

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 40

interconnection Networks Howard Jay Siegel William Tsun-yuk Hsu 6.1. Introduction Many tasks require the computational power made possible by parallel processing. The demand for fast computation is usually due to a desire for real- time response and/or the need to process immense data sets. These types of tasks include acrodynamic simulations, air traffic control, chemical re- action simulations, seismic data processing, satellite-collected imagery anal- ysis, missile guidance, ballistic missile defense, weather forecasting, map making, robot vision, and speech understanding. Systems comprising a mul- titude of tightly coupled, cooperating processors can help provide the computational performance required by these tasks. STARAN (7) and MPP (8) are examples of existing systems with 2° and 2'* simple processors, re- Some of the material in this chapter is summarized from Interconnection Networks for Large-Scale Parallel Processing, by H. J. Siegel, Lexington Books, D. C. Heath and Company, Lexington, MA, copyright 1985. This project was supported by the Rome Air Development Center, under contract number F30602-83-K-01 19, the Institute for Defense Analyses Super- computing Research Center under contract number MDA 904-85-C-5027, and the Purdue Re- search Foundation David Ross Grant 1985/86 number 0857. 225226 Part Il/Topics in Multiprocessing spectively. Ultracomputer (28) is a proposed design for a system consisting of 2'? complex processors. This chapter examines methods to provide communications among the processors and memories of such large-scale parallel/ distributed systems. Two models of interprocessor communication networks were intro- duced in Chapter 5. The processor-to-memory model assumes N processors on one side of a bidirectional network and N memory modules on the other side. It is also possible to organize processors and memory modules into processor/memory pairs or processing elements (PEs). In the PE-to-PE model, PE i is connected to input i and output i of a unidirectional interconnection network. In this chapter, the PE-to-PE model will be used; however, the material presented is also applicable to processor-to-memory systems. The taxonomy originated by Flynn (26) to describe parallel processors has already been described in Chapter 5. Two of the modes of parallelism described by Flynn are the SIMD and MIMD modes. SIMD stands for single instruction stream-multiple data stream. An SIMD machine may consist of N PEs, an interconnection network that provides communications between. PEs, and a single control unit. The control unit broadcasts instructions to all the PEs, and all enabled PEs execute the same instructions simultaneously, hence forming a single instruction stream. Each PE operates on its own data from its memory. Hence, there are multiple data streams. MIMD stands for multiple instruction stream—multiple data stream. An MIMD machine may consist of N PEs linked by an interconnection network. Each PE stores and executes its own instructions and operates on its own data. There- fore, there are multiple instruction streams and multiple data streams. In addition, there are MSIMD (multiple-SIMD) machines and partitionable SIMD/MIMD machines. MSIMD machines are systems that can be recon- figured into a number of smaller, independent SIMD machines. Partitionable SIMD/MIMD machines can be partitioned into smaller virtual machines working in SIMD or MIMD mode. These have been covered in Chapter 5. The task of interconnecting N processors and N memory modules, where N may be in the range 2° to 2'°, is a nontrivial one. The interconnection scheme must provide fast and flexible communications without unreasonable cost. A single shared bus, as shown in Figure 6.1, is not sufficient because it is often desirable to allow all processors to send data to other processors simultaneously (e.g., from processor i to processor i — 1, 1 =i < N). The Figure 6.1 A single shared bus used to provide communications for N devices.Interconnection Networks 227 Figure 6.2 A completely connected system for N = 8. ideal situation would be to link directly each processor ta every other processor so that the system is completely connected. This is shown for N = 8 in Figure 6.2, where one could assume, for example, that each node is a processor with its own memory. Unfortunately, this is highly impractical for large N because it requires N — 1 unidirectional lines for each processor. For example, if N = 2°, then 2° x (2° — 1) = 261,632 links would be needed. An alternative interconnection scheme that allows all processors to communicate simultaneously is the crossbar switch, shown in Figure 6.3. In this example, the processors communicate through the memories. The network can be viewed as a set of intersecting lines, where interconnections between processors and memories are specified by the crosspoint switches at each line intersection (75). The difficulty with crossbar networks is that 168 proc. i [ ‘crosspoint switch proc. New Figure 6.3 A crossbar switch connecting N processors to N memories.228 Part I/Topics in Multiprocessing the cost of the network (the number of crosspoint switches) grows with N?, which, given current technology, makes it infeasible for large systems. In order to solve the problem of providing fast, efficient communications at a reasonable cost, many different networks between the extremes of the single bus and the completely connected scheme have been proposed in the literature. No single network is generally considered ‘‘best.”” The cost- effectiveness of a particular network design depends on such factors as the computational tasks for which it will be used, the desired speed of interprocessor data transfers, the actual hardware implementation of the network, the number of processors in the system, and any cost constraints on the construction. A variety of networks that have been proposed are over- viewed in numerous survey articles and books, e.g., (4, 12, 21, 32, 34, 37, 42, 62, 74, 81). This chapter is a study of an important collection of network designs that can be used to support large scale parallelism—i.e., these networks can provide the communications needed in a parallel processing system consisting of a large number of processors (¢.g., 2° to 2'6) that are working together to perform a single overall task. Many of these networks can be used in dynamically reconfigurable machines that can perform independent multiple tasks, where each task is processed using parallelism. The networks examined here are based on the ‘‘Shuffle-Exchange,”” “Cube,” ‘‘PM2I’’ (plus-minus 2‘), and “Illiac’* (nearest neighbor) interconnection patterns. These networks and their single stage implementations are explored in Section 6.2. Section 6.3 is a study of the multistage Cube/ Shuffle-Exchange class of networks. The Generalized Cube network will be discussed as an example of this type of network. A fault-tolerant yersion of the Generalized Cube network, called the Extra Stage Cube network, is the subject of Section 6.4. Data manipulator type networks, which are multistage implementations of the PM2I connection patterns, will be discussed in Sec- tion 6.5. 6.2. Interconnection Functions and Single Stage Networks 6.2.1. Introduction Assume a parallel system with N = 2” PEs, numbered (addressed) from 0 to N — 1. An interconnection network can be described by a set of interconnection functions. Each interconnection function is a bijection (permu- tation) on the set of PE addresses. Interconnection functions represent inter- PE data transfers using mathematical mappings. When an interconnection function f is executed, PE i sends data to PE f(i). If a system is operating in SIMD mode, this means that every PE sends data to exactly one PE, and every PE receives data from exactly one PE (assuming all PEs are active).Interconnection Networks 229 Otherwise, the data transfer from PE i to PE f(i) may occur only for a subset of the PEs in the system. Four types of interconnection networks will be discussed: the Cube, the Iliac, the PM2I, and the Shuffle-Exchange. Interconnection networks can be constructed from a single stage of switches or multiple stages of switches. In a single-stage network, data items may have to be passed through the switches severa! times before reaching their final destinations. In a multistage network, generally one pass through the multiple (usually m) stages of switches is sufficient to transfer the data items to their final destinations. An important consideration in the selection of an interconnection network for a system is the partitionability of the network. The partitionability of an interconnection network is the ability to divide the network into independent subnetworks of different sizes (60). Each subnetwork of size N’ < N must have all of the interconnection capabilities of a complete network of that same type built to be of size N'. Multiple-SIMD systems use partitionable interconnection networks to dynamically reconfigure the system into independent SIMD machines of varying sizes. The multiple-SIMD model will be used as a framework for the partitioning analyses in this chapter. However, the results can be used to partition MIMD and partitionable SIMD/MIMD machines also. The subject of this section is the single-stage implementation of the Cube, Iliac, PM2I, and Shuffle-Exchange interconnection networks. Each of these networks will be defined, and examples of their operation in both the SIMD and MIMD modes of parallelism will be given. The partitionability of these single stage networks will also be discussed. Further information about these topics is in (59-61, 69). The following notation-will be used: let the binary representation of an arbitrary PE address P be pm—1Pm—2 « - » P1Po, let p; be the complement of p:, and let the integer n be the square root of N. It is assumed throughout this chapter that —7 modulo N = N — j modulo N, for j > 0—e.g., —4 modulo 16 = 12 modulo 16. 6.2.2. The Cube Network The Cube network consists of m interconnection functions defined by: cube(Pm—1 °° Pir 1PiPi-1 *** Po) = Pm—1°** Pix1PiPi-1 *"* Po for 0 4. Because P + 2”~! = P — 2”~! modulo N, PM2n—1 and PM2—¢,—» are equivalent. Figure 6.7 shows the PM2.,; interconnections for N = 8; PM2_,; is the same as PM2..; except the direction is reversed. This network is called the Plus— Minus 2' because, in terms of mapping source addresses to destinations, it Figure 6.6 Partitioning a size-cight Cube network. (A) Physical cube; (logical cubes). (B) Physical cube, (logical cubes).Interconnection Networks 233 ic 0 oe o 7, Figure 6.7 PM2I network for N = 8. (A) PM2+0 connections. (B) PM2.1 connections. (C) PM2,2 connections. can add or subtract 2‘ from the PE addresses—i.e., it allows PE P to send data to any one of PE P + 2! or PE P — 2', arithmetic modulo N, 0 si < m. A network similar to the PM21 is used in the “Novel Multiprocessor Array’? (50) and is included in the network of the Omen computer (31), The interconnection network of the SIMDA machine is similar in concept to that of the PM2I (78). The data manipulator (20), ADM (66), IADM (63), and gamma (52) multistage networks are based on the PM21 connection pattern. Various properties of the single-stage PM2I network are discussed in (24, 56, 58, 67, 70). Network control in SIMD made can be achieved by means of a system control unit, as in the Cube network. Suppose the PM2I network is imple- mented in the hardware, and a cube, transfer is needed. Mathematically, this means that the i-th bit of each PE address would have to be comple- mented using PM2I functions—i.e., data needs to be moved from PE P to PE cube,(P), 0 = P
You might also like
Module 3
No ratings yet
Module 3
25 pages
1 Module 1 Introduction To Multiprocessors September 29 2024
No ratings yet
1 Module 1 Introduction To Multiprocessors September 29 2024
29 pages
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
38 pages
2.radmi 2013 Vol 2 Multiprocessorinterconnectionnetworks
No ratings yet
2.radmi 2013 Vol 2 Multiprocessorinterconnectionnetworks
8 pages
Multiprocessor Architecture and Programming
No ratings yet
Multiprocessor Architecture and Programming
20 pages
Unit I Introduction
No ratings yet
Unit I Introduction
54 pages
atII Bks Lec 2021 31 32
No ratings yet
atII Bks Lec 2021 31 32
16 pages
Ca 2-1
No ratings yet
Ca 2-1
48 pages
Unit 3 Interconnection Network: Structure Page Nos
No ratings yet
Unit 3 Interconnection Network: Structure Page Nos
18 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
15CS72 IAT2 Solution
No ratings yet
15CS72 IAT2 Solution
13 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
Multiprocessors Interconnection Networks
No ratings yet
Multiprocessors Interconnection Networks
32 pages
Lecture5 (Share Memory" According To Connection)
No ratings yet
Lecture5 (Share Memory" According To Connection)
9 pages
Lecture 5
No ratings yet
Lecture 5
72 pages
Lecture 5 Network Topologies for Parallel Architectures - Updated
No ratings yet
Lecture 5 Network Topologies for Parallel Architectures - Updated
46 pages
Module-4 Notes
No ratings yet
Module-4 Notes
48 pages
COA group Assigment
No ratings yet
COA group Assigment
11 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Lecture 4 Network Topologies For Parallel Architecture
No ratings yet
Lecture 4 Network Topologies For Parallel Architecture
34 pages
Lectures On Multiprocessors: Unit 10
No ratings yet
Lectures On Multiprocessors: Unit 10
26 pages
Publication 3 3685 213
No ratings yet
Publication 3 3685 213
25 pages
B.tech CS S8 High Performance Computing Module Notes Module 3
100% (1)
B.tech CS S8 High Performance Computing Module Notes Module 3
28 pages
ch.4 and 5
No ratings yet
ch.4 and 5
41 pages
Unit I 2 Marks With Answer
No ratings yet
Unit I 2 Marks With Answer
6 pages
Final Unit5 CO Notes
No ratings yet
Final Unit5 CO Notes
7 pages
Multiprocessor
No ratings yet
Multiprocessor
22 pages
Unit11
No ratings yet
Unit11
10 pages
ACA Assignment 4
No ratings yet
ACA Assignment 4
16 pages
Parallel Processors: Session 5 Interconnection Networks
No ratings yet
Parallel Processors: Session 5 Interconnection Networks
48 pages
Lecture 3.2.4 (Various Interconnection Networks)
No ratings yet
Lecture 3.2.4 (Various Interconnection Networks)
5 pages
Chapter 03
No ratings yet
Chapter 03
68 pages
Unit - I - Chapter - 1 - Notes-Distributed Systems
No ratings yet
Unit - I - Chapter - 1 - Notes-Distributed Systems
14 pages
module-4-chapter-1
No ratings yet
module-4-chapter-1
28 pages
2ad6a430 1637912349895
No ratings yet
2ad6a430 1637912349895
51 pages
Chapter2 part 3
No ratings yet
Chapter2 part 3
27 pages
Chapter Thirteen: Multiprocessors
No ratings yet
Chapter Thirteen: Multiprocessors
55 pages
Interconnection Networks
No ratings yet
Interconnection Networks
7 pages
Lecture-27 Interconnection Networks+chapter-5 Slides-Version-2
No ratings yet
Lecture-27 Interconnection Networks+chapter-5 Slides-Version-2
70 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
Parallel and Distributed Computing Research Paper
No ratings yet
Parallel and Distributed Computing Research Paper
8 pages
Chapter 3
No ratings yet
Chapter 3
21 pages
Chapter 3
No ratings yet
Chapter 3
57 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
21 pages
What Is An Interconnection Network
No ratings yet
What Is An Interconnection Network
5 pages
Interconnection Networks
No ratings yet
Interconnection Networks
31 pages
Unit - 3 Part2
No ratings yet
Unit - 3 Part2
15 pages
Unit 1
No ratings yet
Unit 1
25 pages
Computer Architecture and Parallel Processing
No ratings yet
Computer Architecture and Parallel Processing
29 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
49 pages
Chapter 6 Advanced Topics
No ratings yet
Chapter 6 Advanced Topics
14 pages
Unit 1
No ratings yet
Unit 1
21 pages
@vtucode - in 21CS643 Module 4 2021 Scheme
No ratings yet
@vtucode - in 21CS643 Module 4 2021 Scheme
189 pages
Organization of Multiprocessor Systems
No ratings yet
Organization of Multiprocessor Systems
87 pages
Static and Dynamic
No ratings yet
Static and Dynamic
43 pages
Chappp 5
No ratings yet
Chappp 5
24 pages
Multistage Interconnection Network For Mpsoc: Performances Study and Prototyping On Fpga
No ratings yet
Multistage Interconnection Network For Mpsoc: Performances Study and Prototyping On Fpga
6 pages
Parallel Computer Structures
No ratings yet
Parallel Computer Structures
23 pages

Module 3
Module 3
1 Module 1 Introduction To Multiprocessors September 29 2024
1 Module 1 Introduction To Multiprocessors September 29 2024
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
2.radmi 2013 Vol 2 Multiprocessorinterconnectionnetworks
2.radmi 2013 Vol 2 Multiprocessorinterconnectionnetworks
Multiprocessor Architecture and Programming
Multiprocessor Architecture and Programming
Unit I Introduction
Unit I Introduction
atII Bks Lec 2021 31 32
atII Bks Lec 2021 31 32
Ca 2-1
Ca 2-1
Unit 3 Interconnection Network: Structure Page Nos
Unit 3 Interconnection Network: Structure Page Nos
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
15CS72 IAT2 Solution
15CS72 IAT2 Solution
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
Multiprocessors Interconnection Networks
Multiprocessors Interconnection Networks
Lecture5 (Share Memory" According To Connection)
Lecture5 (Share Memory" According To Connection)
Lecture 5
Lecture 5
Lecture 5 Network Topologies for Parallel Architectures - Updated
Lecture 5 Network Topologies for Parallel Architectures - Updated
Module-4 Notes
Module-4 Notes
COA group Assigment
COA group Assigment
24-25 - Parallel Processing PDF
24-25 - Parallel Processing PDF
Lecture 4 Network Topologies For Parallel Architecture
Lecture 4 Network Topologies For Parallel Architecture
Lectures On Multiprocessors: Unit 10
Lectures On Multiprocessors: Unit 10
Publication 3 3685 213
Publication 3 3685 213
B.tech CS S8 High Performance Computing Module Notes Module 3
B.tech CS S8 High Performance Computing Module Notes Module 3
ch.4 and 5
ch.4 and 5
Unit I 2 Marks With Answer
Unit I 2 Marks With Answer
Final Unit5 CO Notes
Final Unit5 CO Notes
Multiprocessor
Multiprocessor
Unit11
Unit11
ACA Assignment 4
ACA Assignment 4
Parallel Processors: Session 5 Interconnection Networks
Parallel Processors: Session 5 Interconnection Networks
Lecture 3.2.4 (Various Interconnection Networks)
Lecture 3.2.4 (Various Interconnection Networks)
Chapter 03
Chapter 03
Unit - I - Chapter - 1 - Notes-Distributed Systems
Unit - I - Chapter - 1 - Notes-Distributed Systems
module-4-chapter-1
module-4-chapter-1
2ad6a430 1637912349895
2ad6a430 1637912349895
Chapter2 part 3
Chapter2 part 3
Chapter Thirteen: Multiprocessors
Chapter Thirteen: Multiprocessors
Interconnection Networks
Interconnection Networks
Lecture-27 Interconnection Networks+chapter-5 Slides-Version-2
Lecture-27 Interconnection Networks+chapter-5 Slides-Version-2
Parallel Architecture
Parallel Architecture
Parallel and Distributed Computing Research Paper
Parallel and Distributed Computing Research Paper
Chapter 3
Chapter 3
Chapter 3
Chapter 3
Introduction To Parallel Processing
Introduction To Parallel Processing
What Is An Interconnection Network
What Is An Interconnection Network
Interconnection Networks
Interconnection Networks
Unit - 3 Part2
Unit - 3 Part2
Unit 1
Unit 1
Computer Architecture and Parallel Processing
Computer Architecture and Parallel Processing
Introduction To Parallel Processing
Introduction To Parallel Processing
Chapter 6 Advanced Topics
Chapter 6 Advanced Topics
Unit 1
Unit 1
@vtucode - in 21CS643 Module 4 2021 Scheme
@vtucode - in 21CS643 Module 4 2021 Scheme
Organization of Multiprocessor Systems
Organization of Multiprocessor Systems
Static and Dynamic
Static and Dynamic
Chappp 5
Chappp 5
Multistage Interconnection Network For Mpsoc: Performances Study and Prototyping On Fpga
Multistage Interconnection Network For Mpsoc: Performances Study and Prototyping On Fpga
Parallel Computer Structures
Parallel Computer Structures