research-article

Hypernet: A communication-efficient architecture for constructing massively parallel computers

Authors:

J. GhoshAuthors Info & Claims

IEEE Transactions on Computers, Volume 36, Issue 12

Pages 1450 - 1466

https://doi.org/10.1109/TC.1987.5009497

Published: 01 December 1987 Publication History

Abstract

A new class of modular networks is proposed for hierarchically constructing massively parallel computer systems for distributed supercomputing and AI applications. These networks are called hypernets. They are constructed incrementally with identical cubelets, treelets, or buslets that are well suited for VLSI implementation. Hypernets integrate positive features of both hypercubes and tree-based topologies, and maintain a constant node degree when the network size increases. This paper presents the principles of constructing hypernets and analyzes their architectural potentials in terms of message routing complexity, cost-effective support for global as well as localized communication, I/O capabilities, and fault tolerance. Several algorithms are mapped onto hypernets to illustrate their ability to support parallel processing in a hierarchically structured or data-dependent environment. The emulation of hypercube connections using less hardware is shown. The potential of hypernets for efficient support of connectionist models of computation is also explored.

References

[1]

S. G. Abraham and E. S. Davidson, "A communication model for optimizing hierarchical multiprocessor systems," in Proc. Int. Conf. Parallel Processing, Aug. 1986, pp. 467-474.

[2]

L. Adams and R. Voigt, "A methodology for exploiting parallelism in the finite element process," in Proc. NATO Workshop High Speed Computations, J. Kowalik, Ed. Berlin, Germany: Springer-Verlag, 1984, vol. F-7, pp. 373-392.

[3]

D. P. Agrawal, V. K. Janakiram, and G. C. Pathak, "Evaluating the performance of multicomputer configurations," IEEE Computer, vol. 19, pp. 23-37, May 1986.

Digital Library

[4]

J. R. Armstrong and F. G. Gray, "Fault diagnosis in a Boolean n-cube array of microprocessors," IEEE Trans. Comput., vol. C-30, pp. 587-590, Aug. 1981.

[5]

J. Bailey, D. Hammerstrom, and M. Rudnick, "Interconnect architectures for WSI neurocomputers," Tech. Rep., Oregon Graduate Center, Beaverton, OR, May 1987.

[6]

F. Berman and L. Synder, "On mapping parallel algorithms into parallel architectures," J. Parallel Distributed Computing, to be published.

[7]

J. C. Bermond and C. Delorme, "Strategies for interconnection networks: Some methods from graph theory," J. Parallel Distributed Comput., vol. 3, Dec. 1986.

Digital Library

[8]

T. F. Chan and Y. Saad, "Multigrid algorithms on the hypercube multiprocessor," IEEE Trans. Comput., vol. C-35 pp. 969-977, Nov. 1986.

Digital Library

[9]

E. Dekel, D. Nassimi, and S. Sahni, "Parallel matrix and graph algorithms, SIAM J. Comput., vol. 4, pp. 657-675, Nov. 1981.

[10]

A. M. Despain and D. A. Patterson, "X-Tree: A tree structured multiprocessor computer architecture," in Proc. 5th Annu. Symp. Comput. Architecture., Aug. 1978, pp. 144-151.

[11]

N. J. Dimopoulos, R. D. Rasmussen, G. S. Bolotin, B. F. Lewis, and R. M. Manning, "Hypercycles: Interconnection networks with simple routing strategies," Intern. Rep., Jet Propulsion Labs., Pasadena, CA, Apr. 1987.

[12]

S. E. Fahlman, "Design sketch for a million-element NETL machine," in Proc. 1st Annu. Nat. Conf. Artificial Intelligence, Aug. 1980, pp. 249-252.

[13]

S. E. Fahlman and G. E. Hinton, "Massively parallel architectures for AI: NETL, thistle and boltzmann machines," in Proc. Nat. Conf. Artificial Intelligence, 1983, pp. 109-113.

[14]

S. E. Fahlman and G. E. Hinton, "Connectionist architectures for artificial intelligence," IEEE Computer, pp. 100-109, Jan. 1987.

Digital Library

[15]

J. A. Feldman and D. H. Ballard, "Connectionist models and their properties," Cognitive Science, vol. 6, pp. 205-254, Ablex, Norwood, NJ, 1982.

[16]

G. Fox, "Questions and unexpected answers in concurrent computation," Tech. Rep. C P-288, Caltech. Concurrent Comput. Prog., Pasadena, CA, June 1986.

[17]

R. P. Gabriel, Performance and Evaluation of Lisp Systems. Cambridge, MA: MIT Press, 1985.

[18]

D. B. Gannon and J. V. Rosendale, "On the impact of communication complexity on the design of parallel numerical algorithms," IEEE Trans. Comput., vol. C-33, pp. 1180-94, Dec. 1984.

[19]

L. Gasser, C. Braganza, and N. Herman, "Implementing distributed AI systems using MACE," in Proc. 3rd IEEE Conf. AI Appl., Feb. 1987.

[20]

E. F. Gehringer, A. K. Jones, and Z. Z. Segall, "The Cm* testbed," IEEE Computer, vol. 15, pp. 38-50, Oct. 1982.

Digital Library

[21]

J. R. Goodman and C. H. Sequin, "Hypertree: A multiprocessor interconnection topology," IEEE Trans. Comput., vol. C-30, pp. 923-933, Dec. 1981.

[22]

J. L. Gustafson, S. Hawkinson, and K. Scott, "The architecture of a homogeneous vector supercomputer," J. Parallel Distributed Comput., vol. 3, pp. 297-304, Sept. 1986.

Digital Library

[23]

R. Hecht-Nielsen, "Performance limits of optical, electro-optical, and electronics neurocomputers," Intern. Rep., TRW Rancho Carmel AI Center, San Diego, CA, 1987.

[24]

W. D. Hillis, The Connection Machine. Cambridge, MA: MIT Press, 1985.

[25]

W. D. Hillis and G. L. Steele, "Data parallel algorithms," Commun. ACM, vol. 29, pp. 1170-1183, Dec. 1986.

Digital Library

[26]

C. T. Ho and S. Lennart Johnsson, "Distributed routing algorithms for broadcasting and personalized communication in hypercubes," in Proc. Int. Conf. Parallel Processing, Aug. 1986, pp. 640-648.

[27]

E. Horowitz and A. Zorat, "The binary tree as an interconnection network: Applications to multiprocessor systems and VLSI," IEEE Trans. Comput., vol. C-30, pp. 247-253, Apr. 1981.

[28]

K. Hwang and R. Chowkwanyun, "Dynamic load balancing for message-passing multiprocessors," Tech. Rep. CRI-87-04, Comput. Res. Instit., Univ. Southern California, Los Angeles, Sept. 1987.

[29]

K. Hwang, J. Ghosh, and R. Chowkwanyun, "Computer architectures for artificial intelligence processing," IEEE Computer, pp. 19-29, Jan. 1987.

Digital Library

[30]

K. Hwang and D. DeGroot, Eds., Parallel Processing for Supercomputing and Artificial Intelligence. New York: McGraw-Hill, 1988.

[31]

R. M. Keller, F. C. Lin, and J. Tanaka, "Rediflow multiprocessing," Proc. IEEE COMPCON, pp. 410-417, Feb. 1984.

[32]

C. R. Lang, "The extension of object-oriented languages to a homogeneous concurrent architecture," Tech. Rep. 5014, Comput. Sci. Dep., California Instit. Technol., Pasadena, May 1982.

[33]

C. E. Leiserson, "Fat-trees: Universal networks for hardware-efficient supercomputing," IEEE Trans. Comput., vol. C-34, pp. 892-901, Oct. 1985.

Digital Library

[34]

F. P. Preparata and Vuillemin, "The cube-connecred cycles: A versatile network for parallel computations," Commtm. ACM, pp. 300-309, May 1981.

[35]

D. A. Reed and H. D. Schwetman, "Cost performance bounds for multimicrocomputer networks," IEEE Trans. Comput., vol. C-32, pp. 83-95, Jan. 1983.

[36]

L. Shastri and J. A. Feldman, "Semantic networks and neural nets," Tech. Rep. 133, Dep. of Comput. Sci., Univ. Rochester, Rochester, NY, June, 1984.

[37]

C. Stanfill and D. Waltz, "Toward memory-based reasoning," Commun. ACM, vol. 29, pp. 1213-1228, Dec. 1986.

Digital Library

[38]

L. Uhr, Multicomputer Architectures for Artificial Intelligence. New York: Wiley-Interscience, 1987.

[39]

D. L. Waltz, "Applications of the connection machine," Computer, vol. 20, pp. 85-97, Jan. 1987.

Digital Library

[40]

S. B. Wu and M. T. Liu, "A cluster structure as an interconnection network for large multimicrocomputer systems," IEEE Trans. Comput., vol. C-30, pp. 254-265, Apr. 1981.

Cited By

Prabhu SManimozhi VDavoodi AGuirao J(2024)Fault-tolerant basis of generalized fat trees and perfect binary tree derived architecturesThe Journal of Supercomputing10.1007/s11227-024-06053-580:11(15783-15798)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s11227-024-06053-5
Hajder PRauch Ł(2019)Reconfiguration of the Multi-channel Communication System with Hierarchical Structure and Distributed Passive SwitchingComputational Science – ICCS 201910.1007/978-3-030-22741-8_36(502-516)Online publication date: 12-Jun-2019
https://dl.acm.org/doi/10.1007/978-3-030-22741-8_36
Fujiwara IKoibuchi MMatsutani HCasanova H(2015)Swap-And-Randomize: A Method for Building Low-Latency HPC InterconnectsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.234086326:7(2051-2060)Online publication date: 1-Jul-2015
https://dl.acm.org/doi/10.1109/TPDS.2014.2340863
Show More Cited By

Index Terms

Recommendations

The Scalability of FFT on Parallel Computers

The authors present the scalability analysis of a parallel fast Fourier transform (FFT)algorithm on mesh and hypercube connected multicomputers using the isoefficiencymetric. The isoefficiency function of an algorithm architecture combination is defined ...
Efficient Mapping of ANNs on Hypercube Massively Parallel Machines

This paper presents a technique for mapping artificial neural networks (ANNs) on hypercube massively parallel machines. The paper starts by synthesizing a parallel structure, the mesh-of-appendixed-trees (MAT), for fast ANN implementation. Then, it ...
Hierarchical Hypercube Networks (HHN) for Massively Parallel Computers

The hypercube is one of the most widely used topologies because it provides small diameter and embedding of various interconnection networks. For very large systems, however, the number of links needed with the hypercube may become prohibitively large. ...

Comments

Information & Contributors

Information

Published In

Copyright © 1987.

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 December 1987

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Prabhu SManimozhi VDavoodi AGuirao J(2024)Fault-tolerant basis of generalized fat trees and perfect binary tree derived architecturesThe Journal of Supercomputing10.1007/s11227-024-06053-580:11(15783-15798)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s11227-024-06053-5
Hajder PRauch Ł(2019)Reconfiguration of the Multi-channel Communication System with Hierarchical Structure and Distributed Passive SwitchingComputational Science – ICCS 201910.1007/978-3-030-22741-8_36(502-516)Online publication date: 12-Jun-2019
https://dl.acm.org/doi/10.1007/978-3-030-22741-8_36
Fujiwara IKoibuchi MMatsutani HCasanova H(2015)Swap-And-Randomize: A Method for Building Low-Latency HPC InterconnectsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.234086326:7(2051-2060)Online publication date: 1-Jul-2015
https://dl.acm.org/doi/10.1109/TPDS.2014.2340863
Le Nhat TNguyen TNguyen K(2014)Robust and Efficient Custom Routing for Interconnection Networks with Distributed ShortcutsInternational Journal of Distributed Systems and Technologies10.4018/ijdst.20141001045:4(51-74)Online publication date: 1-Oct-2014
https://dl.acm.org/doi/10.4018/ijdst.2014100104
Miriam DEaswarakumar K(2012)HPGRIDInternational Journal of Computer Applications in Technology10.1504/IJCAT.2012.04604643:2(155-167)Online publication date: 1-Mar-2012
https://dl.acm.org/doi/10.1504/IJCAT.2012.046046
Miriam DEaswarakumar KJena SKumar RTuruk ADash M(2011)HPGRIDProceedings of the 2011 International Conference on Communication, Computing & Security10.1145/1947940.1948028(427-432)Online publication date: 12-Feb-2011
https://dl.acm.org/doi/10.1145/1947940.1948028
Razavi SSarbazi-Azad H(2010)The triangular pyramidInformation Sciences: an International Journal10.1016/j.ins.2010.02.016180:11(2328-2339)Online publication date: 1-Jun-2010
https://dl.acm.org/doi/10.1016/j.ins.2010.02.016
Wu RChang GChen G(2006)Node-disjoint paths in hierarchical hypercube networksProceedings of the 20th international conference on Parallel and distributed processing10.5555/1898699.1898846(312-312)Online publication date: 25-Apr-2006
https://dl.acm.org/doi/10.5555/1898699.1898846
Farahabady MSarbazi-Azad H(2005)The RTCC-pyramidProceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region10.1109/HPCASIA.2005.94Online publication date: 30-Nov-2005
https://dl.acm.org/doi/10.1109/HPCASIA.2005.94
Parhami B(2005)Swapped interconnection networksJournal of Parallel and Distributed Computing10.1016/j.jpdc.2005.05.00265:11(1443-1452)Online publication date: 1-Nov-2005
https://dl.acm.org/doi/10.1016/j.jpdc.2005.05.002
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents