Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Express Cubes: Improving the Performance of k-ary n-cube Interconnection Networks

Published: 01 September 1991 Publication History

Abstract

The author discusses express cubes, k-ary n-cube interconnection networks augmented by express channels that provide a short path for nonlocal messages. An express cube combines the logarithmic diameter of a multistage network with the wire-efficiency and ability to exploit locality of a low-dimensional mesh network. The insertion of express channels reduces the network diameter and thus the distance component of network latency. Wire length is increased, allowing networks to operate with latencies that approach the physical speed-of-light limitation rather than being limited by node delays. Express channels increase wire bisection in a manner that allows the bisection to be controlled independently of the choice of radix, dimension, and channel width. By increasing wire bisection to saturate the available wiring media, throughput can be substantially increased. With an express cube both latency and throughput are wire-limited and within a small factor of the physical limit on performance.

References

[1]
{1} W. C. Athas and C. L. Seitz, "Multicomputers: Message-passing concurrent computers," IEEE Comput. Mag., vol. 21, pp. 9-24, Aug. 1988.
[2]
{2} BBN Advanced Computers, Inc., "Butterfly parallel processor overview, BBN Rep. 6148, Mar. 1986.
[3]
{3} W. J. Dally and C. L. Seitz, "The torus routing chip," J. Distributed Syst., vol. 1, no. 3, pp. 187-196, 1986.
[4]
{4} W. J. Dally, A VLSI Architecture for Concurrent Data Structures. Hingham, MA: Kluwer, 1987.
[5]
{5} W. J. Dally, "Wire efficient VLSI multiprocessor communication networks," in Proc. Stanford Conf. Advanced Res. VLSI, P. Losleben, Ed. Cambridge, MA: MIT Press, Mar. 1987, pp. 391-415.
[6]
{6} W. J. Dally and P. Song, "Design of a self-timed VLSI multicomputer communication controller," in Proc. Int. Conf. Comput. Design, ICCD-87, 1987, pp. 230-234.
[7]
{7} W. J. Dally et al., "The J-Machine: A fine-grain concurrent computer," in Proc. IFIP Congress, 1989.
[8]
{8} W. J. Dally, "The J-Machine: System support for actors," in Actors: Knowledge-Based Concurrent Conputing, Hewitt and Agha, Eds. Cambridge, MA: MIT Press, 1991.
[9]
{9} W. J. Dally, "Performance analysis of k-ary n-cube interconnection network," IEEE Trans. Comput., vol. 39, pp. 775-785, June 1990.
[10]
{10} W. J. Dally, "Network and processor architecture for message-driven computing," in VLSI and Parallel Processing, R. Suaya and G. Birtwistle, Eds. Los Altos, CA: Morgan Kaufmann, 1990.
[11]
{11} P. Kermani and L. Kleinrock, "Virtual cut-through: A new computer communication switching technique," Comput. Networks, vol. 3, pp. 267-286, 1979.
[12]
{12} D. H. Lawrie, "Alignment and access of data in an array processor," IEEE Trans. Comput., vol. C-24, pp. 1145-1155, Dec. 1975.
[13]
{13} C. E. Leiserson, "Fat-trees: Universal networks for hardware-efficient supercomputing," IEEE Trans. Comput., vol. C-34, pp. 892-900, Oct. 1985.
[14]
{14} J. Mailhot, "A comparative study of routing and flow control strategies in k-ary n-cube networks," S.B. thesis, Massachusetts Instit. of Technol., May 1988.
[15]
{15} J. Ngai, "A framework for adaptive routing in multicomputer networks," Ph.D. dissertation, Caltech Computer Science Tech. Rep., Caltech-CS-TR-89-09, May 1989.
[16]
{16} M. O. Noakes and W. J. Dally, "System design of the J-Machine," in Proc. Sixth MIT Conf. Advanced Res. VLSI, MIT Press, 1990, pp. 179-194.
[17]
{17} P. R. Nuth, "Router protocol," MIT Concurrent VLSI Architecture Memo 23, Feb. 1989.
[18]
{18} C. L. Seitz, "The Cosmic Cube," Commun. ACM, vol. 28, pp. 22-23, Jan. 1985.
[19]
{19} C. L. Seitz et al., "The architecture and programming of the Ametek Series 2010 Multicomputer," in Proc. Third Conf. Hypercube Concurrent Comput. Appl., ACM, Jan. 1988, pp. 33-37.
[20]
{20} C. L. Seitz et al., "Submicron systems architecture project semiannual technical report," Caltech Computer Science Tech. Rep., Caltech-CS-TR-88-18, p. 2 and pp. 11-12, Nov. 1988.
[21]
{21} C.-L. Wu and T. Feng, "On a class of multistage interconnection networks," IEEE Trans. Comput., vol. C-29, pp. 694-702, Aug. 1980.

Cited By

View all
  • (2020)Predictive Reliability and Fault Management in Exascale SystemsACM Computing Surveys10.1145/340395653:5(1-32)Online publication date: 28-Sep-2020
  • (2020)Optimal low-latency network topologies for cluster performance enhancementThe Journal of Supercomputing10.1007/s11227-020-03216-y76:12(9558-9584)Online publication date: 1-Dec-2020
  • (2019)Express Link Placement for NoC-Based Many-Core PlatformsProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337877(1-10)Online publication date: 5-Aug-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 40, Issue 9
September 1991
99 pages
ISSN:0018-9340
Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 September 1991

Author Tags

  1. distributed processing
  2. express cubes
  3. interconnection networks
  4. k-ary n-cube
  5. multiprocessor interconnection networks.
  6. nonlocal messages
  7. performance
  8. throughput
  9. wire bisection

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Predictive Reliability and Fault Management in Exascale SystemsACM Computing Surveys10.1145/340395653:5(1-32)Online publication date: 28-Sep-2020
  • (2020)Optimal low-latency network topologies for cluster performance enhancementThe Journal of Supercomputing10.1007/s11227-020-03216-y76:12(9558-9584)Online publication date: 1-Dec-2020
  • (2019)Express Link Placement for NoC-Based Many-Core PlatformsProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337877(1-10)Online publication date: 5-Aug-2019
  • (2019)Analyzing Cost-Performance Tradeoffs of HPC Network Designs under Different Constraints using SimulationsProceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3316480.3325516(1-12)Online publication date: 29-May-2019
  • (2018)Thermal-Aware Task Mapping on Dynamically Reconfigurable Network-on-Chip Based Multiprocessor System-on-ChipIEEE Transactions on Computers10.1109/TC.2018.284436567:12(1818-1834)Online publication date: 1-Dec-2018
  • (2017)There and Back AgainACM SIGARCH Computer Architecture News10.1145/3140659.308025145:2(678-690)Online publication date: 24-Jun-2017
  • (2017)There and Back AgainProceedings of the 44th Annual International Symposium on Computer Architecture10.1145/3079856.3080251(678-690)Online publication date: 24-Jun-2017
  • (2017)FoToNoC: A Folded Torus-Like Network-on-Chip Based Many-Core Systems-on-Chip in the Dark Silicon EraIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.264366928:7(1905-1918)Online publication date: 10-Jun-2017
  • (2017)An Energy-Efficient Directory Based Multicore Architecture with Wireless Routers to Minimize the Communication LatencyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.257128228:2(374-385)Online publication date: 1-Feb-2017
  • (2016)Reducing Wire and Energy Overheads of the SMART NoC Using a Setup Request NetworkIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2016.253828424:10(3013-3026)Online publication date: 1-Oct-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media