research-article

Algorithms for Supporting Compiled Communication

Authors:

Rajiv GuptaAuthors Info & Claims

IEEE Transactions on Parallel and Distributed Systems, Volume 14, Issue 2

Pages 107 - 118

https://doi.org/10.1109/TPDS.2003.1178875

Published: 01 February 2003 Publication History

Abstract

In this paper, we investigate the compiler algorithms to support compiled communication in multiprocessor environments and study the benefits of compiled communication assuming that the underlying network is an all-optical Time-Division-Multiplexing (TDM) network. We present an experimental compiler, E-SUIF, that supports compiled communication for High Performance Fortran (HPF) like programs on all-optical TDM networks, describe and evaluate the compiler algorithms used in E-SUIF. We further demonstrate the effectiveness of compiled communication on all-optical TDM networks by comparing the performance of compiled communication with that of a traditional communication method using a number of application programs.

References

[1]

S.P. Amarasinghe J.M. Anderson M.S. Lam and C.W. Tseng, “The SUIF Compiler for Scalable Parallel Machines,” Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing, Feb. 1995.

[2]

P. Banerjee J.A. Chandy M. Gupta E.W. Hodges IV J.G. Holm A. Lain D.J. Palermo S. Ramaswamy and E. Su, “The PARADIGM Compiler for Distributed-Memory Multicomputers,” Computer, vol. 28, no. 10, pp. 37-47, Oct. 1995.

Digital Library

[3]

C.A. Brackett, “Dense Wavelength Division Multiplexing Networks: Principles and Applications,” IEEE J. Selected Areas of Comm., vol. 8, Aug. 1990.

Digital Library

[4]

M. Bromley S. Heller T. McNerney and G. L. Steele Jr., “Fortran at Ten Gigaflops: The Connection Machine Convolution Compiler,” Proc. SIGPLAN '91 Conf. Programming Language Design and Implementation, June 1991.

Digital Library

[5]

D. Callahan and K. Kennedy, “Analysis of Interprocedural Side Effects in a Parallel Programming Environment,” J. Parallel and Distributed Computing, vol. 5, pp. 517-550, 1988.

Digital Library

[6]

F. Cappelllo and C. Germain, “Toward High Communication Performance Through Compiled Communications on a Circuit Switched Interconnection Network,” Proc. Int'l Symp. High Performance Computer Architecture, pp. 44-53, Jan. 1995.

Digital Library

[7]

S. Chakrabarti M. Gupta and J. Choi, “Global Communication Analysis and Optimization,” Programming Language Design and Implementation, pp. 68-78, 1996.

Digital Library

[8]

D. Culler, et al., “The Generic Active Message Interface Specification,” White Paper, NOW group, UC Berkeley, available at http://now.cs.berkeley.edu/Papers/Papers/gam_spec.ps. Feb. 1995

[9]

T. von Eicken A. Basu V. Buch and W. Vogels, “U-Net: A User-Level Network Interface for Parallel and Distributed Computing,” Proc. 15th ACM Symp. Operating Systems Principles, Dec. 1995.

Digital Library

[10]

M. Gupta E. Schonberg and H. Srinivasan, “A Unified Framework for Optimizing Communication in Data-Parallel Programs,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 7, pp. 689-704, July 1996.

Digital Library

[11]

S. Hinrichs, “Compiler Directed Architecture-Dependent Communication Optimization,” PhD dissertation, School of Computer Science, Carnegie Mellon Univ., 1995.

Digital Library

[12]

S. Hiranandani K. Kennedy and C. Tseng, “Compiling Fortran D for MIMD Distributed-Memory Machines,” Comm. ACM, vol. 35, no. 8, pp. 66-80, Aug. 1992.

Digital Library

[13]

“High Performance Fortran Applications (HPFA).” available at http://www.npac.syr.edu/hpfa. June 1997.

[14]

M. Kandemir P. Banerjee A. Choudhary J. Ramanujam and N. Shenoy, “A Global Communication Optimization Technique Based on Data-Flow Analysis and Linear Algebra,” Proc. First Merged Symp. IPPS/SPDP, Apr. 1998.

[15]

K. Kennedy N. Nedeljkovic and A. Sethi, “A Linear-Time Algorithm for Computing the Memory Access Sequence in Data-Parallel Programs,” Principles & Practice of Parallel Programming, pp. 102-111, 1995.

Digital Library

[16]

M. Kumar, “Unique Design Concepts in GF11 and Their Impact on Performance,” IBM J. Research and Development, vol. 36, no. 6, Nov. 1992.

Digital Library

[17]

D. Lahaut and C. Germain, “Static Communications in Parallel Scientific Programs,” PARLE '94, Parallel Architecture & Languages, July 1994.

Digital Library

[18]

S. Pakin V. Karamcheti and A.A. Chien, “Fast Messages (FM): Efficient, Portable Communication for Workstation Clusters and Massively-Parallel Processors,” IEEE Concurrency, vol. 5, no. 2, pp. 60-73, Apr.-June 1997.

Digital Library

[19]

C. Qiao and R. Melhem, “Reducing Communication Latency with Path Multiplexing in Optically Interconnected Multiprocessor Systems,” IEEE Trans. Parallel and Distributed Systems, vol 8, no 2, pp. 97-108, 1997.

Digital Library

[20]

S. Salisbury Z. Chen and R. Melhem, “Modeling Communication Locality in Multiprocessors,” J. Parallel and Distributed Computing, vol. 56, no. 2, pp. 71-98, 1999.

Digital Library

[21]

J.M. Stichnoth D. O'Hallaron and T.R. Gross, ”Generating Communication for Array Statements: Design, Implementation, and Evaluation,” J. Parallel and Distributed Computing, vol. 21, no. 1, pp. 150-159, 1994.

Digital Library

[22]

R. Subramanian and S. Pande, “Efficient Program Partitioning Based on Compiler Controlled Communication,” Fourth Int'l Workshop High Level Programming Models and Supportive Environments (HIPS '99), Apr. 1999.

Digital Library

[23]

X. Yuan R. Melhem and R. Gupta, “Compiled Communication for All-Optical TDM Networks,” Supercomputing '96, Nov. 1996.

Digital Library

[24]

X. Yuan R. Gupta and R. Melhem, “An Array Data Flow Analysis Based Communication Optimizer,” 10th Ann. Workshop Languages and Compilers for Parallel Computing, (LCPC '97), Aug. 1997.

Digital Library

[25]

X. Yuan R. Melhem and R. Gupta, “Distributed Path Reservation Algorithms for Multiplexed All-Optical Interconnection Networks,” IEEE Trans. Computers, vol. 48, no. 12, pp. 1355-1363, Dec. 1999.

Digital Library

[26]

X. Yuan R. Gupta and R. Melhem, “Compiler Analysis to Support Compiled Communication for HPF-Like Programs,” 13th Int'l Parallel Processing Symp. and 10th Symp. Parallel and Distributed Processing, Apr. 1999.

Digital Library

[27]

X. Yuan R. Melhem and R. Gupta, “Performance of Multi-Hop Communications Using Logical Topologies on Optical Torus Networks,” J. Parallel and Distributed Computing, vol. 61, no. 6, pp. 748-766, June 2001.

Digital Library

Cited By

Guoyue Jiang Zhaolin Li Fang Wang Shaojun Wei (2015)A Low-Latency and Low-Power Hybrid Scheme for On-Chip NetworksIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2014.231837423:4(664-677)Online publication date: 1-Apr-2015
https://dl.acm.org/doi/10.1109/TVLSI.2014.2318374
Faraj APatarasuk PYuan X(2008)A study of process arrival patterns for MPI collective operationsInternational Journal of Parallel Programming10.1007/s10766-008-0070-936:6(543-570)Online publication date: 1-Dec-2008
https://dl.acm.org/doi/10.1007/s10766-008-0070-9
Faraj APatarasuk PYuan X(2008)Bandwidth efficient all-to-all broadcast on switched clustersInternational Journal of Parallel Programming10.1007/s10766-007-0047-036:4(426-453)Online publication date: 1-Apr-2008
https://dl.acm.org/doi/10.1007/s10766-007-0047-0
Show More Cited By

Index Terms

Algorithms for Supporting Compiled Communication
1. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging languages and compilers
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
  2. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Distributed memory

Recommendations

An MPI prototype for compiled communication on Ethernet switched clusters
Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I

Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to communications whose information is known at compile ...
Compiled communication for all-optical TDM networks
Supercomputing '96: Proceedings of the 1996 ACM/IEEE conference on Supercomputing

While all-optical networks offer large bandwidth for transferring data, the control mechanisms to dynamically establish all-optical paths incur large overhead. In this paper, we consider the problem of adapting all-optical multiplexed networks in ...
CC--MPI: a compiled communication capable MPI prototype for ethernet switched clusters
PPoPP '03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming

Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to communications whose information is known at compile ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems

IEEE Transactions on Parallel and Distributed Systems Volume 14, Issue 2

February 2003

96 pages

ISSN:1045-9219

Issue’s Table of Contents

Copyright © Copyright © 2003 IEEE. All Rights Reserved.

Publisher

IEEE Press

Publication History

Published: 01 February 2003

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Guoyue Jiang Zhaolin Li Fang Wang Shaojun Wei (2015)A Low-Latency and Low-Power Hybrid Scheme for On-Chip NetworksIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2014.231837423:4(664-677)Online publication date: 1-Apr-2015
https://dl.acm.org/doi/10.1109/TVLSI.2014.2318374
Faraj APatarasuk PYuan X(2008)A study of process arrival patterns for MPI collective operationsInternational Journal of Parallel Programming10.1007/s10766-008-0070-936:6(543-570)Online publication date: 1-Dec-2008
https://dl.acm.org/doi/10.1007/s10766-008-0070-9
Faraj APatarasuk PYuan X(2008)Bandwidth efficient all-to-all broadcast on switched clustersInternational Journal of Parallel Programming10.1007/s10766-007-0047-036:4(426-453)Online publication date: 1-Apr-2008
https://dl.acm.org/doi/10.1007/s10766-007-0047-0
Faraj AYuan XPatarasuk P(2007)A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched ClustersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2007.1918:2(264-276)Online publication date: 1-Feb-2007
https://dl.acm.org/doi/10.1109/TPDS.2007.19
Fishgold LDanalis APollock LSwany M(2006)An automated approach to improve communication-computation overlap in clustersProceedings of the 20th international conference on Parallel and distributed processing10.5555/1898699.1898816(290-290)Online publication date: 25-Apr-2006
https://dl.acm.org/doi/10.5555/1898699.1898816
Danalis AKim KPollock LSwany MKramer W(2005)Transformations to Parallel Codes for Communication-Computation OverlapProceedings of the 2005 ACM/IEEE conference on Supercomputing10.1109/SC.2005.75Online publication date: 12-Nov-2005
https://dl.acm.org/doi/10.1109/SC.2005.75
Ding ZHoare RJones ALi DShao STung SZheng JMelhem R(2005)Switch Design to Enable Predictive Multiplexed Switching in Multiprocessor NetworksProceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 0110.1109/IPDPS.2005.416Online publication date: 4-Apr-2005
https://dl.acm.org/doi/10.1109/IPDPS.2005.416
Hoare RDing ZTung SMelhem RJones A(2005)A framework for the design, synthesis and cycle-accurate simulation of multiprocessor networksJournal of Parallel and Distributed Computing10.1016/j.jpdc.2005.04.02265:10(1237-1252)Online publication date: 1-Oct-2005
https://dl.acm.org/doi/10.1016/j.jpdc.2005.04.022
Karwande AYuan XLowenthal D(2005)An MPI prototype for compiled communication on Ethernet switched clustersJournal of Parallel and Distributed Computing10.1016/j.jpdc.2005.04.01465:10(1123-1133)Online publication date: 1-Oct-2005
https://dl.acm.org/doi/10.1016/j.jpdc.2005.04.014
Cristóbal-Salas AChernykh ARodríguez-Alcantar EGaudiot J(2004)Exploiting single-assignment properties to optimize message-passing programs by code transformationsProceedings of the 16th international conference on Implementation and Application of Functional Languages10.1007/11431664_1(1-16)Online publication date: 8-Sep-2004
https://dl.acm.org/doi/10.1007/11431664_1
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents