Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Algorithms for Supporting Compiled Communication

Published: 01 February 2003 Publication History

Abstract

In this paper, we investigate the compiler algorithms to support compiled communication in multiprocessor environments and study the benefits of compiled communication assuming that the underlying network is an all-optical Time-Division-Multiplexing (TDM) network. We present an experimental compiler, E-SUIF, that supports compiled communication for High Performance Fortran (HPF) like programs on all-optical TDM networks, describe and evaluate the compiler algorithms used in E-SUIF. We further demonstrate the effectiveness of compiled communication on all-optical TDM networks by comparing the performance of compiled communication with that of a traditional communication method using a number of application programs.

References

[1]
S.P. Amarasinghe J.M. Anderson M.S. Lam and C.W. Tseng, “The SUIF Compiler for Scalable Parallel Machines,” Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing, Feb. 1995.
[2]
P. Banerjee J.A. Chandy M. Gupta E.W. Hodges IV J.G. Holm A. Lain D.J. Palermo S. Ramaswamy and E. Su, “The PARADIGM Compiler for Distributed-Memory Multicomputers,” Computer, vol. 28, no. 10, pp. 37-47, Oct. 1995.
[3]
C.A. Brackett, “Dense Wavelength Division Multiplexing Networks: Principles and Applications,” IEEE J. Selected Areas of Comm., vol. 8, Aug. 1990.
[4]
M. Bromley S. Heller T. McNerney and G. L. Steele Jr., “Fortran at Ten Gigaflops: The Connection Machine Convolution Compiler,” Proc. SIGPLAN '91 Conf. Programming Language Design and Implementation, June 1991.
[5]
D. Callahan and K. Kennedy, “Analysis of Interprocedural Side Effects in a Parallel Programming Environment,” J. Parallel and Distributed Computing, vol. 5, pp. 517-550, 1988.
[6]
F. Cappelllo and C. Germain, “Toward High Communication Performance Through Compiled Communications on a Circuit Switched Interconnection Network,” Proc. Int'l Symp. High Performance Computer Architecture, pp. 44-53, Jan. 1995.
[7]
S. Chakrabarti M. Gupta and J. Choi, “Global Communication Analysis and Optimization,” Programming Language Design and Implementation, pp. 68-78, 1996.
[8]
D. Culler, et al., “The Generic Active Message Interface Specification,” White Paper, NOW group, UC Berkeley, available at http://now.cs.berkeley.edu/Papers/Papers/gam_spec.ps. Feb. 1995
[9]
T. von Eicken A. Basu V. Buch and W. Vogels, “U-Net: A User-Level Network Interface for Parallel and Distributed Computing,” Proc. 15th ACM Symp. Operating Systems Principles, Dec. 1995.
[10]
M. Gupta E. Schonberg and H. Srinivasan, “A Unified Framework for Optimizing Communication in Data-Parallel Programs,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 7, pp. 689-704, July 1996.
[11]
S. Hinrichs, “Compiler Directed Architecture-Dependent Communication Optimization,” PhD dissertation, School of Computer Science, Carnegie Mellon Univ., 1995.
[12]
S. Hiranandani K. Kennedy and C. Tseng, “Compiling Fortran D for MIMD Distributed-Memory Machines,” Comm. ACM, vol. 35, no. 8, pp. 66-80, Aug. 1992.
[13]
“High Performance Fortran Applications (HPFA).” available at http://www.npac.syr.edu/hpfa. June 1997.
[14]
M. Kandemir P. Banerjee A. Choudhary J. Ramanujam and N. Shenoy, “A Global Communication Optimization Technique Based on Data-Flow Analysis and Linear Algebra,” Proc. First Merged Symp. IPPS/SPDP, Apr. 1998.
[15]
K. Kennedy N. Nedeljkovic and A. Sethi, “A Linear-Time Algorithm for Computing the Memory Access Sequence in Data-Parallel Programs,” Principles & Practice of Parallel Programming, pp. 102-111, 1995.
[16]
M. Kumar, “Unique Design Concepts in GF11 and Their Impact on Performance,” IBM J. Research and Development, vol. 36, no. 6, Nov. 1992.
[17]
D. Lahaut and C. Germain, “Static Communications in Parallel Scientific Programs,” PARLE '94, Parallel Architecture & Languages, July 1994.
[18]
S. Pakin V. Karamcheti and A.A. Chien, “Fast Messages (FM): Efficient, Portable Communication for Workstation Clusters and Massively-Parallel Processors,” IEEE Concurrency, vol. 5, no. 2, pp. 60-73, Apr.-June 1997.
[19]
C. Qiao and R. Melhem, “Reducing Communication Latency with Path Multiplexing in Optically Interconnected Multiprocessor Systems,” IEEE Trans. Parallel and Distributed Systems, vol 8, no 2, pp. 97-108, 1997.
[20]
S. Salisbury Z. Chen and R. Melhem, “Modeling Communication Locality in Multiprocessors,” J. Parallel and Distributed Computing, vol. 56, no. 2, pp. 71-98, 1999.
[21]
J.M. Stichnoth D. O'Hallaron and T.R. Gross, ”Generating Communication for Array Statements: Design, Implementation, and Evaluation,” J. Parallel and Distributed Computing, vol. 21, no. 1, pp. 150-159, 1994.
[22]
R. Subramanian and S. Pande, “Efficient Program Partitioning Based on Compiler Controlled Communication,” Fourth Int'l Workshop High Level Programming Models and Supportive Environments (HIPS '99), Apr. 1999.
[23]
X. Yuan R. Melhem and R. Gupta, “Compiled Communication for All-Optical TDM Networks,” Supercomputing '96, Nov. 1996.
[24]
X. Yuan R. Gupta and R. Melhem, “An Array Data Flow Analysis Based Communication Optimizer,” 10th Ann. Workshop Languages and Compilers for Parallel Computing, (LCPC '97), Aug. 1997.
[25]
X. Yuan R. Melhem and R. Gupta, “Distributed Path Reservation Algorithms for Multiplexed All-Optical Interconnection Networks,” IEEE Trans. Computers, vol. 48, no. 12, pp. 1355-1363, Dec. 1999.
[26]
X. Yuan R. Gupta and R. Melhem, “Compiler Analysis to Support Compiled Communication for HPF-Like Programs,” 13th Int'l Parallel Processing Symp. and 10th Symp. Parallel and Distributed Processing, Apr. 1999.
[27]
X. Yuan R. Melhem and R. Gupta, “Performance of Multi-Hop Communications Using Logical Topologies on Optical Torus Networks,” J. Parallel and Distributed Computing, vol. 61, no. 6, pp. 748-766, June 2001.

Cited By

View all
  • (2015)A Low-Latency and Low-Power Hybrid Scheme for On-Chip NetworksIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2014.231837423:4(664-677)Online publication date: 1-Apr-2015
  • (2008)A study of process arrival patterns for MPI collective operationsInternational Journal of Parallel Programming10.1007/s10766-008-0070-936:6(543-570)Online publication date: 1-Dec-2008
  • (2008)Bandwidth efficient all-to-all broadcast on switched clustersInternational Journal of Parallel Programming10.1007/s10766-007-0047-036:4(426-453)Online publication date: 1-Apr-2008
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 14, Issue 2
February 2003
96 pages

Publisher

IEEE Press

Publication History

Published: 01 February 2003

Author Tags

  1. Compiled communication
  2. all-optical networks.
  3. communication optimization
  4. compilation techniques for distributed memory machines

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2015)A Low-Latency and Low-Power Hybrid Scheme for On-Chip NetworksIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2014.231837423:4(664-677)Online publication date: 1-Apr-2015
  • (2008)A study of process arrival patterns for MPI collective operationsInternational Journal of Parallel Programming10.1007/s10766-008-0070-936:6(543-570)Online publication date: 1-Dec-2008
  • (2008)Bandwidth efficient all-to-all broadcast on switched clustersInternational Journal of Parallel Programming10.1007/s10766-007-0047-036:4(426-453)Online publication date: 1-Apr-2008
  • (2007)A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched ClustersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2007.1918:2(264-276)Online publication date: 1-Feb-2007
  • (2006)An automated approach to improve communication-computation overlap in clustersProceedings of the 20th international conference on Parallel and distributed processing10.5555/1898699.1898816(290-290)Online publication date: 25-Apr-2006
  • (2005)Transformations to Parallel Codes for Communication-Computation OverlapProceedings of the 2005 ACM/IEEE conference on Supercomputing10.1109/SC.2005.75Online publication date: 12-Nov-2005
  • (2005)Switch Design to Enable Predictive Multiplexed Switching in Multiprocessor NetworksProceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 0110.1109/IPDPS.2005.416Online publication date: 4-Apr-2005
  • (2005)A framework for the design, synthesis and cycle-accurate simulation of multiprocessor networksJournal of Parallel and Distributed Computing10.1016/j.jpdc.2005.04.02265:10(1237-1252)Online publication date: 1-Oct-2005
  • (2005)An MPI prototype for compiled communication on Ethernet switched clustersJournal of Parallel and Distributed Computing10.1016/j.jpdc.2005.04.01465:10(1123-1133)Online publication date: 1-Oct-2005
  • (2004)Exploiting single-assignment properties to optimize message-passing programs by code transformationsProceedings of the 16th international conference on Implementation and Application of Functional Languages10.1007/11431664_1(1-16)Online publication date: 8-Sep-2004
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media