An evaluation of computing paradigms for N-body simulations on distributed memory architectures
Abstract
The efficiency of HPF with respect to irregular applications is still largely unproven. While recent work has shown that a highly irregular hierarchical n-body force calculation method can be implemented in HPF, we have found that the implmentation contains inefficiencies which cause it to run up to a factor of three times slower than our hand-coded, explicitly parallel implementation. Our work examines these inefficiencies, determines that most of the extra overhead is due to a single aspect of the communication strategy, and demonstrates that fixing the communication strategy can bring the overheads of the HPF application to within 25% of those of the hand-coded version.
References
[1]
S. Aarseth, M. Henon, and R. Wielen. Astronomy and Astrophysics, 37, 1994. Reference for Plummet distributions for N-body problems.
[2]
C. R. Anderson. An implementation of the fast multipole method without multipoles. SIAM J. Sci. Stat. Comput, 13(4):923-947, July 1992.
[3]
J. Barnes and P. Hut. A hierarchical o(n log n) force calculation algorithm. Nature, 324:446-449, 1986.
[4]
R. DaB, M. Uysal, J. Saltz, and Y.-S. Hwang. Communication optimizations for irregular scientific computations on distributed memory architectures. Journal of Parallel and Distributed Computing, 22(3):462-479, Sept. 1994.
[5]
L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comput. Physics, 73:325- 348, 1987.
[6]
R. v. Hanxleden. Compiler Support for Machine. Independent Parallelization of Irregular Problems. PhD thesis, Dept. of Computer Science, Rice University, Dec. 1994. Available as CRPC-TR94494-S from the Center for Research on Parallel Computation, Rice University.
[7]
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(I-2):1-170, 1993.
[8]
Y. C. Hu and S. L. Johnsson. Implementing o(n) n-body algorithms efficiently in data-parallel languages. Scientific Programming, 5(4):337-364, 1996.
[9]
Y. C. Hu, S. L. Johnsson, and S.-H. Teng. High Performance Fortran for highly irregular problems. In Proceedings of the Sixth A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 13-24, Las Vegas, NV, June 1997.
[10]
C. Koelbel, D. Loveman, R. Schreiber, G. Steele, Jr., and M. Zosel. The High Performance Fortran Handbook. The MIT Press, Cambridge, MA, 1994.
[11]
C. McCurdy. Efficient techniques for n-body simulation on distributed memory architectures. Master's thesis, Dept. of Computer Science, Rice University, 1999. Forthcoming.
[12]
P. Mehrotra and J.,Van Rosendale. Compiling high level constructs to distributed memory architectures. In Proceedings of the 4th Conference on Hypercube Concurrent Computers and Applications, Monterey, CA, Mar. 1989.
[13]
H. Sagan. Space-FillingCurves. Springer-Verlag, New York, NY, 1994.
[14]
J. Saltz, K. Crowley, R. Mirchandaney, and H. Berryman. Run-time scheduling and execution of loops on message passing machines. Journal of Parallel and Distributed Computing, 8(4):303-312, Apr. 1990.
[15]
M. Warren and J. Salmon. A parallel hashedocttree n-body algorithm. In Proceedings of Supercomputing '93, Portland, OR, Nov. 1993.
[16]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22th International Symposium on Computer Architecture, pages 24-36, Santa Margherita Ligure, Italy, June 1995.
Index Terms
- An evaluation of computing paradigms for N-body simulations on distributed memory architectures
Recommendations
An evaluation of computing paradigms for N-body simulations on distributed memory architectures
PPoPP '99: Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programmingThe efficiency of HPF with respect to irregular applications is still largely unproven. While recent work has shown that a highly irregular hierarchical n-body force calculation method can be implemented in HPF, we have found that the implmentation ...
Comments
Information & Contributors
Information
Published In
Aug. 1999
192 pages
Copyright © 1999 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 01 May 1999
Published in SIGPLAN Volume 34, Issue 8
Check for updates
Qualifiers
- Article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- View Citations3Total Citations
- 433Total Downloads
- Downloads (Last 12 months)66
- Downloads (Last 6 weeks)16
Reflects downloads up to 12 Sep 2024
Other Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in