Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2503210.2503220acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Open access

2HOT: an improved parallel hashed oct-tree n-body algorithm for cosmological simulation

Published: 17 November 2013 Publication History

Abstract

We report on improvements made over the past two decades to our adaptive treecode N-body method (HOT). A mathematical and computational approach to the cosmological N-body problem is described, with performance and scalability measured up to 256k (218) processors. We present error analysis and scientific application results from a series of more than ten 69 billion (40963) particle cosmological simulations, accounting for 4 x 1020 floating point operations. These results include the first simulations using the new constraints on the standard model of cosmology from the Planck satellite. Our simulations set a new standard for accuracy and scientific throughput, while meeting or exceeding the computational efficiency of the latest generation of hybrid TreePM N-body methods.

References

[1]
R. E. Angulo, V. Springel, S. D. M. White, A. Jenkins, C. M. Baugh, and C. S. Frenk. Scaling relations for galaxy clusters in the Millennium-XXL simulation. arXiv:1203.3216, 2012.
[2]
R. E. Angulo and S. D. M. White. One simulation to fit them all -- changing the background parameters of a cosmological N-body simulation. Monthly Notices of the Royal Astronomical Society, 405(1):143--154, 2010.
[3]
P. Balaji, et al. MPI on millions of cores. Parallel Processing Letters, 21(01):45--60, 2011.
[4]
J. Bédorf, E. Gaburov, and S. Portegies Zwart. A sparse octree gravitational N-body code that runs entirely on the GPU processor. Journal of Computational Physics, 231(7):2825--2839, 2012.
[5]
S. Behnel, R. Bradshaw, C. Citro, L. Dalcin, D. Selje-botn, and K. Smith. Cython: The best of both worlds. Computing in Science Engineering, 13(2):31--39, 2011.
[6]
P. S. Behroozi, R. H. Wechsler, and H. Wu. The ROCKSTAR phase-space temporal halo finder and the velocity offsets of cluster cores. The Astrophysical Journal, 762(2):109, 2013.
[7]
D. Blas, J. Lesgourgues, and T. Tram. The cosmic linear anisotropy solving system (CLASS). part II: approximation schemes. Journal of Cosmology and Astroparticle Physics, 2011(07):034, 2011.
[8]
M. Challacombe, C. White, and M. Head-Gordon. Periodic boundary conditions and the fast multipole method. The Journal of Chemical Physics, 107(23):10131--10140, 1997.
[9]
Planck Collaboration. Planck 2013 results. XVI. cosmological parameters. arXiv:1303.5076, 2013.
[10]
Planck Collaboration. Planck 2013 results. XX. cosmology from Sunyaev-Zeldovich cluster counts. arXiv:1303.5080, 2013.
[11]
National Research Council. New worlds, new horizons in astronomy and astrophysics. National Academies Press, 2010.
[12]
M. Crocce, S. Pueblas, and R. Scoccimarro. Transients from initial conditions in cosmological simulations. Monthly Notices of the Royal Astronomical Society, 373(1):369--381, 2006.
[13]
W. Dehnen. Towards optimal softening in three-dimensional N-body codes -- I. minimizing the force error. Monthly Notices of the Royal Astronomical Society, 324(2):273--291, 2001.
[14]
G. Efstathiou, M. Davis, S. D. M. White, and C. S. Frenk. Numerical techniques for large cosmological N-body simulations. The Astrophysical Journal Supplement Series, 57:241--260, 1985.
[15]
C. I. Ellinger, P. A. Young, C. L. Fryer, and G. Rockefeller. A case study of small scale structure formation in 3D supernova simulations. arXiv:1206.1834, 2012.
[16]
W. Feng. Making a case for efficient supercomputing. Queue, 1(7):54--64, 2003.
[17]
M. Frigo and S. G. Johnson. FFTW: an adaptive software architecture for the FFT. In Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, volume 3, page 1381--1384. 1998.
[18]
C. L. Fryer, G. Rockefeller, and M. S. Warren. SNSPH: a parallel three-dimensional smoothed particle radiation hydrodynamics code. The Astrophysical Journal, 643(1):292, 2006.
[19]
C. L. Fryer and M. S. Warren. Modeling Core-Collapse supernovae in three dimensions. The Astrophysical Journal Letters, 574(1):L65, 2002.
[20]
M. Galassi, et al. GNU scientific library. Network Theory, 2007.
[21]
K. M. Gorski, E. Hivon, A. J. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, and M. Bartelmann. HEALPix: a framework for high-resolution discretization and fast analysis of data distributed on the sphere. The Astrophysical Journal, 622(2):759, 2005.
[22]
A. G. Gray and A. W. Moore. N-Body problems in statistical learning. Advances in neural information processing systems, page 521--527, 2001.
[23]
M. Griebel and G. Zumbusch. Parallel multigrid in an adaptive PDE solver based on hashing and space-filling curves. Parallel Computing, 25(7):827--843, 1999.
[24]
J. Harnois-Deraps, U. Pen, I. T. Iliev, H. Merz, J. D. Emberson, and V. Desjacques. High performance P3M N-body code: CUBEP3M. arXiv:1208.5098, 2012.
[25]
L. Hernquist, F. R. Bouchet, and Y. Suto. Application of the ewald method to cosmological N-body simulations. The Astrophysical Journal Supplement Series, 75:231--240, 1991.
[26]
T. Ishiyama, K. Nitadori, and J. Makino. 4.45 pflops astrophysical N-Body simulation on K computer -- the gravitational Trillion-Body problem. arXiv:1211.4406, 2012.
[27]
P. Jetley, F. Gioachin, C. Mendes, L. Kale, and T. Quinn. Massively parallel cosmological simulations with ChaNGa. In IEEE International Symposium on Parallel and Distributed Processing, 2008. IPDPS 2008, pages 1--12. 2008.
[28]
A. Kawai and J. Makino. Pseudoparticle multipole method: A simple method to implement a high-accuracy tree code. The Astrophysical Journal Letters, 550(2):L143, 2001.
[29]
M. Kuhlen, M. Vogelsberger, and R. Angulo. Numerical simulations of the dark universe: State of the art and the next decade. arXiv:1209.5745, 2012.
[30]
J. Lesgourgues. The cosmic linear anisotropy solving system (CLASS) I: Overview. arXiv:1104.2932, 2011.
[31]
Z. Lukić, K. Heitmann, S. Habib, S. Bashinsky, and P. M. Ricker. The halo mass function: High-Redshift evolution and universality. The Astrophysical Journal, 671(2):1160, 2007.
[32]
P. MacNeice, K. M. Olson, C. Mobarry, R. de Fainchtein, and C. Packer. PARAMESH: a parallel adaptive mesh refinement community toolkit. Computer physics communications, 126(3):330--354, 2000.
[33]
P. M. Mcllroy, K. Bostic, and M. D. Mcllroy. Engineering radix sort. Computing systems, 6(1):5--27, 1993.
[34]
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 13th international conference on Supercomputing, ICS '99, page 425--433. ACM, New York, NY, USA, 1999.
[35]
M. Metchnik. A Fast N-Body Scheme for Computational Cosmology. Ph.D. thesis, U. Arizona., 2009.
[36]
B. Nijboer and F. De Wette. On the calculation of lattice sums. Physica, 23(1--5):309--321, 1957.
[37]
J. Ousterhout. Why threads are a bad idea (for most purposes). In Presentation given at the 1996 Usenix Annual Technical Conference, volume 5. 1996.
[38]
M. Parashar and J. Browne. On partitioning dynamic adaptive grid hierarchies. In System Sciences, 1996., Proceedings of the Twenty-Ninth Hawaii International Conference on, volume 1, pages 604--613 vol.1. 1996.
[39]
P. J. E. Peebles. Large-Scale Structure of the Universe. Princeton University Press, 1980.
[40]
D. W. Pfitzner, J. K. Salmon, T. Sterling, P. Stolorz, and R. Musick. Halo world: Tools for parallel cluster finding in astrophysical N-body simulations. In P. Stolorz and R. Musick, editors, Scalable High Performance Computing for Knowledge Discovery and Data Mining, pages 81--100. Springer US, 1998.
[41]
P. Ploumhans, G. Winckelmans, J. Salmon, A. Leonard, and M. Warren. Vortex methods for direct numerical simulation of Three-Dimensional bluff body flows: Application to the sphere at re=300, 500, and 1000. Journal of Computational Physics, 178(2):427--463, 2002.
[42]
T. Quinn, N. Katz, J. Stadel, and G. Lake. Time stepping N-body simulations. arXiv:astro-ph/9710043, 1997.
[43]
D. S. Reed, R. E. Smith, D. Potter, A. Schneider, J. Stadel, and B. Moore. Toward an accurate mass function for precision cosmology. arXiv:1206.5302, 2012.
[44]
A. G. Riess, et al. Type ia supernova discoveries at z > 1 from the hubble space telescope: Evidence for past deceleration and constraints on dark energy evolution. The Astrophysical Journal, 607(2):665, 2004.
[45]
J. K. Salmon and M. S. Warren. Skeletons from the treecode closet. Journal of Computational Physics, 111(1):136--155, 1994.
[46]
Z. F. Seidov and P. I. Skvirsky. Gravitational potential and energy of homogeneous rectangular parallelepiped. arXiv:astro-ph/0002496, 2000.
[47]
G. F. Smoot, et al. Structure in the COBE differential microwave radiometer first-year maps. The Astrophysical Journal, 396:L1--L5, 1992.
[48]
E. Solomonik and L. Kale. Highly scalable parallel sorting. In 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS), pages 1--12. 2010.
[49]
D. N. Spergel, et al. First-year wilkinson microwave anisotropy probe (WMAP) observations: determination of cosmological parameters. The Astrophysical Journal Supplement Series, 148(1):175, 2003.
[50]
V. Springel. The cosmological simulation code gadget-2. Monthly Notices of the Royal Astronomical Society, 364(4):1105--1134, 2005.
[51]
R. M. Stallman. Using and porting the gnu compiler collection. Free Software Foundation, 1989.
[52]
A. Taruya, F. Bernardeau, T. Nishimichi, and S. Codis. RegPT: direct and fast calculation of regularized cosmological power spectrum at two-loop order. arXiv:1208.1191, 2012.
[53]
M. Tegmark, et al. Cosmological parameters from SDSS and WMAP. Physical Review D, 69(10):103501, 2004.
[54]
R. Thakur, et al. MPI at exascale. Procceedings of SciDAC, 2010.
[55]
J. Tinker, A. V. Kravtsov, A. Klypin, K. Abazajian, M. Warren, G. Yepes, S. Gottlöber, and D. E. Holz. Toward a halo mass function for precision cosmology: The limits of universality. The Astrophysical Journal, 688(2):709, 2008.
[56]
L. Torvalds and J. Hamano. GIT-fast version control system. 2005.
[57]
M. J. Turk, B. D. Smith, J. S. Oishi, S. Skory, S. W. Skillman, T. Abel, and M. L. Norman. yt: A multi-code analysis toolkit for astrophysical simulation data. The Astrophysical Journal Supplement Series, 192:9, 2011.
[58]
G. Van Rossum and F. L. Drake Jr. Python reference manual. Centrum voor Wiskunde en Informatica, 1995.
[59]
J. Waldvogel. The newtonian potential of a homogeneous cube. Zeitschrift für angewandte Mathematik und Physik ZAMP, 27(6):867--871, 1976.
[60]
M. Warren, J. Salmon, D. Becker, M. Goda, T. Sterling, and W. Winckelmans. Pentium pro inside: I. a treecode at 430 gigaflops on ASCI red, II. Price/Performance of $50/mflop on Loki and Hyglac. In Supercomputing, ACM/IEEE 1997 Conference, pages 61--61. 1997.
[61]
M. S. Warren, K. Abazajian, D. E. Holz, and L. Teodoro. Precision determination of the mass function of dark matter halos. The Astrophysical Journal, 646(2):881, 2006.
[62]
M. S. Warren, D. J. Becker, M. P. Goda, J. K. Salmon, and T. Sterling. Parallel supercomputing with commodity components. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA '97), page 1372--1381. 1997.
[63]
M. S. Warren and B. Bergen. Poster: The hashed Oct-Tree N-Body algorithm at a petaflop. In High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:, page 1442--1442. 2012.
[64]
M. S. Warren, C. L. Fryer, and M. P. Goda. The space simulator: Modeling the universe from supernovae to cosmology. In Proceedings of the 2003 A CM/IEEE conference on Supercomputing, SC '03, page 30--. ACM, New York, NY, USA, 2003.
[65]
M. S. Warren, T. C. Germann, P. S. Lomdahl, D. M. Beazley, and J. K. Salmon. Avalon: an Alpha/Linux cluster achieves 10 gflops for $150k. In Proceedings of the 1998 ACM/IEEE conference on Supercomputing, Supercomputing '98, page 1--11. IEEE Computer Society, Washington, DC, USA, 1998.
[66]
M. S. Warren and J. K. Salmon. Astrophysical N-body simulations using hierarchical tree data structures. In Supercomputing '92. Proceedings, page 570--576. 1992.
[67]
M. S. Warren and J. K. Salmon. A parallel hashed Oct-Tree N-body algorithm. In Proceedings of the 1993 ACM/IEEE conference on Supercomputing, Supercomputing '93, page 12--21. ACM, New York, NY, USA, 1993.
[68]
M. S. Warren and J. K. Salmon. A portable parallel particle program. Computer Physics Communications, 87(1--2):266--290, 1995.
[69]
M. S. Warren, E. H. Weigle, and W. Feng. High-density computing: a 240-processor beowulf in one cubic meter. In Supercomputing, ACM/IEEE 2002 Conference, page 61--61. 2002.
[70]
M. S. Warren, W. Zurek, P. Quinn, and J. Salmon. The shape of the invisible halo: N-body simulations on parallel supercomputers. AIP Conference Proceedings, 222:216, 1991.
[71]
W. A. Watson, I. T. Iliev, A. D'Aloisio, A. Knebe, P. R. Shapiro, and G. Yepes. The halo mass function through the cosmic ages. arXiv:1212.0095, 2012.
[72]
W. A. Watson, I. T. Iliev, J. M. Diego, S. Gottlöber, A. Knebe, E. Martínez-González, and G. Yepes. Statistics of extreme objects in the juropa hubble volume simulation. arXiv e-print 1305.1976, 2013.
[73]
J. J. Willcock, T. Hoefler, N. G. Edmonds, and A. Lumsdaine. AM++: a generalized active message framework. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, page 401--410. 2010.
[74]
S. Wolfram. The MATHEMATICA ® Book, Version 4. Cambridge University Press, 4 edition, 1999.
[75]
J. Wu, Z. Lan, X. Xiong, N. Y. Gnedin, and A. V. Kravtsov. Hierarchical task mapping of cell-based AMR cosmology simulations. In SC '12, page 75:1-75:10. IEEE Computer Society Press, Los Alamitos, CA, USA, 2012.
[76]
L. Ying, G. Biros, and D. Zorin. A kernel-independent adaptive fast multipole algorithm in two and three dimensions. Journal of Computational Physics, 196(2):591--626, 2004.
[77]
R. Yokota. An FMM based on dual tree traversal for many-core architectures. arXiv:1209.3516, 2012.

Cited By

View all
  • (2024) Swift : a modern highly parallel gravity and smoothed particle hydrodynamics solver for astrophysical and cosmological applications Monthly Notices of the Royal Astronomical Society10.1093/mnras/stae922530:2(2378-2419)Online publication date: 29-Mar-2024
  • (2023)Hubble tensionThe European Physical Journal Plus10.1140/epjp/s13360-023-04591-0138:11Online publication date: 10-Nov-2023
  • (2023)Optimizing the gravitational tree algorithm for many-core processorsMonthly Notices of the Royal Astronomical Society10.1093/mnras/stad4001528:1(821-832)Online publication date: 29-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
November 2013
1123 pages
ISBN:9781450323789
DOI:10.1145/2503210
  • General Chair:
  • William Gropp,
  • Program Chair:
  • Satoshi Matsuoka
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 November 2013

Check for updates

Author Tags

  1. N-body
  2. computational cosmology
  3. fast multipole method

Qualifiers

  • Research-article

Conference

SC13
Sponsor:

Acceptance Rates

SC '13 Paper Acceptance Rate 91 of 449 submissions, 20%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)98
  • Downloads (Last 6 weeks)10
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024) Swift : a modern highly parallel gravity and smoothed particle hydrodynamics solver for astrophysical and cosmological applications Monthly Notices of the Royal Astronomical Society10.1093/mnras/stae922530:2(2378-2419)Online publication date: 29-Mar-2024
  • (2023)Hubble tensionThe European Physical Journal Plus10.1140/epjp/s13360-023-04591-0138:11Online publication date: 10-Nov-2023
  • (2023)Optimizing the gravitational tree algorithm for many-core processorsMonthly Notices of the Royal Astronomical Society10.1093/mnras/stad4001528:1(821-832)Online publication date: 29-Dec-2023
  • (2023)Massively parallelized interpolated factored Green function methodJournal of Computational Physics10.1016/j.jcp.2022.111837475(111837)Online publication date: Feb-2023
  • (2022)ParaTreeT: A Fast, General Framework for Spatial Tree Traversal2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00079(762-772)Online publication date: May-2022
  • (2022)Large-scale dark matter simulationsLiving Reviews in Computational Astrophysics10.1007/s41115-021-00013-z8:1Online publication date: 11-Feb-2022
  • (2019)FleCSPHg: A GPU Accelerated Framework for Physics and Astrophysics SimulationsHigh Performance Computing10.1007/978-3-030-16205-4_10(123-137)Online publication date: 31-Mar-2019
  • (2018)G-ML-Octree: An Update-Efficient Index Structure for Simulating 3D Moving Objects Across GPUsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.278774729:5(1075-1088)Online publication date: 1-May-2018
  • (2017)PKDGRAV3: beyond trillion particle cosmological simulations for the next era of galaxy surveysComputational Astrophysics and Cosmology10.1186/s40668-017-0021-14:1Online publication date: 18-May-2017
  • (2016)Predicting performance of smoothed particle hydrodynamics codes at large scalesProceedings of the 2016 Winter Simulation Conference10.5555/3042094.3042324(1825-1835)Online publication date: 11-Dec-2016
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media