Abstract
The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions involving large multidimensional arrays. The efficient computation of complex tensor contractions usually requires the generation of temporary intermediate arrays. These intermediates could be extremely large, but they can often be generated and used in batches through appropriate loop fusion transformations. To optimize the performance of such computations on parallel computers, the total amount of inter-processor communication must be minimized, subject to the available memory on each processor. In this paper, we address the memory-constrained communication minimization problem in the context of this class of computations. Based on a framework that models the relationship between loop fusion and memory usage, we develop an approach to identify the best combination of loop fusion and data partitioning that minimizes inter-processor communication cost without exceeding the per-processor memory limit. The effectiveness of the developed optimization approach is demonstrated on a computation representative of a component used in quantum chemistry suites.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cociorva, D., Wilkins, J., Lam, C., Baumgartner, G., Sadayappan, P., Ramanujam, J.: Loop Optimizations for a Class of Memory-Constrained Computations. In: Proc. 15th ACM Intl. Conf. on Supercomputing, Sorrento, Italy, June 2001, pp. 103–113 (2001)
Cociorva, D., Wilkins, J., Baumgartner, G., Sadayappan, P., Ramanujam, J., Nooijen, M., Bernholdt, D., Harrison, R.: Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization. In: Monien, B., Prasanna, V.K., Vajapeyam, S. (eds.) HiPC 2001. LNCS, vol. 2228, pp. 237–248. Springer, Heidelberg (2001)
Cociorva, D., Baumgartner, G., Lam, C., Sadayappan, P., Ramanujam, J., Nooijen, M., Bernholdt, D., Harrison, R.: Space-Time Trade-Off Optimization for a Class of Electronic Structure Calculations. In: Proceedings of ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI) (June 2002)
Crawford, T.D., Schaefer III, H.F.: An Introduction to Coupled Cluster Theory for Computational Chemists. Reviews in Computational Chemistry 14, 33–136 (2000)
Fraboulet, A., Huard, G., Mignotte, A.: Loop alignment for memory access optimization. In: Proc. 12th International Symposium on System Synthesis, San Jose, California, November 1999, pp. 71–77 (1999)
Gao, G., Olsen, R., Sarkar, V., Thekkath, R.: Collective loop fusion for array contraction. In: Languages and Compilers for Parallel Processing, New Haven, CT (August 1992)
High Performance Computational Chemistry Group. NWChem, A computational chemistry package for parallel computers, Version 3.3, Pacific Northwest National Laboratory, Richland, WA 99352 (1999)
Lam, C., Sadayappan, P., Wenger, R.: On optimizing a class of multi-dimensional loops with reductions for parallel execution. Parallel Processing Letters 7(2), 157–168 (1997)
Lam, C., Sadayappan, P., Wenger, R.: Optimization of a class of multi-dimensional integrals on parallel machines. In: Proc. Eighth SIAM Conference on Parallel Processing for Scientific Computing, Minneapolis, MN (March 1997)
Lam, C., Sadayappan, P., Cociorva, D., Alouani, M., Wilkins, J.: Performance optimization of a class of loops involving sums of products of sparse arrays. In: Proc. Ninth SIAM Conference on Parallel Processing for Scientific Computing, San Antonio, TX (March 1999)
Lam, C., Cociorva, D., Baumgartner, G., Sadayappan, P.: Memory-optimal evaluation of expression trees involving large objects. In: Banerjee, P., Prasanna, V.K., Sinha, B.P. (eds.) HiPC 1999. LNCS, vol. 1745, pp. 103–110. Springer, Heidelberg (1999)
Lam, C., Cociorva, D., Baumgartner, G., Sadayappan, P.: Optimization of memory usage requirement for a class of loops implementing multi-dimensional integrals. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, p. 350. Springer, Heidelberg (2000)
Lam, C.: Performance optimization of a class of loops implementing multi-dimensional integrals. Ph.D. Dissertation, Ohio State University, Columbus, Also available as Technical Report No. OSU-CISRC-8/99-TR22, Dept. of Computer and Information Science, The Ohio State University (August 1999)
Lee, T., Scuseria, G.: Achieving chemical accuracy with coupled cluster theory. In: Langhoff, S.R. (ed.) Quantum Mechanical Electronic Structure Calculations with Chemical Accuracy, pp. 47–109. Kluwer Academic, Dordrecht (1997)
Martin, J.: Encyclopedia of Computational Chemistry. In: Schleyer, P., Schreiner, P., Allinger, N., Clark, T., Gasteiger, J., Kollman, P., Schaefer III, H. (eds.), vol. 1, pp. 115–128. Wiley & Sons, Berne (1998)
Sarkar, V., Gao, G.: Optimization of array accesses by collective loop transformations. In: Proc. ACM International Conference on Supercomputing, Cologne, Germany, June 1991, pp. 194–205 (1991)
Song, Y., Xu, R., Wang, C., Li, Z.: Data locality enhancement by memory reduction. In: Proc. of ACM 15th International Conference on Supercomputing, June 2001, pp. 50–64 (2001)
Song, Y., Wang, C., Li, Z.: Locality enhancement by array contraction. In: Proc. 14th International Workshop on Languages and Compilers for Parallel Computing (August 2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cociorva, D., Baumgartner, G., Lam, CC., Sadayappan, P., Ramanujam, J. (2005). Memory-Constrained Communication Minimization for a Class of Array Computations. In: Pugh, B., Tseng, CW. (eds) Languages and Compilers for Parallel Computing. LCPC 2002. Lecture Notes in Computer Science, vol 2481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596110_1
Download citation
DOI: https://doi.org/10.1007/11596110_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30781-5
Online ISBN: 978-3-540-31612-1
eBook Packages: Computer ScienceComputer Science (R0)