Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Parallel Application Signature for Performance Analysis and Prediction

Published: 01 July 2015 Publication History

Abstract

Predicting the performance of parallel scientific applications is becoming increasingly complex. Our goal was to characterize the behavior of message-passing applications on different target machines. To achieve this goal, we developed a method called parallel application signature for performance prediction (PAS2P), which strives to describe an application based on its behavior. Based on the application's message-passing activity, we identified and extracted representative phases, with which we created a parallel application signature that enabled us to predict the application's performance. We experimented with using different scientific applications on different clusters. We were able to predict execution times with an average accuracy greater than 97 percent.

References

[1]
J. McCalpin and C. Oakland, “An industry perspective on performance characterization: Applications vs benchmarks,” Proc. 3rd Annu. IEEE Workshop Workload Characterization, Sep. 2000.
[2]
J. Canillas, A. Wong, D. Rexachs, and E. Luque, “Predicting parallel applications performance using signatures: The workload effect,” in Proc. 9th IEEE/ACS Int. Conf.  Comput. Syst. Appl., Dec. 2011, pp. 299– 300.
[3]
L. Lamport and C. Time, “The ordering of events in a distributed system,” Commun. ACM , vol. 21, no. 7, pp. 558–565, 1978.
[4]
D. Bailey, E. Barszcz, J. Barton, and D. Browning. (1991, Jan.). The NAS parallel benchmarks, Int. J. High Perform. Comput. [Online]. Available: http://hpc.sagepub.com/cgi/content/abstract/5/3/63, pp. 158–165.
[5]
A. Hoisie, O. Lubeck, and H. Wasserman. (2000, Jan.). Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional. J. High Perform. Comput. Appl. [Online]. Available: http://hpc.sagepub.com/cgi/content/abstract/14/4/330, vol. 14, no. 4, pp. 330–346.
[6]
J. Vetter, “Performance analysis of distributed applications using automatic classification of communication inefficiencies,” in Proc. 14th Int. Conf. Supercomput. , New York, NY, USA, 2000, pp. 245–254.
[7]
P. N. Brown, R. D. Falgout, J. E. Jones, Jim, and E. Jones, “Semicoarsening multigrid on distributed memory machines,” SIAM J. Sci. Comput., vol. 21, pp. 1823–1834, 2000.
[8]
B. Hess, C. Kutzner, D. van der Spoel, and E. Lindahl, “Gromacs 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation,” J. Chem. Theory Comput., vol. 4, no. 3, pp. 435–447, 2008.
[9]
K. Refson, “Moldy: A portable molecular dynamics simulation program for serial and parallel computers,” Comput. Phys. Commun., vol. 126, no. 3, pp. 310–329, 2000.
[10]
J. Gustafson. (2012, Jun. 23). A new approach to computer performance prediction [Online]. Available: http://hint.byu.edu/documentation/Gus/France/France.html
[11]
A. S. Laura, L. Carrington, N. Wolter, and T. San, “A framework for performance modeling and prediction,” in Proc. ACM/IEEE Supercomput., 2002, pp. 1 –17.
[12]
S. Sodhi, J. Subhlok, and Q. Xu. (2008, Jan.). Performance prediction with skeletons. Cluster Comput., vol. 11, no. 2, pp. 151–165, 2008 .
[13]
X. Wu, V. Deshpande, and F. Mueller, “Scalabenchgen: Auto-generation of communication benchmarks traces,” in Proc. IEEE 26th Int. Parallel Distrib. Process. Symp., May 2012, pp. 1250–1260.
[14]
S. Girona, J. Labarta, and R. M. Badia, “ Validation of dimemas communication model for MPI collective operations,” in Proc. 7th Eur. PVM/MPI Users’ Group Meeting Recent Adv. Parallel Virtual Mach. Message Passing Interface, London, U.K., 2000, pp. 39–46.
[15]
E. Perelman, M. Polito, J.-Y. Bouguet, J. Sampson, B. Calder, and C. Dulong, “Detecting phases in parallel applications on shared memory architectures,” in Proc. IEEE Int. Parallel Distrib. Process. Symp., 2006, p. 68.
[16]
P. Bohrer, J. Peterson, M. Elnozahy, R. Rajamony, A. Gheith, R. Rockhold, C. Lefurgy, H. Shafi, T. Nakra, R. Simpson, E. Speight, K. Sudeep, E. Van Hensbergen, and L. Zhang, “Mambo: A full system simulator for the powerpc architecture,” SIGMETRICS Perform. Eval. Rev., vol. 31, no. 4, pp. 8– 12, 2004.
[17]
L. T. Yang, X. Ma, and F. Mueller, “ Cross-platform performance prediction of parallel applications using partial execution,” in Proc. IEEE/ACM Supercomput.: High Perform. Netw. Comput. Conf., 2005, p. 40 .
[18]
M. Casas, R. M. Badia, and J. Labarta, “Automatic phase detection and structure extraction of MPI applications,” Int. J. High Perform. Comput. Appl., vol. 24, no. 3, pp. 335–360, 2010.
[19]
H. Brunst, D. Kranzlmüller, M. S. Muller, and W. E. Nagel, “Tools for scalable parallel program analysis: vampir ng, marmot, and dewiz,” Int. J. Comput. Sci. Eng., vol. 4, no. 3, pp. 149–161, Jul. 2009.
[20]
M. Noeth, P. Ratn, F. Mueller, M. Schulz, and B. R. de Supinski, “Scalatrace: Scalable compression and replay of communication traces for high-performance computing,” J. Parallel Distrib. Comput. , vol. 69, no. 8, pp. 696–710, Aug. 2009.
[21]
T. Sherwood, E. Perelman, and B. Calder, “Basic block distribution analysis to find periodic behavior and simulation points in applications,” in Proc. Int. Conf. Parallel Archit Compilation Tech., Jan. 2001, pp. 3–14.
[22]
E. Perelman, M. Polito, J. yves Bouguet, J. Sampson, B. Calder, and C. Dulong, “Detecting phases in parallel applications on shared memory architectures,” in Proc. Int. Parallel and Distributed Process. Symp., 2006, pp. 25 –29.
[23]
A. Wong, D. Rexachs, and E. Luque, “Parallel application signature,” in Proc. IEEE Int. Conf. Cluster Comput. Workshops, Aug. 31, 2009–Sep. 4, 2009, pp. 1–4.
[24]
H. Brunst, D. Kranzlmüller, M. S. Muller, and W. E. Nagel, “Tools for scalable parallel program analysis: vampir ng, marmot, and dewiz,” Int. J. Comput. Sci. Eng., vol. 4, no. 3, pp. 149–161, Jul. 2009.
[25]
A. Wong, D. Rexachs, and E. Luque, “Parallel application signature for performance prediction,” in Proc. Int. Conf. Parallel Distrib. Process. Tech. Appl., 2010, vol. 2, no. 408–414.
[26]
A. Wong, D. Rexachs, and E. Luque, “Pas2p tool, parallel application signature for performance prediction,” in Proc. 10th Int. Conf. Appl. Parallel Sci. Comput.—Volume Part I, 2012, pp. 293–302.
[27]
G. Hamerly, E. Perelman, and B. Calder, “How to use simpoint to pick simulation points,” ACM SIGMETRICS Perform. Eval. Rev., vol. 31, no. 4, pp. 25–30, 2004.
[28]
J. Hursey, J. M. Squyres, and A. Lumsdaine. (2006, Jul.). A checkpoint and restart service specification for open MPI, Indiana Univ., Bloomington, IN, USA, Tech. Rep. TR635 [Online]. Available: http://www.cs.indiana.edu/cgi-bin/techreports/TRNNN.cgi?trnum=TR635
[29]
P. H. Hargrove and J. C. Duell, “Berkeley lab checkpoint/restart (BLCR) for linux clusters,” J. Phys. Conf. Series, vol. 46, no. 1, pp. 494– 499, 2006.
[30]
J. Ansel, K. Arya, and G. Cooperman, “DMTCP: Transparent checkpointing for cluster computations and the desktop,” in Proc. IEEE Int. Symp. Parallel Distrib. Process., May 2009, pp. 1–12.

Cited By

View all
  • (2023)Efficiency Analysis for AI Applications in HPC Systems. Case Study: K-MeansComputational Science – ICCS 202310.1007/978-3-031-36021-3_39(373-380)Online publication date: 3-Jul-2023
  • (2022)Scalable performance analysis method for SPMD applicationsThe Journal of Supercomputing10.1007/s11227-022-04588-z78:17(19346-19371)Online publication date: 1-Nov-2022
  • (2021)Performance prediction of parallel applications: a systematic literature reviewThe Journal of Supercomputing10.1007/s11227-020-03417-577:4(4014-4055)Online publication date: 1-Apr-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 26, Issue 7
July 2015
286 pages

Publisher

IEEE Press

Publication History

Published: 01 July 2015

Author Tags

  1. application signature
  2. Parallel application
  3. performance prediction

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Efficiency Analysis for AI Applications in HPC Systems. Case Study: K-MeansComputational Science – ICCS 202310.1007/978-3-031-36021-3_39(373-380)Online publication date: 3-Jul-2023
  • (2022)Scalable performance analysis method for SPMD applicationsThe Journal of Supercomputing10.1007/s11227-022-04588-z78:17(19346-19371)Online publication date: 1-Nov-2022
  • (2021)Performance prediction of parallel applications: a systematic literature reviewThe Journal of Supercomputing10.1007/s11227-020-03417-577:4(4014-4055)Online publication date: 1-Apr-2021
  • (2021)Self-tuning serverless task farming using proactive elasticity controlCluster Computing10.1007/s10586-020-03158-324:2(799-817)Online publication date: 1-Jun-2021
  • (2019)NAPELProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317867(1-6)Online publication date: 2-Jun-2019
  • (2019)Improving the energy efficiency of SMACOF for multidimensional scaling on modern architecturesThe Journal of Supercomputing10.1007/s11227-018-2285-x75:3(1038-1050)Online publication date: 1-Mar-2019
  • (2018)Predicting cloud performance for HPC applications before deploymentFuture Generation Computer Systems10.1016/j.future.2017.10.04887:C(618-628)Online publication date: 1-Oct-2018

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media