Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/PDP.2009.50guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Verifying Causality between Distant Performance Phenomena in Large-Scale MPI Applications

Published: 18 February 2009 Publication History

Abstract

In message-passing applications, the temporal or spatial distance between cause and symptom of a performance problem constitutes a major difficulty in deriving helpful conclusions from performance data. Just knowing the locations of wait states in the program is often insufficient to understand the reason for their occurrence. We present a method for verifying hypotheses on causality between temporally or spatially distant performance phenomena in message-passing applications without altering the application itself. The verification is accomplished by modifying MPI event traces and using them to simulate the hypothetical message-passing behavior. By performing a parallel real-time reenactment of the communication to be simulated using the original execution configuration, we can achieve high scalability and good predictive accuracy in relation to the measured behavior. Not relying on a potentially complex model of the message-passing subsystem, our method is also platform independent.

Cited By

View all
  • (2023)An Event Model for Trace-Based Performance Analysis of MPI Partitioned Point-to-Point CommunicationProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624205(1357-1367)Online publication date: 12-Nov-2023
  • (2015)HAEC-SIMProceedings of the 8th International Conference on Simulation Tools and Techniques10.4108/eai.24-8-2015.2261105(129-138)Online publication date: 24-Aug-2015
  • (2013)Using automated performance modeling to find scalability bugs in complex codesProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503277(1-12)Online publication date: 17-Nov-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
PDP '09: Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
February 2009
425 pages
ISBN:9780769535449

Publisher

IEEE Computer Society

United States

Publication History

Published: 18 February 2009

Author Tags

  1. Causality of Performance Phenomena
  2. Large-scale
  3. Performance Prediction
  4. Performance Simulation
  5. Performance analysis

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)An Event Model for Trace-Based Performance Analysis of MPI Partitioned Point-to-Point CommunicationProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624205(1357-1367)Online publication date: 12-Nov-2023
  • (2015)HAEC-SIMProceedings of the 8th International Conference on Simulation Tools and Techniques10.4108/eai.24-8-2015.2261105(129-138)Online publication date: 24-Aug-2015
  • (2013)Using automated performance modeling to find scalability bugs in complex codesProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503277(1-12)Online publication date: 17-Nov-2013
  • (2012)ScalaExtrapACM Transactions on Programming Languages and Systems10.1145/2160910.216091434:1(1-29)Online publication date: 4-May-2012
  • (2012)Pattern-independent detection of manual collectives in MPI programsProceedings of the 18th international conference on Parallel Processing10.1007/978-3-642-32820-6_5(28-39)Online publication date: 27-Aug-2012
  • (2010)LogGOPSimProceedings of the 19th ACM International Symposium on High Performance Distributed Computing10.1145/1851476.1851564(597-604)Online publication date: 21-Jun-2010
  • (2010)Scalable Communication Trace CompressionProceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing10.1109/CCGRID.2010.111(408-417)Online publication date: 17-May-2010
  • (2009)Performance simulation of non-blocking communication in message-passing applicationsProceedings of the 2009 international conference on Parallel processing10.5555/1884795.1884822(208-217)Online publication date: 25-Aug-2009

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media