Article

Ubiquitous Performance Analysis

Authors:

Pascal Aschwanden,

Matthew LeGendreAuthors Info & Claims

High Performance Computing: 36th International Conference, ISC High Performance 2021, Virtual Event, June 24 – July 2, 2021, Proceedings

Pages 431 - 449

https://doi.org/10.1007/978-3-030-78713-4_23

Published: 24 June 2021 Publication History

Abstract

In an effort to guide optimizations and detect performance regressions, developers of large HPC codes must regularly collect and analyze application performance profiles across different hardware platforms and in a variety of program configurations. However, traditional performance profiling tools mostly focus on ad-hoc analysis of individual program runs. Ubiquitous performance analysis is a new approach to automate and simplify the collection, management, and analysis of large numbers of application performance profiles. In this regime, performance profiling of large HPC codes transitions from a sporadic process that often requires the help of experts into a routine activity in which the entire development team can participate. We discuss the design and implementation of an open source ubiquitous performance analysis software stack with three major components: the Caliper instrumentation library with a new API to control performance profiling programmatically; Adiak, a library for automatic program metadata capture; and SPOT, a web-based visualization interface for comparing large sets of runs. A case study shows how ubiquitous performance analysis has helped the developers of the Marbl simulation code for over a year with analyzing performance and understanding regressions.

References

[1]

Adiak: Standard interface for collecting HPC run metadata. https://github.com/LLNL/Adiak. Accessed 16 Mar 2020

[2]

dc.js - dimensional charting javascript library. https://dc-js.github.io/dc.js/. Accessed 7 Apr 2019

[3]

Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH). http://computation.llnl.gov/casc/ShockHydro

[4]

NVIDIA CUDA Profiling Tools Interface. https://developer.nvidia.com/CUPTI-CTK10_2. Accessed 8 Apr 2020

[5]

Project jupyter. https://jupyer.org/. Accessed 10 Apr 2019

[6]

SPOT Container. https://github.com/llnl/spot2_container. Accessed 31 Mar 2021

[7]

Adhianto L et al. HPCToolkit: tools for performance analysis of optimized parallel programs Concurrency Comput. Pract. Experience 2010 22 6 685-701

[8]

Anderson, R., et al.: The Multiphysics on Advanced Platforms Project. Technical Report LLNL-TR-815869, LLNL (2020).

[9]

Bhatele, A., Brink, S., Gamblin, T.: Hatchet: Pruning the overgrowth in parallel profiles. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New York, SC 2019. Association for Computing Machinery (2019).

[10]

Böhme, D., Beckingsale, D., Schulz, M.: Flexible data aggregation for performance profiling. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 419–428 (2017).

[11]

Böhme, D., et al.: Caliper: performance introspection for HPC software stacks. In: Supercomputing 2016 (SC 2016). Salt Lake City (2016). lLNL-CONF-699263

[12]

Brunst, H., Hoppe, H.C., Nagel, W.E., Winkler, M.: Performance optimization for large scale computing: the scalable VAMPIR approach. In: Proceedings of the 2001 International Conference on Computational Science (ICCS 2001), San Francisco, pp. 751–760 (2001)

[13]

Eichenberger AE et al. Rendell AP, Chapman BM, Müller MS, et al. OMPT: an OpenMP tools application programming interface for performance analysis OpenMP in the Era of Low Power Devices and Accelerators 2013 Heidelberg Springer 171-185

[14]

Geimer, M., Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. Concurrency Comput. Pract. Experience 22(6), 702–719 (2010)., http://apps.fz-juelich.de/jsc-pubsystem/pub-webpages/general/get_attach.php?pubid=142

[15]

Huck, K.A., Malony, A.D.: PerfExplorer: A performance data mining framework for large-scale parallel computing. In: Proceedings of the 2005 ACM/IEEE conference on Supercomputing. SC 2005. IEEE Computer Society (2005)

[16]

Huck, K.A., Malony, A.D., Bell, R., Morris, A.: Design and implementation of a parallel performance data management framework. In: 2005 International Conference on Parallel Processing (ICPP 2005), pp. 473–482. IEEE (2005)

[17]

Huck, K.A., Malony, A.D., Shende, S., Morris, A.: Knowledge support and automation for performance analysis with perfexplorer 2.0. Sci. Program. 16(2–3), 123–134 (2008)

[18]

Karavanic, K.L., et al.: Integrating database technology with comparison-based parallel performance diagnosis: the perftrack performance experiment management tool. In: Supercomputing 2005. Proceedings of the ACM/IEEE SC 2005 Conference, p. 39 (2005).

[19]

Karavanic, K.L., Miller, B.P.: Experiment management support for performance tuning. In: SC 1997: Proceedings of the 1997 ACM/IEEE Conference on Supercomputing, p. 8. IEEE (1997)

[20]

Karlin, I., et al.: LULESH programming model and performance ports overview. Technical Report LLNL-TR-608824 (2012)

[21]

Knapp, R.L., et al.: PerfTrack: scalable application performance diagnosis for linux clusters. In: 8th LCI International Conference on High-Performance Clustered Computing, pp. 15–17. Citeseer (2007)

[22]

Knüpfer T et al. Brunst H, Müller MS, Nagel WE, Resch MM, et al. Score-P: a joint performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir Tools for High Performance Computing 2011 2011 Heidelberg Springer 79-91

[23]

Madsen JR et al. Sadayappan P, Chamberlain BL, Juckeland G, Ltaief H, et al. TiMemory: modular performance analysis for HPC High Performance Computing 2020 Cham Springer 434-452

[24]

Mellor-Crummey J, Fowler R, and Marin G HPCView: a tool for top-down analysis of node performance J. Supercomputing 2002 23 81-101

[25]

Mi, H., Wang, H., Cai, H., Zhou, Y., Lyu, M.R., Chen, Z.: P-tracer: path-based performance profiling in cloud computing systems. In: 2012 IEEE 36th Annual Computer Software and Applications Conference, pp. 509–514 (2012)

[26]

Mucci, P.J., Browne, S., Deane, C., Ho, G.: PAPI: a portable interface to hardware performance counters. In: Proceedings Department of Defense HPCMP User Group Conference (1999)

[27]

Pillet, V., Labarta, J., Cortes, T., Girona, S.: PARAVER: a tool to visualize and analyze parallel code. In: Proceedings of WoTUG-18: Transputer and Occam Developments, pp. 17–31 (1995)

[28]

Ren G, Tune E, Moseley T, Shi Y, Rus S, and Hundt R Google-wide profiling: a continuous profiling infrastructure for data centers IEEE Micro 2010 30 4 65-79

[29]

Rosinski, J.M.: GPTL-general purpose timing library (2016)

[30]

Shende S and Malony A The TAU parallel performance system Int. J. High Perform. Comput. Appl. 2006 20 2 287-331

[31]

Skinner, D.: Performance monitoring of parallel scientific applications (2005)., https://www.osti.gov/biblio/881368

[32]

The Open

|

SpeedShop Team: Open

|

SpeedShop for Linux. http://www.openspeedshop.org

Index Terms

Ubiquitous Performance Analysis

Index terms have been assigned to the content through auto-classification.

Recommendations

Timemory: Modular Performance Analysis for HPC
High Performance Computing
Abstract
HPC has undergone a significant transition toward heterogeneous architectures. This transition has introduced several issues in code migration to support multiple frameworks for targeting the various architectures. In order to cope with these ...
TAU Performance System
IWOCL '22: Proceedings of the 10th International Workshop on OpenCL

The TAU Performance System 1 is a versatile performance evaluation tool that supports OpenCL, DPC++/SYCL, OpenMP, and other GPU runtimes. It features a performance profiling and tracing module that is widely portable and can access hardware performance ...
Improving the scalability of performance evaluation tools
PARA'10: Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2

Performance evaluation tools play an important role in helping understand application performance, diagnose performance problems and guide tuning decisions on modern HPC systems. Tools to observe parallel performance must evolve to keep pace with the ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

High Performance Computing: 36th International Conference, ISC High Performance 2021, Virtual Event, June 24 – July 2, 2021, Proceedings

Jun 2021

484 pages

ISBN:978-3-030-78712-7

DOI:10.1007/978-3-030-78713-4

Editors:
Bradford L. Chamberlain
Hewlett Packard Enterprise, Seattle, WA, USA
,
Ana-Lucia Varbanescu
University of Amsterdam, Amsterdam, The Netherlands
,
Hatem Ltaief
Extreme Computing Research Center, Thuwal Jeddah, Saudi Arabia
,
Piotr Luszczek
The University of Tennessee, Knoxville, Knoxville, TN, USA

© Springer Nature Switzerland AG 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 24 June 2021

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents