HPC Software for Massive Analysis of the Parallel Efficiency of Applications

Shvets, Pavel; Voevodin, Vadim; Zhumatiy, Sergey

doi:10.1007/978-3-030-28163-2_1

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1063))

Included in the following conference series:

International Conference on Parallel Computational Technologies

469 Accesses
4 Citations

Abstract

Efficiency is a major weakness in modern supercomputers. Low efficiency of user applications is one of the main reasons for that. There are many software tools for analyzing and improving the performance of parallel applications. However, supercomputer users often do not have sufficient knowledge and skills to apply these tools correctly in their specific case. Moreover, users often do not know that their applications work inefficiently.

The main goal of our project is to help any HPC user to detect performance flaws in their applications and find out how to deal with them. To this end, we plan to develop an open-source software solution that performs automatic massive analysis of all jobs running on a supercomputer to identify those with efficiency issues and helps users to conduct a detailed analysis of an individual program (using existing software tools) to identify and eliminate the root causes of the loss of efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Why do Users Need to Take Care of Their HPC Applications Efficiency?

Article 01 August 2020

Parallel Code Analysis in HPC User Support

Scalability and efficiency challenges for the exascale supercomputing system: practice of a parallel supporting environment on the Sunway exascale prototype system

Article 23 January 2023

References

Voevodin, V., Voevodin, V.: Efficiency of exascale supercomputer centers and supercomputing education. In: Gitler, I., Klapp, J. (eds.) ISUM 2015. CCIS, vol. 595, pp. 14–23. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32243-8_2
Chapter Google Scholar
Joseph, E., Conway, S.: Major trends in the worldwide HPC market. Technical report (2017). https://hpcuserforum.com/presentations/stuttgart2017/IDC-update-HLRS.pdf
Bridgwater, S.: Performance optimisation and productivity centre of excellence. In: 2016 International Conference on High Performance Computing & Simulation (HPCS), pp. 1033–1034. IEEE (2016). https://doi.org/10.1109/HPCSim.2016.7568454
Performance Optimisation and Productivity—A Centre of Excellence in Computing Applications. https://pop-coe.eu/
Knüpfer, A., et al.: Score-P: a joint performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir. In: Brunst, H., Müller, M., Nagel, W., Resch, M. (eds.) Tools for High Performance Computing 2011, pp. 79–91. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31476-6_7
Chapter Google Scholar
Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)
Article Google Scholar
Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. SIGPLAN Not. 42(6), 89–100 (2007). https://doi.org/10.1145/1273442.1250746
Article Google Scholar
Intel Parallel Studio XE. https://software.intel.com/en-us/parallel-studio-xe
Neytcheva, M., et al.: Multidimensional performance and scalability analysis for diverse applications based on system monitoring data. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds.) PPAM 2017. LNCS, vol. 10777, pp. 417–431. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78024-5_37
Chapter Google Scholar
Afanasyev, I.V., et al.: Developing efficient implementations of Bellman-Ford and forward-backward graph algorithms for NEC SX-ACE. Supercomput. Front. Innov. 5(3), 65–69 (2018)
MathSciNet Google Scholar
Nikitenko, D., et al.: JobDigest detailed system monitoring-based supercomputer application behavior analysis. In: Voevodin, V., Sobolev, S. (eds.) Russian Supercomputing Days, RuSCDays 2017, vol. 793, pp. 516–529. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71255-0_42
Chapter Google Scholar
Shaykhislamov, D., Voevodin, V.: An approach for dynamic detection of inefficient supercomputer applications. Proc. Comput. Sci. 136, 35–43 (2018)
Article Google Scholar
Shvets, P., Voevodin, V., Zhumatiy, S.: Primary automatic analysis of the entire flow of supercomputer applications. In: Proceedings of the 4rd Ural Workshop on Parallel, Distributed, and Cloud Computing for Young Scientists, CEUR Workshop Proceedings, vol. 2281, pp. 20–32 (2018)
Google Scholar
Vetter, J., Chambreau, C.: mpiP: Lightweight, scalable MPI profiling (2005)
Google Scholar
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000)
Article Google Scholar
Tuning Applications Using a Top-down Microarchitecture Analysis Method. https://software.intel.com/en-us/vtune-amplifier-help-tuning-applications-using-a-top-down-microarchitecture-analysis-method
Nikitenko, D., Voevodin, V., Zhumatiy, S.: Resolving frontier problems of mastering large-scale supercomputer complexes. In: Proceedings of the ACM International Conference on Computing Frontiers - CF 2016, pp. 349–352. ACM Press, New York (2016). https://doi.org/10.1145/2903150.2903481

Download references

Acknowledgments

The results described in this paper were achieved at Lomonosov Moscow State University with the financial support of the Russian Science Foundation (agreement No. 17-71-20114). The research was carried out on the HPC equipment of the shared research facilities at Lomonosov Moscow State University and was supported through the project RFMEFI62117X0011.

Author information

Authors and Affiliations

Research Computing Center of Lomonosov Moscow State University, Leninskie Gory, 1, bld. 4, Moscow, Russia
Pavel Shvets, Vadim Voevodin & Sergey Zhumatiy

Authors

Pavel Shvets
View author publications
You can also search for this author in PubMed Google Scholar
Vadim Voevodin
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Zhumatiy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vadim Voevodin .

Editor information

Editors and Affiliations

South Ural State University, Chelyabinsk, Russia
Leonid Sokolinsky
South Ural State University, Chelyabinsk, Russia
Mikhail Zymbler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shvets, P., Voevodin, V., Zhumatiy, S. (2019). HPC Software for Massive Analysis of the Parallel Efficiency of Applications. In: Sokolinsky, L., Zymbler, M. (eds) Parallel Computational Technologies. PCT 2019. Communications in Computer and Information Science, vol 1063. Springer, Cham. https://doi.org/10.1007/978-3-030-28163-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-28163-2_1
Published: 02 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28162-5
Online ISBN: 978-3-030-28163-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HPC Software for Massive Analysis of the Parallel Efficiency of Applications

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Why do Users Need to Take Care of Their HPC Applications Efficiency?

Parallel Code Analysis in HPC User Support

Scalability and efficiency challenges for the exascale supercomputing system: practice of a parallel supporting environment on the Sunway exascale prototype system

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

HPC Software for Massive Analysis of the Parallel Efficiency of Applications

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Why do Users Need to Take Care of Their HPC Applications Efficiency?

Parallel Code Analysis in HPC User Support

Scalability and efficiency challenges for the exascale supercomputing system: practice of a parallel supporting environment on the Sunway exascale prototype system

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation