Abstract
High-performance computing (HPC) plays a key role in driving innovations in health, economics, energy, transport, networks, and other smart-society infrastructures. HPC enables large-scale simulations and processing of big data related to smart societies to optimize their services. Driving high efficiency from shared-memory and distributed HPC systems have always been challenging; it has become essential as we move towards the exascale computing era. Therefore, the evaluation, analysis, and optimization of HPC applications and systems to improve HPC performance on various platforms are of paramount importance. This paper reviews the performance analysis tools and techniques for HPC applications and systems. Common HPC applications used by the researchers and HPC benchmarking suites are discussed. A qualitative comparison of various tools used for the performance analysis of HPC applications is provided. Conclusions are drawn with future research directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ábrahám, E., Bekas, C., Brandic, I., Genaim, S., Johnsen, E.B., Kondov, I., Pllana, S., Streit, A.: Preparing HPC applications for exascale: Challenges and recommendations (2015). CoRR abs/1503.06974. http://arxiv.org/abs/1503.06974
Abraham, M.J., Murtola, T., Schulz, R., Páll, S., Smith, J.C., Hess, B., Lindahl, E.: Gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1-2, 19–25 (2015). http://www.sciencedirect.com/science/article/pii/S2352711015000059
Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the plasma and magma projects. J. Phys. Conf. Ser. 180, 012037 (2009)
Ahmed, W., Khan, M., Khan, A.A., Mehmood, R., Algarni, A., Albeshri, A., Katib, I.: A framework for faster porting of scientific applications between heterogeneous clouds. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications. pp. 27–43. Springer International Publishing, Cham (2018)
Alam, F., Mehmood, R., Katib, I., Albogami, N.N., Albeshri, A.: Data fusion and IoT for smart ubiquitous environments: a survey. IEEE Access 5, 9533–9554 (2017)
Alam, F., Mehmood, R., Katib, I.: D2TFRS: an object recognition method for autonomous vehicles based on RGB and spatial values of pixels. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications. pp. 155–168. Springer International Publishing, Cham (2018)
Alamoudi, E., Mehmood, R., Albeshri, A., Gojobori, T.: Dna profiling methods and tools: a review. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 216–231. Springer International Publishing, Cham (2018)
Alomari, E., Mehmood, R.: Analysis of tweets in Arabic language for detection of road traffic conditions. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications. pp. 98–110. Springer International Publishing, Cham (2018)
Alonso, P., Badia, R.M., Labarta, J., Barreda, M., Dolz, M.F., Mayo, R., Quintana-Orti, E.S., Reyes, R.: Tools for power-energy modelling and analysis of parallel scientific applications. In: 2012 41st International Conference on Parallel Processing (ICPP), pp. 420–429. IEEE, New York (2012)
Alotaibi, S., Mehmood, R.: Big data enabled healthcare supply chain management: opportunities and challenges. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 207–215. Springer International Publishing, Cham (2018)
Alyahya, H., Mehmood, R., Katib, I.: Parallel sparse matrix vector multiplication on Intel MIC: performance analysis. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 306–322. Springer International Publishing, Cham (2018)
Alzahrani, S., Ikbal, M.R., Mehmood, R., Fayez, M., Katib, I.: Performance evaluation of Jacobi iterative solution for sparse linear equation system on multicore and manycore architectures. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 296–305. Springer International Publishing, Cham (2018)
Amazon: AWS | Amazon Elastic Block Store (EBS) - Incremental Backup & Persistent Storage. http://aws.amazon.com/ebs/
Aqib, M., Mehmood, R., Albeshri, A., Alzahrani, A.: Disaster management in smart cities by forecasting traffic plan using deep learning and GPUs. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 139–154. Springer International Publishing, Cham (2018)
Arfat, Y., Aqib, M., Mehmood, R., Albeshri, A., Katib, I., Albogami, N., Alzahrani, A.: Enabling smarter societies through mobile big data fogs and clouds. Proc. Comput. Sci. 109, 1128–1133 (2017). http://www.sciencedirect.com/science/article/pii/S1877050917311213. 8th International Conference on Ambient Systems, Networks and Technologies, ANT-2017 and the 7th International Conference on Sustainable Energy Information Technology, SEIT 2017, 16-19 May 2017, Madeira
Arfat, Y., Mehmood, R., Albeshri, A.: Parallel shortest path graph computations of United States road network data on apache spark. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 323–336. Springer International Publishing, Cham (2018)
Azad, A., Ballard, G., Buluç, A., Demmel, J., Grigori, L., Schwartz, O., Toledo, S., Williams, S.: Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication. SIAM J. Sci. Comput. 38(6), C624–C651 (2016). https://doi.org/10.1137/15M104253X
Bader, D.A.: Petascale Computing: Algorithms and Applications. CRC Press, Boca Raton (2007)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., et al.: The NAS parallel benchmarks. Int. J. High Perform. Comput. Appl. 5(3), 63–73 (1991)
Bailey, J.A., Bazavov, A., Bernard, C., Bouchard, C.M., DeTar, C., Du, D., El-Khadra, A.X., Foley, J., Freeland, E.D., Gámiz, E., Gottlieb, S., Heller, U.M., Kim, J., Kronfeld, A.S., Laiho, J., Levkova, L., Mackenzie, P.B., Meurice, Y., Neil, E.T., Oktay, M.B., Qiu, S.W., Simone, J.N., Sugar, R., Toussaint, D., Van de Water, R.S., Zhou, R.: Refining new-physics searches in b → dτν with lattice QCD. Phys. Rev. Lett. 109, 071802 (2012). https://link.aps.org/doi/10.1103/PhysRevLett.109.071802
Benedict, S.: Performance issues and performance analysis tools for HPC cloud applications: a survey. Computing 95(2), 89–108 (2013)
Berriman, G.B., Juve, G., Deelman, E., Regelson, M., Plavchan, P.: The application of cloud computing to astronomy: A study of cost and performance. In: 2010 Sixth IEEE International Conference on e-Science Workshops, December, pp. 1–7 (2010)
Bhatele, A., Kumar, S., Mei, C., Phillips, J.C., Zheng, G., Kale, L.V.: Overcoming scaling challenges in biomolecular simulations across multiple platforms. In: IEEE International Symposium on Parallel and Distributed Processing, 2008 (IPDPS 2008), pp. 1–12. IEEE, New York (2008)
Bohra, A.E.H., Chaudhary, V.: Vmeter: power modelling for virtualized clouds. In: 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and PhD Forum (IPDPSW), pp. 1–8. IEEE, New York (2010)
BPG: Best Practice Guides. http://www.prace-ri.eu/best-practice-guides/
Burtscher, M., Kim, B.D., Diamond, J., McCalpin, J., Koesterke, L., Browne, J.: PerfExpert: an easy-to-use performance diagnosis tool for HPC applications. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE Computer Society, Washington (2010)
Carrington, L.C., Laurenzano, M., Snavely, A., Campbell Jr., R.L., Davis, L.P.: How well can simple metrics represent the performance of HPC applications? In: Supercomputing, 2005. Proceedings of the ACM/IEEE SC 2005 Conference, pp. 48–48. IEEE, New York (2005)
Carrington, L., Snavely, A., Wolter, N.: A performance prediction framework for scientific applications. Fut. Gener. Comput. Syst. 22(3), 336–346 (2006)
Carter, J., Oliker, L., Shalf, J.: Performance evaluation of scientific applications on modern parallel vector systems. In: High Performance Computing for Computational Science-VECPAR 2006, pp. 490–503. Springer, New York (2007)
Djoudi, L., Barthou, D., Carribault, P., Lemuet, C., Acquaviva, J.T., Jalby, W.: Exploring application performance: a new tool for a static/dynamic approach. In: Proceedings of the 6th LACSI Symposium (2005)
Dongarra, J.L.A.P.: The LINPACK benchmark: past, present and future. Concurr. Comput. Pract. and Exp. 15, 1–18 (2003)
Dunigan Jr, T.H., Vetter, J.S., White III, J.B., Worley, P.H.: Performance evaluation of the Cray x1 distributed shared-memory architecture. Micro, IEEE 25(1), 30–40 (2005)
ECC2. Elastic Compute Cloud (EC2) Cloud Server & Hosting – AWS. https://aws.amazon.com/ec2/
Eleliemy, A., Fayez, M., Mehmood, R., Katib, I., Aljohani, N.: Loadbalancing on parallel heterogeneous architectures: Spin-image algorithm on CPU and MIC. In: 9th EUROSIM Congress on Modelling and Simulation. EUROSIM (2016). http://edoc.unibas.ch/53117/
ExpóSito, R.R., Taboada, G.L., Ramos, S., Touriño, J., Doallo, R.: Performance analysis of HPC applications in the cloud. Fut. Gener. Comput. Syst. 29(1), 218–229 (2013)
Farber, R.: The convergence of big data and extreme-scale HPC (2018). https://www.hpcwire.com/2018/08/31/the-convergence-of-big-data-and-extreme-scale-hpc/
Ferreira, G., Kästner, C., Pfeffer, J., Apel, S.: Characterizing complexity of highly-configurable systems with variational call graphs: analyzing configuration options interactions complexity in function calls. In: Proceedings of the 2015 Symposium and Bootcamp on the Science of Security. p. 17. ACM, New York (2015)
Foster, I., Freeman, T., Keahy, K., Scheftner, D., Sotomayer, B., Zhang, X.: Virtual clusters for grid communities. In: Sixth IEEE International Symposium on Cluster Computing and the Grid, 2006 (CCGRID 06), vol. 1, pp. 513–520. IEEE, New York (2006)
Freche, J., Frings, W., Sutmann, G.: High-throughput parallel-I/O using SIONlib for mesoscopic particle dynamics simulations on massively parallel computers. In: Parallel Computing: From Multicores and GPU’s to Petascale Advances in Parallel Computing, vol. 19, pp. 371–378. IOS Press, Amsterdam (2010)
Freeman, T., Keahey, K., Sotomayor, B., Zhang, X., Foster, I., Scheftner, D.: Virtual clusters for grid communities. Citeseer (2006)
Gel, A., Hu, J., Ould-Ahmed-Vall, E., Kalinkin, A.A.: Modernization and optimization of a legacy open-source CFD code for high-performance computing architectures. Int. J. Comput. Fluid Dynam. 31(2), 122–133 (2017). https://doi.org/10.1080/10618562.2017.1285398
Genovese, L., Videau, B., Ospici, M., Deutsch, T., Goedecker, S., Méhaut, J.F.: Daubechies wavelets for high performance electronic structure calculations: The BigDFT project. Comptes Rendus Mécanique 339(2), 149–164 (2011). http://www.sciencedirect.com/science/article/pii/S1631072110002135. High Performance Computing
Giannozzi, P., Baroni, S., Bonini, N., Calandra, M., Car, R., Cavazzoni, C., Ceresoli, D., Chiarotti, G.L., Cococcioni, M., Dabo, I., et al.: Quantum espresso: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. matter 21(39), 395502 (2009)
Gibbon, P.: Pepc: pretty efficient parallel coulomb-solver. Sonstiger Interner Bericht ZAM-IB-2003-05, ZAM, Jülich, Forschungszentrum (2003)
Gordon, M.S., Schmidt, M.W.: Advances in electronic structure theory: GAMESS a decade later. In: Dykstra, C.E., Frenking, G., Kim, K.S., Scuseria, G.E. (eds.) Theory and Applications of Computational Chemistry, chapter 41, pp. 1167–1189. Elsevier, Amsterdam (2005). http://www.sciencedirect.com/science/article/pii/B9780444517197500846
Gudiksen, B.V., Carlsson, M., Hansteen, V.H., Hayek, W., Leenaarts, J., Martínez-Sykora, J.: The stellar atmosphere simulation code Bifrost - code description and validation. Astron. Astrophys. 531, A154 (2011). https://doi.org/10.1051/0004-6361/201116520
Gupta, A., Faraboschi, P., Gioachin, F., Kale, L., Kaufmann, R., Lee, B.S., March, V., Milojicic, D., Suen, C.: Evaluating and improving the performance and scheduling of HPC applications in cloud. IEEE Trans. Cloud Comput. 4(99), 1–1 (2014)
Gustafson, J.L., Todi, R.: Conventional benchmarks as a sample of the performance spectrum. In: Proceedings of the Thirty-First Hawaii International Conference on System Sciences, 1998, vol. 7, pp. 514–523. IEEE, New York (1998)
Gygi, F., Yates, R.K., Lorenz, J., Draeger, E.W., Franchetti, F., Ueberhuber, C.W., Supinski, B.R.D., Kral, S., Gunnels, J.A., Sexton, J.C.: Large-scale first-principles molecular dynamics simulations on the Bluegene/l platform using the Qbox code. In: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, p. 24. IEEE Computer Society, Washington (2005)
Heck, D., Pierog, T., Knapp, J.: CORSIKA: An Air Shower Simulation Program. Astrophysics Source Code Library (2012)
Hwu, W.M., Chang, L.W., Kim, H.S., Dakkak, A., El Hajj, I.: Transitioning HPC software to exascale heterogeneous computing. In: Computational Electromagnetics International Workshop (CEM), July 2015, pp. 1–2 (2015)
Irbäck, A., Mohanty, S.: Profasi: A Monte Carlo simulation package for protein folding and aggregation. J. Comput. Chem. 27(13), 1548–1555. https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.20452
Jackson, K.R., Ramakrishnan, L., Muriki, K., Canon, S., Cholia, S., Shalf, J., Wasserman, H.J., Wright, N.J.: Performance analysis of high performance computing applications on the amazon web services cloud. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 159–168. IEEE, New York (2010)
Jacobsen, N.G., Fuhrman, D.R., Fredsøe, J.: A wave generation toolbox for the open-source CFD library: Openfoam®. Int. J. Numer. Methods Fluids 70(9), 1073–1088. https://onlinelibrary.wiley.com/doi/abs/10.1002/fld.2726
Jetley, P., Gioachin, F., Mendes, C., Kale, L.V., Quinn, T.: Massively parallel cosmological simulations with ChaNGa. In: International Symposium on Parallel and Distributed Processing, 2008 (IPDPS 2008), pp. 1–12. IEEE, New York (2008)
Jin, H., Van der Wijngaart, R.F.: Performance characteristics of the multi-zone NAS parallel benchmarks. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium, 2004, p. 6. IEEE, New York (2004)
Jöckel, P., Sander, R., Kerkweg, A., Tost, H., Lelieveld, J.: Technical note: the modular earth submodel system (MESSy) - a new approach towards earth system modeling. Atmos. Chem. Phys. 5(2), 433–444 (2005). https://www.atmos-chem-phys.net/5/433/2005/
Johnsen, E.B., Hähnle, R., Schäfer, J., Schlatte, R., Steffen, M.: ABS: a core language for abstract behavioral specification. In: Formal Methods for Components and Objects, pp. 142–164. Springer, New York (2012)
Jurenz, M., Brendel, R., Knüpfer, A., Müller, M., Nagel, W.E.: Memory allocation tracing with VampireTrace. In: Computational Science–ICCS 2007, pp. 839–846. Springer, New York (2007)
Kale, L.V., Krishnan, S.: CHARM++: A Portable Concurrent Object Oriented System Based on C++, vol. 28. ACM, New York (1993)
Kay, J.E., Deser, C., Phillips, A., Mai, A., Hannay, C., Strand, G., Arblaster, J.M., Bates, S.C., Danabasoglu, G., Edwards, J., Holland, M., Kushner, P., Lamarque, J.F., Lawrence, D., Lindsay, K., Middleton, A., Munoz, E., Neale, R., Oleson, K., Polvani, L., Vertenstein, M.: The community earth system model (CESM) large ensemble project: a community resource for studying climate change in the presence of internal climate variability. Bull. Am. Meteorol. Soc. 96(8), 1333–1349 (2015). https://doi.org/10.1175/BAMS-D-13-00255.1
Keahey, K., Figueiredo, R., Fortes, J., Freeman, T., Tsugawa, M.: Science clouds: early experiences in cloud computing for scientific applications. Cloud Comput. Appl. 2008, 825–830 (2008)
Khanum, A., Alvi, A., Mehmood, R.: Towards a semantically enriched computational intelligence (SECI) framework for smart farming. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications. pp. 247–257. Springer International Publishing, Cham (2018)
Kirk, B.S., Peterson, J.W., Stogner, R.H., Carey, G.F.: libMesh: a C++ library for parallel adaptive mesh refinement/coarsening simulations. Eng. Comput. 22(3), 237–254 (2006). https://doi.org/10.1007/s00366-006-0049-3
Kn̈pfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Müller, M.S., Nagel, W.E.: The Vampir performance analysis tool-set. In: Tools for High Performance Computing, pp. 139–155. Springer, New York (2008)
Kodiyalam, S., Yang, R., Gu, L., Tho, C.H.: Multidisciplinary design optimization of a vehicle system in a scalable, high performance computing environment. Struct. Multidiscip. Optim. 26(3), 256–263 (2004). https://doi.org/10.1007/s00158-003-0343-2
Komatitsch, D., Tromp, J.: Introduction to the spectral element method for three-dimensional seismic wave propagation. Geophys. J. Int. 139(3), 806–822 (1999). https://onlinelibrary.wiley.com/doi/abs/10.1046/j.1365-246x.1999.00967.x
Kramer, W., Shalf, J., Strohmaier, E.: The NERSC Sustained System Performance (SSP) Metric. Lawrence Berkeley National Laboratory (2005)
Kwiatkowska, M., Mehmood, R.: Out-of-core solution of large linear systems of equations arising from stochastic modelling. In: Hermanns, H., Segala, R. (eds.) Process Algebra and Probabilistic Methods: Performance Modeling and Verification, pp. 135–151. Springer, Berlin/Heidelberg (2002)
Kwiatkowska, M., Mehmood, R., Norman, G., Parker, D.: A symbolic out-of-core solution method for Markov models. Electron. Notes Theor. Comput. Sci. 68(4), 589–604 (2002). http://www.sciencedirect.com/science/article/pii/S1571066105803949
Kwiatkowska, M., Parker, D., Zhang, Y., Mehmood, R.: Dual-processor parallelisation of symbolic probabilistic model checking. In: Proceedings of the IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS ’04, pp. 123–130. IEEE Computer Society, Washington (2004). http://dl.acm.org/citation.cfm?id=1032659.1034195
Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) Proceedings of the 23rd International Conference on Computer Aided Verification (CAV’11). Lecture Notes in Computer Science, vol. 6806, pp. 585–591. Springer, New York (2011)
Letherwood, M.D., Gunter, D.D.: Ground vehicle modeling and simulation of military vehicles using high performance computing. Parallel Comput. 27(1), 109–140 (2001). http://www.sciencedirect.com/science/article/pii/S0167819100000910. New Trends in High Performance Computing
Lingerfelt, E., Endeve, E., Hui, Y., Smith, C., Somnath, S., Grodowitz, N., Borreguero, J., Bao, F., Niedziela, J., Bansal, D., Delaire, O., Archibald, R., Belianinov, A., Shankar, M., Jesse, S.: BEAM: an HPC pipeline for nanoscale materials analysis and neutron data modeling. In: APS March Meeting Abstracts, p. A7.002 (2017)
Lusk, E., Huss, S., Saphir, B., Snir, M.: MPI: a message-passing interface standard (2009)
Luszczek, P.R., Bailey, D.H., Dongarra, J.J., Kepner, J., Lucas, R.F., Rabenseifner, R., Takahashi, D.: The HPC challenge (HPCC) benchmark suite. In: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 213. Citeseer (2006)
Mantripragada, K., Binotto, A., Tizzei, L., Netto, M.: A feasibility study of using HPC cloud environment for seismic exploration. In: 77th EAGE Conference and Exhibition 2015 (2015)
McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers (1995)
Mehmood, R.: A survey of out-of-core analysis techniques in stochastic modelling. Report CSR-03-7, University of Birmingham (2003). https://www.researchgate.net/publication/326827715_A_Survey_of_Out-of-Core_Analysis_Techniques_in_Stochastic_Modelling
Mehmood, R.: Disk-based Techniques for Efficient Solution of Large Markov Chains. Thesis (2004)
Mehmood, R.: Serial Disk-Based Analysis of Large Stochastic Models, pp. 230–255. Springer, Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24611-4_7
Mehmood, R., Crowcroft, J.: Parallel iterative solution method for large sparse linear equation systems. UCAM-CL-TR-650. Report UCAM-CL-TR-650, University of Cambridge, Computer Laboratory (2005). http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-650.pdf
Mehmood, R., Graham, G.: Big data logistics: a health-care transport capacity sharing model. Proc. Comput. Sci. 64, 1107–1114 (2015). http://www.sciencedirect.com/science/article/pii/S1877050915027015. Conference on ENTERprise Information Systems/International Conference on Project MANagement/Conference on Health and Social Care Information Systems and Technologies, CENTERIS/ProjMAN/HCist 2015 October 7-9, 2015
Mehmood, R., Lu, J.A.: Computational Markovian analysis of large systems. J. Manuf. Technol. Manage. 22(6), 804–817 (2011). https://doi.org/10.1108/17410381111149657
Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient BDD-based implementation of Gauss-Seidel for CTMC analysis. Report CSR-03-13, University of Birmingham (2003). http://www.prismmodelchecker.org/bibitem.php?key=MPK03b
Mehmood, R., Crowcroft, J., Elmirghani, J.M.H.: A parallel implicit method for the steady-state solution of CTMCs. In: 14th IEEE International Symposium on Modeling, Analysis, and Simulation, pp. 293–302 (2006)
Mehmood, R., Faisal, M.A., Altowaijri, S.: Future networked healthcare systems: a review and case study. In: Boucadair, M., Jacquenet, C. (eds.) Handbook of Research on Redesigning the Future of Internet Architectures, pp. 531–558. IGI Global, Hershey, PA (2015). http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-4666-8371-6.ch022
Mehmood, R., Alam, F., Albogami, N.N., Katib, I., Albeshri, A., Altowaijri, S.M.: UTiLearn: a personalised ubiquitous teaching and learning system for smart societies. IEEE Access 5, 2615–2635 (2017)
Mehmood, R., Meriton, R., Graham, G., Hennelly, P., Kumar, M.: Exploring the influence of big data on city transport operations: a Markovian approach. Int. J. Oper. Prod. Manage. 37(1), 75–104 (2017). https://doi.org/10.1108/IJOPM-03-2015-0179
Meinke, J.H., Mohanty, S., Eisenmenger, F., Hansmann, U.H.E.: SMMP v. 3.0-simulating proteins and protein interactions in Python and Fortran. Comput. Phys. Commun. 178, 459–470 (2008)
Moureau, V., Domingo, P., Vervisch, L.: Design of a massively parallel CFD code for complex geometries. Comptes Rendus Mécanique 339(2), 141–148 (2011). http://www.sciencedirect.com/science/article/pii/S1631072110002111. High Performance Computing
MPI: Open MPI: Open Source High Performance Computing. http://www.open-mpi.org/
MPICH: MPICH | High-Performance Portable MPI. http://www.mpich.org/
Muhammed, T., Mehmood, R., Albeshri, A.: Enabling reliable and resilient IoT based smart city applications. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 169–184. Springer International Publishing, Cham (2018)
Muhammed, T., Mehmood, R., Albeshri, A., Katib, I.: Ubehealth: a personalized ubiquitous cloud and edge-enabled networked healthcare system for smart cities. IEEE Access 6, 32258–32285 (2018)
Nakajima, K.: Three-level hybrid vs. flat MPI on the earth simulator: parallel iterative solvers for finite-element method. Appl. Numer. Math. 54(2), 237–255 (2005)
NAS: NAS Parallel Benchmarks. http://www.nas.nasa.gov/publications/npb.html
Nielsen, E.J., Diskin, B.: High-performance aerodynamic computations for aerospace applications. Parall. Comput. 64, 20–32 (2017). http://www.sciencedirect.com/science/article/pii/S0167819117300182. High-End Computing for Next-Generation Scientific Discovery
Niethammer, C., Gracia, J., Knüpfer, A., Resch, M.M., Nagel, W.E.: Tools for High Performance Computing 2014: Proceedings of the 8th International Workshop on Parallel Tools for High Performance Computing, October 2014, HLRS, Stuttgart. Springer, New York (2015)
Nonaka, A., Almgren, A.S., Bell, J.B., Lijewski, M.J., Malone, C.M., Zingale, M.: Maestro: an adaptive low Mach number hydrodynamics algorithm for Stellar flows 188(2), 358–383 (2010). http://dx.doi.org/10.1088/0067-0049/188/2/358
Oliker, L., Canning, A., Carter, J., Shalf, J., Ethier, S.: Scientific computations on modern parallel vector systems. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, p. 10. IEEE Computer Society, Washington (2004)
Oliker, L., Carter, J., Wehner, M., Canning, A., Ethier, S., Mirin, A., Parks, D., Worley, P., Kitawaki, S., Tsuda, Y.: Leading computational methods on scalar and vector HEC platforms. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, p. 62. IEEE Computer Society, Washington (2005)
Oliker, L., Canning, A., Carter, J., Iancu, C., Lijewski, M., Kamil, S., Shalf, J., Shan, H., Strohmaier, E., Ethier, S., et al.: Scientific application performance on candidate petascale platforms. In: IEEE International Parallel and Distributed Processing Symposium, 2007 (IPDPS 2007), pp. 1–12. IEEE, New York (2007)
Pfrommer, B., Raczkowski, D., Canning, A., Louie, S.: Paratec (parallel total energy code), Lawrence Berkeley national laboratory (with contributions from F. Mauri, M. Cote, Y. Yoon, C. Pickard and P. Haynes). www.nersc.gov/projects/paratec
Pérez, F.E.H., Mukhadiyev, N., Xu, X., Sow, A., Lee, B.J., Sankaran, R., Im, H.G.: Direct numerical simulations of reacting flows with detailed chemistry using many-core/GPU acceleration. Comput. Fluids 173, 73–79 (2018). http://www.sciencedirect.com/science/article/pii/S0045793018301786
Pllana, S., Brandic, I., Benkner, S.: A survey of the state of the art in performance modeling and prediction of parallel and distributed computing systems. Int. J. Comput. Intel. Res.(IJCIR) 4, 17–26 (2008)
Qiang, J., Lidia, S., Ryne, R.D., Limborg-Deprey, C.: Three-dimensional quasistatic model for high brightness beam dynamics simulation. Phys. Rev. ST Accel. Beams 9, 044204 (2006). https://link.aps.org/doi/10.1103/PhysRevSTAB.9.044204
Reed, D.A., Dongarra, J.: Exascale computing and big data. Commun. ACM 58(7), 56–68 (2015). http://doi.acm.org/10.1145/2699414
Rudi, J., Malossi, A.C.I., Isaac, T., Stadler, G., Gurnis, M., Staar, P.W.J., Ineichen, Y., Bekas, C., Curioni, A., Ghattas, O.: An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth’s mantle. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’15, pp. 5:1–5:12. ACM, New York, (2015). http://doi.acm.org/10.1145/2807591.2807675
Sáez, X., Soba, A., Sánchez, E., Kleiber, R., Castejón, F., Cela, J.M.: Improvements of the particle-in-cell code EUTERPE for petascaling machines. Comput. Phys. Commun. 182(9), 2047–2051 (2011). http://www.sciencedirect.com/science/article/pii/S001046551000531X. Computer Physics Communications Special Edition for Conference on Computational Physics Trondheim, June 23-26, 2010
Schlingensiepen, J., Nemtanu, F., Mehmood, R., McCluskey, L.: Autonomic Transport Management Systems—Enabler for Smart Cities, Personalized Medicine, Participation and Industry Grid/Industry 4.0, pp. 3–35. Springer International Publishing, Cham (2016)
Schmidt, M.W., Baldridge, K.K., Boatz, J.A., Elbert, S.T., Gordon, M.S., Jensen, J.H., Koseki, S., Matsunaga, N., Nguyen, K.A., Su, S., et al.: General atomic and molecular electronic structure system. J. Computat. Chem. 14(11), 1347–1363 (1993)
Schwarz, K., Blaha, P., Madsen, G.: Electronic structure calculations of solids using the WIEN2K package for material sciences. Comput. Phys. Commun. 147(1), 71 – 76 (2002). http://www.sciencedirect.com/science/article/pii/S0010465502002060. Proceedings of the Europhysics Conference on Computational Physics Computational Modeling and Simulation of Complex Systems
Snavely, A., Gao, X., Lee, C., Carrington, L., Wolter, N., Labarta, J., Gimenez, J., Jones, P.: Performance modeling of HPC applications. In: PARCO, vol. 13, pp. 777–784 (2003)
Stanisic, L., Videau, B., Cronsioe, J., Degomme, A., Marangozova-Martin, V., Legrand, A., Méhaut, J.F.: Performance analysis of HPC applications on low-power embedded platforms. In: Proceedings of the Conference on Design, Automation and Test in Europe, March, pp. 475–480. EDA Consortium (2013)
Strunk, T., Wolf, M., Brieg, M., Klenin, K., Biewer, A., Tristram, F., Ernst, M., Kleine, P.J., Heilmann, N., Kondov, I., Wenzel, W.: Simona 1.0: An efficient and versatile framework for stochastic simulations of molecular and nanoscale systems. J. Comput. Chem. 33(32), 2602–2613. https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.23089
Subbiah, A., Wasynczuk, O.: Computationally efficient simulation of high-frequency transients in power electronic circuits. IEEE Trans. Power Electron. 31(9), 6351–6361 (2016)
Suma, S., Mehmood, R., Albugami, N., Katib, I., Albeshri, A.: Enabling next generation logistics and planning for smarter societies. Proc. Comput. Sci. 109, 1122–1127 (2017). http://www.sciencedirect.com/science/article/pii/S1877050917311225. 8th International Conference on Ambient Systems, Networks and Technologies, ANT-2017 and the 7th International Conference on Sustainable Energy Information Technology, SEIT 2017, 16–19 May 2017, Madeira
Suma, S., Mehmood, R., Albeshri, A.: Automatic event detection in smart cities using big data analytics. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications, pp. 111–122. Springer International Publishing, Cham (2018)
Taboada, G.L., Touriño, J., Doallo, R.: F-MPJ: scalable java message-passing communications on parallel systems. J. Supercomput. 60(1), 117–140 (2012)
Tikir, M.M., Carrington, L., Strohmaier, E., Snavely, A.: A genetic algorithms approach to modeling the performance of memory-bound computations. In: Proceedings of the 2007 ACM/IEEE conference on Supercomputing, p. 47. ACM, New York (2007)
Tomov, S., Nath, R., Ltaief, H., Dongarra, J.: Dense linear algebra solvers for multicore with GPU accelerators. In: 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and PhD Forum (IPDPSW), April, pp. 1–8 (2010)
Tomov, S., Dongarra, J., Baboulin, M.: Towards dense linear algebra for hybrid GPU accelerated manycore systems. Parall. Comput. 36(5), 232–240 (2010). http://www.sciencedirect.com/science/article/pii/S0167819109001276. Parallel Matrix Algorithms and Applications
Usman, S., Mehmood, R., Katib, I.: Big data and hpc convergence: the cutting edge and outlook. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds.) Smart Societies, Infrastructure, Technologies and Applications. pp. 11–26. Springer International Publishing, Cham (2018)
Vetter, J.S., Alam, S.R., Dunigan, T.H., Fahey, M.R., Roth, P.C., Worley, P.H.: Early evaluation of the Cray XT3. In: 20th International Parallel and Distributed Processing Symposium, 2006 (IPDPS 2006), 10 pp. IEEE, New York (2006)
Voorsluys, W., Garg, S.K., Buyya, R.: Provisioning spot market cloud resources to create cost-effective virtual clusters. In: Algorithms and Architectures for Parallel Processing, pp. 395–408. Springer, Berlin (2011)
Wolf, F., Wylie, B.J., Abrahám, E., Becker, D., Frings, W., Fürlinger, K., Geimer, M., Hermanns, M.A., Mohr, B., Moore, S., et al.: Usage of the scalasca toolset for scalable performance analysis of large-scale parallel applications. In: Tools for High Performance Computing, pp. 157–167. Springer, New York (2008)
Wylie, B.J.N., Geimer, M., Mohr, B., Böhme, D., Szebenyi, Z., Wolf, F.: Large-scale performance analysis of Sweep3D with the scalasca toolset. Parall. Process. Lett. 20(04), 397–414 (2010). https://doi.org/10.1142/S0129626410000314
Yan, S., Zhou, Z., Dinavahi, V.: Large-scale nonlinear device-level power electronic circuit simulation on massively parallel graphics processing architectures. IEEE Trans. Power Electron. 33(6), 4660–4678 (2018)
Yang, R., Gu, L., Tho, C., Sobieszczanski-Sobieski, J.: Multidisciplinary design optimization of a full vehicle with high performance computing. In: Fluid Dynamics and Co-located Conferences, June. American Institute of Aeronautics and Astronautics, Reston (2001). https://doi.org/10.2514/6.2001-1273
Zaki, O., Lusk, E., Gropp, W., Swider, D.: Toward scalable performance visualization with Jumpshot. Int. J. High Perform. Comput. Appl. 13(3), 277–288 (1999)
Acknowledgements
The authors acknowledge with thanks the technical and financial support from the Deanship of Scientific Research (DSR) at the King Abdulaziz University (KAU), Jeddah, Saudi Arabia, under the grant number G-651-611-38. The work carried out in this paper is supported by the High Performance Computing Center at the King Abdulaziz University, Jeddah.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Muhammed, T., Mehmood, R., Albeshri, A., Alsolami, F. (2020). HPC-Smart Infrastructures: A Review and Outlook on Performance Analysis Methods and Tools. In: Mehmood, R., See, S., Katib, I., Chlamtac, I. (eds) Smart Infrastructure and Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-13705-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-13705-2_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13704-5
Online ISBN: 978-3-030-13705-2
eBook Packages: EngineeringEngineering (R0)