Abstract
The TAU Performance System® is an integrated suite of tools for instrumentation, measurement, and analysis of parallel programs targeting large-scale, high-performance computing (HPC) platforms. Representing over fifteen calendar years and fifty person years of research and development effort, TAU’s driving concerns have been portability, flexibility, interoperability, and scalability. The result is a performance system which has evolved into a leading framework for parallel performance evaluation and problem solving. This paper presents the current state of TAU, overviews the design and function of TAU’s main features, discusses best practices of TAU use, and outlines future development.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ahn, D., Kufrin, R., Raghuraman, A., Seo, J.: Perfsuite. http://perfsuite.ncsa.uiuc.edu/
Bell, R., Malony, A., Shende, S.: A portable, extensible, and scalable tool for parallel performance profile analysis. In: Proc. EUROPAR 2003 Conference (EUROPAR03) (2003). URL http://www.cs.uoregon.edu/research/paracomp/papers/parco03/parco03.pdf
Bernholdt, D.E., Allan, B.A., Armstrong, R., Bertrand, F., Chiu, K., Dahlgren, T.L., Damevski, K., Elwasif, W.R., Epperly, T.G.W., Govindaraju, M., Katz, D.S., Kohl, J.A., Krishnan, M., Kumfert, G., Larson, J.W., Lefantzi, S., Lewis, M.J., Malony, A.D., McInnes, L., Nieplocha, J., Norris, B., Parker, S.G., Ray, J., Shende, S., Windus, T.L., Zhou, S.: A Component Architecture for High-Performance Scientific Computing. Intl. Journal of High-Performance Computing Applications ACTS Collection Special Issue (2005)
Berrendorf, R., Ziegler, H., Mohr, B.: PCL — The Performance Counter Library. http://www.fz-juelich.de/zam/PCL/
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A Portable Programming Interface for Performance Evaluation on Modern Processors. International Journal of High Performance Computing Applications 14(3), 189–204 (2000)
Brunst, H., Malony, A.D., Shende, S., Bell, R.: Online Remote Trace Analysis of Parallel Applications on High-Performance Clusters. In: Proceedings of the ISHPC Conference (LNCS 2858), pp. 440–449. Springer (2003)
Brunst, H., Nagel, W.E., Malony, A.D.: A Distributed Performance Analysis Architecture for Clusters. In: Proceedings of the IEEE International Conference on Cluster Computing (Cluster 2003), pp. 73–83. IEEE Computer Society (2003)
Buck, B., Hollingsworth, J.: An API for Runtime Code Patching. Journal of High Performance Computing Applications 14(4), 317–329 (2000)
CCA Forum: The Common Component Architecture Forum. http://www.cca-forum.org
DeRose, L.: The Hardware Performance Monitor Toolkit. In: Proceedings of the European Conference on Parallel Computing (EuroPar 2001, LNCS 2150), pp. 122–131. Springer (2001)
Dongarra, J., Malony, A.D., Moore, S., Mucci, P., Shende, S.: Performance Instrumentation and Measurement for Terascale Systems. In: Proceedings of the ICCS 2003 Conference (LNCS 2660), pp. 53–62 (2003)
Eaton, J.W.: Octave home page. http://www.octave.org/. Http://www.octave.org/
Forum, M.P.I.: MPI: A Message Passing Interface Standard. International Journal of Supercomputer Applications (Special Issue on MPI) 8(3/4) (1994)
Foundation, T.A.S.: Apache derby. URL http://db.apache.org/derby/. Http://db.apache.org/derby/
Graham, S., Kessler, P., McKusick, M.: gprof: A Call Graph Execution Profiler. SIGPLAN ’82 Symposium on Compiler Construction pp. 120–126 (1982)
Huck, K., Malony, A.: PerfExplorer: A performance data mining framework for large-scale parallel computing. In: Conference on High Performance Networking and Computing (SC’05) (2005)
Huck, K., Malony, A., Bell, R., Morris, A.: Design and Implementation of a Parallel Performance Data Management Framework. In: Proc. International Conference on Parallel Processing, ICPP-05 (2005)
IBM: IBM DB2 Information Management Software. http://www.ibm.com/software/data
Knüpfer, A., Brendel, R., Brunst, H., Mix, H., Nagel, W.E.: Introducing the Open Trace Format (OTF). In: Proceedings of the 6th International Conference on Computational Science, Springer Lecture Notes in Computer Science, vol. 3992, pp. 526–533. Reading, UK (2006)
Kohn, S., Kumfert, G., Painter, J., Ribbens, C.: Divorcing Language Dependencies from a Scientific Software Library. In: Proceedings of the 10th SIAM Conference on Parallel Processing (2001)
Lindlan, K.A., Cuny, J., Malony, A.D., Shende, S., Mohr, B., Rivenburgh, R., Rasmussen., C.: A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates. In: Proceedings of SC2000: High Performance Networking and Computing Conference (2000)
Malony, A., Shende, S.: Distributed and Parallel Systems: From Concepts to Applications, chap. Performance Technology for Complex Parallel and Distributed Systems, pp. 37–46. Kluwer, Norwell, MA (2000)
Malony, A.D.: Performance Observability. Ph.D. thesis, University of Illinois at Urbana-Champaign (1990)
Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Towards a Performance Tool Interface for OpenMP: An Approach Based on Directive Rewriting. In: Proceedings of Third European Workshop on OpenMP (2001)
Mohr, B., Wolf, F.: KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications. In: Proceedings of the European Conference on Parallel Computing (EuroPar 2003, LNCS 2790), pp. 1301–1304. Springer (2003)
Mucci, P.: Dynaprof. http://www.cs.utk.edu/~mucci/dynaprof
MySQL: MySQL: The World’s Most Popular Open Source Database
Nagel, W., Arnold, A., Weber, M., Hoppe, H.C., Solchenbach, K.: VAMPIR: Visualization and Analysis of MPI Resources. Supercomputer 12(1), 69–80 (1996)
Nataraj, A., Malony, A.D., Shende, S., Morris, A.: Integrated parallel performance views. Cluster Computing 11(1), 57–73 (2008). http://dx.doi.org/10.1007/s10586-007-0051-6
Nataraj, A., Morris, A., Malony, A.D., Arnold, D., Miller, B.: A Framework for Scalable, Parallel Performance Monitoring using TAU and MRNet. Under submission
Nataraj, A., Morris, A., Malony, A.D., Sottile, M., Beckman, P.: The Ghost in the Machine: Observing the Effects of Kernel Operation on Parallel Application Performance. In: ACM/IEEE SC2007. Reno, Nevada (2007)
Nataraj, A., Sottile, M., Morris, A., Malony, A.D., Shende, S.: TAUoverSupermon : Low-Overhead Online Parallel Performance Monitoring. In: Europar’07: European Conference on Parallel Processing (2007)
Norris, B., Ray, J., McInnes, L., Bernholdt, D., Elwasif, W., Malony, A., Shende, S.: Computational quality of service for scientific components. In: Proceedings of the International Symposium on Component-based Software Engineering (CBSE7). Springer (2004)
Oracle Corporation: Oracle. http://www.oracle.com
PostgreSQL: PostgreSQL: The World’s Most Advanced Open Source Database. http://www.postgresql.org
Seidl, S.: VTF3 - A Fast Vampir Trace File Low-Level Management Library. Tech. Rep. ZHR-R-0304, Dresden University of Technology, Center for High-Performance Computing (2003)
Shende, S.: The Role of Instrumentation and Mapping in Performance Measurement. Ph.D. thesis, University of Oregon (2001)
Shende, S., Malony, A.D.: The TAU parallel performance system. The International Journal of High Performance Computing Applications 20(2), 287–331 (2006). URL http://www.cs.uoregon.edu/research/tau
Shende, S., Malony, A.D., Cuny, J., Lindlan, K., Beckman, P., Karmesin, S.: Portable Profiling and Tracing for Parallel Scientific Applications using C++. In: Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, SPDT’98, pp. 134–145 (1998)
Shende, S., Malony, A.D., Rasmussen, C., Sottile, M.: A Performance Interface for Component-Based Applications. In: Proceedings of International Workshop on Performance Modeling, Evaluation and Optimization, International Parallel and Distributed Processing Symposium (2003)
Subramanya, R., Reddy, R.: Sandia DNS code for 3D compressible flows - Final Report. Tech. Rep. PSC-Sandia-FR-3.0, Pittsburgh Supercomputing Center, PA (2000)
Szyperski, C.: Component Software: Beyond Object-Oriented Programming. Addison-Wesley (1997)
The R Foundation for Statistical Computing: R project for statistical computing (2007). URL http://www.r-project.org. Http://www.r-project.org
University of Oregon: TAU Portable Profiling. http://tau.uoregon.edu
University of Oregon: TAU Portal. http://tau.nic.uoregon.edu
University of Oregon: Tuning and Analysis Utilities User’s Guide. http://www.cs.uoregon.edu/research/paracomp/tau
Vetter, J., Chambreau, C.: mpiP: Lightweight, Scalable MPI Profiling. http://www.llnl.gov/CASC/mpip/
Witten, Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2005). URL http://www.cs.waikato.ac.nz/~ml/weka/
Wolf, F., Mohr, B., Dongarra, J., Moore, S.: Efficient Pattern Search in Large Traces through Successive Refinement. In: Proceedings of the European Conference on Parallel Computing (EuroPar 2004, LNCS 3149), pp. 47–54. Springer (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Malony, A.D. et al. (2008). Evolution of a Parallel Performance System. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds) Tools for High Performance Computing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68564-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-68564-7_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68561-6
Online ISBN: 978-3-540-68564-7
eBook Packages: Computer ScienceComputer Science (R0)