Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

The Tau Parallel Performance System

Published: 01 May 2006 Publication History
  • Get Citation Alerts
  • Abstract

    The ability of performance technology to keep pace with the growing complexity of parallel and distributed systems depends on robust performance frameworks that can at once provide system-specific performance capabilities and support high-level performance problem solving. Flexibility and portability in empirical methods and processes are influenced primarily by the strategies available for instrmentation and measurement, and how effectively they are integrated and composed. This paper presents the TAU (Tuning and Analysis Utilities) parallel performance sytem and describe how it addresses diverse requirements for performance observation and analysis.

    References

    [1]
    Ahn, D., Kufrin, R., Raghuraman, A., and Seo, J. Perfsuite. http://perfsuite.ncsa.uiuc.edu/.]]
    [2]
    Bell, R., Malony, A. D., and Shende, S. 2003. A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis. Proceedings of the Europar 2003 Coference (LNCS 2790), pp. 17-26.]]
    [3]
    Bernholdt, D. E., Allan, B. A., Armstrong, R. et al. 2006. A Component Architecture for High-Performance Scientific Computing. Intl. Journal of High-Performance Compuing Applications ACTS Collection Special Issue.]]
    [4]
    Berrendorf, R., Ziegler, H., and Mohr, B.PCL -- The Performance Counter Library. http://www.fz-juelich.de/zam/PCL/.]]
    [5]
    Browne, S., Dongarra, J., Garner, N., Ho, G., and Mucci, P.2000. A Portable Programming Interface for Performance Evauation on Modern Processors. International Journal of High Performance Computing Applications14(3):189-204.]]
    [6]
    Brunst, H., Malony, A. D., Shende, S., and Bell, R. 2003. Online Remote Trace Analysis of Parallel Applications on High-Performance Clusters. Proceedings of the ISHPC Conference (LNCS 2858), pp. 440-449. Springer.]]
    [7]
    Brunst, H., Nagel, W. E., and Malony, A. D. 2003. A Distriuted Performance Analysis Architecture for Clusters. Prceedings of the IEEE International Conference on Cluster Computing (Cluster 2003), pp. 73-83. IEEE Computer Society.]]
    [8]
    Buck, B. and Hollingsworth, J.2000. An API for Runtime Code Patching. Journal of High Performance Computing Applcations14(4):317-329.]]
    [9]
    California Institute of Technology. VTF -- Virtual Test Shock Facility. http://www.cacr.caltech.edu/ASAP.]]
    [10]
    CCA Forum. The Common Component Architecture Forum. http://www.cca-forum.org.]]
    [11]
    DeRose, L. 2001. The Hardware Performance Monitor Toolkit. Proceedings of the European Conference on Parallel Computing (EuroPar 2001, LNCS 2150), pp. 122-131. Springer.]]
    [12]
    DeRose, L. and Reed, D. 1998. SvPablo: A Multi-Language Architecture-Independent Performance Analysis System. Proceedings of the International Conference on Parallel Processing, ICPP '99, pp. 311-318.]]
    [13]
    DeRose, L. and Wolf, F. 2002. CATCH - A Call-Graph Based Automatic Tool for Capture of Hardware Performance Metrics for MPI and OpenMP Applications. Proceedings of the Europar 2002 Conference.]]
    [14]
    Dongarra, J., Malony, A. D., Moore, S., Mucci, P., and Shende, S. 2003. Performance Instrumentation and Measurement for Terascale Systems. Proceedings of the ICCS 2003 Conference (LNCS 2660), pp. 53-62.]]
    [15]
    European Center for Parallelism of Barcelona (CEPBA). Paaver -- Parallel Program Visualization and Analysis Tool - reference manual. http://www.cepba.upc.es/paraver.]]
    [16]
    Forum, M. P. I.1994. MPI: A Message Passing Interface Stanard. International Journal of Supercomputer Applications(Special Issue on MPI) 8(3/4/).]]
    [17]
    Graham, S., Kessler, P., and McKusick, M.1982. gprof: A Call Graph Execution Profiler. SIGPLAN '82 Symposium on Compiler Construction pp. 120-126.]]
    [18]
    Gropp, W. and Lusk, E.User's Guide for MPE: Extensions for MPI Programs. http://www-unix.mcs.anl.gov/mpi/mpich/docs/mpeguide/paper.htm.]]
    [19]
    HPC++ Working Group. 1995. HPC++ White Papers. Techncal Report TR 95633, Center for Research on Parallel Computation.]]
    [20]
    Huck, K., Malony, A., Bell, R., and Morris, A.2005. Design and Implementation of a Parallel Performance Data Maagement Framework. Proc.International Conference on Parallel Processing, ICPP-05.]]
    [21]
    IBM. IBM DB2 Information Management Software. http://www.ibm.com/software/data.]]
    [22]
    Intel Corporation. Intel (R) Trace Analyzer 4.0. http://www.intel.com/software/products/cluster/tanalyzer/.]]
    [23]
    Kessler, P.1990. Fast Breakpoints: Design and Implementation. SIGPLAN Notices25(6):78-84.]]
    [24]
    Kohn, S., Kumfert, G., Painter, J., and Ribbens, C. 2001. Divorcing Language Dependencies from a Scientific Sofware Library. Proceedings of the 10th SIAM Conference on Parallel Processing.]]
    [25]
    Lindlan, K., Cuny, J., Malony, A. D., Shende, S., Mohr, B., Rivenburgh, R., and Rasmussen, C. 2000. A Tool Framwork for Static and Dynamic Analysis of Object-Oriented Software with Templates. Proceedings of the SC'2000 Conference.]]
    [26]
    Malony, A. D. 1990. Performance Observability. PhD thesis, University of Illinois, Urbana-Champaign.]]
    [27]
    Malony, A. and Shende, S.2000. Performance Technology for Complex Parallel and Distributed Systems. In: Distributed and Parallel Systems: From Concepts to Applications(eds. G. Kotsis and P. Kacsuk), pp. 37-46, Norwell, MA: Klwer.]]
    [28]
    Malony, A., Shende, S., Bell, R., Li, K., Li, L., and Trebon, N.2003. Advances in the TAU Performance System. In: Peformance Analysis and Grid Computing (eds. V. Getov, M. Gerndt, A. Hoisie, A. Malony, B. Miller), pp. 129-144. Norwell, MA: Kluwer.]]
    [29]
    Mellor-Crummey, J., Fowler, R., and Marlin, G.2002. HPCView: A Tool for Top-down Analysis of Node Peformance. The Journal of Supercomputing23:81-104.]]
    [30]
    Mohr, B.KOJAK -- Kit for Objective Judgment and Knowedge-based Detection of Bottlenecks. http://www.fz-juelich.de/zam/kojak.]]
    [31]
    Mohr, B., Malony, A., Shende, S., and Wolf, F.2002. Design and Prototype of a Performance Tool Interface for OpenMP. The Journal of Supercomputing23:105-128.]]
    [32]
    Mohr, B. and Wolf, F. 2003. KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications. Proceeings of the European Conference on Parallel Computing (EuroPar 2003, LNCS 2790), pp. 1301-1304. Springer.]]
    [33]
    Mucci, P. Dynaprof. http://www.cs.utk.edu/mucci/dynaprof.]]
    [34]
    MySQL. MySQL: The World's Most Popular Open Source Database. www.mysql.org.]]
    [35]
    Nagel, W., Arnold, A., Weber, M., Hoppe, H.-C., and Solchebach, K.1996. VAMPIR: Visualization and Analysis of MPI Resources. Supercomputer12(1):69-80.]]
    [36]
    Norris, B., Ray, J., McInnes, L., Bernholdt, D., Elwasif, W., Malony, A., and Shende, S. 2004. Computational quality of service for scientific components. Proceedings of the International Symposium on Component-based Software Engineering (CBSE7). Springer.]]
    [37]
    Oracle Corporation. Oracle. http://www.oracle.com.]]
    [38]
    PostgreSQL. PostgreSQL: The World's Most Advanced Open Source Database. http://www.postgresql.org.]]
    [39]
    Ray, J., Trebon, N., Shende, S., Armstrong, R., and Malony, A. 2004. Performance Measurement and Modeling of Coponent Applications in a High Performance Computing Environment: A Case Study. Proc. International Parallel and Distributed Processing Symposium (IPDPS'04).]]
    [40]
    Sarukkai, S. and Malony, A. D.1993. Perturbation Analysis of High Level Instrumentation for SPMD Programs. SIGPLAN Notices28(7).]]
    [41]
    Seidl, S. 2003. VTF3 - A Fast Vampir Trace File Low-Level Management Library. Technical Report ZHR-R-0304, Dresden University of Technology, Center for High-Peformance Computing.]]
    [42]
    Shende, S. 2001. The Role of Instrumentation and Mapping in Performance Measurement. PhD thesis, University of Oregon.]]
    [43]
    Shende, S. and Malony, A. D.2003. Integration and Appliction of TAU in Parallel Java Environments. Concurrency and Computation: Practice and Experience15(3-5):501-519.]]
    [44]
    Shende, S., Malony, A. D., Cuny, J., Lindlan, K., Beckman, P., and Karmesin, S. 1998. Portable Profiling and Tracing for Parallel Scientific Applications using C++. Proceedings of the SIGMETRICS Symposium on Parallel and Distriuted Tools, SPDT'98, pp. 134-145.]]
    [45]
    Shende, S., Malony, A. D., Rasmussen, C., and Sottile, M. 2003. A Performance Interface for Component-Based Applications. Proceedings of International Workshop on Performance Modeling, Evaluation and Optimization, International Parallel and Distributed Processing Sympsium.]]
    [46]
    Song, F., Wolf, F., Bhatia, N., Dongarra, J., and Moore, S. 2004. An Algebra for Cross-Experiment Performance Analysis. Proc. of International Conference on Parallel Processing, ICPP-04.]]
    [47]
    Subramanya, R. and Reddy, R. 2000. Sandia DNS code for 3D compressible flows - Final Report. Technical Report PSC-Sandia-FR-3.0, Pittsburgh Supercomputing Center, PA.]]
    [48]
    SUN Microsystems Inc.Java Virtual Machine Profiler Interface (JVMPI). http://java.sun.com/j2se/1.5.0/docs/guide/jvmpi/.]]
    [49]
    Szyperski, C.1997. Component Software: Beyond Object-Orented Programming. Addison-Wesley.]]
    [50]
    University of Oregon, A TAU Portable Profiling. http://www.cs.uoregon.edu/research/paracomp/tau.]]
    [51]
    University of Oregon, b. Tuning and Analysis Utilities User's Guide. http://www.cs.uoregon.edu/research/paracomp/tau.]]
    [52]
    Vetter, J. and Chambreau, C.mpiP: Lightweight, Scalable MPI Profiling. http://www.llnl.gov/CASC/mpip/.]]
    [53]
    Viswanathan, D. and Liang, S.2000. Java Virtual Machine Prfiler Interface. IBM Systems Journal39(1):82-95.]]
    [54]
    Wolf, F., Mohr, B., Dongarra, J., and Moore, S. 2004. Efficient Patern Search in Large Traces through Successive Refinement. Proceedings of the European Conference on Parallel Coputing (EuroPar 2004, LNCS 3149), pp. 47-54. Springer.]]
    [55]
    Wu, C. E., Bolmarcich, A., Snir, M., Wootton, D., Parpia, F., Chan, A., Lusk, E., and Gropp, W. 2000. From trace geeration to visualization: A performance framework for distributed parallel systems. Proc. of SC2000: High Peformance Networking and Computing.]]

    Cited By

    View all
    • (2024)MUPPETProceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3649169.3649246(22-31)Online publication date: 3-Mar-2024
    • (2024)I/O-signature-based feature analysis and classification of high-performance computing applicationsCluster Computing10.1007/s10586-023-04139-y27:3(3219-3231)Online publication date: 1-Jun-2024
    • (2023)Finding the forest in the treesInternational Journal of High Performance Computing Applications10.1177/1094342023117568737:3-4(434-441)Online publication date: 1-Jul-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image International Journal of High Performance Computing Applications
    International Journal of High Performance Computing Applications  Volume 20, Issue 2
    May 2006
    148 pages

    Publisher

    Sage Publications, Inc.

    United States

    Publication History

    Published: 01 May 2006

    Author Tags

    1. Performance evaluation
    2. TAU
    3. analysis
    4. instrumentation
    5. measurement

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)MUPPETProceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3649169.3649246(22-31)Online publication date: 3-Mar-2024
    • (2024)I/O-signature-based feature analysis and classification of high-performance computing applicationsCluster Computing10.1007/s10586-023-04139-y27:3(3219-3231)Online publication date: 1-Jun-2024
    • (2023)Finding the forest in the treesInternational Journal of High Performance Computing Applications10.1177/1094342023117568737:3-4(434-441)Online publication date: 1-Jul-2023
    • (2023)GPUscout: Locating Data Movement-related Bottlenecks on GPUsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624208(1392-1402)Online publication date: 12-Nov-2023
    • (2023)Enabling Agile Analysis of I/O Performance Data with PyDarshanProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624207(1380-1391)Online publication date: 12-Nov-2023
    • (2023)An Event Model for Trace-Based Performance Analysis of MPI Partitioned Point-to-Point CommunicationProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624205(1357-1367)Online publication date: 12-Nov-2023
    • (2023)ZeroSum: User Space Monitoring of Resource Utilization and Contention on Heterogeneous HPC SystemsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624145(685-695)Online publication date: 12-Nov-2023
    • (2023)PEAK: a Light-Weight Profiler for HPC SystemsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624143(677-680)Online publication date: 12-Nov-2023
    • (2023)BaRRT: Buildtime and Runtime Reproducibility Tool for Software Development and TestingProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624142(673-676)Online publication date: 12-Nov-2023
    • (2023)REMORA Resource Monitor: Usability, Performance and User Interface ImprovementsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624141(663-672)Online publication date: 12-Nov-2023
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media