Abstract
The performance skeleton of an application is a short running program whose performance in any scenario reflects the performance of the application it represents. Specifically, the execution time of the performance skeleton is a small fixed fraction of the execution time of the corresponding application in any execution environment. Such a skeleton can be employed to quickly estimate the performance of a large application under existing network and node sharing. This paper presents a framework for automatic construction of performance skeletons of a specified execution time and evaluates their use in performance prediction with CPU and network sharing. The approach is based on capturing the execution behavior of an application and automatically generating a synthetic skeleton program that reflects that execution behavior. The paper demonstrates that performance skeletons running for a few seconds can predict the application execution time fairly accurately. Relationship of skeleton execution time, application characteristics, and nature of resource sharing, to accuracy of skeleton based performance prediction, is analyzed in detail. The goal of this research is accurate performance estimation in heterogeneous and shared computational grids.
Similar content being viewed by others
References
Linux man pages
Almesberger, W.: Linux network traffic control—implementation overview. White Paper, April 1999. Available at ftp://lrcftp.epfl.ch/pub/people/almesber/pub/tcio-current.ps
Bailey, D., Harris, T., Saphir, W., van der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0. Technical Report 95-020, NASA Ames Research Center, December 1995
Berman, F., Wolski, R., Figueira, S., Schopf, J., Shao, G.: Application-level scheduling on distributed heterogeneous networks. In: Proceedings of Supercomputing ’96, Pittsburgh, PA, November 1996
Bolliger, J., Gross, T.: A framework-based approach to the development of network-aware applications. IEEE Trans. Softw. Eng. 24(5), 376–390 (1998)
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An architecture for a resource management and scheduling system in a global computational grid. In: The 4th International Conference on High Performance Computing in Asia-Pacific Region (2000)
Cardwell, N., Savage, S., Anderson, T.: Modeling TCP latency. In: Proceedings of IEEE INFOCOM 2000, pp. 1742–1751 (2000)
Casanova, H., Dongarra, J.: NetSolve: A network-enabled server for solving computational science problems. Int. J. Supercomput. Appl. High Perform. Comput. 11(3), 212–223 (1997)
Casanova, H., Obertelli, G., Berman, F., Wolski, R.: The AppLeS parameter sweep template: user-level middleware for the grid. In: Supercomputing 2000, pp. 75–76 (2000)
Dikaiakos, M., Rogers, A., Steiglitz, K.: Fast: a functional algorithm simulation testbed. In: International Conference On Parallel and Distributed Systems, December 1993
Dinda, P., O’Hallaron, D.: An evaluation of linear models for host load prediction. In: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing, August 1999
Duesterwald, E., Cascaval, C., Dwarkadas, S.: Characterizing and predicting program behavior and its variability. In: International Conference on Parallel Architectures and Compilation Techniques (PACT), New Orleans, LA, September 2003
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Kaufman, Los Altos (2001)
Lai, K., Baker, M.: Nettimer: a tool for measuring bottleneck link bandwidth. In: USENIX Symposium on Internet Topology and Systems, pp. 123–134, March 1991
Leo, F.M., Yang, T., Xiaosong, M.: Cross-platform performance prediction of parallel applications using partial execution. In: Proceedings of Supercomputing ’05, Seattle, WA, November 2005
Litzkow, M., Livny, M., Mutka, M.: Condor—a hunter of idle workstations. In: Proceedings of the Eighth Conference on Distributed Computing Systems, San Jose, CA, June 1988
Lowekamp, B., Miller, N., Sutherland, D., Gross, T., Steenkiste, P., Subhlok, J.: A resource query interface for network-aware applications. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL, July 1998
Lu, C., Reed, D.A.: Compact application signatures for parallel and distributed scientific codes. In: Proceedings of Supercomputing 2002, Baltimore, MD, November 2002
Message Passing Interface Forum. MPI: a message-passing interface standard. Technical Report UT-CS-94-230 (1994)
Nakazawa, M., Lowenthal, D.K., Zhou, W.: The MHETA execution model for heterogeneous clusters. In: Proceedings of Supercomputing ’05, Seattle, WA, November 2005
Paxson, V., Floyd, S.: Wide-area traffic: the failure of Poisson modeling. IEEE/ACM Trans. Netw. 3(3), 226–244 (1995)
Raman, R., Livny, M., Solomon, M.: Matchmaking: distributed resource management for high throughput computing. In: 7th IEEE International Symposium on High Performance Distributed Computing, July 1998
Shao, G., Berman, F., Wolski, R.: Master/slave computing on the grid. In: 9th Heterogeneous Computing Workshop, pp. 3–16 (2000)
Sherwood, T., Perelman, E., Calder, B.: Basic block-distribution analysis to find periodic behavior and simulation points in applications. In: International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2001
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior. In: 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), San Jose, CA, October 2002
Singh, A., Subhlok, J.: Reconstruction of application layer message sequences by network monitoring. In: IASTED International Conference on Communications and Computer Networks, Boston, MA, November 2002
Snavely, A., Carrington, L., Wolter, N., Labarta, J., Badia, R., Purkayastha, A.: A framework for performance modeling and prediction. In: Proceedings of Supercomputing 2002, Baltimore, MD, November 2002
Snavely, A., Wolter, N., Carrington, L.: Modeling application performance by convolving machine signatures with application profiles. In: IEEE Workshop on Workload Characterization, Austin, TX (2001)
Sodhi, S.: Automatic construction of performance skeletons for grid resource selection and performance estimation frameworks. Master’s thesis, University of Houston (2004)
Stemm, M., Seshan, S., Katz, R.: Spand: Shared passive network performance discovery. In: USENIX Symposium on Internet Technologies and Systems, Monterey, CA, June 1997
Subhlok, J., Lieu, P., Lowekamp, B.: Automatic node selection for high performance applications on networks. In: Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 163–172, Atlanta, GA, May 1999
Tabe, T., Stout, Q.: The use of the MPI communication library in the NAS Parallel Benchmark. Technical Report CSE-TR-386-99, Department of Computer Science, University of Michigan, November 1999
Tangmunarunkit, H., Steenkiste, P.: Network-aware distributed computing: a case study. In: Second Workshop on Runtime Systems for Parallel Programming (RTSPP), Orlando, March 1998
Toomula, A., Subhlok, J.: Replication memory behavior for performance prediction. In: LCR 2004: The 7th Workshop on Languages, Compilers, and Run-time Support for Scalable Systems, Houston, TX, October 2004
Venkataramaiah, S., Subhlok, J.: Performance estimation for scheduling on shared networks. In: 9th Workshop on Job Scheduling Strategies for Parallel Processing, Seattle, WA, June 2003
Weismann, J.: Metascheduling: a scheduling model for metacomputing systems. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL, July 1998
Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared Unix systems on the computational grid. Cluster Comput. 3(4), 293–301 (2000)
Wolski, R., Spring, N., Peterson, C.: Implementing a performance forecasting system for metacomputing: The Network Weather Service. In: Proceedings of Supercomputing ’97, San Jose, CA, November 1997
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sodhi, S., Subhlok, J. & Xu, Q. Performance prediction with skeletons. Cluster Comput 11, 151–165 (2008). https://doi.org/10.1007/s10586-007-0039-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-007-0039-2