Performance prediction with skeletons

Sodhi, Sukhdeep; Subhlok, Jaspal; Xu, Qiang

doi:10.1007/s10586-007-0039-2

Performance prediction with skeletons

Published: 04 October 2007

Volume 11, pages 151–165, (2008)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Sukhdeep Sodhi²,
Jaspal Subhlok¹ &
Qiang Xu¹

182 Accesses
36 Citations
Explore all metrics

Abstract

The performance skeleton of an application is a short running program whose performance in any scenario reflects the performance of the application it represents. Specifically, the execution time of the performance skeleton is a small fixed fraction of the execution time of the corresponding application in any execution environment. Such a skeleton can be employed to quickly estimate the performance of a large application under existing network and node sharing. This paper presents a framework for automatic construction of performance skeletons of a specified execution time and evaluates their use in performance prediction with CPU and network sharing. The approach is based on capturing the execution behavior of an application and automatically generating a synthetic skeleton program that reflects that execution behavior. The paper demonstrates that performance skeletons running for a few seconds can predict the application execution time fairly accurately. Relationship of skeleton execution time, application characteristics, and nature of resource sharing, to accuracy of skeleton based performance prediction, is analyzed in detail. The goal of this research is accurate performance estimation in heterogeneous and shared computational grids.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Linux man pages
Almesberger, W.: Linux network traffic control—implementation overview. White Paper, April 1999. Available at ftp://lrcftp.epfl.ch/pub/people/almesber/pub/tcio-current.ps
Bailey, D., Harris, T., Saphir, W., van der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0. Technical Report 95-020, NASA Ames Research Center, December 1995
Berman, F., Wolski, R., Figueira, S., Schopf, J., Shao, G.: Application-level scheduling on distributed heterogeneous networks. In: Proceedings of Supercomputing ’96, Pittsburgh, PA, November 1996
Bolliger, J., Gross, T.: A framework-based approach to the development of network-aware applications. IEEE Trans. Softw. Eng. 24(5), 376–390 (1998)
Article Google Scholar
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An architecture for a resource management and scheduling system in a global computational grid. In: The 4th International Conference on High Performance Computing in Asia-Pacific Region (2000)
Cardwell, N., Savage, S., Anderson, T.: Modeling TCP latency. In: Proceedings of IEEE INFOCOM 2000, pp. 1742–1751 (2000)
Casanova, H., Dongarra, J.: NetSolve: A network-enabled server for solving computational science problems. Int. J. Supercomput. Appl. High Perform. Comput. 11(3), 212–223 (1997)
Article Google Scholar
Casanova, H., Obertelli, G., Berman, F., Wolski, R.: The AppLeS parameter sweep template: user-level middleware for the grid. In: Supercomputing 2000, pp. 75–76 (2000)
Dikaiakos, M., Rogers, A., Steiglitz, K.: Fast: a functional algorithm simulation testbed. In: International Conference On Parallel and Distributed Systems, December 1993
Dinda, P., O’Hallaron, D.: An evaluation of linear models for host load prediction. In: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing, August 1999
Duesterwald, E., Cascaval, C., Dwarkadas, S.: Characterizing and predicting program behavior and its variability. In: International Conference on Parallel Architectures and Compilation Techniques (PACT), New Orleans, LA, September 2003
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Kaufman, Los Altos (2001)
Google Scholar
Lai, K., Baker, M.: Nettimer: a tool for measuring bottleneck link bandwidth. In: USENIX Symposium on Internet Topology and Systems, pp. 123–134, March 1991
Leo, F.M., Yang, T., Xiaosong, M.: Cross-platform performance prediction of parallel applications using partial execution. In: Proceedings of Supercomputing ’05, Seattle, WA, November 2005
Litzkow, M., Livny, M., Mutka, M.: Condor—a hunter of idle workstations. In: Proceedings of the Eighth Conference on Distributed Computing Systems, San Jose, CA, June 1988
Lowekamp, B., Miller, N., Sutherland, D., Gross, T., Steenkiste, P., Subhlok, J.: A resource query interface for network-aware applications. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL, July 1998
Lu, C., Reed, D.A.: Compact application signatures for parallel and distributed scientific codes. In: Proceedings of Supercomputing 2002, Baltimore, MD, November 2002
Message Passing Interface Forum. MPI: a message-passing interface standard. Technical Report UT-CS-94-230 (1994)
Nakazawa, M., Lowenthal, D.K., Zhou, W.: The MHETA execution model for heterogeneous clusters. In: Proceedings of Supercomputing ’05, Seattle, WA, November 2005
Paxson, V., Floyd, S.: Wide-area traffic: the failure of Poisson modeling. IEEE/ACM Trans. Netw. 3(3), 226–244 (1995)
Article Google Scholar
Raman, R., Livny, M., Solomon, M.: Matchmaking: distributed resource management for high throughput computing. In: 7th IEEE International Symposium on High Performance Distributed Computing, July 1998
Shao, G., Berman, F., Wolski, R.: Master/slave computing on the grid. In: 9th Heterogeneous Computing Workshop, pp. 3–16 (2000)
Sherwood, T., Perelman, E., Calder, B.: Basic block-distribution analysis to find periodic behavior and simulation points in applications. In: International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2001
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior. In: 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), San Jose, CA, October 2002
Singh, A., Subhlok, J.: Reconstruction of application layer message sequences by network monitoring. In: IASTED International Conference on Communications and Computer Networks, Boston, MA, November 2002
Snavely, A., Carrington, L., Wolter, N., Labarta, J., Badia, R., Purkayastha, A.: A framework for performance modeling and prediction. In: Proceedings of Supercomputing 2002, Baltimore, MD, November 2002
Snavely, A., Wolter, N., Carrington, L.: Modeling application performance by convolving machine signatures with application profiles. In: IEEE Workshop on Workload Characterization, Austin, TX (2001)
Sodhi, S.: Automatic construction of performance skeletons for grid resource selection and performance estimation frameworks. Master’s thesis, University of Houston (2004)
Stemm, M., Seshan, S., Katz, R.: Spand: Shared passive network performance discovery. In: USENIX Symposium on Internet Technologies and Systems, Monterey, CA, June 1997
Subhlok, J., Lieu, P., Lowekamp, B.: Automatic node selection for high performance applications on networks. In: Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 163–172, Atlanta, GA, May 1999
Tabe, T., Stout, Q.: The use of the MPI communication library in the NAS Parallel Benchmark. Technical Report CSE-TR-386-99, Department of Computer Science, University of Michigan, November 1999
Tangmunarunkit, H., Steenkiste, P.: Network-aware distributed computing: a case study. In: Second Workshop on Runtime Systems for Parallel Programming (RTSPP), Orlando, March 1998
Toomula, A., Subhlok, J.: Replication memory behavior for performance prediction. In: LCR 2004: The 7th Workshop on Languages, Compilers, and Run-time Support for Scalable Systems, Houston, TX, October 2004
Venkataramaiah, S., Subhlok, J.: Performance estimation for scheduling on shared networks. In: 9th Workshop on Job Scheduling Strategies for Parallel Processing, Seattle, WA, June 2003
Weismann, J.: Metascheduling: a scheduling model for metacomputing systems. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL, July 1998
Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared Unix systems on the computational grid. Cluster Comput. 3(4), 293–301 (2000)
Article Google Scholar
Wolski, R., Spring, N., Peterson, C.: Implementing a performance forecasting system for metacomputing: The Network Weather Service. In: Proceedings of Supercomputing ’97, San Jose, CA, November 1997

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Houston, Houston, TX, 77204, USA
Jaspal Subhlok & Qiang Xu
Microsoft Corp, Redmond, WA, 98052, USA
Sukhdeep Sodhi

Authors

Sukhdeep Sodhi
View author publications
You can also search for this author in PubMed Google Scholar
Jaspal Subhlok
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaspal Subhlok.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sodhi, S., Subhlok, J. & Xu, Q. Performance prediction with skeletons. Cluster Comput 11, 151–165 (2008). https://doi.org/10.1007/s10586-007-0039-2

Download citation

Received: 01 June 2007
Accepted: 08 August 2007
Published: 04 October 2007
Issue Date: June 2008
DOI: https://doi.org/10.1007/s10586-007-0039-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance prediction with skeletons

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hybrid CPU–GPU execution support in the skeleton programming framework SkePU

Generation of high-performance code based on a domain-specific language for algorithmic skeletons

Automatic Optimization of Python Skeletal Parallel Programs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Performance prediction with skeletons

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hybrid CPU–GPU execution support in the skeleton programming framework SkePU

Generation of high-performance code based on a domain-specific language for algorithmic skeletons

Automatic Optimization of Python Skeletal Parallel Programs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation