Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/583810.583817acmconferencesArticle/Chapter ViewAbstractPublication PagesjgiConference Proceedingsconference-collections
Article

Advanced eager scheduling for Java-based adaptively parallel computing

Published: 03 November 2002 Publication History

Abstract

Javelin 3 is a software system for developing large-scale, fault tolerant, adaptively parallel applications. When all or part of their application can be cast as a master-worker or branch-and-bound computation, Javelin 3 frees application developers from concerns about inter-processor communication and fault tolerance among networked hosts, allowing them to focus on the underlying application. The paper describes a fault tolerant task scheduler and its performance analysis. The task scheduler integrates work stealing with an advanced form of eager scheduling. It enables dynamic task decomposition, which improves host load-balancing in the presence of tasks whose non-uniform computational load is evident only at execution time. Speedup measurements are presented of actual performance on up to 1,000 hosts. We analyze the expected performance degradation due to unresponsive hosts, and measure actual performance degradation due to unresponsive hosts.

References

[1]
A. Alexandrov, M. Ibel, K. E. Schauser, and C. Scheiman. SuperWeb: Research Issues in Java-Based Global Computing. Concurrency: Practice and Experience, 9(6):535--553, June 1997.]]
[2]
J. E. Baldeschwieler, R. D. Blumofe, and E. A. Brewer. ATLAS: An Infrastructure for Global Computing. In Proceedings of the Seventh ACM SIGOPS European Workshop on System Support for Worldwide Applications, 1996.]]
[3]
A. Baratloo, M. Karaul, Z. Kedem, and P. Wyckoff. Charlotte: Metacomputing on the Web. In Proceedings of the 9th Conference on Parallel and Distributed Computing Systems, 1996.]]
[4]
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP '95), pages 207--216, Santa Barbara, CA, July 1995.]]
[5]
T. Brecht, H. Sandhu, M. Shan, and J. Talbot. ParaWeb: Towards World-Wide Supercomputing. In Proc. 7th ACM SIGOPS European Workshop on System Support for Worldwide Applications, 1996.]]
[6]
H. Casanova, G. Obertelli, F. Berman, and R. Wolski. The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid. In Proceedings of Super Computing, Nov. 2000. Dallas, TX.]]
[7]
B. O. Christiansen, P. Cappello, M. F. Ionescu, M. O. Neary, K. E. Schauser, and D. Wu. Javelin: Internet-Based Parallel Computing Using Java. Concurrency: Practice and Experience, 9(11):1139--1160, Nov. 1997.]]
[8]
B. N. Chun and D. E. Culler. REXEC: A Decentralized, Secure Remote Execution Environment for Clusters. In Proc. 4th Workshop on Communication, Architecture, and Applications for Network-based Parallel Computing, Jan. 2000. Toulouse, France.]]
[9]
D. H. J. Epema, M. Livny, R. van Dantzig, X. Evers, and J. Pruyne. A Worldwide Flock of Condors: Load Sharing among Workstation Clusters. Future Generation Computer Systems, 12:53--65, 1996.]]
[10]
I. Foster and C. Kesselman. Globus: A Metacomputing Infrastructure Toolkit. International Journal of Supercomputer Applications, 1997.]]
[11]
G. Fox and W. Furmanski. Java for Parallel Computing and as a General Language for Scientific and Engineering Simulation and Modeling. Concurrency: Practice and Experience, 9(6):415--425, June 1997.]]
[12]
J. Frey, T. Tannenbaum, I. Foster, M. Livny, and S. Tuecke. Condor-G: A Computation Management Agent for Multi- Institutional Grids. In Proc. Tenth IEEE Symposium on High Performance Distributed Computing (HPDC10), Aug. 2000. San Francisco, CA.]]
[13]
D. Gelernter and D. Kaminsky. Supercomputing out of Recycled Garbage: Preliminary Experience with Piranha. In Proc. Sixth ACM Int. Conf. on Supercomputing, July 1992.]]
[14]
A. S. Grimshaw, W. A. Wulf, and the Legion team. The Legion Vision of a Worldwide Virtual Computer. Communications of the ACM, 40(1):39--45, Jan. 1997.]]
[15]
K. Kennedy, M. Mazina, J. Mellor-Crummey, K. Cooper, L. Torczon, F. Berman, A. Chien, H. Dail, O. Sievert, D. Angulo, I. Foster, D. Gannon, L. Johnsson, C. Kesselman, R. Aydt, D. Reed, J. Dongarra, S. Vadhiyar, and R. Wolski. Toward a Framework for Preparing and Executing Adaptive Grid Programs. In Proc. NSF Next Generation Systems Program Workshop (Int. Parallel and Distributed Processing Symp.), Apr. 2002. Ft. Lauderdale, FL.]]
[16]
M. O. Neary, S. P. Brydon, P. Kmiec, S. Rollins, and P. Cappello. Javelin++: Scalability Issues in Global Computing. Concurrency: Practice and Experience, pages 727--753, Dec. 2000.]]
[17]
M. O. Neary and P. Cappello. Internet-Based TSP Computation with Javelin++. In 1st International Workshop on Scalable Web Services (SWS 2000), International Conference on Parallel Processing, Toronto, Canada, Aug. 2000.]]
[18]
M. O. Neary, A. Phipps, S. Richman, and P. Cappello. Javelin 2.0: Java-Based Parallel Computing on the Internet. In Euro-Par 2000, pages 1231--1238, Munich, Germany, Aug. 2000.]]
[19]
M. Nibhanupudi and B. Szymanski. BSP-based Adaptive Parallel Processing. In R. Buyya, editor, High Performance Cluster Computing, pages 702--721. Prentice-Hall, 1999.]]
[20]
L. F. G. Sarmenta and S. Hirano. Bayanihan: Building and Studying Web-Based Volunteer Computing Systems Using Java. Future Generation Computer Systems, 15(5-6):675--686, Oct. 1999.]]
[21]
R. van Nieupoort, J. Maassen, H. E. Bal, T. Kielmann, and R. Veldema. Wide-Area Parallel Computing in Java. In ACM 1999 Java Grande Conference, pages 8--14, San Francisco, June 1999.]]
[22]
G. von Laszewski, I. Foster, J. Gawor, W. Smith, and S. Tuecke. CoG Kits: A Bridge between Commodity Distributed Computing and High-Performance Grids. In ACM Java Grande Conference, June 2000.]]
[23]
M. Welsh, D. Culler, and E. Brewer. SEDA: An Architecture for Well-Conditioned, Scalable Internet Services. In Proc. 18th Symp. Operating Systems Principles, Oct. 2001. Lake Louise, Canada.]]
[24]
R. Wolski, J. Brevik, C. Krintz, G. Obertelli, N. Spring, and A. Su. Running EveryWare on the Computational Grid. In Proc. of SC99, Nov. 1999.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
JGI '02: Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
November 2002
252 pages
ISBN:1581135998
DOI:10.1145/583810
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2002

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. branch-and-bound
  2. eager scheduling
  3. fault tolerance
  4. grid computing
  5. parallel computing

Qualifiers

  • Article

Conference

JGI02
Sponsor:

Acceptance Rates

Overall Acceptance Rate 18 of 60 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2013)How to be a successful thiefProceedings of the 19th international conference on Parallel Processing10.1007/978-3-642-40047-6_14(114-125)Online publication date: 26-Aug-2013
  • (2012)Using load information in work-stealing on distributed systems with non-uniform communication latenciesProceedings of the 18th international conference on Parallel Processing10.1007/978-3-642-32820-6_17(155-166)Online publication date: 27-Aug-2012
  • (2010)SatinACM Transactions on Programming Languages and Systems10.1145/1709093.170909632:3(1-39)Online publication date: 16-Mar-2010
  • (2009)Optimal Spot-checking for Computation Time Minimization in Volunteer ComputingJournal of Grid Computing10.1007/s10723-009-9125-47:4(575-600)Online publication date: 18-Aug-2009
  • (2009)WSPE: a peer‐to‐peer grid programming environmentConcurrency and Computation: Practice and Experience10.1002/cpe.139221:13(1709-1724)Online publication date: 11-Feb-2009
  • (2008)Advanced Job Scheduler Based on Markov Availability Model and Resource Selection in Desktop Grid Computing EnvironmentMetaheuristics for Scheduling in Distributed Computing Environments10.1007/978-3-540-69277-5_6(153-171)Online publication date: 2008
  • (2008)Aspect‐oriented component assembly—a case study in parallel software designSoftware: Practice and Experience10.1002/spe.91239:9(807-832)Online publication date: 12-Dec-2008
  • (2007)WSPEProceedings of the 5th international workshop on Middleware for grid computing: held at the ACM/IFIP/USENIX 8th International Middleware Conference10.1145/1376849.1376855(1-6)Online publication date: 26-Nov-2007
  • (2007)MJSAFuture Generation Computer Systems10.1016/j.future.2006.09.00423:4(616-622)Online publication date: 1-May-2007
  • (2006)An economy-driven mapping heuristic for hierarchical master-slave applications in grid systemsProceedings of the 20th international conference on Parallel and distributed processing10.5555/1898953.1899090(162-162)Online publication date: 25-Apr-2006
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media