Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Sharc: Managing CPU and Network Bandwidth in Shared Clusters

Published: 01 January 2004 Publication History

Abstract

Abstract--In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc--a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares, and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources--CPU and network interface bandwidth--on a cluster-wide basis. Our techniques allow Sharc to 1) support reservation of CPU and network interface bandwidth for distributed applications, 2) dynamically allocate resources based on past usage, and 3) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.

References

[1]
K. Appleby S. Fakhouri L. Fong G. Goldszmidt M. Kalantar S. Krishnakumar D.P. Pazel J. Pershing and B. Rochwerger, “Oceano-SLA Based Management of a Computing Utility,” IBM Research, 2001.
[2]
M. Aron P. Druschel and W. Zwaenepoel, “Cluster Reserves: A Mechanism for Resource Management in Cluster-Based Network Servers,” Proc. ACM SIGMETRICS Conf., June 2000.
[3]
A. Arpaci-Dusseau and D.E. Culler, “Extending Proportional-Share Scheduling to a Network of Workstations,” Proc. Conf. Parallel and Distributed Processing Techniques and Applications, June 1997.
[4]
G. Banga P. Druschel and J. Mogul, “Resource Containers: A New Facility for Resource Management in Server Systems,” Proc. Third Symp. Operating System Design and Implementation, pp. 45-58, Feb. 1999.
[5]
J. Blanquer J. Bruno M. McShea B. Ozden A. Silberschatz and A. Singh, “Resource Management for QoS in Eclipse/BSD,” Proc. Free BSD Conf., Oct. 1999.
[6]
J. Chase D. Anderson P. Thakar A. Vahdat and R. Doyle, “Managing Energy and Server Resources in Hosting Centers,” Proc. 18th ACM Symp. Operating Systems Principles, pp. 103-116, Oct. 2001.
[7]
Corba Documentation, http://www.omg.org, 2003.
[8]
Distributed Computing Environment Documentation, http://www.opengroup.org, 2003.
[9]
A. Fox S.D. Gribble Y. Chawathe E.A. Brewer and P. Gauthier, “Cluster-Based Scalable Network Services,” Proc. 16th ACM Symp. Operating Systems Principles, pp. 78-91, Dec. 1997.
[10]
M.R. Garey and D.S. Johnson, Computer and Intractability: A Guide to the Theory of NP-Completeness. 2000.
[11]
K. Govil D. Teodosiu Y. Huang and M. Rosenblum, “Cellular Disco: Resource Management Using Virtual Clusters on Shared-Memory Multiprocessors,” Proc. ACM Symp. Operating Systems Principles, pp. 154-169, Dec. 1999.
[12]
P. Goyal H.M. Vin and H. Cheng, “Start-Time Fair Queuing: A Scheduling Algorithm for Integrated Services Packet Switching Networks,” Proc. ACM SIGCOMM, Aug. 1996.
[13]
P. Goyal S.S. Lam and H.M. Vin, “Determining End-to-End Delay Bounds In Heterogeneous Networks,” ACM/Springer-Verlag Multimedia Systems J., vol. 5, no. 3, pp. 157-163, May 1997.
[14]
S.D. Gribble E.A. Brewer J.M. Hellerstein and D. Culler, “Scalable, Distributed Data Structures for Internet Service Construction,” Proc. Fourth Symp. Operating System Design and Implementation, pp. 319-332, Oct. 2000.
[15]
A. Hori H. Tezuka Y. Ishikawa N. Soda H. Konaka and M. Maeda, “Implementation of Gang Scheduling on a Workstation Cluster,” Proc. Workshop Job Scheduling Strategies for Parallel Processing, pp. 27-40, 1996.
[16]
M.B. Jones D. Rosu and M. Rosu, “CPU Reservations and Time Constraints: Efficient, Predictable Scheduling of Independent Activities,” Proc. 16th ACM Symp. Operating Systems Principles, pp. 198-211, Dec. 1997.
[17]
I. Leslie D. McAuley R. Black T. Roscoe P. Barham D. Evers R. Fairbairns and E. Hyden, “The Design and Implementation of an Operating System to Support Distributed Multimedia Applications,” IEEE J. Selected Areas in Comm., vol. 14, no. 7, pp. 1280-1297, Sept. 1996.
[18]
C. Lin H. Chu and K. Nahrstedt, “A Soft Real-Time Scheduling Server on the Windows NT,” Proc. Second USENIX Windows NT Symp., Aug. 1998.
[19]
M. Litzkow M. Livny and M. Mutka, “Condor-A Hunter of Idle Workstations,” Proc. Eighth Int'l Conf. Distributed Computing Systems, pp. 104-111, June 1988.
[20]
J. Moore D. Irwin L. Grit S. Sprenkle and J. Chase, “Managing Mixed-Use Clusters with Cluster-on-Demand,” Cluster-on-Demand Draft, Internet Systems and Storage Group, Duke Univ., 2002.
[21]
QLinux Software Distribution, http://lass.cs.umass.edu/soft ware/qlinux, 1999.
[22]
S. Ranjan J. Rolia H. Fu and E. Knightly, “QoS-Driven Server Migration for Internet Data Centers,” Proc. 10th Int'l Workshop Quality of Service, 2002.
[23]
REACT: IRIX Real-Time Extensions, Silicon Graphics, Inc., http://www.sgi.com/software/react, 1999.
[24]
J. Reumann A. Mehra K. Shin and D. Kandlur, “Virtual Services: A New Abstraction for Server Consolidation,” Proc. USENIX Ann. Technical Conf., June 2000.
[25]
T. Roscoe and B. Lyles, “Distributing Computing without DPEs: Design Considerations for Public Computing Platforms,” Proc. Ninth ACM SIGOPS European Workshop, Sept. 2000.
[26]
Y. Saito B. Bershad and H. Levy, “Manageability, Availability and Performance in Porcupine: A Highly Available, Scalable Cluster-Based Mail Service,” Proc. 17th Symp. Operating Systems Principles, pp. 1-15, Dec. 1999.
[27]
P. Shenoy and H. Vin, “Cello: A Disk Scheduling Framework for Next Generation Operating Systems,” Proc. ACM SIGMETRICS Conf, pp. 44-55, June 1998.
[28]
Solaris Resource Manager 1.0: Controlling System Resources Effectively, Sun Microsystems, Inc., http://www.sun.com/soft ware/white-papers/wp-srm/, 1998.
[29]
B. Urgaonkar and P. Shenoy, “Sharc: Managing CPU and Network Bandwidth in Shared Clusters,” Technical Report TR01-08, Dept. of Computer Science, Univ. of Mass., Oct. 2001.
[30]
B. Urgaonkar P. Shenoy and T. Roscoe, “Resource Overbooking and Application Profiling in Shared Hosting Platforms,” Proc. Fifth Symp. Operating Systems Design and Implementation, Dec. 2002.
[31]
B. Urgaonkar P. Shenoy and A. Rosenberg, “Application Placement on a Cluster of Servers,” Dept. of Computer Science, Univ. of Mass., 2003.
[32]
B. Verghese A. Gupta and M. Rosenblum, “Performance Isolation: Sharing and Isolation in Shared-Memory Multiprocessors,” Proc. ASPLOS-VIII, pp. 181-192, Oct. 1998.
[33]
C.A. Waldspurger, “Memory Resource Management in VMWare ESX Server,” Proc. Fifth Symp. Operating Systems Design and Implementation, Dec. 2002.
[34]
T. Zhao and V. Karmacheti, “Enforcing Resource Sharing Agreements among Distributed Server Clusters,” Proc. 16th Int'l Parallel and Distributed Processing Symp., April 2002.

Cited By

View all
  • (2024)Resource Management in Aurora ServerlessProceedings of the VLDB Endowment10.14778/3685800.368582517:12(4038-4050)Online publication date: 8-Nov-2024
  • (2014)Efficiently Compositing and Optimizing the Quality of Heterogeneous ServicesInternational Journal of Web Services Research10.4018/ijwsr.201407010411:3(76-95)Online publication date: 1-Jul-2014
  • (2014)Analysis on Cloud Classification using AccessibilityInternational Journal of Cloud Applications and Computing10.4018/ijcac.20140701034:3(44-53)Online publication date: 1-Jul-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 15, Issue 1
January 2004
96 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2004

Author Tags

  1. CPU and network bandwidth
  2. Linux
  3. Sharc
  4. Shared clusters
  5. capsule
  6. control plane
  7. dedicated clusters
  8. hosting platforms.
  9. nucleus

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Resource Management in Aurora ServerlessProceedings of the VLDB Endowment10.14778/3685800.368582517:12(4038-4050)Online publication date: 8-Nov-2024
  • (2014)Efficiently Compositing and Optimizing the Quality of Heterogeneous ServicesInternational Journal of Web Services Research10.4018/ijwsr.201407010411:3(76-95)Online publication date: 1-Jul-2014
  • (2014)Analysis on Cloud Classification using AccessibilityInternational Journal of Cloud Applications and Computing10.4018/ijcac.20140701034:3(44-53)Online publication date: 1-Jul-2014
  • (2014)Synchronisation of data transfer in cloudInternational Journal of Internet Protocol Technology10.1504/IJIPT.2014.0608568:1(1-24)Online publication date: 1-May-2014
  • (2014)Adaptive Resource Provisioning for Virtualized Servers Using Kalman FiltersACM Transactions on Autonomous and Adaptive Systems10.1145/26262909:2(1-35)Online publication date: 1-Jul-2014
  • (2011)On/off-line prediction applied to job scheduling on non-dedicated NOWsJournal of Computer Science and Technology10.5555/1991836.199184626:1(99-116)Online publication date: 1-Jan-2011
  • (2011)A game theoretic formulation of the service provisioning problem in cloud systemsProceedings of the 20th international conference on World wide web10.1145/1963405.1963433(177-186)Online publication date: 28-Mar-2011
  • (2010)Joint admission control and resource allocation in virtualized serversJournal of Parallel and Distributed Computing10.1016/j.jpdc.2009.08.00970:4(344-362)Online publication date: 1-Apr-2010
  • (2009)A multi-agent learning approach to online distributed resource allocationProceedings of the 21st International Joint Conference on Artificial Intelligence10.5555/1661445.1661503(361-366)Online publication date: 11-Jul-2009
  • (2009)Dynamic trade-off analysis of QoS and energy saving in admission control for web service systemsProceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools10.4108/ICST.VALUETOOLS2009.7941(1-10)Online publication date: 20-Oct-2009
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media