Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

QoS-Aware scheduling in heterogeneous datacenters with paragon

Published: 20 December 2013 Publication History

Abstract

Large-scale datacenters (DCs) host tens of thousands of diverse applications each day. However, interference between colocated workloads and the difficulty of matching applications to one of the many hardware platforms available can degrade performance, violating the quality of service (QoS) guarantees that many cloud workloads require. While previous work has identified the impact of heterogeneity and interference, existing solutions are computationally intensive, cannot be applied online, and do not scale beyond a few applications.
We present Paragon, an online and scalable DC scheduler that is heterogeneity- and interference-aware. Paragon is derived from robust analytical methods, and instead of profiling each application in detail, it leverages information the system already has about applications it has previously seen. It uses collaborative filtering techniques to quickly and accurately classify an unknown incoming workload with respect to heterogeneity and interference in multiple shared resources. It does so by identifying similarities to previously scheduled applications. The classification allows Paragon to greedily schedule applications in a manner that minimizes interference and maximizes server utilization. After the initial application placement, Paragon monitors application behavior and adjusts the scheduling decisions at runtime to avoid performance degradations. Additionally, we design ARQ, a multiclass admission control protocol that constrains application waiting time. ARQ queues applications in separate classes based on the type of resources they need and avoids long queueing delays for easy-to-satisfy workloads in highly-loaded scenarios. Paragon scales to tens of thousands of servers and applications with marginal scheduling overheads in terms of time or state.
We evaluate Paragon with a wide range of workload scenarios, on both small and large-scale systems, including 1,000 servers on EC2. For a 2,500-workload scenario, Paragon enforces performance guarantees for 91% of applications, while significantly improving utilization. In comparison, heterogeneity-oblivious, interference-oblivious, and least-loaded schedulers only provide similar guarantees for 14%, 11%, and 3% of workloads. The differences are more striking in oversubscribed scenarios where resource efficiency is more critical.

References

[1]
Alameldeen, A. R. and Wood, D. A. 2006. IPC considered harmful for multiprocessor workloads. IEEE Micro (Special Issue on Computer Architecture Simulation and Modeling).
[2]
Amazon EC2. http://aws.amazon.com/ec2/.
[3]
Banga, G., Druschel, P., and Mogul, J. C. 1999. Resource containers: A new facility for resource management in server systems. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI).
[4]
Barroso, L. 2011. Warehouse-scale computing: entering the teenage decade. In Proceedings of ISCA.
[5]
Barroso, L. and Hoelzle, U. 2009. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan and Claypool.
[6]
Bell, R. M., Koren, Y., and Volinsky, C. 2007. The BellKor 2008 solution to the Netflix Prize. Tech. rep., AT&T Labs.
[7]
Bertsimas, D., Gamarnik, D., and Tsitsiklis, J. N. 2001. Performance of multiclass Markovian queueing networks via piecewise linear Lyapunov functions. Ann. Appl. Probab. 11, 1384--1428.
[8]
Bienia, C., Kumar, S., Singh, J. P., and Li, K. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[9]
Bottou, L. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of the International Conference on Computational Statistics (COMPSTAT).
[10]
Calder, B., Wang, J., Ogus, A., Nilakantan, N., Skjolsvold, A., McKelvie, S., Xu, Y., Srivastav, S., Wu, J., Simitci, H., Haridas, J., Uddaraju, C., Khatri, H., Edwards, A., Bedekar, V., Mainali, S., Abbasi, R., Agarwal, A., ul Haq, M. F., ul Haq, M. I., Bhardwaj, D., Dayanand, S., Adusumilli, A., McNett, M., Sankaran, S., Manivannan, K., and Rigas, L. 2011. Windows Azure storage: A highly available cloud storage service with strong consistency. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP).
[11]
Chase, J., Anderson, D., Thakar, P., Vahdat, A., and Doyle, R. 2001. Managing energy and server resources in hosting centers. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP).
[12]
Craeynest, K. V., Jaleel, A., Eeckhout, L., Narvaez, P., and Emer, J. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In Proceedings of the International Symposium on Computer Architecture (ISCA).
[13]
Dai, J. G. 1995. On positive Harris recurrence of multiclass queueing networks: A unified approach via fluid limit models. Ann. Appl. Probab. 5, 49--77.
[14]
Dai, J. G. 1996. A fluid-limit model criterion for instability of multiclass queueing networks. Ann. Appl. Probab. 6, 751--757.
[15]
Delimitrou, C. and Kozyrakis, C. 2013a. iBench: Quantifying interference for datacenter applications. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC).
[16]
Delimitrou, C. and Kozyrakis, C. 2013b. Paragon: QoS-aware scheduling for heterogeneous datacenters. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[17]
Delimitrou, C. and Kozyrakis, C. 2013c. The Netflix challenge: Datacenter edition. IEEE Comput. Archit. Lett. (June).
[18]
Fedorova, A., Seltzer, M., and Smith, M. D. 2007. Improving performance isolation on chip multiprocessors via an operating system scheduler. In Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT).
[19]
Gamarnik, D. 2000. On deciding stability of scheduling policies in queuing systems. In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms. 467--476.
[20]
Google Compute Engine GCE. http://cloud.google.com/products/compute-engine.html.
[21]
Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., and Stoica, I. 2011. Dominant resource fairness: Fair allocation of multiple resource types. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI).
[22]
Gmach, D., Rolia, J., Cherkasova, L., and Kemper, A. 2007. Workload analysis and demand prediction of enterprise data center applications. In Proceedings of the 10th IEEE International Symposium on Workload Characterization (IISWC).
[23]
Govindan, S., Liu, J., Kansal, A., and Sivasubramaniam, A. 2011. Cuanta: Quantifying effects of shared on-chip resource interference for consolidated virtual machines. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SOCC).
[24]
Hamilton, J. 2009. Internet-scale service infrastructure efficiency. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA).
[25]
Hamilton, J. 2010. Cost of power in large-scale data centers. http://perspectives.mvdirona.com.
[26]
Hasenbein, J. J. 1998. Stability, capacity, and scheduling of multiclass queuing networks. Ph.D. dissertation, Georgia Institute of Technology.
[27]
Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A. D., Katz, R., Shenker, S., and Stoica, I. 2011. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI).
[28]
Jaleel, A., Mattina, M., and Jacob, B. L. 2006. Last level cache (LLC) performance of data mining workloads on a CMP—A case study of parallel bioinformatics workloads. In Proceedings of the 12th International Symposium on High-Performance Computer Architecture (HPCA-12).
[29]
Katz, J. and Lindell, Y. 2007. Introduction to Modern Cryptography. Chapman & Hall/CRC Press.
[30]
Kiwiel, K. C. 2001. Convergence and efficiency of subgradient methods for quasiconvex minimization. Math. Program. (Series A), 90, 1, 1--25.
[31]
Kozyrakis, C., Kansal, A., Sankar, S., and Vaid, K. 2010. Server engineering insights for large-scale online services. IEEE Micro 30, 4, 8--19.
[32]
Leverich, J. and Kozyrakis, C. 2010. On the energy (in)efficiency of Hadoop clusters. SIGOPS Oper. Syst. Rev. 44, 1, 61--65.
[33]
Lin, J. and Kolcz, A. 2012. Large-scale machine learning at Twitter. In Proceedings of the ACM SIGMOD Conference.
[34]
Mars, J. and Tang, L. 2013. Whare-map: heterogeneity in “homogeneous” warehouse-scale computers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA).
[35]
Mars, J., Tang, L., and Hundt, R. 2011. Heterogeneity in “homogeneous”; warehouse-scale computers: A performance opportunity. IEEE Comput. Archit. Lett. 10, 2, 29--32.
[36]
Meisner, D., Sadler, C. M., Barroso, L. A., Weber, W.-D., and Wenisch, T. F. 2011. Power management of online data-intensive services. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA).
[37]
Miller, B. L. 1969. A queuing reward system with several customer classes. Manage. Sci. 16, 3, 234--245.
[38]
Narayanan, R., Ozisikyilmaz, B., Zambreno, J., Memik, G., and Choudhary, A. N. 2006. MineBench: A benchmark suite for data mining workloads. In Proceedings of the 9th IEEE International Symposium on Workload Characterization (IISWC).
[39]
Nathuji, R., Isci, C., and Gorbatov, E. 2007. Exploiting platform heterogeneity for power efficient data centers. In Proceedings of the International Conference on Autonomic Computing (ICAC).
[40]
Nathuji, R., Kansal, A., and Ghaffarkhah, A. 2010. Q-Clouds: Managing performance interference effects for QoS-aware clouds. In Proceedings of the European Conference on Computer Systems (EuroSys'10).
[41]
Novakovi&cgrave;, D., Vasi&cgrave;, N., Novakovi&cgrave;, S., Kosti&cgrave;, D., and Bianchini, R. 2013. DeepDive: Transparently identifying and managing performance interference in virtualized environments. In Proceedings of the USENIX Annual Technical Conference (ATC).
[42]
Rackspace. Open Cloud. http://www.rackspace.com/.
[43]
Rajaraman, A. and Ullman, J. 2011. Textbook on Mining of Massive Datasets. Rightscale. https://aws.amazon.com/solution-providers/isv/rightscale.
[44]
Sanchez, D. and Kozyrakis, C. 2011. Vantage: Scalable and efficient fine-grain cache partitioning. In Proceedings of the 38th Annual International Symposium in Computer Architecture (ISCA-38).
[45]
Schein, A., Popescul, A., Ungar, L., and Pennock, D. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).
[46]
Schwarzkopf, M., Konwinski, A., Abd-El-Malek, M., and Wilkes, J. 2013. Omega: Flexible, scalable schedulers for large compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys'13).
[47]
Shelepov, D., Alcaide, J. C. S., Jeffery, S., Fedorova, A., Perez, N., Huang, Z. F., Blagodurov, S., and Kumar, V. 2009. HASS: A scheduler for heterogeneous multicore systems. SIGOPS Oper. Syst. Rev. 43, 2.
[48]
Shen, Z., Subbiah, S., Gu, X., and Wilkes, J. 2011. CloudScale: elastic resource scaling for multi-tenant cloud systems. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SOCC).
[49]
Sun, J., Xie, Y., Zhang, H., and Faloutsos, C. 2008. Less is more: Compact matrix decomposition for large sparse graphs. J. Stat. Anal. Data Mining 1, 1.
[50]
Tanenbaum, A. S. 2007. Modern Operating Systems. 3rd Ed. Peason Education, Inc.
[51]
Vasić, N., Novaković, D., Miučin, S., Kostić, D., and Bianchini, R. 2012. Deja vu: accelerating resource allocation in virtualized environments. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[52]
vMotion. Migrate VMs with Zero Downtime. http//www.vmware.com/products/vmotion.
[53]
VMWare-DRS. 2012. Distributed resource scheduler: design, implementation and lessons learned. VMware Tech. J. 1, 1.
[54]
VMWare vSphere. http://www.vmware.com/products/vsphere/.
[55]
Weng, L.-T., Yue, X., Yuefeng, L., and Nayak, R. 2008. Exploiting item taxonomy for solving cold-start problem in recommendation making. In Proceedings of the 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI).
[56]
Wenisch, T. F., Wunderlich, R. E., Ferdman, M., Ailamaki, A., Falsafi, B., and Hoe, J. C. 2006. SimFlex: Statistical sampling of computer system simulation. IEEE MICRO 26, 4.
[57]
Windows Azure. http://www.windowsazure.com/.
[58]
Witten, I. H., Frank, E., and Holmes, G. 2011. Data Mining: Practical Machine Learning Tools and Techniques. 3rd Ed.
[59]
Woo, S. C., Ohara, M., Torrie, E., Singh, J. P., and Gupta, A. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd International Symposium on Computer Architecture (ISCA).
[60]
Xenserver. 6.1. http://www.citrix.com/xenserver/.
[61]
Yang, H., Breslow, A., Mars, J., and Tang, L. 2013. Bubble-flux: Precise online QoS management for increased utilization in warehouse scale computers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA).
[62]
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S., and Stoica, I. 2012. Spark: Cluster computing with working sets. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI).
[63]
Zhang, X., Tune, E., Hagmann, R., Jnagal, R., Gokhale, V., and Wilkes, J. 2013. CPI2: CPU performance isolation for shared compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys'13).
[64]
Zhang, Z.-K., Liu, C., Zhang, Y.-C., and Zhou, T. 2010. Solving the cold-start problem in recommender systems with social tags. arXiv:1004.3732v2.
[65]
Zhu, X., Young, D., Watson, B. J., Wang, Z., Rolia, J., Singhal, S., Mckee, B., Hyser, C., Gmach, D., Gardner, R., Christian, T., and Cherkasova, L. 2009. 1000 Islands: An integrated approach to resource management for virtualized datacenters. J. Cluster Comput. 12, 1.

Cited By

View all
  • (2024)PREACT: Predictive Resource Allocation for Bursty Workloads in a Co-located Data CenterProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673135(722-731)Online publication date: 12-Aug-2024
  • (2024)Characterizing In-Kernel Observability of Latency-Sensitive Request-Level Metrics with eBPF2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS61541.2024.00013(24-35)Online publication date: 5-May-2024
  • (2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 31, Issue 4
December 2013
90 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/2542150
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2013
Accepted: 01 September 2013
Revised: 01 September 2013
Received: 01 May 2013
Published in TOCS Volume 31, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Datacenter
  2. QoS
  3. cloud computing
  4. heterogeneity
  5. interference
  6. resource-efficiency
  7. scheduling

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)PREACT: Predictive Resource Allocation for Bursty Workloads in a Co-located Data CenterProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673135(722-731)Online publication date: 12-Aug-2024
  • (2024)Characterizing In-Kernel Observability of Latency-Sensitive Request-Level Metrics with eBPF2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS61541.2024.00013(24-35)Online publication date: 5-May-2024
  • (2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
  • (2023)Approx-RM: Reducing Energy on Heterogeneous Multicore Processors under Accuracy and Timing ConstraintsACM Transactions on Architecture and Code Optimization10.1145/360521420:3(1-25)Online publication date: 22-Jul-2023
  • (2023)Understanding the Neglected Cost of Serverless Cluster ManagementProceedings of the 4th Workshop on Resource Disaggregation and Serverless10.1145/3605181.3626286(22-28)Online publication date: 23-Oct-2023
  • (2023)Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter BehaviorsProceedings of the 24th International Middleware Conference on ZZZ10.1145/3590140.3629117(220-233)Online publication date: 27-Nov-2023
  • (2023)AQUATOPE: QoS-and-Uncertainty-Aware Resource Management for Multi-stage Serverless WorkflowsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3567955.3567960(1-14)Online publication date: 25-Mar-2023
  • (2023)An Efficient Scheduler for Task-Parallel Interactive ApplicationsProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591092(27-38)Online publication date: 17-Jun-2023
  • (2023)With Great Freedom Comes Great Opportunity: Rethinking Resource Allocation for Serverless FunctionsProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3567506(381-397)Online publication date: 8-May-2023
  • (2023)OLPart: Online Learning based Resource Partitioning for Colocating Multiple Latency-Critical Jobs on Commodity ComputersProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3567490(347-364)Online publication date: 8-May-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media