research-article

Open access

Heracles: improving resource efficiency at scale

Authors:

Rama Govindaraju,

Parthasarathy Ranganathan,

Christos KozyrakisAuthors Info & Claims

ISCA '15: Proceedings of the 42nd Annual International Symposium on Computer Architecture

Pages 450 - 462

https://doi.org/10.1145/2749469.2749475

Published: 13 June 2015 Publication History

Abstract

User-facing, latency-sensitive services, such as websearch, underutilize their computing resources during daily periods of low traffic. Reusing those resources for other tasks is rarely done in production services since the contention for shared resources can cause latency spikes that violate the service-level objectives of latency-sensitive tasks. The resulting under-utilization hurts both the affordability and energy-efficiency of large-scale datacenters. With technology scaling slowing down, it becomes important to address this opportunity.

We present Heracles, a feedback-based controller that enables the safe colocation of best-effort tasks alongside a latency-critical service. Heracles dynamically manages multiple hardware and software isolation mechanisms, such as CPU, memory, and network isolation, to ensure that the latency-sensitive job meets latency targets while maximizing the resources given to best-effort tasks. We evaluate Heracles using production latency-critical and batch workloads from Google and demonstrate average server utilizations of 90% without latency violations across all the load and colocation scenarios that we evaluated.

References

[1]

"Iperf - The TCP/UDP Bandwidth Measurement Tool," https://iperf.fr/.

[2]

"memcached," http://memcached.org/.

[3]

"Intel® 64 and IA-32 Architectures Software Developer's Manual," vol. 3B: System Programming Guide, Part 2, Sep 2014.

[4]

Mohammad Al-Fares et al., "A Scalable, Commodity Data Center Network Architecture," in Proc. of the ACM SIGCOMM 2008 Conference on Data Communication, ser. SIGCOMM '08. New York, NY: ACM, 2008.

Digital Library

[5]

Mohammad Alizadeh et al., "Data Center TCP (DCTCP)," in Proc. of the ACM SIGCOMM 2010 Conference, ser. SIGCOMM '10. New York, NY: ACM, 2010.

Digital Library

[6]

Luiz Barroso et al., "The Case for Energy-Proportional Computing," Computer, vol. 40, no. 12, Dec. 2007.

Digital Library

[7]

Luiz André Barroso et al., The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2nd ed. Morgan & Claypool Publishers, 2013.

Digital Library

[8]

Adam Belay et al., "IX: A Protected Dataplane Operating System for High Throughput and Low Latency," in 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). Broomfield, CO: USENIX Association, Oct. 2014.

Digital Library

[9]

Sergey Blagodurov et al., "A Case for NUMA-aware Contention Management on Multicore Systems," in Proc. of the 2011 USENIX Conference on USENIX Annual Technical Conference, ser. USENIXATC'11. Berkeley, CA: USENIX Association, 2011.

Digital Library

[10]

Eric Boutin et al., "Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing," in 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). Broomfield, CO: USENIX Association, 2014.

Digital Library

[11]

Bob Briscoe, "Flow Rate Fairness: Dismantling a Religion," SIGCOMM Comput. Commun. Rev., vol. 37, no. 2, Mar. 2007.

Digital Library

[12]

Martin A. Brown, "Traffic Control HOWTO," http://linux-ip.net/articles/Traffic-Control-HOWTO/.

[13]

Marcus Carvalho et al., "Long-term SLOs for Reclaimed Cloud Computing Resources," in Proc. of SOCC, Seattle, WA, Dec. 2014.

Digital Library

[14]

McKinsey & Company, "Revolutionizing data center efficiency," Uptime Institute Symp., 2008.

[15]

Henry Cook et al., "A Hardware Evaluation of Cache Partitioning to Improve Utilization and Energy-efficiency While Preserving Responsiveness," in Proc. of the 40th Annual International Symposium on Computer Architecture, ser. ISCA '13. New York, NY: ACM, 2013.

Digital Library

[16]

Carlo Curino et al., "Reservation-based Scheduling: If You're Late Don't Blame Us!" in Proc. of the 5th annual Symposium on Cloud Computing, 2014.

Digital Library

[17]

Jeffrey Dean et al. "The tail at scale," Commun. ACM, vol. 56, no. 2, Feb. 2013.

Digital Library

[18]

Christina Delimitrou et al. "Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters," in Proc. of the 18th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Houston, TX, 2013.

Digital Library

[19]

Christina Delimitrou et al. "Quasar: Resource-Efficient and QoS-Aware Cluster Management," in Proc. of the Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Salt Lake City, UT, 2014.

Digital Library

[20]

Eiman Ebrahimi et al. "Fairness via Source Throttling: A Configurable and High-performance Fairness Substrate for Multi-core Memory Systems," in Proc. of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS XV. New York, NY: ACM, 2010.

Digital Library

[21]

H. Esmaeilzadeh et al. "Dark silicon and the end of multicore scaling," in Computer Architecture (ISCA), 2011 38th Annual International Symposium on, June 2011.

Digital Library

[22]

Sriram Govindan et al. "Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines," in Proc. of the 2nd ACM Symposium on Cloud Computing, 2011.

Digital Library

[23]

Fei Guo et al. "From Chaos to QoS: Case Studies in CMP Resource Management," SIGARCH Comput. Archit. News, vol. 35, no. 1, Mar. 2007.

Digital Library

[24]

Fei Guo et al. "A Framework for Providing Quality of Service in Chip Multi-Processors," in Proc. of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO 40. Washington, DC: IEEE Computer Society, 2007.

Digital Library

[25]

Nikos Hardavellas et al. "Toward Dark Silicon in Servers," IEEE Micro, vol. 31, no. 4, 2011.

Digital Library

[26]

Lisa R. Hsu et al. "Communist, Utilitarian, and Capitalist Cache Policies on CMPs: Caches As a Shared Resource," in Proc. of the 15th International Conference on Parallel Architectures and Compilation Techniques, ser. PACT '06. New York, NY: ACM, 2006.

Digital Library

[27]

Intel, "Serial ATA II Native Command Queuing Overview," http://download.intel.com/support/chipsets/imsm/sb/sata2_ncq_overview.pdf, 2003.

[28]

Teerawat Issariyakul et al. Introduction to Network Simulator NS2, 1st ed. Springer Publishing Company, Incorporated, 2010.

Digital Library

[29]

Ravi Iyer, "CQoS: A Framework for Enabling QoS in Shared Caches of CMP Platforms," in Proc. of the 18th Annual International Conference on Supercomputing, ser. ICS '04. New York, NY: ACM, 2004.

Digital Library

[30]

Ravi Iyer et al. "QoS Policies and Architecture for Cache/Memory in CMP Platforms," in Proc. of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, ser. SIGMETRICS '07. New York, NY: ACM, 2007.

Digital Library

[31]

Vijay Janapa Reddi et al. "Web Search Using Mobile Cores: Quantifying and Mitigating the Price of Efficiency," SIGARCH Comput. Archit. News, vol. 38, no. 3, Jun. 2010.

Digital Library

[32]

Min Kyu Jeong et al. "A QoS-aware Memory Controller for Dynamically Balancing GPU and CPU Bandwidth Use in an MPSoC," in Proc. of the 49th Annual Design Automation Conference, ser. DAC '12. New York, NY: ACM, 2012.

Digital Library

[33]

Vimalkumar Jeyakumar et al. "EyeQ: Practical Network Performance Isolation at the Edge," in Proc. of the 10th USENIX Conference on Networked Systems Design and Implementation, ser. nsdi'13. Berkeley, CA: USENIX Association, 2013.

Digital Library

[34]

Svilen Kanev et al. "Tradeoffs between Power Management and Tail Latency in Warehouse-Scale Applications," in IISWC, 2014.

[35]

Rishi Kapoor et al. "Chronos: Predictable Low Latency for Data Center Applications," in Proc. of the Third ACM Symposium on Cloud Computing, ser. SoCC '12. New York, NY: ACM, 2012.

Digital Library

[36]

Harshad Kasture et al. "Ubik: Efficient Cache Sharing with Strict QoS for Latency-Critical Workloads," in Proc. of the 19th international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XIX), March 2014.

Digital Library

[37]

Wonyoung Kim et al. "System level analysis of fast, per-core DVFS using on-chip switching regulators," in High Performance Computer Architecture, 2008. HPCA 2008. IEEE 14th International Symposium on, Feb 2008.

[38]

Quoc Le et al. "Building high-level features using large scale unsupervised learning," in International Conference in Machine Learning, 2012.

[39]

Jacob Leverich et al. "Reconciling High Server Utilization and Sub-millisecond Quality-of-Service," in SIGOPS European Conf. on Computer Systems (EuroSys), 2014.

Digital Library

[40]

Bin Li et al. "CoQoS: Coordinating QoS-aware Shared Resources in NoC-based SoCs," J. Parallel Distrib. Comput., vol. 71, no. 5, May 2011.

Digital Library

[41]

Kevin Lim et al. "Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached," in Proc. of the 40th Annual International Symposium on Computer Architecture, 2013.

Digital Library

[42]

Kevin Lim et al. "System-level Implications of Disaggregated Memory," in Proc. of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture, ser. HPCA '12. Washington, DC: IEEE Computer Society, 2012.

Digital Library

[43]

Jiang Lin et al. "Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems," in High Performance Computer Architecture, 2008. HPCA 2008. IEEE 14th International Symposium on, Feb 2008.

[44]

Huan Liu, "A Measurement Study of Server Utilization in Public Clouds," in Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth Intl. Conf. on, 2011.

Digital Library

[45]

Rose Liu et al. "Tessellation: Space-time Partitioning in a Manycore Client OS," in Proc. of the First USENIX Conference on Hot Topics in Parallelism, ser. HotPar'09. Berkeley, CA: USENIX Association, 2009.

Digital Library

[46]

Yanpei Liu et al. "SleepScale: Runtime Joint Speed Scaling and Sleep States Management for Power Efficient Data Centers," in Proceeding of the 41st Annual International Symposium on Computer Architecuture, ser. ISCA '14. Piscataway, NJ: IEEE Press, 2014.

Digital Library

[47]

David Lo et al. "Towards Energy Proportionality for Large-scale Latency-critical Workloads," in Proceeding of the 41st Annual International Symposium on Computer Architecuture, ser. ISCA '14. Piscataway, NJ: IEEE Press, 2014.

Digital Library

[48]

Krishna T. Malladi et al. "Towards Energy-proportional Datacenter Memory with Mobile DRAM," SIGARCH Comput. Archit. News, vol. 40, no. 3, Jun. 2012.

Digital Library

[49]

R Manikantan et al. "Probabilistic Shared Cache Management (PriSM)," in Proc. of the 39th Annual International Symposium on Computer Architecture, ser. ISCA '12. Washington, DC: IEEE Computer Society, 2012.

Digital Library

[50]

J. Mars et al. "Increasing Utilization in Modern Warehouse-Scale Computers Using Bubble-Up," Micro, IEEE, vol. 32, no. 3, May 2012.

Digital Library

[51]

Jason Mars et al. "Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations," in Proc. of the 44th Annual IEEE/ACM Intl. Symp. on Microarchitecture, ser. MICRO-44 '11, 2011.

Digital Library

[52]

Paul Marshall et al. "Improving Utilization of Infrastructure Clouds," in Proc. of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 2011.

Digital Library

[53]

David Meisner et al. "PowerNap: Eliminating Server Idle Power," in Proc. of the 14th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS XIV, 2009.

Digital Library

[54]

David Meisner et al. "Power Management of Online Data-Intensive Services," in Proc. of the 38th ACM Intl. Symp. on Computer Architecture, 2011.

Digital Library

[55]

Paul Menage, "CGROUPS," https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt.

[56]

Sai Prashanth Muralidhara et al. "Reducing Memory Interference in Multicore Systems via Application-aware Memory Channel Partitioning," in Proc. of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-44. New York, NY: ACM, 2011.

Digital Library

[57]

Vijay Nagarajan et al. "ECMon: Exposing Cache Events for Monitoring," in Proc. of the 36th Annual International Symposium on Computer Architecture, ser. ISCA '09. New York, NY: ACM, 2009.

Digital Library

[58]

R. Nathuji et al. "Q-Clouds: Managing Performance Interference Effects for QoS-Aware Clouds," in Proc. of EuroSys, France, 2010.

Digital Library

[59]

K. J. Nesbit et al. "Fair Queuing Memory Systems," in Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on, Dec 2006.

Digital Library

[60]

Dejan Novakovic et al. "DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments," in Proc. of the USENIX Annual Technical Conference (ATC'13), San Jose, CA, 2013.

Digital Library

[61]

W. Pattara-Aukom et al. "Starvation prevention and quality of service in wireless LANs," in Wireless Personal Multimedia Communications, 2002. The 5th International Symposium on, vol. 3, Oct 2002.

[62]

M. Podlesny et al. "Solving the TCP-Incast Problem with Application-Level Scheduling," in Modeling, Analysis Simulation of Computer and Telecommunication Systems (MASCOTS), 2012 IEEE 20th International Symposium on, Aug 2012.

Digital Library

[63]

Andrew Putnam et al. "A Reconfigurable Fabric for Accelerating Large-scale Datacenter Services," in Proceeding of the 41st Annual International Symposium on Computer Architecuture, ser. ISCA '14. Piscataway, NJ: IEEE Press, 2014.

Digital Library

[64]

M. K. Qureshi et al. "Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches," in Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on, Dec 2006.

Digital Library

[65]

Parthasarathy Ranganathan et al. "Reconfigurable Caches and Their Application to Media Processing," in Proc. of the 27th Annual International Symposium on Computer Architecture, ser. ISCA '00. New York, NY: ACM, 2000.

Digital Library

[66]

Charles Reiss et al. "Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis," in ACM Symp. on Cloud Computing (SoCC), Oct. 2012.

Digital Library

[67]

Chuck Rosenberg, "Improving Photo Search: A Step Across the Semantic Gap," http://googleresearch.blogspot.com/2013/06/improving-photo-search-step-across.html.

[68]

Daniel Sanchez et al. "Vantage: Scalable and Efficient Fine-grain Cache Partitioning," SIGARCH Comput. Archit. News, vol. 39, no. 3, Jun. 2011.

Digital Library

[69]

Yoon Jae Seong et al. "Hydra: A Block-Mapped Parallel Flash Memory Solid-State Disk Architecture," Computers, IEEE Transactions on, vol. 59, no. 7, July 2010.

Digital Library

[70]

Akbar Sharifi et al. "METE: Meeting End-to-end QoS in Multicores Through System-wide Resource Management," in Proc. of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, ser. SIGMETRICS '11. New York, NY: ACM, 2011.

Digital Library

[71]

Shekhar Srikantaiah et al. "SHARP Control: Controlled Shared Cache Management in Chip Multiprocessors," in Proc. of the 42Nd Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO 42. New York, NY: ACM, 2009.

Digital Library

[72]

Shingo Tanaka et al. "High Performance Hardware-Accelerated Flash Key-Value Store," in The 2014 Non-volatile Memories Workshop (NVMW), 2014.

[73]

Lingjia Tang et al. "The impact of memory subsystem resource sharing on datacenter applications," in Computer Architecture (ISCA), 2011 38th Annual International Symposium on, June 2011.

Digital Library

[74]

Arunchandar Vasan et al. "Worth their watts? - an empirical study of datacenter servers," in Intl. Symp. on High-Performance Computer Architecture, 2010.

[75]

Nedeljko Vasić et al. "DejaVu: accelerating resource allocation in virtualized environments," in Proc. of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), London, UK, 2012.

Digital Library

[76]

Christo Wilson et al. "Better Never Than Late: Meeting Deadlines in Datacenter Networks," in Proc. of the ACM SIGCOMM 2011 Conference, ser. SIGCOMM '11. New York, NY: ACM, 2011.

Digital Library

[77]

Carole-Jean Wu et al. "A Comparison of Capacity Management Schemes for Shared CMP Caches," in Proc. of the 7th Workshop on Duplicating, Deconstructing, and Debunking, vol. 15. Citeseer, 2008.

[78]

Yuejian Xie et al. "PIPP: Promotion/Insertion Pseudo-partitioning of Multi-core Shared Caches," in Proc. of the 36th Annual International Symposium on Computer Architecture, ser. ISCA '09. New York, NY: ACM, 2009.

Digital Library

[79]

Hailong Yang et al. "Bubble-flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers," in Proc. of the 40th Annual Intl. Symp. on Computer Architecture, ser. ISCA '13, 2013.

Digital Library

[80]

Xiao Zhang et al. "CPI2: CPU performance isolation for shared compute clusters," in Proc. of the 8th ACM European Conference on Computer Systems (EuroSys), Prague, Czech Republic, 2013.

Digital Library

[81]

Yunqi Zhang et al. "SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers," in International Symposium on Microarchitecture (MICRO), 2014.

Digital Library

Cited By

Hong YXie QWang W(2024)Near-Optimal Stochastic Bin-Packing in Large Service Systems with Time-Varying Item SizesACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365507052:1(93-94)Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3673660.3655070
Hong YXie QWang WGaretto MMarin ACiucu FFanti GRighter R(2024)Near-Optimal Stochastic Bin-Packing in Large Service Systems with Time-Varying Item SizesAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655070(93-94)Online publication date: 10-Jun-2024
https://dl.acm.org/doi/10.1145/3652963.3655070
Miliadis PTheodoropoulos DPnevmatikatos DKoziris N(2024)Architectural Support for Sharing, Isolating and Virtualizing FPGA ResourcesACM Transactions on Architecture and Code Optimization10.1145/364847521:2(1-26)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3648475
Show More Cited By

Index Terms

Heracles: improving resource efficiency at scale
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems
2. Hardware

Recommendations

Heracles: improving resource efficiency at scale
ISCA'15

User-facing, latency-sensitive services, such as websearch, underutilize their computing resources during daily periods of low traffic. Reusing those resources for other tasks is rarely done in production services since the contention for shared ...
Improving Resource Efficiency at Scale with Heracles

User-facing, latency-sensitive services, such as websearch, underutilize their computing resources during daily periods of low traffic. Reusing those resources for other tasks is rarely done in production services since the contention for shared ...
Heracles: a tool for fast RTL-based design space exploration of multicore processors
FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '15: Proceedings of the 42nd Annual International Symposium on Computer Architecture

June 2015

768 pages

ISBN:9781450334020

DOI:10.1145/2749469

General Chair:
Debbie Marr
Intel
,
Program Chair:
David Albonesi
Cornell

ACM SIGARCH Computer Architecture News Volume 43, Issue 3S
ISCA'15
June 2015
745 pages
ISSN:0163-5964
DOI:10.1145/2872887
Editor:
Doug DeGroot
acm dot org
Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IEEE TCCA: IEEE Computer Society Technical Committee on Computer Architecture
SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ISCA '15

Sponsor:

IEEE TCCA
SIGARCH

ISCA '15: The 42nd Annual International Symposium on Computer Architecture

June 13 - 17, 2015

Oregon, Portland

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

382
Total Citations
View Citations
5,348
Total Downloads

Downloads (Last 12 months)745
Downloads (Last 6 weeks)68

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hong YXie QWang W(2024)Near-Optimal Stochastic Bin-Packing in Large Service Systems with Time-Varying Item SizesACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365507052:1(93-94)Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3673660.3655070
Hong YXie QWang WGaretto MMarin ACiucu FFanti GRighter R(2024)Near-Optimal Stochastic Bin-Packing in Large Service Systems with Time-Varying Item SizesAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655070(93-94)Online publication date: 10-Jun-2024
https://dl.acm.org/doi/10.1145/3652963.3655070
Miliadis PTheodoropoulos DPnevmatikatos DKoziris N(2024)Architectural Support for Sharing, Isolating and Virtualizing FPGA ResourcesACM Transactions on Architecture and Code Optimization10.1145/364847521:2(1-26)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3648475
Wilkins GKeshav SMortier R(2024)Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference WorkloadsProceedings of the 15th ACM International Conference on Future and Sustainable Energy Systems10.1145/3632775.3662830(506-513)Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1145/3632775.3662830
Zhou ZGogte VVaish NKennelly CXia PKanev SMoseley TDelimitrou CRanganathan PTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Characterizing a Memory Allocator at Warehouse ScaleProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651350(192-206)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651350
Guo CZhang RXu JLeng JLiu ZHuang ZGuo MWu HZhao SZhao JZhang KTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory StitchingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640423(450-466)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640423
Piga LNarayanan ISundarrajan ASkach MDeng QMaity BChakkaravarthy MHuang ADhanotia AMalani PTsafrir DMUSUVATHI MGupta RAbu-Ghazaleh N(2024)Expanding Datacenter Capacity with DVFS Boosting: A safe and scalable deployment experienceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624853(150-165)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624853
Shi JFu KWang JChen QZeng DGuo M(2024)Adaptive QoS-Aware Microservice Deployment With Excessive Loads via Intra- and Inter-Datacenter SchedulingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.342593135:9(1565-1582)Online publication date: Oct-2024
https://doi.org/10.1109/TPDS.2024.3425931
Liu YDeng XZhou JChen MBao Y(2024)Suppressing the Interference Within a Datacenter: Theorems, Metric and StrategyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.335441835:5(732-750)Online publication date: May-2024
https://doi.org/10.1109/TPDS.2024.3354418
Ma RZhan YXia YWu CYang LGao R(2024)SonnetFuture Generation Computer Systems10.1016/j.future.2023.11.019153:C(169-181)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1016/j.future.2023.11.019
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents