Article

PerfIso: performance isolation for commercial latency-sensitive services

Authors:

Călin Iorgulescu,

Sameh Elnikety,

Vivek Narasayya,

Herodotos Herodotou,

Junhua WangAuthors Info & Claims

USENIX ATC '18: Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference

Pages 519 - 531

Published: 11 July 2018 Publication History

Abstract

Large commercial latency-sensitive services, such as web search, run on dedicated clusters provisioned for peak load to ensure responsiveness and tolerate data center outages. As a result, the average load is far lower than the peak load used for provisioning, leading to resource under-utilization. The idle resources can be used to run batch jobs, completing useful work and reducing overall data center provisioning costs. However, this is challenging in practice due to the complexity and stringent tail-latency requirements of latency-sensitive services. Left unmanaged, the competition for machine resources can lead to severe response-time degradation and unmet service-level objectives (SLOs).

This work describes PerfIso, a performance isolation framework which has been used for nearly three years in Microsoft Bing, a major search engine, to colocate batch jobs with production latency-sensitive services on over 90,000 servers. We discuss the design and implementation of PerfIso, and conduct an experimental evaluation in a production environment. We show that colocating CPU-intensive jobs with latency-sensitive services increases average CPU utilization from 21% to 66% for off-peak load without impacting tail latency.

References

[1]

Hadoop. http://hadoop.apache.org.

[2]

Intel CAT. https://www.intel.com/content/www/us/en/communications/cache-monitoring-cache-allocation-technologies.html.

[3]

Windows Job Objects. https://msdn.microsoft.com/en-us/library/windows/desktop/hh684161(v=vs.85).aspx.

[4]

Cgroups, 2014. http://en.wikipedia.org/wiki/Cgroups.

[5]

DiskSPD, 2017. https://github.com/Microsoft/diskspd.

[6]

ALIZADEH, M., KABBANI, A., EDSALL, T., PRABHAKAR, B., VAHDAT, A., AND YASUDA, M. Less is more: trading a little bandwidth for ultra-low latency in the data center. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation (2012), USENIX Association, pp. 19-19.

[7]

ARMBRUST, M., XIN, R. S., LIAN, C., HUAI, Y., LIU, D., BRADLEY, J. K., MENG, X., KAFTAN, T., FRANKLIN, M. J., GHODSI, A., ET AL. Spark SQL: Relational data processing in spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (2015), ACM, pp. 1383- 1394.

[8]

BARROSO, L. A., CLIDARAS, J., AND HÖLZLE, U. The data-center as a computer: An introduction to the design of warehouse-scale machines. Synthesis lectures on computer architecture 8, 3 (2013), 1-154.

[9]

CARBONE, P., KATSIFODIMOS, A., EWEN, S., MARKL, V., HARIDI, S., AND TZOUMAS, K. Apache Flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36, 4 (2015).

[10]

DEAN, J., AND BARROSO, L. A. The tail at scale. Communications of the ACM 56, 2 (2013), 74-80.

[11]

DELIMITROU, C., AND KOZYRAKIS, C. Quasar: resource-efficient and QoS-aware cluster management. In ACM SIGPLAN Notices (2014), vol. 49, ACM, pp. 127-144.

[12]

DOUCEUR, J. R., AND BOLOSKY, W. J. Progress-based regulation of low-importance processes. In In Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (1999), ACM Press, pp. 247-260.

[13]

FEDOROVA, A., SELTZER, M., AND SMITH, M. D. A nonwork-conserving operating system scheduler for SMT processors. In Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture, in conjunction with ISCA (2006), vol. 33, pp. 10-17.

[14]

ISARD, M. Autopilot: Automatic data center management. ACM SIGOPS Operating Systems Review 41, 2 (Apr. 2007), 60-67.

[15]

JEON, M., HE, Y., KIM, H., ELNIKETY, S., RIXNER, S., AND COX, A. L. TPC: Target-driven parallelism combining prediction and correction to reduce tail latency in interactive services. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (2016), ACM, pp. 129-141.

[16]

KASTURE, H., BARTOLINI, D. B., BECKMANN, N., AND SANCHEZ, D. Rubik: Fast analytical power management for latency-critical systems. In Proceedings of the 48th International Symposium on Microarchitecture (2015), ACM, pp. 598-610.

[17]

KIM, S., HE, Y., HWANG, S.-W., ELNIKETY, S., AND CHOI, S. Delayed-Dynamic-Selective (DDS) prediction for reducing extreme tail latency in web search. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (2015), ACM, pp. 7-16.

[18]

LEVERICH, J., AND KOZYRAKIS, C. Reconciling high server utilization and sub-millisecond quality-of-service. In Proceedings of the Ninth European Conference on Computer Systems (2014), ACM, p. 4.

[19]

LI, T., BAUMBERGER, D., AND HAHN, S. Efficient and scalable multiprocessor fair scheduling using distributed weighted round-robin. In ACM Sigplan Notices (2009), vol. 44, ACM, pp. 65-74.

[20]

LO, D., CHENG, L., GOVINDARAJU, R., RANGANATHAN, P., AND KOZYRAKIS, C. Improving resource efficiency at scale with Heracles. ACM Transactions on Computer Systems (TOCS) 34, 2 (2016), 6.

[21]

LOZI, J.-P., LEPERS, B., FUNSTON, J., GAUD, F., QUÉMA, V., AND FEDOROVA, A. The linux scheduler: a decade of wasted cores. In Proceedings of the Eleventh European Conference on Computer Systems (2016), ACM, p. 1.

[22]

MACE, J., BODIK, P., MUSUVATHI, M., FONSECA, R., AND VARADARAJAN, K. 2dfq: Two-dimensional fair queuing for multi-tenant cloud services. In Proceedings of the 2016 ACM SIGCOMM Conference (New York, NY, USA, 2016), SIGCOMM '16, ACM, pp. 144-159.

[23]

MAKRESHANSKI, D., GICEVA, J., BARTHELS, C., AND ALONSO, G. BatchDB: Efficient isolated execution of hybrid OLTP+ OLAP workloads for interactive applications. In Proceedings of the 2017 ACM International Conference on Management of Data (2017), ACM, pp. 37-50.

[24]

MARS, J., TANG, L., HUNDT, R., SKADRON, K., AND SOFFA, M. L. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th annual IEEE/ACM International Symposium on Microarchitecture (2011), ACM, pp. 248-259.

[25]

MISRA, P. A., GOIRI, I., KACE, J., AND BIANCHINI, R. Scaling distributed file systems in resource-harvesting datacenters. In 2017 USENIX Annual Technical Conference (USENIX ATC 17) (Santa Clara, CA, 2017), USENIX Association, pp. 799-811.

[26]

NISHTALA, R., CARPENTER, P., PETRUCCI, V., AND MARTORELL, X. Hipster: Hybrid task manager for latency-critical cloud workloads. In High Performance Computer Architecture (HPCA), 2017 IEEE International Symposium on (2017), IEEE, pp. 409-420.

[27]

OUSTERHOUT, K., RASTI, R., RATNASAMY, S., SHENKER, S., CHUN, B.-G., AND ICSI, V. Making sense of performance in data analytics frameworks. In NSDI (2015), vol. 15, pp. 293-307.

[28]

ROHIT, J., AND DAVID, L. CAT at scale: Deploying cache isolation in a mixed workload environment. LinuxCon + Container-Con North America, August 2016.

[29]

SCHURMAN, E., AND BRUTLAG, J. Performance related changes and their user impact. In velocity web performance and operations conference (2009).

[30]

SHUE, D., FREEDMAN, M. J., AND SHAIKH, A. Performance isolation and fairness for multi-tenant cloud storage. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12) (Hollywood, CA, 2012), USENIX, pp. 349-362.

[31]

VAVILAPALLI, V. K., MURTHY, A. C., DOUGLAS, C., AGARWAL, S., KONAR, M., EVANS, R., GRAVES, T., LOWE, J., SHAH, H., SETH, S., ET AL. Apache hadoop YARN: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing (2013), ACM, p. 5.

[32]

VERMA, A., PEDROSA, L., KORUPOLU, M., OPPENHEIMER, D., TUNE, E., AND WILKES, J. Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems (2015), ACM, p. 18.

[33]

YANG, H., BRESLOW, A., MARS, J., AND TANG, L. Bubbleflux: Precise online qos management for increased utilization in warehouse scale computers. In ACM SIGARCH Computer Architecture News (2013), vol. 41, ACM, pp. 607-618.

[34]

YANG, X., BLACKBURN, S. M., AND MCKINLEY, K. S. Elfen scheduling: Fine-grain principled borrowing from latency-critical workloads using simultaneous multithreading. In USENIX Annual Technical Conference (2016), pp. 309-322.

[35]

ZAHARIA, M., XIN, R. S., WENDELL, P., DAS, T., ARMBRUST, M., DAVE, A., MENG, X., ROSEN, J., VENKATARAMAN, S., FRANKLIN, M. J., ET AL. Apache Spark: A unified engine for big data processing. Communications of the ACM 59, 11 (2016), 56-65.

[36]

ZHANG, W., RAJASEKARAN, S., DUAN, S., WOOD, T., AND ZHUY, M. Minimizing interference and maximizing progress for Hadoop virtual machines. ACM SIGMETRICS Performance Evaluation Review 42, 4 (2015), 62-71.

[37]

ZHANG, X., TUNE, E., HAGMANN, R., JNAGAL, R., GOKHALE, V., AND WILKES, J. CPI2: CPU performance isolation for shared compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems (2013), ACM, pp. 379-391.

[38]

ZHANG, X., ZHONG, R., DWARKADAS, S., AND SHEN, K. A flexible framework for throttling-enabled multicore management (TEMM). In Parallel Processing (ICPP), 2012 41st International Conference on (2012), IEEE, pp. 389-398.

[39]

ZHANG, Y., PREKAS, G., FUMAROLA, G. M., FONTOURA, M., GOIRI, I., AND BIANCHINI, R. History-based harvesting of spare cycles and storage in large-scale datacenters. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (GA, 2016), USENIX Association, pp. 755-770.

Cited By

Ma LLiu ZXiong JWu YChen RPeng XZhang YZhang GJiang D(2024)zQoS: Unleashing full performance capabilities of NVMe SSDs while enforcing SLOs in distributed storage systemsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673156(618-628)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673156
Zhao JZhou XChang SXu CButt AMi NChard K(2023)Let It Go: Relieving Garbage Collection Pain for Latency Critical Applications in GolangProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592998(169-180)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3588195.3592998
Kaffes KYadwadkar NKozyrakis CGavrilovska AAltınbüken DBinnig C(2022)HermodProceedings of the 13th Symposium on Cloud Computing10.1145/3542929.3563468(289-305)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3542929.3563468
Show More Cited By

Index Terms

PerfIso: performance isolation for commercial latency-sensitive services

Index terms have been assigned to the content through auto-classification.

Recommendations

SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16

Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...
SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...
Nosv

nOSV can provide a bare-metal like performance for HPC applications on Cloud.The CPU cores and main memory are not shared among guest VMs of nOSV.Dedicated I/O resources are allocated to I/O sensitive HPC guests.Other virtualization environments can run ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

USENIX ATC '18: Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference

July 2018

1019 pages

ISBN:9781931971447

Program Chairs:
Haryadi Gunawi,
Benjamin Reed
Facebook

Sponsors

VMware
NetApp
NSF
Facebook: Facebook
ORACLE: ORACLE

Publisher

USENIX Association

United States

Publication History

Published: 11 July 2018

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ma LLiu ZXiong JWu YChen RPeng XZhang YZhang GJiang D(2024)zQoS: Unleashing full performance capabilities of NVMe SSDs while enforcing SLOs in distributed storage systemsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673156(618-628)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673156
Zhao JZhou XChang SXu CButt AMi NChard K(2023)Let It Go: Relieving Garbage Collection Pain for Latency Critical Applications in GolangProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592998(169-180)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3588195.3592998
Kaffes KYadwadkar NKozyrakis CGavrilovska AAltınbüken DBinnig C(2022)HermodProceedings of the 13th Symposium on Cloud Computing10.1145/3542929.3563468(289-305)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3542929.3563468
Zhao JPi AZhou XChang SXu CBellavista PZhang KGherbi ABagchi SPatiño MDi Modica GGascon-Samson J(2022)Improving Concurrent GC for Latency Critical Services in Multi-tenant SystemsProceedings of the 23rd ACM/IFIP International Middleware Conference10.1145/3528535.3531515(43-55)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3528535.3531515
Kim JJang IReda WIm JCanini MKostić DKwon YPeter SWitchel E(2021)LineFSProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483565(756-771)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3477132.3483565
Oh JKwon YGunawi HMa X(2021)Persistent memory aware performance isolation with dicioProceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3476886.3477517(97-105)Online publication date: 24-Aug-2021
https://dl.acm.org/doi/10.1145/3476886.3477517
Suresh AGandhi A(2021)ServerMoreProceedings of the ACM Symposium on Cloud Computing10.1145/3472883.3486979(570-584)Online publication date: 1-Nov-2021
https://dl.acm.org/doi/10.1145/3472883.3486979
Yu JFeng DTong WLv PXiong Y(2021)CERES: Container-Based Elastic Resource Management System for Mixed WorkloadsProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472459(1-10)Online publication date: 9-Aug-2021
https://dl.acm.org/doi/10.1145/3472456.3472459
Zhao LYang YLi YZhou XLi Kde Supinski BHall MGamblin T(2021)Understanding, predicting and scheduling serverless workloads under partial interferenceProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476215(1-15)Online publication date: 14-Nov-2021
https://dl.acm.org/doi/10.1145/3458817.3476215
Yuan YAlian MWang YWang RKurakin ITai CKim NMartínez JDuato JJohn L(2021)Don't forget the I/O when allocating your LLCProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00018(112-125)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00018
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents