abstract

NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs

Authors:

Mohammad Sadrosadati,

Hamid Sarbazi-AzadAuthors Info & Claims

SIGMETRICS/PERFORMANCE '22: Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems

Pages 39 - 40

https://doi.org/10.1145/3489048.3522656

Published: 06 June 2022 Publication History

Get Access

Abstract

Multi-application execution in Graphics Processing Units (GPUs), a promising way to utilize GPU resources, is still challenging. Some pieces of prior work (e.g. spatial multitasking) have limited opportunity to improve resource utilization, while others, e.g. simultaneous multi-kernel, provide fine-grained resource sharing at the price of unfair execution. This paper proposes a new multi-application paradigm for GPUs, called NURA, that provides high potential to improve resource utilization and ensure fairness and Quality-of-Service(QoS). The key idea is that each streaming multiprocessor (SM) executes the Cooperative Thread Arrays (CTAs) that belong to only one application (similar to spatial multi-tasking) and shares its unused resources with the SMs running other applications demanding more resources. NURA handles resource sharing process mainly using a software approach to provide simplicity, low hardware overhead, and flexibility.We also perform some hardware modifications as an architectural support for our software-based proposal. Our conservative analysis reveals that the hardware area overhead of our proposal is less than 1.07% with respect to the whole GPU die. Our experimental results over various mixes of GPU workloads show that NURA improves throughput by 26% compared to the state-of-the-art spatial multi-tasking, on average, while meeting QoS targets. In terms of fairness, NURA has almost similar results to spatial multitasking, while it outperforms simultaneous multi-kernel by 76%, on average.

References

[1]

Sina Darabi, Negin Mahani, Hazhir Baxishi, Ehsan Yousefzadeh-Asl-Miandoab, Mohammad Sadrosadati, and Hamid Sarbazi-Azad. 2022. NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs. Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, 1 (2022), 1--27.

Digital Library

Google Scholar

[2]

Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke. 2017. Dynamic resource management for efficient utilization of multitasking GPUs. ACM SIGOPS Operating Systems Review 51, 2 (2017), 527--540.

Crossref

Google Scholar

[3]

W. Zhao, Q. Chen, H. Lin, J. Zhang, J. Leng, C. Li, W. Zheng, L. Li, and M. Guo. 2019. Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 653--663. https://doi.org/10.1109/IPDPS.2019.00074

Crossref

Google Scholar

Cited By

View all

Aslan FIosifidis GAyala-Romero JGarcia-Saavedra ACosta-Perez X(2024)Fair Resource Allocation in Virtualized O-RAN PlatformsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390438:1(1-34)Online publication date: 21-Feb-2024
https://dl.acm.org/doi/10.1145/3639043
Darabi SSadrosadati MAkbarzadeh NLindegger JHosseini MHosseini MPark JGómez-Luna JMutlu OSarbazi-Azad HHardavellas NCampanoni SGrot BKarpuzcu U(2022)Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core ResourcesProceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO56248.2022.00029(228-244)Online publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1109/MICRO56248.2022.00029

Index Terms

NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Single instruction, multiple data
2. Computing methodologies
  1. Computer graphics
    1. Graphics systems and interfaces
      1. Graphics processors

Recommendations

NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs
SIGMETRICS '22

Multi-application execution in Graphics Processing Units (GPUs), a promising way to utilize GPU resources, is still challenging. Some pieces of prior work (e.g. spatial multitasking) have limited opportunity to improve resource utilization, while others, ...
NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs
POMACS

Multi-application execution in Graphics Processing Units (GPUs), a promising way to utilize GPU resources, is still challenging. Some pieces of prior work (e.g., spatial multitasking) have limited opportunity to improve resource utilization, while other ...
HSM: A Hybrid Slowdown Model for Multitasking GPUs
ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems

Graphics Processing Units (GPUs) are increasingly widely used in the cloud to accelerate compute-heavy tasks. However, GPU-compute applications stress the GPU architecture in different ways --- leading to suboptimal resource utilization when a single ...

Comments

Information & Contributors

Information

Published In

SIGMETRICS/PERFORMANCE '22: Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems

June 2022

132 pages

ISBN:9781450391412

DOI:10.1145/3489048

General Chairs:
D Manjunath
IIT Bombay, India
,
Jayakrishnan Nair
IIT Bombay, India
,
Program Chairs:
Niklas Carlsson
Linköping University, Sweden
,
Edith Cohen
Google Research, USA
,
Philippe Robert
INRIA, France

ACM SIGMETRICS Performance Evaluation Review Volume 50, Issue 1
SIGMETRICS '22
June 2022
118 pages
ISSN:0163-5999
DOI:10.1145/3547353
Editor:
Zhenhua Liu
Stony Brook University
Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2022

Check for updates

Author Tags

Qualifiers

Abstract

Conference

SIGMETRICS/PERFORMANCE '22

Sponsor:

SIGMETRICS

SIGMETRICS/PERFORMANCE '22: ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems

June 6 - 10, 2022

Mumbai, India

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
65
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)3

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Aslan FIosifidis GAyala-Romero JGarcia-Saavedra ACosta-Perez X(2024)Fair Resource Allocation in Virtualized O-RAN PlatformsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390438:1(1-34)Online publication date: 21-Feb-2024
https://dl.acm.org/doi/10.1145/3639043
Darabi SSadrosadati MAkbarzadeh NLindegger JHosseini MHosseini MPark JGómez-Luna JMutlu OSarbazi-Azad HHardavellas NCampanoni SGrot BKarpuzcu U(2022)Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core ResourcesProceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO56248.2022.00029(228-244)Online publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1109/MICRO56248.2022.00029

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs

NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs

HSM: A Hybrid Slowdown Model for Multitasking GPUs