Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Thermal-Aware Scheduling for Integrated CPUs--GPU Platforms

Published: 08 October 2019 Publication History

Abstract

As modern embedded systems like cars need high-power integrated CPUs--GPU SoCs for various real-time applications such as lane or pedestrian detection, they face greater thermal problems than before, which may, in turn, incur higher failure rate and cooling cost. We demonstrate, via experimentation on a representative CPUs--GPU platform, the importance of accounting for two distinct thermal characteristics—the platform’s temperature imbalance and different power dissipations of different tasks—in real-time scheduling to avoid any burst of power dissipations while guaranteeing all timing constraints. To achieve this goal, we propose a new <u>R</u>eal-<u>T</u>ime <u>T</u>hermal-<u>A</u>ware <u>S</u>cheduling (RT-TAS) framework. We first capture different CPU cores’ temperatures caused by different GPU power dissipations (i.e., CPUs--GPU thermal coupling) with core-specific thermal coupling coefficients. We then develop thermally-balanced task-to-core assignment and CPUs--GPU co-scheduling. The former addresses the platform’s temperature imbalance by efficiently distributing the thermal load across cores while preserving scheduling feasibility. Building on the thermally-balanced task assignment, the latter cooperatively schedules CPU and GPU computations to avoid simultaneous peak power dissipations on both CPUs and GPU, thus mitigating excessive temperature rises while meeting task deadlines. We have implemented and evaluated RT-TAS on an automotive embedded platform to demonstrate its effectiveness in reducing the maximum temperature by 6−12.2°C over existing approaches without violating any task deadline.

References

[1]
2018. Tegra X1 Thermal Design Guide. Technical Report TDG-08214-001. Nvidia.
[2]
Rehan Ahmed, Pengcheng Huang, Max Millen, and Lothar Thiele. 2017. On the design and application of thermal isolation servers. ACM Transactions on Embedded Computing Systems (TECS) 16 (2017).
[3]
Tarek A AlEnawy and Hakan Aydin. 2005. Energy-aware task allocation for rate monotonic scheduling. In RTAS.
[4]
Hakan Aydin and Qi Yang. 2003. Energy-aware partitioning for multiprocessor real-time systems. In Parallel and Distributed Processing Symposium.
[5]
Enrico Bini and Giorgio C. Buttazzo. 2005. Measuring the performance of schedulability tests. Real-Time Systems 30, 1--2 (2005).
[6]
Thidapat Chantem, X. Sharon Hu, and Robert P. Dick. 2011. Temperature-aware scheduling and assignment for hard real-time applications on MPSoCs. IEEE Transactions on Very Large Scale Integration Systems 19, 10 (2011).
[7]
Minki Cho, William Song, Sudhakar Yalamanchili, and Saibal Mukhopadhyay. 2012. Thermal system identification (TSI): A methodology for post-silicon characterization and prediction of the transient thermal field in multicore chips. In SEMI-THERM.
[8]
Edward G. Coffman, Gabor Galambos, Silvano Martello, and Daniele Vigo. 1999. Bin packing approximation algorithms: Combinatorial analysis. In Handbook of Combinatorial Optimization. 151--207.
[9]
David Defour and Eric Petit. 2013. GPUburn: A system to test and mitigate GPU hardware failures. In International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).
[10]
Kapil Dev and Sherief Reda. 2016. Scheduling challenges and opportunities in integrated cpu+ gpu processors. In ESTIMedia.
[11]
Glenn A. Elliott, Bryan C. Ward, and James H. Anderson. 2013. GPUSync: A framework for real-time GPU management. In RTSS.
[12]
Paolo Gai, Marco Di Natale, Giuseppe Lipari, Alberto Ferrari, Claudio Gabellini, and Paolo Marceca. 2003. A comparison of MPCP and MSRP when sharing resources in the Janus multiple-processor on a chip platform. In RTAS.
[13]
Sharath Kodase, Shige Wang, Zonghua Gu, and Kang G. Shin. 2003. Improving scalability of task allocation and scheduling in large distributed real-time systems using shared buffers. In RTAS.
[14]
Pratyush Kumar and Lothar Thiele. 2011. Cool shapers: Shaping real-time tasks for improved thermal guarantees. In DAC.
[15]
Kai Lampka and Bjorn Forsberg. 2016. Keep it slow and in time : Online DVFS with hard real-time workloads. In DATE.
[16]
Youngmoon Lee, Hoon Sung Chwa, Kang G. Shin, and Shige Wang. 2018. Thermal-aware resource management for embedded real-time systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018).
[17]
Sheng-Chih Lin and Kaustav Banerjee. 2008. Cool chips: Opportunities and implications for power and thermal management. IEEE Trans. Dev. 55, 1 (2008).
[18]
Pratyush Patel, Iljoo Baek, Hyoseung Kim, and Ragunathan Rajkumar. 2018. Analytical enhancements and practical insights for MPCP with self-suspensions. In RTAS.
[19]
Indrani Paul, Srilatha Manne, Manish Arora, W. Lloyd Bircher, and Sudhakar Yalamanchili. 2013. Cooperative boosting: Needy versus greedy power management. In ISCA.
[20]
Nick Piggin. [n.d.]. “Linux CFS Scheduler”. https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt.
[21]
Alok Prakash, Hussam Amrouch, Muhammad Shafique, Tulika Mitra, and Jörg Henkel. 2016. Improving mobile gaming performance through cooperative CPU-GPU thermal management. In DAC.
[22]
Danil Prokhorov. 2008. Computational Intelligence in Automotive Applications. Vol. 132. Springer.
[23]
Robert Redelmeier. [n.d.]. cpuburn. https://patrickmn.com/projects/cpuburn/.
[24]
Onur Sahin, Lothar Thiele, and Ayse K. Coskun. 2018. MAESTRO: Autonomous QoS management for mobile applications under thermal constraints. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2018).
[25]
Gaurav Singla, Gurinderjit Kaur, Ali Unver, and Umit Ogras. 2015. Predictive dynamic thermal and power management for heterogeneous mobile platforms. In DATE.
[26]
Kevin Skadron, Mircea Stan, Wei Huang, Sivakumar Velusamy, Karthik Sankaranarayanan, and David Tarjan. 2003. Temperature-aware microarchitecture. In ISCA.
[27]
Liang Wang, Xiaohang Wang, and Terrence Mak. 2016. Adaptive routing algorithms for lifetime reliability optimization in network-on-chip. IEEE Trans. Comput. 65, 9 (2016).
[28]
Man-Ki Yoon, Sibin Mohan, Chien-Ying Chen, and Lui Sha. 2016. TaskShuffler: A schedule randomization protocol for obfuscation against timing inference attacks in real-time systems. In RTAS.

Cited By

View all
  • (2024)TREAFET: Temperature-Aware Real-Time Task Scheduling for FinFET based MulticoresACM Transactions on Embedded Computing Systems10.1145/3665276Online publication date: 16-May-2024
  • (2023)An Interplay of Energy and Temperature Minimization Techniques for Heterogeneous Multiprocessor SystemsTENCON 2023 - 2023 IEEE Region 10 Conference (TENCON)10.1109/TENCON58879.2023.10322452(629-634)Online publication date: 31-Oct-2023
  • (2023)Fast-Accurate Full-Chip Dynamic Thermal Simulation With Fine Resolution Enabled by a Learning MethodIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322959842:8(2675-2688)Online publication date: 1-Aug-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 18, Issue 5s
Special Issue ESWEEK 2019, CASES 2019, CODES+ISSS 2019 and EMSOFT 2019
October 2019
1423 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3365919
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 08 October 2019
Accepted: 01 July 2019
Revised: 01 June 2019
Received: 01 April 2019
Published in TECS Volume 18, Issue 5s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GPU
  2. Thermal management
  3. embedded systems
  4. real-time systems

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)266
  • Downloads (Last 6 weeks)47
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)TREAFET: Temperature-Aware Real-Time Task Scheduling for FinFET based MulticoresACM Transactions on Embedded Computing Systems10.1145/3665276Online publication date: 16-May-2024
  • (2023)An Interplay of Energy and Temperature Minimization Techniques for Heterogeneous Multiprocessor SystemsTENCON 2023 - 2023 IEEE Region 10 Conference (TENCON)10.1109/TENCON58879.2023.10322452(629-634)Online publication date: 31-Oct-2023
  • (2023)Fast-Accurate Full-Chip Dynamic Thermal Simulation With Fine Resolution Enabled by a Learning MethodIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322959842:8(2675-2688)Online publication date: 1-Aug-2023
  • (2023)Energy-Efficient Multiprocessor-Based Computation and Communication Resource Allocation in Two-Tier Federated Learning NetworksIEEE Internet of Things Journal10.1109/JIOT.2022.315399610:7(5689-5703)Online publication date: 1-Apr-2023
  • (2022)HEATACM SIGAPP Applied Computing Review10.1145/3558053.355805622:2(34-43)Online publication date: 17-Aug-2022
  • (2022)Towards Energy-Efficient Real-Time Scheduling of Heterogeneous Multi-GPU Systems2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00042(409-421)Online publication date: Dec-2022
  • (2022)Future aware Dynamic Thermal Management in CPU-GPU Embedded Platforms2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00041(396-408)Online publication date: Dec-2022
  • (2022)Energy- and Temperature-aware Scheduling: From Theory to an Implementation on Intel Processor2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00288(1922-1930)Online publication date: Dec-2022
  • (2022)RT-SEAT: A hybrid approach based real-time scheduler for energy and temperature efficient heterogeneous multicore platformsResults in Engineering10.1016/j.rineng.2022.10070816(100708)Online publication date: Dec-2022
  • (2022)ETA-HP: an energy and temperature-aware real-time scheduler for heterogeneous platformsThe Journal of Supercomputing10.1007/s11227-021-04257-778:8(1-25)Online publication date: 1-May-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media