Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2854038.2854052acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

A black-box approach to energy-aware scheduling on integrated CPU-GPU systems

Published: 29 February 2016 Publication History

Abstract

Energy efficiency is now a top design goal for all computing systems, from fitness trackers and tablets, where it affects battery life, to cloud computing centers, where it directly impacts operational cost, maintainability, and environmental impact. Today's widespread integrated CPU-GPU processors combine a CPU and a GPU compute device with different power-performance characteristics. For these integrated processors, hardware vendors implement automatic power management policies that are typically not exposed to the end-user. Furthermore, these policies often vary between different processor generations and SKUs. As a result, it is challenging to design a generally-applicable energy-aware runtime to schedule work onto both the CPU and GPU of such integrated CPU-GPU processors to optimize energy consumption. We propose a new black-box scheduling technique to reduce energy use by effectively partitioning work across the CPU and GPU cores of integrated CPU-GPU processors. Our energy-aware scheduler combines a power model with information about the runtime behavior of a specific workload. This power model is computed once for each processor to characterize its power consumption for different kinds of workloads. On two widely different platforms, a high-end desktop system and a low-power tablet, our energy-aware runtime yields an energy-delay product that is 96% and 93%, respectively, of the near-ideal Oracle energy-delay product on a diverse set of workloads.

References

[1]
C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, 23(2):187–198, 2011.
[2]
R. Barik, R. Kaleem, D. Majeti, B. Lewis, T. Shpeisman, C. Hu, Y. Ni, and A.-R. Adl-Tabatabai. Efficient mapping of irregular C++ applications to integrated GPUs. In IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2014.
[3]
J. Barnes and P. Hut. A hierarchical O(N log N ) force calculation algorithm. Nature, 324:446–449, 1986.
[4]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT), pages 72–81, NY, USA, 2008.
[5]
K. Chandramohan and M. F. O’Boyle. Partitioning dataparallel programs for heterogeneous mpsocs: Time and energy design space exploration. In Proceedings of the 2014 SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES), pages 73–82, 2014.
[6]
R. Ge, X. Feng, M. Burtscher, and Z. Zong. PEACH: A Model for Performance and Energy Aware Cooperative Hybrid Computing. In Proceedings of the 11th ACM Conference on Computing Frontiers (CF), pages 24:1–24:2, 2014.
[7]
H. Hoffmann. Racing and pacing to idle: An evaluation of heuristics for energy-aware resource allocation. In Proceedings of the Workshop on Power-Aware Computing and Systems (HotPower), pages 13:1–13:5, 2013.
[8]
S. Hong and H. Kim. An integrated GPU power and performance model. SIGARCH Comput. Archit. News, 38(3): 280–289, June 2010.
[9]
Intel Performance Counter Monitor. URL https://software.intel.com/en-us/articles/ intel-performance-counter-monitor.
[10]
Intel Thread Building Blocks. URL https://www. threadingbuildingblocks.org/.
[11]
Q. Jiao, M. Lu, H. P. Huynh, and T. Mitra. Improving GPGPU Energy-efficiency Through Concurrent Kernel Execution and DVFS. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 1–11, 2015.
[12]
R. Kaleem, R. Barik, T. Shpeisman, B. Lewis, C. Hu, and K. Pingali. Adaptive Heterogeneous Scheduling on Integrated GPUs. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation Techniques (PACT), 2014.
[13]
S. Kim, I. Roy, and V. Talwar. Evaluating integrated graphics processors for data center workloads. In Proceedings of the Workshop on Power-Aware Computing and Systems (HotPower), pages 8:1–8:5, 2013.
[14]
J. Lee, M. Samadi, Y. Park, and S. Mahlke. Transparent CPUGPU collaboration for data-parallel kernels on heterogeneous systems. In Proceedings of the 22nd international conference on Parallel architectures and compilation techniques (PACT), 2013.
[15]
C.-K. Luk, S. Hong, and H. Kim. Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 45–55, 2009.
[16]
K. Ma, X. Li, W. Chen, C. Zhang, and X. Wang. GreenGPU: A Holistic Approach to Energy Efficiency in GPU-CPU Heterogeneous Architectures. In Proceedings of the 2012 41st International Conference on Parallel Processing (ICPP), pages 48–57, 2012.
[17]
X. Mei, L. S. Yung, K. Zhao, and X. Chu. A Measurement Study of GPU DVFS on Energy Conservation. In Proceedings of the Workshop on Power-Aware Computing and Systems, HotPower ’13, pages 10:1–10:5, 2013.
[18]
OpenSource Computer Vision Library. URL http: //sourceforge.net/projects/opencvlibrary/.
[19]
I. Paul, V. Ravi, S. Manne, M. Arora, and S. Yalamanchili. Coordinated energy management in heterogeneous processors. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pages 59:1–59:12, 2013.
[20]
P. M. Phothilimthana, J. Ansel, J. Ragan-Kelley, and S. Amarasinghe. Portable performance on heterogeneous architectures. In Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems (ASPLOS), pages 431–444, 2013.
[21]
C. J. Rossbach, Y. Yu, J. Currey, J.-P. Martin, and D. Fetterly. Dandelion: a compiler and runtime for heterogeneous systems. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP), pages 49–68, NY, USA, 2013.
[22]
C. Shen, S. Chakraborty, K. R. Raghavan, H. Choi, and M. B. Srivastava. Exploiting processor heterogeneity for energy efficient context inference on mobile phones. In Proceedings of the Workshop on Power-Aware Computing and Systems (HotPower), pages 9:1–9:5, 2013.
[23]
T. Somu Muthukaruppan, A. Pathania, and T. Mitra. Price theory based power management for heterogeneous multi-cores. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 161–176, 2014.
[24]
E. Totoni, M. Dikmen, and M. J. Garzarán. Easy, fast, and energy-efficient object detection on heterogeneous on-chip architectures. ACM Trans. Archit. Code Optim., 10(4): 45:1–45:25, Dec. 2013.
[25]
H. Wang, V. Sathish, R. Singh, M. J. Schulte, and N. S. Kim. Workload and power budget partitioning for single-chip heterogeneous processors. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 401–410, 2012.
[26]
G. Wu, J. L. Greathouse, A. Lyashevsky, N. Jayasena, and D. Chiou. GPGPU performance and power estimation using machine learning. In IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pages 564–576, Feb 2015.
[27]
Q. Wu, M. Martonosi, D. W. Clark, V. J. Reddi, D. Connors, Y. Wu, J. Lee, and D. Brooks. A dynamic compilation framework for controlling microprocessor energy and performance. In Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 271–282, 2005.

Cited By

View all
  • (2024)Energy-Aware Tile Size Selection for Affine Programs on GPUsProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444795(13-27)Online publication date: 2-Mar-2024
  • (2023)Energy-Aware Scheduling for High-Performance Computing Systems: A SurveyEnergies10.3390/en1602089016:2(890)Online publication date: 12-Jan-2023
  • (2021)Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program InputsACM Transactions on Embedded Computing Systems10.1145/347828820:6(1-35)Online publication date: 18-Oct-2021
  • Show More Cited By

Index Terms

  1. A black-box approach to energy-aware scheduling on integrated CPU-GPU systems

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CGO '16: Proceedings of the 2016 International Symposium on Code Generation and Optimization
    February 2016
    283 pages
    ISBN:9781450337786
    DOI:10.1145/2854038
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • IEEE-CS: Computer Society

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 February 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Energy efficiency
    2. Heterogeneous CPU-GPU scheduling
    3. Power characterization

    Qualifiers

    • Research-article

    Conference

    CGO '16

    Acceptance Rates

    CGO '16 Paper Acceptance Rate 25 of 108 submissions, 23%;
    Overall Acceptance Rate 312 of 1,061 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)39
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Energy-Aware Tile Size Selection for Affine Programs on GPUsProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444795(13-27)Online publication date: 2-Mar-2024
    • (2023)Energy-Aware Scheduling for High-Performance Computing Systems: A SurveyEnergies10.3390/en1602089016:2(890)Online publication date: 12-Jan-2023
    • (2021)Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program InputsACM Transactions on Embedded Computing Systems10.1145/347828820:6(1-35)Online publication date: 18-Oct-2021
    • (2021)PCCS: Processor-Centric Contention-aware Slowdown Model for Heterogeneous System-on-ChipsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480101(1282-1295)Online publication date: 18-Oct-2021
    • (2021)FPGA Resource Pooling in Cloud ComputingIEEE Transactions on Cloud Computing10.1109/TCC.2018.28740119:2(610-626)Online publication date: 1-Apr-2021
    • (2021)AnghaBenchProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370322(378-390)Online publication date: 27-Feb-2021
    • (2021)Scheduling for heterogeneous systems in accelerator-rich environmentsThe Journal of Supercomputing10.1007/s11227-021-03883-5Online publication date: 25-May-2021
    • (2020)YACOSProceedings of the 24th Brazilian Symposium on Context-Oriented Programming and Advanced Modularity10.1145/3427081.3427089(56-63)Online publication date: 19-Oct-2020
    • (2020)MEPHESTOProceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques10.1145/3410463.3414671(413-425)Online publication date: 30-Sep-2020
    • (2020)Minimizing Energy of Heterogeneous Computing Systems by Task Scheduling ApproachJournal of Circuits, Systems and Computers10.1142/S0218126620501947Online publication date: 29-Jan-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media