Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Maximizing Heterogeneous Processor Performance Under Power Constraints

Published: 17 September 2016 Publication History

Abstract

Heterogeneous processors (e.g., ARM’s big.LITTLE) improve performance in power-constrained environments by executing applications on the ‘little’ low-power core and move them to the ‘big’ high-performance core when there is available power budget. The total time spent on the big core depends on the rate at which the application dissipates the available power budget. When applications with different big-core power consumption characteristics concurrently execute on a heterogeneous processor, it is best to give a larger share of the power budget to applications that can run longer on the big core, and a smaller share to applications that run for a very short duration on the big core.
This article investigates mechanisms to manage the available power budget on power-constrained heterogeneous processors. We show that existing proposals that schedule applications onto a big core based on various performance metrics are not high performing, as these strategies do not optimize over an entire power period and are unaware of the applications’ power/performance characteristics. We use linear programming to design the DPDP power management technique, which guarantees optimal performance on heterogeneous processors. We mathematically derive a metric (Delta Performance by Delta Power) that takes into account the power/performance characteristics of each running application and allows our power-management technique to decide how best to distribute the available power budget among the co-running applications at minimal overhead. Our evaluations with a 4-core heterogeneous processor consisting of big.LITTLE pairs show that DPDP improves performance by 16% on average and up to 40% compared to a strategy that globally and greedily optimizes the power budget. We also show that DPDP outperforms existing heterogeneous scheduling policies that use performance metrics to decide how best to schedule applications on the big core.

References

[1]
Michela Becchi and Patrick Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. In Proceedings of the 3rd Conference on Computing Frontiers. 29--40.
[2]
David Brooks and Margaret Martonosi. 2001. Dynamic thermal management for high-performance microprocessors. In 7th International Symposium on High-Performance Computer Architecture (HPCA’01). 171--182.
[3]
Trevor E. Carlson, Wim Heirman, Stijn Eyerman, Ibrahim Hur, and Lieven Eeckhout. 2014. An evaluation of high-level mechanistic core models. ACM Transactions on Architecture and Code Optimization 11, 3, 28.
[4]
Jian Chen and Lizy K. John. 2009. Efficient program scheduling for heterogeneous multi-core processors. In Proceedings of the 46th Annual Design Automation Conference (DAC’09). 927--930.
[5]
N. Chitlur, G. Srinivasa, S. Hahn, P. K. Gupta, D. Reddy, D. Koufaty, P. Brett, A. Prabhakaran, Li Zhao, N. Ijih, S. Subhaschandra, S. Grover, Xiaowei Jiang, and R. Iyer. 2012. QuickIA: Exploring heterogeneous architectures on real prototypes. In 18th International Symposium on High Performance Computer Architecture (HPCA’12). 1--8.
[6]
Ryan Cochran, Can Hankendi, Ayse K. Coskun, and Sherief Reda. 2011. Pack & cap: Adaptive DVFS and thread packing under power caps. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). 175--185.
[7]
James Donald and Margaret Martonosi. 2006. Techniques for multicore thermal management: Classification and new exploration. In 33rd International Symposium on Computer Architecture (ISCA’06). 78--88.
[8]
H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger. 2011. Dark silicon and the end of multicore scaling. In 38th Annual International Symposium on Computer Architecture (ISCA’11). 365--376.
[9]
Stijn Eyerman and Lieven Eeckhout. 2008. System-level performance metrics for multiprogram workloads. IEEE Micro 28, 3, 42--53.
[10]
Songchun Fan, Seyed Majid Zahedi, and Benjamin C. Lee. 2016. The computational sprinting game. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’16). 561--575.
[11]
Soraya Ghiasi, Tom Keller, and Freeman Rawson. 2005. Scheduling for heterogeneous processors in server systems. In Proceedings of the 2nd Conference on Computing Frontiers. 199--210.
[12]
Peter Greenhalgh. 2011. big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. ARM White paper.
[13]
Steve Gunther, Anant Deval, Ted Burton, and Rajesh Kumar. 2010. Energy-efficient computing: Power management system on the Nehalem family of processors. Intel Technology Journal 14, 3.
[14]
Nikos Hardavellas, Michael Ferdman, Babak Falsafi, and Anastasia Ailamaki. 2011. Toward dark silicon in servers. IEEE Micro 31, 6--15.
[15]
Scott Huck. 2011. Measuring processor power. Intel white paper.
[16]
Canturk Isci, Alper Buyuktosunoglu, Chen-Yong Cher, Pradip Bose, and Margaret Martonosi. 2006. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In Proceedings of the 39th International Symposium on Microarchitecture (MICRO’06). 347--358.
[17]
Brian Jeff. 2013. big.LITTLE Technology moves towards fully heterogeneous global task scheduling. ARM White paper.
[18]
David Koufaty, Dheeraj Reddy, and Scott Hahn. 2010. Bias scheduling in heterogeneous multi-core architectures. In Proceedings of the 5th European Conference on Computer Systems. 125--138.
[19]
Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, and Dean M. Tullsen. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In 36th International Symposium on Microarchitecture (MICRO’03). 81--92.
[20]
Belli Kuttana. 2013. Technology Insight: Intel Silvermont Microarchitecture. Intel Developer Forum.
[21]
Nagesh B. Lakshminarayana, Jaekyu Lee, and Hyesoon Kim. 2009. Age based scheduling for asymmetric multiprocessors. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. 25.
[22]
Charles Lefurgy, Xiaorui Wang, and Malcolm Ware. 2008. Power capping: A prelude to power shifting. Cluster Computing 11, 2, 183--195.
[23]
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In 42nd International Symposium on Microarchitecture (MICRO’09). 469--480.
[24]
Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Ronald Dreslinski Jr, Thomas F. Wenisch, and Scott Mahlke. 2014. Heterogeneous microarchitectures trump voltage scaling for low-power cores. In 23rd International Conference on Parallel Architectures and Compilation Techniques (PACT’14). 237--250.
[25]
Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Faissal M. Sleiman, Ronald Dreslinski, Thomas F. Wenisch, and Scott Mahlke. 2012. Composite cores: Pushing heterogeneity into a core. In 45th International Symposium on Microarchitecture (MICRO’12). 317--328.
[26]
Kai Ma, Xue Li, Ming Chen, and Xiaorui Wang. 2011. Scalable power control for many-core architectures running multi-threaded applications. In 38th Annual International Symposium on Computer Architecture (ISCA’11). 449--460.
[27]
Thannirmalai Somu Muthukaruppan, Anuj Pathania, and Tulika Mitra. 2014. Price theory based power management for heterogeneous multi-cores. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 161--176.
[28]
NVIDIA. 2011. Variable SMP -- A multi-core CPU architecture for low power and high performance. White paper.
[29]
Harish Patil, Robert Cohn, Mark Charney, Rajiv Kapoor, Andrew Sun, and Anand Karunanidhi. 2004. Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’04). 81--92.
[30]
Indrani Paul, Srilatha Manne, Manish Arora, W. Lloyd Bircher, and Sudhakar Yalamanchili. 2013. Cooperative boosting: Needy versus greedy power management. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). 285--296.
[31]
Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2013a. Utilizing dark silicon to save energy with computational sprinting. IEEE Micro 33, 5, 20--28.
[32]
Arun Raghavan, Laurel Emurian, Lei Shao, Marios C. Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2013b. Computational sprinting on a hardware/software testbed. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’13). 155--166.
[33]
Arun Raghavan, Yixin Luo, Anuj Chandawalla, Marios C. Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2012. Computational sprinting. In 18th International Symposium on High Performance Computer Architecture (HPCA’12). 249--260.
[34]
Efraim Rotem, Alon Naveh, Avinash Ananthakrishnan, Doron Rajwan, and Eliezer Weissmann. 2012. Power-management architecture of the Intel microarchitecture code-named Sandy Bridge. IEEE Micro 2, 20--27.
[35]
Samsung Electronics. 2013. Samsung Primes Exynos 5 Octa for ARM big.LITTLE Technology with Heterogeneous Multi-Processing Capability. Press release.
[36]
Daniel Shelepov, Juan Carlos Saez Alcaide, Stacey Jeffery, Alexandra Fedorova, Nestor Perez, Zhi Feng Huang, Sergey Blagodurov, and Viren Kumar. 2009. HASS: A scheduler for heterogeneous multicore systems. ACM SIGOPS Operating Systems Review 43, 2, 66--75.
[37]
Michael B. Taylor. 2012. Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse. In Proceedings of the 49th Annual Design Automation Conference (DAC’12). 1131--1136.
[38]
Michael B. Taylor. 2013. A landscape of the new dark silicon design regime. IEEE Micro 33, 5, 8--19.
[39]
Kenzo Van Craeynest, Shoaib Akram, Wim Heirman, Aamer Jaleel, and Lieven Eeckhout. 2013. Fairness-aware scheduling on single-ISA heterogeneous multi-cores. In 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT’13). 177--187.
[40]
Kenzo Van Craeynest, Aamer Jaleel, Lieven Eeckhout, Paolo Narvaez, and Joel Emer. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In International Symposium on Computer Architecture (ISCA’12). 213--224.
[41]
Yefu Wang, Kai Ma, and Xiaorui Wang. 2009. Temperature-constrained power control for chip multiprocessors with online model estimation. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). 314--324.
[42]
Jonathan A. Winter, David H. Albonesi, and Christine A. Shoemaker. 2010. Scalable thread scheduling and global power management for heterogeneous many-core architectures. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT’10). 29--40.
[43]
Yuhao Zhu, Matthew Halpern, and Vijay Janapa Reddi. 2015. Event-based scheduling for energy-efficient qos (eqos) in mobile web applications. In 21st International Symposium on High Performance Computer Architecture (HPCA). 137--149.

Cited By

View all
  • (2022)Adaptive Power Shifting for Power-Constrained Heterogeneous SystemsIEEE Transactions on Computers10.1109/TC.2022.3174545(1-1)Online publication date: 2022
  • (2020)CuttleSys: Data-Driven Resource Management for Interactive Services on Reconfigurable Multicores2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00060(650-664)Online publication date: Oct-2020
  • (2019)TangramProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358285(384-398)Online publication date: 12-Oct-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 13, Issue 3
September 2016
207 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2988523
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 September 2016
Accepted: 01 July 2016
Revised: 01 July 2016
Received: 01 May 2016
Published in TACO Volume 13, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DPDP
  2. Heterogeneous chip multiprocessors
  3. power management
  4. scheduling

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013)/ERC

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)94
  • Downloads (Last 6 weeks)14
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Adaptive Power Shifting for Power-Constrained Heterogeneous SystemsIEEE Transactions on Computers10.1109/TC.2022.3174545(1-1)Online publication date: 2022
  • (2020)CuttleSys: Data-Driven Resource Management for Interactive Services on Reconfigurable Multicores2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00060(650-664)Online publication date: Oct-2020
  • (2019)TangramProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358285(384-398)Online publication date: 12-Oct-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media