Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Compiler-Directed Power Management for Superscalars

Published: 09 January 2015 Publication History

Abstract

Modern superscalar CPUs contain large complex structures and diverse execution units, consuming wide dynamic power range. Building a power delivery network for the worst-case power consumption is not energy efficient and often is impossible to fit in small systems. Instantaneous power excursions can cause voltage droops. Power management algorithms are too slow to respond to instantaneous events. In this article, we propose a novel compiler-directed framework to address this problem. The framework is validated on a 4th Generation Intel® Core™ processor and with simulator on output trace. Up to 16% performance speedup is measured over baseline for the SPEC CPU2006 benchmarks.

References

[1]
Todd M. Austin. 1999. DIVA: A reliable substrate for deep submicron microarchitecture design. In Proceedings of the 32nd Annual International Symposium on Microarchitecture (MICRO-32). IEEE, Los Alamitos, CA, 196--207.
[2]
David Brooks and Margaret Martonosi. 2001. Dynamic thermal management for high-performance microprocessors. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture. 171--182.
[3]
James Charles, Preet Jassi, Narayan S. Ananth, Abbas Sadat, and Alexandra Fedorova. 2009. Evaluation of the Intel® Core™ i7 Turbo Boost feature. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’09). IEEE, Los Alamitos, CA, 188--197.
[4]
Kihwan Choi, Ramakrishna Soma, and Massound Pedram. 2005. Fine-grained dynamic voltage and frequency scaling for precise energy and performance tradeoff based on the ratio of off-chip access to on-chip computation times. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 1, 18--28.
[5]
Dan Ernst, Shidhartha Das, Seokwoo Lee, David Blaauw, Todd Austin, Trevor Mudge, Nam Sung Kim, and Krisztian Flautner. 2004. Razor: Circuit-level correction of timing errors for low-power operation. IEEE Micro 24, 6, 10--20.
[6]
Dan Ernst, Nam Sung Kim, Shidhartha Das, Sanjay Pant, Rajeev Rao, Toan Pham, Conrad Ziesler, David Blaauw, Todd Austin, Krisztian Flautner, and Trevor Mudge. 2003. Razor: A low-power pipeline based on circuit-level timing speculation. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-36). 7.
[7]
Nadeem Firasta, Mark Buxton, Paula Jinbo, Kaveh Nasri, and Shihjong Kuo. 2008. Intel® AVX: New Frontiers in Performance Improvements and Energy Efficiency. Intel Corporation White Paper.
[8]
Ed Grochowski, Dave Ayers, and Vivek Tiwari. 2002. Microarchitectural simulation and control of di/dt-induced power supply voltage variation. In Proceedings of the 8th International Symposium on High-Performance Computer Architecture. IEEE, Los Alamitos, CA, 7--16.
[9]
Meeta S. Gupta, Krishna K. Rangan, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2007. Towards a software approach to mitigate voltage emergencies. In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED’07). 123--128.
[10]
Meeta S. Gupta, Krishna K. Rangan, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2008. DeCoR: A Delayed Commit and Rollback mechanism for handling inductive noise in processors. In Proceedings of the IEEE 14th International Symposium on High-Performance Computer Architecture (HPCA’08). IEEE, Los Alamitos, CA, 381--392.
[11]
Marcus Hähnel, Bjorn Dobel, Marcus Volp, and Hermann Hartig. 2012. Measuring energy consumption for short code paths using RAPL. ACM SIGMETRICS Performance Evaluation Review 40, 3, 13--17.
[12]
Jawad Haj-Yihia. 2014. Power Profiling of Third Droop Voltage-Emergencies Tool. Haifa University. Retrieved November 12, 2014, from https://drive.google.com/folderview?id=0B3IgzCqRS5Q_NDZ0dWxZeTdHV2c&usp==sharing.
[13]
Per Hammarlund, Alberto J. Martinez, Atiq Bajwa, David L. Hill, Erik Hallnor, Hong Jiang, Martin Dixon, Michael Derr, Mikal Hunsaker, Rajesh Kumar, Randy Osborne, Ravi Rajwar, Ronak Singhal, Reynold D’Sa, Robert Chappell, Shiv Kaushik, Srinivas Chennupaty, Stephan Jourdan, Steve Gunther, Tom Piazza, and Ted Burton. 2013. 4th Generation Intel® Core™ Processor, Codenamed Haswell. Available at http://www.computer.org/scdl/mags/mi/preprint/06762795.pdf.
[14]
Seongmoo Heo, Kenneth Barr, and Krste Asanovic. 2003. Reducing power density through activity migration. In Proceedings of the 2003 International Symposium on Low Power Electronics and Design (ISLPED’03). 217--222.
[15]
Intel. 2009. (VRM) and Enterprise Voltage Regulator-Down 11.1 Design Guidelines. Reference Number 321736, Revision 002.
[16]
Intel. 2011. Measuring Processor Power, TDP vs. ACP. White Paper. Retrieved November 12, 2014, from http://www.intel.com/content/dam/doc/white-paper/resources-xeon-measuring-processor-power-paper.pdf.
[17]
Intel. 2014. Intel 64 and IA-32 Architectures Software Developer's Manual, Vol. 3, Sec. 14.9. Available at http://www.intel.com.
[18]
Tarush Jain and Tanmay Agrawal. 2013. The Haswell microarchitecture—4th generation processor. International Journal of Computer Science and Information Technologies 4, 3, 477--480.
[19]
Russ Joseph, David Brooks, and Margaret Martonosi. 2003. Control techniques to eliminate voltage emergencies in high performance processors. In Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA’03). IEEE, Los Alamitos, CA, 79.
[20]
Svilen Kanev, Timothy M. Jones, Gu-Yeon Wei, David Brooks, and Vijay Janapa Reddi. 2010. Measuring code optimization impact on voltage noise. Change 40, 20.
[21]
Youngtaek Kim. 2013. Characterization and Management of Voltage Noise in Multi-Core, Multi-Threaded Processors. Ph.D. Dissertation. University of Texas.
[22]
Youngtaek Kim, Lizy Kurian John, Sanjay Pant, Srilatha Manne, Michael Schulte, W. Lloyd Bircher, and Madhu S. Sibi Govindan. 2012. AUDIT: Stress testing the automatic way. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). 212--223.
[23]
Patrik Larsson. 1998. Resonance and damping in CMOS circuits with on-chip decoupling capacitance. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 45, 8, 849--858.
[24]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis and transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization. 75.
[25]
Charles R. Lefurgy, Alan J. Drake, Michael S. Floyd, Malcolm S. Allen-Ware, Bishop Brock, Jose A. Tierno, and John B. Carter. 2011. Active management of timing guardband to save energy in POWER7. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44). ACM, New York, NY, 1--11.
[26]
Timothy N. Miller, Renji Thomas, Xiang Pan, and Radu Teodorescu. 2012. VRSync: Characterizing and eliminating synchronization-induced voltage emergencies in many-core processors. ACM SIGARCH Computer Architecture News 40, 3, 249--260.
[27]
Shubhendu S. Mukherjee, Michael Kontz, and Steven K. Reinhardt. 2002. Detailed design and evaluation of redundant multi-threading alternatives. In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA’02). 99--110.
[28]
Mikhail Popovich, Andrey Mezhiba, and Eby G. Friedman. 2008. Power Distribution Networks with On-Chip Decoupling Capacitors. Springer, New York, NY.
[29]
Vijay Janapa Reddi, Simone Campanoni, Meeta S. Gupta, Michael D. Smith, Gu-Yeon Wei, David Brooks, and Kim Hazelwood. 2010b. Eliminating voltage emergencies via software-guided code transformations. ACM Transactions on Architecture and Code Optimization 7, 2, Article No. 12.
[30]
Vijay Janapa Reddi and Meeta Sharma Gupta. 2013. Resilient architecture design for voltage variation. Synthesis Lectures on Computer Architecture 8, 2, 1--138.
[31]
Vijay Janapa Reddi, Meeta Sharma Gupta, Glenn Holloway, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2013. Predicting voltage droops using recurring program and microarchitectural event activity. IEEE Micro 30, 1, 110.
[32]
Vijay Janapa Reddi, Meeta Sharma Gupta, Glenn H. Holloway, Gu-Yeon Wei, Michael D. Smith, and David Brooks. 2009. Voltage emergency prediction: Using signatures to reduce operating margins. In Proceedings of the 15th IEEE International Symposium on High Performance Computer Architecture (HPCA’09). IEEE, Los Alamitos, CA, 18--29.
[33]
Vijay Janapa Reddi, Svilen Kanev, Wonyoung Kim, Simone Campanoni, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2010a. Voltage smoothing: Characterizing and mitigating voltage noise in production processors via software-guided thread scheduling. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-43).
[34]
Vijay Janapa Reddi, Wonyoung Kim, Simone Campanoni, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2011. Voltage noise in production processors. IEEE Micro 31, 1, 20--28.
[35]
Yakun Sophia Shao and David Brooks. 2013. Energy characterization and instruction-level energy model of Intel's Xeon Phi processor. In Proceedings of the 2013 International Symposium on Low Power Electronics and Design. IEEE, Los Alamitos, CA, 389--394.
[36]
Kevin Skadron. 2004. Hybrid architectural dynamic thermal management. In Proceedings of the Design, Automation, and Test in Europe Conference and Exhibition. 10--15.
[37]
SPEC. 2006. Standard Performance Evaluation Corporation. Retrieved November 12, 2014, from http://www.spec.org/.
[38]
Mark C. Toburen. 1999. Power Analysis and Instruction Scheduling for Reduced DI/DT in the Execution Core of High-Performance Microprocessors. Technical Report.
[39]
Ofri Wechsler. 2006. Inside Intel® Core™ microarchitecture: Setting new standards for energy-efficient performance. Technology, 1.
[40]
Gilad Yahalom, Omer Vikinski, and Gregory Sizikov. 2008. Architecture constraints over dynamic current consumption. In Proceedings of the IEEE-EPEP Conference on Electrical Performance of Electronic Packaging. IEEE, Los Alamitos, CA, 3--6.
[41]
Ahmad Yasin. 2014. A top-down method for performance analysis and counters architecture. In Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 35--44.
[42]
Michael T. Zhang. Powering Intel® Pentium® 4 generation processors. 2001. In Proceedings of the 2001 Conference on Electrical Performance of Electronic Packaging. IEEE, Los Alamitos, CA, 215--218.

Cited By

View all
  • (2022)AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00063(835-850)Online publication date: Oct-2022
  • (2022)DarkGates: A Hybrid Power-Gating Architecture to Mitigate the Performance Impact of Dark-Silicon in High Performance Processors2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00089(1170-1183)Online publication date: Apr-2022
  • (2021)IChannelsProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00081(985-998)Online publication date: 14-Jun-2021
  • Show More Cited By

Index Terms

  1. Compiler-Directed Power Management for Superscalars

    Recommendations

    Reviews

    R. Clayton

    Modern processor architectures have complex, dynamic power demands that are difficult and expensive for the architecture's power distribution network (PDN) to meet. This paper describes a compiler-based analysis that delimits code regions having the potential to create exceptional power demands; the PDN reacts to these regions with actions that compensate for exceptional demand. System simulations show that delimited regions help the PDN reduce power overloads by 20 percent and overall power demand by 11 percent. This work addresses long-term (-104 nsec) voltage drops caused by capacitor exhaustion during high current demand. These problems are met by increasing supply, which spends power, or reducing demand, causing slower execution. A modified LLVM compiler performs static analysis over control-flow graphs to identify regions likely to cause problems. Individual instructions are assigned an empirical maximum energy use normalized to the cheapest instruction. The analysis identifies and minimizes code regions with excessive power demands; the remaining code is considered safe. A processor emulated the region-delimiting instruction and generated traces for offline simulator analysis. Tests based on SPEC CPU2006 benchmarks identified only power emergencies (no false negatives) with around 94 percent accuracy (six percent false-positive rate). The more precise false-positive rate improved average performance by around 12 percent. This paper (section 2 in particular) requires a good grasp of central processing unit (CPU) power management. The static analysis in section 3 is basic and can be easily picked up by a reader who understands the rudiments of compiler-based analysis. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 11, Issue 4
    January 2015
    797 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/2695583
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 January 2015
    Accepted: 01 October 2014
    Revised: 01 October 2014
    Received: 01 June 2014
    Published in TACO Volume 11, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Compiler assisted
    2. energy
    3. power management
    4. power modeling

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)84
    • Downloads (Last 6 weeks)18
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00063(835-850)Online publication date: Oct-2022
    • (2022)DarkGates: A Hybrid Power-Gating Architecture to Mitigate the Performance Impact of Dark-Silicon in High Performance Processors2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00089(1170-1183)Online publication date: Apr-2022
    • (2021)IChannelsProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00081(985-998)Online publication date: 14-Jun-2021
    • (2020)FlexWatts: A Power- and Workload-Aware Hybrid Power Delivery Network for Energy-Efficient Microprocessors2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00088(1051-1066)Online publication date: Oct-2020
    • (2019)A Comprehensive Evaluation of Power Delivery Schemes for Modern Microprocessors20th International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED.2019.8697544(123-130)Online publication date: Mar-2019
    • (2018)Software Static Energy Modeling for Modern ProcessorsInternational Journal of Parallel Programming10.1007/s10766-017-0496-z46:2(284-312)Online publication date: 1-Apr-2018
    • (2018)Static Power Modeling for Modern ProcessorEnergy Efficient High Performance Processors10.1007/978-981-10-8554-3_5(135-165)Online publication date: 23-Mar-2018
    • (2018)Power Modeling at High-Performance Computing ProcessorsEnergy Efficient High Performance Processors10.1007/978-981-10-8554-3_3(73-105)Online publication date: 23-Mar-2018
    • (2016)Fine-Grain Power Breakdown of Modern Out-of-Order Cores and Its Implications on Skylake-Based SystemsACM Transactions on Architecture and Code Optimization10.1145/301811213:4(1-25)Online publication date: 16-Dec-2016

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media