Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3123939.3124537acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

Harnessing voltage margins for energy efficiency in multicore CPUs

Published: 14 October 2017 Publication History

Abstract

In this paper, we present the first automated system-level analysis of multicore CPUs based on ARMv8 64-bit architecture (8-core, 28nm X-Gene 2 micro-server by AppliedMicro) when pushed to operate in scaled voltage conditions. We report detailed system-level effects including SDCs, corrected/uncorrected errors and application/system crashes. Our study reveals large voltage margins (that can be harnessed for energy savings) and also large Vmin variation among the 8 cores of the CPU chip, among 3 different chips (a nominal rated and two sigma chips), and among different benchmarks.
Apart from the Vmin analysis we propose a new composite metric (severity) that aggregates the behavior of cores when undervolted and can support system operation and design protection decisions. Our undervolting characterization findings are the first reported analysis for an enterprise class 64-bit ARMv8 platform and we highlight key differences with previous studies on x86 platforms. We utilize the results of the system characterization along with performance counters information to measure the accuracy of prediction models for the behavior of benchmarks running in particular cores. Finally, we discuss how the detailed characterization and the prediction results can be effectively used to support design and system software decisions to harness voltage margins for energy efficiency while preserving operation correctness. Our findings show that, on average, 19.4% energy saving can be achieved without compromising the performance, while with 25% performance reduction, the energy saving raises to 38.8%.

References

[1]
F. Salehuddin, I. Ahmad, F.A. Hamid, A. Zaharim, A. Maheran, A. Hamid, P. S. Menon, H. A. Elgomati, and B. Y. Majlis. 2012. Optimization of process parameter variation in 45nm p-channel MOSFET using L18 Orthogonal Array. In Proceedings of IEEE International Conference on Semiconductor Electronic (ICSE '12). Kuala Lumpur, Malaysia, 219--223.
[2]
W. Schemmert, and G. Zimmer. 1974. Threshold-voltage sensitivity of ion- implanted MOS transistors due to process variations. Electronics Letters, vol. 10, no. 9, pp. 151--152, May.
[3]
Norman James, Phillip Restle, Joshua Friedrich, Bill Huott, and Bradley McCredie. 2007. Comparison of split-versus connected-core supplies in the POWER6 microprocessor. In Proceedings of the 2007 IEEE International Solid-State Circuits Conference (ISSCC `07). San Francisco, CA, USA, 298--604.
[4]
Vijay Janapa Reddi, Svilen Kanev, Wonyoung Kim, Simone Campanoni, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2010. Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-43). IEEE Computer Society, Washington, DC, USA, 77--88.
[5]
Etienne Le Sueur and Gernot Heiser. 2010. Dynamic voltage and frequency scaling: the laws of diminishing returns. In Proceedings of the 2010 international conference on Power aware computing and systems (HotPower'10). USENIX Association, Berkeley, CA, USA, 1--8.
[6]
Dan Ernst, Nam Sung Kim, Shidhartha Das, Sanjay Pant, Rajeev Rao, Toan Pham, Conrad Ziesler, David Blaauw, Todd Austin, Krisztian Flautner, and Trevor Mudge. 2003. Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture (MICRO-36). IEEE Computer Society, Washington, DC, USA, 7--18.
[7]
Yazhou Zu, Charles R. Lefurgy, Jingwen Leng, Matthew Halpern, Michael S. Floyd, and Vijay Janapa Reddi. 2015. Adaptive guardband scheduling to improve system-level efficiency of the POWER7+. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 308--321.
[8]
Charles R. Lefurgy, Alan J. Drake, Michael S. Floyd, Malcolm S. Allen-Ware, Bishop Brock, Jose A. Tierno, and John B. Carter. 2011. Active management of timing guardband to save energy in POWER7. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44). ACM, New York, NY, USA, 1--11.
[9]
Anys Bacha and Radu Teodorescu. 2013. Dynamic reduction of voltage margins by leveraging on-chip ECC in Itanium II processors. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA '13). ACM, New York, NY, USA, 297--307.
[10]
Anys Bacha and Radu Teodorescu. 2014. Using ECC Feedback to Guide Voltage Speculation in Low-Voltage Processors. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). IEEE Computer Society, Washington, DC, USA, 306--318.
[11]
Jingwen Leng, Alper Buyuktosunoglu, Ramon Bertran, Pradip Bose, and Vijay Janapa Reddi. 2015. Safe limits on voltage reduction efficiency in GPUs: a direct measurement approach. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 294--307.
[12]
The Linux Kernel Documentation (Parent Directory), Retrieved 2017 from https://www.kernel.org/doc/Documentation.
[13]
George Papadimitriou, Manolis Kaliorakis, Athanasios Chatzidimitriou, Dimitris Gizopoulos, Greg Favor, Kumar Sankaran and Shidhartha Das. 2017. A System-Level Voltage/Frequency Scaling Characterization Framework for Multicore CPUs. In 13th IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE `17). Boston, MA, USA.
[14]
John L. Henning. 2006. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News 34, 4 (September 2006), 1--17.
[15]
Reid J. Riedlinger, Rohit Bhatia, Larry Biro, Bill Bowhill, Eric Fetzer, Paul Gronowski, and Tom Grutkowski. 2011. A 32nm 3.1 Billion Transistor 12-Wide-Issue Itanium® Processor for Mission-Critical Servers", In Proceedings of the 2011 IEEE International Solid-State Circuits Conference (ISSCC `11). San Francisco, CA, USA, 84--86.
[16]
Arijit Biswas, Niranjan Soundararajan, Shubhendu S. Mukherjee, and Sudhanva Gurumurthi. 2009. Quantized AVF: A means of capturing vulnerability variations over small windows of time. In IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE `09). Stanford University, CA, USA.
[17]
Vijay Janapa Reddi, Meeta S. Gupta, Glenn Holloway, Gu-Yeon Wei, Michael D. Smith, and David Brooks. 2009. Voltage emergency prediction: Using signatures to reduce operating margins. In Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA `09), Raleigh, NC, USA 18--29.
[18]
Kristen R. Walcott, Greg Humphreys, and Sudhanva Gurumurthi. 2007. Dynamic prediction of architectural vulnerability from microarchitectural state. In Proceedings of the 34th annual international symposium on Computer architecture (ISCA '07). ACM, New York, NY, USA, 516--527.
[19]
Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Edouard Duchesnay. 2011. Scikit-learn: Machine learning in Python. Machine Learning Research, vol. 12, pp. 2825--2830, October.
[20]
Perf: Linux Profiling with Performance Counters. Retrieved 2017 from https://perf.wiki.kernel.org/index.php/Main_Page.
[21]
Chris Wilkerson, Hongliang Gao, Alaa R. Alameldeen, Zeshan Chishti, Muhammad Khellah, and Shih-Lien Lu. 2008. Trading off Cache Capacity for Reliability to Enable Low Voltage Operation. In Proceedings of the 35th Annual International Symposium on Computer Architecture (ISCA '08). IEEE Computer Society, Washington, DC, USA, 203--214.
[22]
Zeshan Chishti, Alaa R. Alameldeen, Chris Wilkerson, Wei Wu, and Shih-Lien Lu. 2009. Improving cache lifetime reliability at ultra-low voltages. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42). ACM, New York, NY, USA, 89--99.
[23]
Henry Duwe, Xun Jian, Daniel Petrisko, and Rakesh Kumar. 2016. Rescuing uncorrectable fault patterns in on-chip memories through error pattern transformation. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA '16). IEEE Press, Piscataway, NJ, USA, 634--644.
[24]
Meeta S. Gupta, Krishna K. Rangan, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2007. Towards a software approach to mitigate voltage emergencies. In Proceedings of the 2007 ACM/IEEE International Symposium on Low Power Electronics and Design (ISPLED `07), Portland, OR, USA, 123--128.
[25]
R. Franch, P. Restle, N. James, W. Huott, J. Friedrich, R. Dixon, S. Weitzel, K. Van Goor, and G. Salem. 2008. On-chip timing uncertainty measurements on IBM microprocessors. In Proceedings of the IEEE International Test Conference (ITC `08), Santa Clara, CA, USA, 1--7.
[26]
Phillip J. Restle, Robert L. Franch, Norman K. James, William V. Huott, Timothy M. Skergan, Steven C. Wilson, Nicole S. Schwartz, Joachim G. Clabes. 2004. Timing uncertainty measurements on the power5 microprocessor. In Proceedings of the 2004 IEEE International Solid-State Circuits Conference (ISSCC '04), San Francisco, CA, USA, 354--355.
[27]
Mahesh Ketkar and Eli Chiprout. 2009. A microarchitecture-based framework for pre- and post-silicon power delivery analysis. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42). ACM, New York, NY, USA, 179--188.
[28]
Youngtaek Kim and Lizy Kurian John. 2011. Automated di/dt stressmark generation for microprocessor power delivery networks. In Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design (ISLPED '11). IEEE Press, Piscataway, NJ, USA, 253--258.
[29]
Youngtaek Kim, Lizy Kurian John, Sanjay Pant, Srilatha Manne, Michael Schulte, W. Lloyd Bircher, and Madhu S. Sibi Govindan. 2012. AUDIT: Stress Testing the Automatic Way. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE Computer Society, Washington, DC, USA, 212--223.
[30]
Meeta S. Gupta, Vijay Janapa Reddi, Glenn Holloway, Gu-Yeon Wei, and David M. Brooks. 2009. An event-guided approach to reducing voltage noise in processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE '09). European Design and Automation Association, 3001 Leuven, Belgium, Belgium, 160--165.
[31]
Russ Joseph, David Brooks, and Margaret Martonosi. 2003. Control techniques to eliminate voltage emergencies in high performance processors. In Proceedings of the 2003 IEEE International Conference on High-Performance Computer Architecture (HPCA `03), Anaheim, CA, USA, 79--90.
[32]
Timothy N. Miller, Renji Thomas, Xiang Pan, and Radu Teodorescu. 2012. VRSync: characterizing and eliminating synchronization-induced voltage emergencies in many-core processors. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA '12). IEEE Computer Society, Washington, DC, USA, 249--260.
[33]
Michael D. Powell and T. N. Vijaykumar. 2003. Pipeline muffling and a priori current ramping: architectural techniques to reduce high-frequency inductive noise. In Proceedings of the 2003 international symposium on Low power electronics and design (ISLPED '03). ACM, New York, NY, USA, 223--228.
[34]
Meeta S. Gupta, Krishna K. Rangan, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2008. DeCoR: A Delayed Commit and Rollback mechanism for handling inductive noise in processors. In Proceedings of the 2008 IEEE International Conference on High-Performance Computer Architecture (HPCA `08), Salt Lake City, UT, USA.
[35]
Bhargava Gopireddy, Choungki Song, Josep Torrellas, Nam Sung Kim, Aditya Agrawal, and Asit Mishra. 2016. ScalCore: Designing a core for voltage scalability. In Proceedings of the 2016 IEEE International Conference on High-Performance Computer Architecture (HPCA `16), Barcelona, Spain, 681--693.
[36]
George Papadimitriou, Manolis Kaliorakis, Athanasios Chatzidimitriou, Charalampos Magdalinos, Dimitris Gizopoulos. 2017. Voltage Margins Identification on Commercial x86-64 Multicore Microprocessors. In Proceedings of the 2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS `17). Thessaloniki, Greece, 51--56.
[37]
Anys Bacha and Radu Teodorescu. 2015. Authenticache: harnessing cache ECC for system authentication. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 128--140.
[38]
Sriram Sundaram, Sriram Samabmurthy, Michael Austin, Aaron Grenat, Michael Golden, Stephen Kosonocky, and Samuel Naffziger. 2016. Adaptive Voltage Frequency Scaling using Critical Path Accumulator implemented in 28nm CPU. In Proceedings of the 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID `16), Kolkata, India, 565--566.
[39]
Paul N. Whatmough, Shidhartha Das, Zacharias Hadjilambrou, and David M. Bull. 2015. An all-digital power-delivery monitor for analysis of a 28nm dual-core ARM Cortex-A57 cluster. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC `15), San Francisco, CA, USA, 262--264.
[40]
Paul N. Whatmough, Shidhartha Das, and David M. Bull. 2015. Analysis of adaptive clocking technique for resonant supply voltage noise mitigation. In Proceedings of the 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED `15), Rome, Italy, 128--133.
[41]
Shidhartha Das, Paul Whatmough and David M. Bull. 2015. Modelling and characterization of the System-Level Power-Delivery Network for a Dual-Core ARM A57 Cluster in 28nm CMOS. In Proceedings of the 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED `15), Rome, Italy, 146--151.
[42]
Paul Whatmough, Shidhartha Das and David M. Bull. 2017. Power Integrity Analysis of a 28 nm Dual-Core ARM Cortex-A57 Cluster Using an All-Digital Power Delivery Monitor. In Journal of Solid-State Circuits (JSSC '17). vol. 52, no. 6, pp. 1643 -- 1654, March.
[43]
Wenhao Jia, Kelly A. Shaw, and Margaret Martonosi. 2012. Stargazer: Automated regression-based GPU design space exploration. In Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS '12). IEEE Computer Society, Washington, DC, USA, 2--13.
[44]
P. J. Joseph, Kapil Vaswani, Matthew J. Thazhuthaveetil. 2006. Construction and use of linear regression models for processor performance analysis. In Proceedings of the 12th International Conference on High-Performance Computer Architecture (HPCA '06). Austin, TX, USA, 99--108.
[45]
Benjamin C. Lee and David M. Brooks. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems (ASPLOS XII). ACM, New York, NY, USA, 185--194.

Cited By

View all
  • (2025)Evaluating GPU's Instruction-Level Error Characteristics Under Low Supply VoltagesIEEE Transactions on Computers10.1109/TC.2024.350036674:2(555-568)Online publication date: Mar-2025
  • (2024)Understanding Timing Error Characteristics from Overclocked Systolic Multiply–Accumulate Arrays in FPGAsJournal of Low Power Electronics and Applications10.3390/jlpea1401000414:1(4)Online publication date: 9-Jan-2024
  • (2024)SUIT: Secure Undervolting with Instruction TrapsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640373(1128-1145)Online publication date: 27-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture
October 2017
850 pages
ISBN:9781450349529
DOI:10.1145/3123939
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. energy efficiency
  2. error detection and correction
  3. micro-servers
  4. multicore CPUs characterization
  5. power consumption
  6. voltage and frequency scaling

Qualifiers

  • Research-article

Funding Sources

  • European Union

Conference

MICRO-50
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)7
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Evaluating GPU's Instruction-Level Error Characteristics Under Low Supply VoltagesIEEE Transactions on Computers10.1109/TC.2024.350036674:2(555-568)Online publication date: Mar-2025
  • (2024)Understanding Timing Error Characteristics from Overclocked Systolic Multiply–Accumulate Arrays in FPGAsJournal of Low Power Electronics and Applications10.3390/jlpea1401000414:1(4)Online publication date: 9-Jan-2024
  • (2024)SUIT: Secure Undervolting with Instruction TrapsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640373(1128-1145)Online publication date: 27-Apr-2024
  • (2024)The Environmental Cost of High Performance Computing System Simulation2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00048(289-292)Online publication date: 20-Mar-2024
  • (2024)SmartOClock: Workload- and Risk-Aware Overclocking in the Cloud2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00040(437-451)Online publication date: 29-Jun-2024
  • (2024)Sustainability and High Performance ComputingInformation Integration and Web Intelligence10.1007/978-3-031-78093-6_21(237-242)Online publication date: 4-Dec-2024
  • (2023)Impact of Voltage Scaling on Soft Errors Susceptibility of Multicore Server CPUsProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614304(957-971)Online publication date: 28-Oct-2023
  • (2023)GreenMD: Energy-efficient Matrix Decomposition on Heterogeneous Multi-GPU SystemsACM Transactions on Parallel Computing10.1145/358359010:2(1-23)Online publication date: 20-Jun-2023
  • (2023)Toward Sustainable HPC: Carbon Footprint Estimation and Environmental Implications of HPC SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607035(1-15)Online publication date: 12-Nov-2023
  • (2023)Silent Data Corruptions: Microarchitectural PerspectivesIEEE Transactions on Computers10.1109/TC.2023.328509472:11(3072-3085)Online publication date: Nov-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media