Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3623278.3624756acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

Predict; Don't React for Enabling Efficient Fine-Grain DVFS in GPUs

Published: 07 February 2024 Publication History

Abstract

With the continuous improvement of on-chip integrated voltage regulators (IVRs) and fast, adaptive frequency control, dynamic voltage-frequency scaling (DVFS) transition times have shrunk from the microsecond to the nanosecond regime, providing immense opportunity to improve energy efficiency. The key to unlocking the continued improvement in V/f circuit technology is the creation of new, smarter DVFS mechanisms that better adapt to rapid fluctuations in workload demand.
It is particularly important to optimize fine-grain DVFS mechanisms for graphics processing units (GPUs) as the chips become ever more important workhorses in the datacenter. However, GPU's massive amount of thread-level parallelism makes it uniquely difficult to determine the optimal V/f state at run-time. Existing solutions---mostly designed for single-threaded CPUs and longer time scales---fail to consider the seemingly chaotic, highly varying nature of GPU workloads at short time scales.
This paper proposes a novel prediction mechanism, PCSTALL, that is tailored for emerging DVFS capabilities in GPUs and achieves near-optimal energy efficiency. Using the insights from our fine-grained workload analysis, we propose a wavefront-level program counter (PC) based DVFS mechanism that improves program behavior prediction accuracy by 32% on average as compared to the best performing prior predictor for a wide set of GPU applications at 1μs DVFS time epochs. Compared to the current state-of-art, our PC-based technique achieves 19% average improvement when optimized for Energy-Delay2 Product (ED2P) at 50μs time epochs, reaching 32% when operated with 1μs DVFS technologies.

References

[1]
S. Kaxiras and M. Martonosi, "Computer Architecture Techniques for Power-Efficiency," Synthesis Lectures on Computer Architecture, vol. 3, no. 1, pp. 1--207, 2008.
[2]
S. Eyerman and L. Eeckhout, "Fine-grained DVFS using on-chip regulators," ACM Transactions on Architecture and Code Optimization, vol. 8, no. 1, pp. 1--24, 2011.
[3]
W. Kim, M. S. Gupta, G.-Y. Wei and D. M. Brooks, "System level analysis of fast, per-core DVFS using on-chip switching regulators," 2008. [Online]. Available: http://eecs.harvard.edu/~dbrooks/kim2008_hpca.pdf. [Accessed 25 6 2019].
[4]
Y. Bai, V. W. Lee and E. Ipek, "Voltage Regulator Efficiency Aware Power Management," Sigplan Notices, vol. 45, no. 1, pp. 825--838, 2017.
[5]
A. Paul, S. P. Park, S. Dinesh, Y. M. Kim, N. Borkar, U. R. Karpuzcu and C. H. Kim, "System-Level Power Analysis of a Multicore Multipower Domain Processor With ON-Chip Voltage Regulators," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, pp. 3468--3476, 2016.
[6]
E. A. Burton, G. Schrom, F. Paillet, J. P. Douglas, W. J. Lambert, K. Radhakrishnan and M. J. Hill, "FIVR --- Fully integrated voltage regulators on 4th generation Intel® CoreTM SoCs," 2014. [Online]. Available: https://ieeexplore.ieee.org/document/6803344. [Accessed 21 11 2019].
[7]
M. Fojtik, B. Keller, A. Klinefelter, N. Pinckney, S. G. Tell, B. Zimmer, T. Raja, K. Zhou, W. J. Dally and B. Khailany, "A Fine-Grained GALS SoC with Pausible Adaptive Clocking in 16 nm FinFET," in 2019 25th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), 2019.
[8]
X. Sun, A. Boora, W. Zhang, V. R. Pamula and V. Sathe, "14.5 A 0.6-to-1.1V Computationally Regulated Digital LDO with 2.79-Cycle Mean Settling Time and Autonomous Runtime Gain Tracking in 65nm CMOS," IEEE International Solid- State Circuits Conference - (ISSCC), 2019.
[9]
Y. Okuma, K. Ishida, Y. Ryu, X. Zhang, P.-H. Chen, K. Watanabe, M. Takamiya and T. Sakurai, "0.5-V input digital LDO with 98.7% current efficiency and 2.7-μA quiescent current in 65nm CMOS," 2010. [Online]. Available: http://lowpower.iis.u-tokyo.ac.jp/paper/2010_22.pdf. [Accessed 25 11 2019].
[10]
R. Miftakhutdinov, E. Ebrahimi and Y. N. Patt, "Predicting Performance Impact of DVFS for Realistic Memory Systems," IEEE Micro, pp. 155--165, 2012.
[11]
P. Paternoster, A. Maki, A. Hernandez, M. Grossman, M. Lau, D. Sutherland and A. Mathad, "XBOX Series X: A Next-Generation Gaming Console SoC," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), 2021.
[12]
P. A. Meinerzhagen, C. Tokunaga, A. Malavasi, V. Vaidya, A. Mendon, D. A. Mathaikutty, J. P. Kulkarni, C. Augustine, M. Cho, S. T. Kim, G. E. Matthew, R. Jain, J. F. Ryan, C.-C. Peng, S. Paul, S. R. Vangal, B. P. Esparza, L. Cuellar, M. Woodman, B. Iyer, S. Maiyuran, G. N. Chinya, C. Zou, Y. Liao, K. Ravichandran, H. Wang, M. M. Khellah, J. W. Tschanz and V. De, "An energy-efficient graphics processor featuring fine-grain DVFS with integrated voltage regulators, execution-unit turbo, and retentive sleep in 14nm tri-gate CMOS," 2018. [Online]. Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8310172. [Accessed 16 8 2019].
[13]
T. Singh, A. Schaefer, S. Rangarajan, D. John, C. Henrion, R. Schreiber, M. Rodriguez, S. Kosonocky, S. D. Naffziger and A. Novak, "Zen: An Energy-Efficient High-Performance X86 Core," IEEE Journal of Solid-state Circuits, vol. 53, no. 1, pp. 102--114, 2018.
[14]
T. Burd, N. Beck, S. White, M. Paraschou, N. Kalyanasundharam, G. Donley, A. D. Smith, L. D. Hewitt and S. D. Naffziger, ""Zeppelin": An SoC for Multichip Architectures," IEEE Journal of Solid-state Circuits, vol. 54, no. 1, pp. 133--143, 2019.
[15]
D. Bouvier, J. Gibney, A. Branover and S. Arora, "AMD Raven-Ridge APU: Delivering a new level of visual performance in an SoC," [Online]. Available: https://www.hotchips.org/hc30/1conf/1.05_AMD_APU_AMD_Raven_HotChips30_Final.pdf.
[16]
B. Zimmer, Y. Lee, A. Puggelli, J. Kwak, R. Jevtic, B. Keller, S. Bailey, M. Blagojevic, P.-F. Chiu, H.-P. Le, P.-H. Chen, N. Sutardja, R. Avizienis, A. Waterman, B. Richards, P. Flatresse, E. Alon, K. Asanovic and B. Nikolic, "A RISC-V vector processor with tightly-integrated switched-capacitor DC-DC converters in 28nm FDSOI," 2015. [Online]. Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7231305. [Accessed 12 2 2020].
[17]
R. Muthukaruppan, T. Mahajan, H. Krishnamurthy, S. Mangal, A. Dhanshekhar, R. Ghayal and V. De, "A digitally controlled linear regulator for per-core wide-range DVFS of atomTM cores in 14nm tri-gate CMOS featuring non-linear control, adaptive gain and code roaming," in ESSCIRC 2017 - 43rd IEEE European Solid State Circuits Conference, Leuven, 2017.
[18]
S. J. Kim, D. Kim, H. Ham, J. Kim and M. Seok, "A 67.1-ps FOM, 0.5-V-Hybrid Digital LDO With Asynchronous Feedforward Control Via Slope Detection and Synchronous PI With State-Based Hysteresis Clock Switching," 2018. [Online]. Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8490730. [Accessed 12 2 2020].
[19]
H. Li, X. Wang, J. Xu, Z. Wang, R. K. V. Maeda, Z. Wang, P. Yang, L. H. K. Duong and Z. W. Wang, "Energy-Efficient Power Delivery System Paradigms for Many-Core Processors," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 36, no. 3, pp. 449--462, 2017.
[20]
R. Nath and D. M. Tullsen, "The CRISP performance model for dynamic voltage and frequency scaling in a GPGPU," in 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2015.
[21]
X. Mei, L. S. Yung, K. Zhao and X. Chu, "A measurement study of GPU DVFS on energy conservation," 2013. [Online]. Available: http://comp.hkbu.edu.hk/~chxw/papers/hotpower_2013.pdf. [Accessed 15 8 2019].
[22]
J. a. B. A. Leng, R. Bertran, P. Bose, Y. Zu and V. J. Reddi, "Predictive Guardbanding: Program-Driven Timing Margin Reduction for GPUs," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 40, no. 1, pp. 171--184, 2021.
[23]
S. T. Kim, Y.-C. Shih, K. Mazumdar, R. Jain, J. F. Ryan, C. Tokunaga, C. Augustine, J. P. Kulkarni, K. Ravichandran, J. W. Tschanz, M. M. Khellah and V. De, "Enabling Wide Autonomous DVFS in a 22 nm Graphics Execution Core Using a Digitally Controlled Fully Integrated Voltage Regulator," IEEE Journal of Solid-state Circuits, vol. 51, no. 1, pp. 18--30, 2016.
[24]
G. Keramidas, V. Spiliopoulos and S. Kaxiras, "Interval-based models for runtime DVFS orchestration in superscalar processors," 2010. [Online]. Available: http://pages.cs.wisc.edu/~kaxiras/papers/cf10_dvfs_model.pdf. [Accessed 25 6 2019].
[25]
C. Isci, G. Contreras and M. Martonos, Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management, 2006, p. 359--370.
[26]
K. Straube, J. Lowe-Power, C. Nitta, M. Farrens and V. Akella, "Improving Provisioned Power Efficiency in HPC Systems with GPU-CAPP," 2018 IEEE 25th International Conference on High Performance Computing (HiPC), 2018.
[27]
C. C. Nugteren, G. G.-J. v. d. Braak and H. Corporaal, "Roofline-aware DVFS for GPUs," 2014. [Online]. Available: http://ece.neu.edu/groups/nucar/nucartalks/roofline-aware_dvfs_for_gpus.pdf. [Accessed 15 8 2019].
[28]
A. Mishra and N. Khare, "Analysis of DVFS Techniques for Improving the GPU Energy Efficiency," 2015. [Online]. Available: http://file.scirp.org/pdf/ojee_2015121415504865.pdf. [Accessed 15 8 2019].
[29]
Z. Tang, Y. Wang, Q. Wang and X. Chu, "The Impact of GPU DVFS on the Energy and Performance of Deep Learning: An Empirical Study," in Proceedings of the Tenth ACM International Conference on Future Energy Systems, Phoenix, 2019.
[30]
G. Antoniou, H. Volos, D. B. Bartolini, T. Rollet, Y. Sazeides and J. H. Yahya, "AgilePkgC: An Agile System Idle State Architecture for Energy Proportional Datacenter Servers," arXiv, 2022.
[31]
B. Dutta, V. Adhinarayanan and W.-c. Feng, "GPU power prediction via ensemble machine learning for DVFS space exploration," 2018. [Online]. Available: https://vtechworks.lib.vt.edu/handle/10919/81997. [Accessed 15 8 2019].
[32]
K. Fan, B. Cosenza and B. Juurlink, "Predictable GPUs Frequency Scaling for Energy and Performance," in Proceedings of the 48th International Conference on Parallel Processing, Kyoto, Japan, 2019.
[33]
S. Eyerman and L. Eeckhout, "A Counter Architecture for Online DVFS Profitability Estimation," IEEE Transactions on Computers, vol. 59, no. 11, pp. 1576--1583, 2010.
[34]
B. Rountree, D. K. Lowenthal, M. Schulz and B. R. d. Supinski, "Practical performance prediction under Dynamic Voltage Frequency Scaling," 2011. [Online]. Available: http://yadda.icm.edu.pl/yadda/element/bwmeta1.element.ieee-000006008553. [Accessed 25 6 2019].
[35]
C. Isci, A. Buyuktosunoglu, C.-Y. Chen, P. Bose and M. Martonosi, "An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget," IEEE Micro, pp. 347--358, 2006.
[36]
H. Hoffmann and M. Maggio, "PCP: A Generalized Approach to Optimizing Performance Under Power Constraints through Resource Management," 2014. [Online]. Available: https://usenix.org/system/files/conference/icac14/icac14-paper-hoffman.pdf. [Accessed 23 7 2019].
[37]
S. Park, J. Park, D. Shin, Y. Wang, Q. Xie, M. Pedram and N. Chang, "Accurate Modeling of the Delay and Energy Overhead of Dynamic Voltage and Frequency Scaling in Modern Microprocessors," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 32, no. 5, pp. 695--708, 2013.
[38]
V. Spiliopoulos, S. Kaxiras and G. Keramidas, "Green governors: A framework for Continuously Adaptive DVFS," 2011. [Online]. Available: http://diva-portal.org/smash/record.jsf?pid=diva2:474791. [Accessed 25 6 2019].
[39]
M. Weiser, B. B. Welch, A. J. Demers and S. Shenker, "Scheduling for reduced CPU energy," 1994. [Online]. Available: https://link.springer.com/chapter/10.1007/978-0-585-29603-6_17. [Accessed 25 6 2019].
[40]
P. Zou, A. Li, K. Barker and R. Ge, "Indicator-Directed Dynamic Power Management for Iterative Workloads on GPU-Accelerated Systems," in 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), 2020.
[41]
A. Zou, J. Leng, Y. Zu, T. Tong, V. J. Reddi, D. M. Brooks, G.-Y. Wei and X. Zhang, "Ivory: Early-Stage Design Space Exploration Tool for Integrated Voltage Regulators," 2017. [Online]. Available: https://dl.acm.org/citation.cfm?id=3062268. [Accessed 23 7 2019].
[42]
A. Paul, S. P. Park, D. Somasekhar, Y. M. Kim, N. Borkar, U. R. Karpuzcu and C.H.Kim, "System-Level Power Analysis of a Multicore Multipower Domain Processor With ON-Chip Voltage Regulators," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2016.
[43]
J. Haj-Yahya, M. Alser, J. S. Kim, L. Orosa, E. Rotem, A. Mendelson, A. Chattopadhyay and O. Mutlu, "FlexWatts: A Power- and Workload-Aware Hybrid Power Delivery Network for Energy-Efficient Microprocessors," arXiv, 2020.
[44]
M. Sankarasubramanian, K. Radhakrishnan, Y. Min, W. Lambert, M. J. Hill, A. Dani, R. Mesch, L. Wojewoda, J. Chavarria and A. Augustine, "Magnetic Inductor Arrays for Intel® Fully Integrated Voltage Regulator (FIVR) on 10th generation Intel® CoreTM SoCs," in 2020 IEEE 70th Electronic Components and Technology Conference (ECTC), Orlando, 2020.
[45]
C. Schaef, K. Radhakrishnan, K. Ravichandran, J. W. Tschanz, V. De, N. Desai, H. K. Krishnamurthy, X. Liu, K. Z. Ahmed, S. Kim, S. Weng, H. T. Do and W. J. Lambert, "A Light-Load Efficient Fully Integrated Voltage Regulator in 14-nm CMOS With 2.5-nH Package-Embedded Air-Core Inductors," IEEE Journal of Solid-state Circuits, vol. 54, no. 12, pp. 3316--3325, 2019.
[46]
G. Yan, Y. Li, Y. Han, X. Li, M. Guo and X. Liang, "AgileRegulator: A hybrid voltage regulator scheme redeeming dark silicon for power efficiency in a multicore architecture," 2012. [Online]. Available: http://cs.sjtu.edu.cn/~guo-my/pdf/conferences/c130.pdf. [Accessed 5 7 2019].
[47]
Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch and R. Bianchini, "CoScale: Coordinating CPU and Memory System DVFS in Server Systems," IEEE Micro, pp. 143--154, 2012.
[48]
S. Akram, J. B. Sartor and L. Eeckhout, "DVFS performance prediction for managed multithreaded applications," 2016. [Online]. Available: https://biblio.ugent.be/publication/7245653. [Accessed 23 7 2019].
[49]
H. Zhang and H. Hoffmann, "Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques," Sigplan Notices, vol. 44, no. 2, pp. 545--559, 2016.
[50]
H. Hoffmann and M. Maggio, "PCP: A Generalized Approach to Optimizing Performance Under Power Constraints through Resource Management," 2014. [Online]. Available: https://lup.lub.lu.se/search/publication/44544051-23aa-4be2-904f-b780181c3f90. [Accessed 5 7 2019].
[51]
C. C. Lin, C. J. Chang, Y. C. Syu, J. J. Wu, P. Liu, P. W. Cheng and W. T. Hsu, "An Energy-Efficient Task Scheduler for Multi-core Platforms with Per-core DVFS Based on Task Characteristics," 2014. [Online]. Available: http://ieeexplore.ieee.org/document/6957247. [Accessed 5 7 2019].
[52]
M. Curtis-Maury, J. Dzierwa, C. D. Antonopoulos and D. S. Nikolopoulos, "Online power-performance adaptation of multithreaded programs using hardware event-based prediction," 2006. [Online]. Available: http://people.cs.vt.edu/~dsn/papers/ics06.pdf. [Accessed 5 7 2019].
[53]
B. Su, J. Gu, L. Shen, W. Huang, J. L. Greathouse and Z. Wang, "PPEP: Online Performance, Power, and Energy Prediction Framework and DVFS Space Exploration," IEEE Micro, pp. 445--457, 2014.
[54]
A. Gendler, E. Knoll and Y. Sazeides, "I-DVFS: Instantaneous Frequency Switch During Dynamic Voltage and Frequency Scaling," in IEEE Micro, 2021.
[55]
J. Haj-Yahya, M. Alser, J. Kim, A. G. Yaglikçi, N. Vijaykumar, E. Rotem and O. Mutlu, "SysScale: Exploiting Multi-domain Dynamic Voltage and Frequency Scaling for Energy Efficient Mobile Processors," in Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture, Virtual Event, 2020.
[56]
A. Miyoshi, C. Lefurgy, E. Van Hensbergen, R. Rajamony and R. Rajkumar, "Critical Power Slope: Understanding the Runtime Effects of Frequency Scaling," in Proceedings of the 16th International Conference on Supercomputing, New York, New York, USA, 2002.
[57]
Q. Wu, M. Martonosi, D. W. Clark, V. J. Reddi, D. A. Connors, Y. Wu, J. Lee and D. M. Brooks, "A Dynamic Compilation Framework for Controlling Microprocessor Energy and Performance," IEEE Micro, pp. 271--282, 2005.
[58]
Y. Sazeides, R. Kumar, D. Tullsen and T. Constantinou, "The Danger of Interval-Based Power Efficiency Metrics: When Worst Is Best," in IEEE Computer Architecture Letters, 2005.
[59]
S. Akram, J. B. Sartor and L. Eeckhout, "DEP+BURST: Online DVFS Performance Prediction for Energy-Efficient Managed Language Execution," IEEE Transactions on Computers, vol. 66, no. 4,, pp. pp. 601--615, 1 April 2017.
[60]
C. Isci, A. Buyuktosunoglu and M. Martonosi, "Long-term workload phases: duration predictions and applications to DVFS," IEEE Micro, vol. 25, no. 5, pp. 39--51, 2005.
[61]
W. L. Bircher and L. K. John, "Predictive power management for multi-core processors,", 2010. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-642-24322-6_21. [Accessed 21 11 2019].
[62]
A. Smith and N. James, "AMD InstinctTM MI200 Series Accelerator and Node Architectures," in 2022 IEEE Hot Chips 34 Symposium (HCS), 2022.
[63]
R. Thomas, N. Sedaghati and R. Teodorescu, "EmerGPU: Understanding and mitigating resonance-induced voltage noise in GPU architectures," 2016. [Online]. Available: https://ieeexplore.ieee.org/document/7482076. [Accessed 24 11 2021].
[64]
C. Isci and M. Martonosi, "Phase characterization for power: evaluating control-flow-based and event-counter-based techniques," 2006. [Online]. Available: http://parapet.ee.princeton.edu/papers/canturk-hpca2006.pdf. [Accessed 15 8 2019].
[65]
O. Khan and S. Kundu, "Microvisor: a runtime architecture for thermal management in chip multiprocessors," 2011. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-642-24568-8_5. [Accessed 20 11 2019].
[66]
S. Srinivasan, R. Kumar and S. Kundu, "Program phase duration prediction and its application to fine-grain power management," 2013. [Online]. [Accessed 20 11 2019].
[67]
F. Xie, M. Martonosi and S. Malik, "Compile-time dynamic voltage scaling settings: opportunities and limits," Sigplan Notices, vol. 38, no. 5, pp. 49--62, 2003.
[68]
A. Gutierrez, B. M. Beckmann, A. Dutu, J. Gross, M. LeBeane, J. Kalamatianos, O. Kayiran, M. Poremba, B. Potter, S. Puthoor, M. D. Sinclair, M. Wyse, J. Yin, X. Zhang, A. Jain and T. G. Rogers, "Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level," 2018. [Online]. Available: https://ieeexplore.ieee.org/document/8327041. [Accessed 30 7 2019].
[69]
N. L. Binkert, B. M. Beckmann, G. Black, S. K. Reinhardt, A. G. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill and D. A. Wood, "The gem5 simulator," ACM Sigarch Computer Architecture News, vol. 39, no. 2, pp. 1--7, 2011.
[70]
S. Bharadwaj, J. Yin, B. Beckmann and T. Krishna, "Kite: A Family of Heterogeneous Interposer Topologies Enabled via Accurate Interconnect Modeling," in 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, US, 2020.
[71]
J. L. Greathouse and G. H. Loh, "Machine learning for performance and power modeling of heterogeneous systems," In Proceedings of the International Conference on Computer-Aided Design (ICCAD '18). ACM, New York, NY, USA, Article 47, 6pages, 2018.
[72]
I. Karlin, J. Keasler and J. R. Neely, "LULESH 2.0 Updates and Changes," 2013. [Online]. Available: https://codesign.llnl.gov/pdfs/lulesh2.0_changes.pdf. [Accessed 2 8 2019].
[73]
BAIDU Research, "DeepBench," [Online]. Available: https://github.com/baidu-research/DeepBench.
[74]
S. Dong and D. R. Kaeli, "DNNMark: A Deep Neural Network Benchmark Suite for GPUs," 2017. [Online]. Available: https://dl.acm.org/citation.cfm?id=3038239. [Accessed 2 8 2019].
[75]
P. Shah, R. G. Shenoy, V. Srinivasan, P. Bose and A. Buyuktosunoglu, "TokenSmart: Distributed, Scalable Power Management in the Many-Core Era," in IEEE Computer Architecture Letters, 2021.
[76]
C. Isci, G. Contreras and M. Martonosi, "Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management," IEEE Micro, pp. 359--370, 2006.
[77]
K. Choi, R. Soma and M. Pedram, "Fine-grained dynamic voltage and frequency scaling for precise energy and performance trade-off based on the ratio of off-chip access to on-chip computation times," 2004. [Online]. Available: http://sportlab.usc.edu/~kihwan/fg-dvfs.pdf. [Accessed 5 7 2019].
[78]
V. Spiliopoulos, A. Bagdia, A. Hansson, P. Aldworth and S. Kaxiras, "Introducing DVFS-Management in a Full-System Simulator," 2013. [Online]. Available: http://it.uu.se/katalog/vassp447/gem5_dvfs.pdf. [Accessed 30 7 2019].
[79]
Q. Wu, V. J. Reddi, Y. Wu, J. Lee, D. Connors, D. Brooks, M. Martonosi, D. W. Clark and yes, A Dynamic Compilation Framework for Controlling Microprocessor Energy and Performance, 2005, p. 271--282.
[80]
Z. Toprak-Deniz, M. A. Sperling, J. F. Bulzacchelli, G. S. Still, R. Kruse, S. Kim, D. W. Boerstler, T. Gloekler, R. Robertazzi, K. Stawiasz, T. Diemoz, G. English, D. T. Hui, P. H. Muench and J. Friedrich, "5.2 Distributed system of digitally controlled microregulators enabling per-core DVFS for the POWER8 TM microprocessor," 2014. [Online]. Available: http://ieeexplore.ieee.org/document/6757354. [Accessed 25 11 2019].
[81]
H. Li, J. Xu, Z. Wang, R. K. V. Maeda, P. Yang and Z. Tian, "Workload-Aware Adaptive Power Delivery System Management for Many-Core Processors," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 10, pp. 2076--2086, 2018.
[82]
A. Jog, O. Kayiran, A. Pattnaik, M. Kandemir, O. Mutlu, R. Iyer and C. R. Das, "Exploiting Core Criticality for Enhanced GPU Performance," Sigmetrics Performance Evaluation Review, vol. 44, no. 1, pp. 351--363, 2016.
[83]
AMD, "Polaris Whitepaper".
[84]
L. Wang, M. Jahre, A. Adileho and L. Eeckhout, "MDM: The GPU Memory Divergence Model," in International Symposium on Microarchitecture(MICRO), 2020.

Cited By

View all
  • (2023)Towards Improved Power Management in Cloud GPUsIEEE Computer Architecture Letters10.1109/LCA.2023.327865222:2(141-144)Online publication date: 1-Jul-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4
March 2023
430 pages
ISBN:9798400703942
DOI:10.1145/3623278
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 February 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic voltage frequency scaling
  2. graphics processing unit

Qualifiers

  • Research-article

Conference

ASPLOS '23

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)446
  • Downloads (Last 6 weeks)111
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Towards Improved Power Management in Cloud GPUsIEEE Computer Architecture Letters10.1109/LCA.2023.327865222:2(141-144)Online publication date: 1-Jul-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media