Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Joint frequency scaling of processor and DRAM

Published: 01 April 2016 Publication History

Abstract

Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. Many previous energy-saving strategies have focused solely on the CPU while the DRAM subsystem has not been addressed sufficiently, even though memory consumes about 20 % of the total power in a typical server platform. This paper describes a novel runtime system that scales the frequency of both processor and DRAM-based on the performance and power models, also proposed here. Specifically, first, a performance-loss constraint is chosen for an application, then, an optimal processor---DRAM frequency pair is modeled such that the pair minimizes the energy consumption in a given timeslice. Experiments performed on SPEC CPU™ 2006, NAS NPB, and pARMS benchmarks demonstrate that the proposed runtime system may obtain total energy savings both for memory- and compute-intensive applications. In particular, as much as 22 % of energy was saved with a low performance loss of about 4.8 %.

References

[1]
Begum R, Werner D, Hempstead M, Prasad G, Challen G (2015) Energy-performance trade-offs on energy-constrained devices with multi-component DVFS. In: Workload Characterization (IISWC), 2015 IEEE International Symposium on, pp 34---43, Oct 2015
[2]
Borkar S (2001) The exascale challenge, 2011. Keynote speech. In: the 12th International Conference on Parallel Architectures and Compilation Techniques
[3]
Chen YJ, Yang CL, Lin PS, Lu YC (2015) Thermal/performance characterization of CMPs with 3D-stacked DRAMs under synergistic voltage-frequency control of cores and DRAMs. In: Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems, RACS, pp 430---436, New York, NY, USA, 2015. ACM
[4]
David H, Fallin C, Gorbatov E, Hanebutte UR, Mutlu O (2011) Memory power management via dynamic voltage/frequency scaling. In: Proceedings of the 8th ACM International Conference on Autonomic Computing, pp 31---40
[5]
Deng Q, Meisner D, Bhattacharjee A, Wenisch TF, Bianchini R (2012) Coscale: coordinating cpu and memory system DVFS in server systems. In: Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on, pp 143---154, Dec 2012
[6]
Etinski M, Corbalan J, Labarta J, Valero M, Veidenbaum A (2009) Power-aware load balancing of large scale MPI applications. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pp 1---8, May 2009
[7]
Freeh VW, Lowenthal DK (2005) Using multiple energy gears in MPI programs on a power-scalable cluster. In: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp 164---173
[8]
Ge R, Feng X, Feng W, Cameron KW (2007) CPU MISER: A performance-directed, run-time system for power-aware clusters. In: Parallel Processing, 2007. ICPP 2007. International Conference on, pp 18, Sep. 2007
[9]
Ge R, Feng X, Song S, Chang HC, Li D, Cameron KW (2010) PowerPack: energy profiling and analysis of high-performance systems and applications. Parallel Distrib Syst IEEE Trans 21:658---671
[10]
Gonzales R, Horowitz M (1995) Energy dissipation in general purpose processors. IEEE J Solid State Circuits 31:1277---1284
[11]
Hackenberg D, Schone R, Ilsche T, Molka D, Schuchart J, Geyer R (2015) An energy efficiency feature survey of the intel haswell processor. In: Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International, pp 896---904, May 2015
[12]
Hennessy JL, Patterson DA (2011) Computer architecture: a quantitative approach (appendix B), 5th edn. Morgan Kaufmann Publishers Inc., San Francisco
[13]
Henning JL (2006) SPEC CPU2006 benchmark descriptions. SIGARCH Comput Archit News 34(4):1---17
[14]
Hsu CH, Feng W (2005) A power-aware run-time system for high-performance computing. In Supercomputing. In: Proceedings of the ACM/IEEE SC 2005 Conference, pp 1, Nov. 2005
[15]
Huang S, Feng W (2009) Energy-efficient cluster computing via accurate workload characterization. In: Cluster Computing and the Grid, 2009. CCGRID'09. 9th IEEE/ACM International Symposium on, pp 68---75, May 2009
[16]
Iancu C, Hofmeyr S, Blagojevic F, Zheng Y (2010) Oversubscription on multicore processors. In: Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pp 1---11
[17]
Intel 64 and IA-32 architectures software developer's manual combined volumes 3A, 3B, and 3C: System programming guide. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
[18]
Ioannou N, Kauschke M, Gries M, Cintra M (2011) Phase-based application-driven hierarchical power management on the single-chip cloud computer. In: Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pp 131---142, Oct. 2011
[19]
Kandalla K, Mancini EP, Sur S, Panda DK (2010) Designing power-aware collective communication algorithms for InfiniBand clusters. In: Parallel Processing (ICPP), 2010 39th International Conference on, pp 218---227
[20]
Lefurgy C, Rajamani K, Rawson F, Felter W, Kistler M, Keller TW (2003) Energy management for commercial servers. Computer 36(12):39---48
[21]
Li Z, Saad Y, Sosonkina M (2003) pARMS: a parallel version of the algebraic recursive multilevel solver. Numer Linear Algebra Appl 10:485---509
[22]
Lim MY, Freeh VW, Lowenthal DK (2006) Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs. In: Proceedings of the 2006 ACM/IEEE conference on Supercomputing
[23]
Mills N, Mills E (2015) Taming the energy use of gaming computers. Energy Efficiency 1---18.
[24]
Mittal S (2014) A survey of techniques for improving energy efficiency in embedded computing systems. Int J Comput Aided Eng Technol (IJACET) 6:440---459
[25]
Moscibroda T, Mutlu O (2007) Memory performance attacks: Denial of memory service in multi-core systems. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS'07, pp 18:1---18:18, Berkeley, CA, USA, 2007. USENIX Association
[26]
Park J, Shin D, Chang N, Pedram M (2010) Accurate modeling and calculation of delay and energy overheads of dynamic voltage scaling in modern high-performance microprocessors. In: 2010 International Symposium on Low-Power Electronics and Design (ISLPED), pp 419---424
[27]
Rountree B, Lownenthal DK, de Supinski BR, Schulz M, Freeh VW, Bletsch T (2009) Adagio: making DVS practical for complex HPC applications. In: Proceedings of the 23rd international conference on Supercomputing, ICS'09, pp 460---469, New York, NY, USA, 2009. ACM
[28]
Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. SIAM, Philadelpha
[29]
Sosonkina M, Saad Y, Cai X (2004) Using the parallel algebraic recursive multilevel solver in modern physical applications. Future Gener Comput Syst 20:489---500
[30]
Sundriyal V, Sosonkina M (2011) Per-call energy saving strategies in all-to-all communications. In: Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface, EuroMPI'11, pp 188---197, Berlin, Heidelberg, 2011. Springer-Verlag
[31]
Sundriyal V, Sosonkina M (2013) Initial investigation of a scheme to use instantaneous CPU power consumption for energy savings format. In: Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, E2SC '13, pp 1:1---1:6, New York, NY, USA, 2013. ACM
[32]
Sundriyal V, Sosonkina M, Gaenko A (2012) Runtime procedure for energy savings in applications with point-to-point communications. In: Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on, pp 155---162
[33]
Sundriyal V, Sosonkina M, Zhang Z (2012) Achieving energy efficiency during collective communications. Pract Exp Concurr Comput 25:2140---2156
[34]
Tiwari A., Schulz M, Arrington L (2015) Predicting optimal power allocation for CPU and DRAM domains. In: Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International, pp 951---959, May 2015
[35]
Vishnu A, Song S, Marquez A, Barker K, Kerbyson D, Cameron K, Balaji P (2010) Designing energy efficient communication runtime systems for data centric programming models. In: Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing, GREENCOM-CPSCOM '10, pp 229---236, Washington, DC, USA, 2010. IEEE Computer Society
[36]
Zhang Z, Chang JM (2014) A cool scheduler for multi-core systems exploiting program phases. Comput IEEE Trans 63(5):1061---1073

Cited By

View all
  • (2023)JOSS: Joint Exploration of CPU-Memory DVFS and Task Scheduling for Energy EfficiencyProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605586(828-838)Online publication date: 7-Aug-2023
  • (2021)CuttlefishProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476163(1-14)Online publication date: 14-Nov-2021
  • (2019)Comparing frequency scaling efficacy on different memory technologiesProceedings of the High Performance Computing Symposium10.5555/3338075.3338086(1-10)Online publication date: 29-Apr-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The Journal of Supercomputing
The Journal of Supercomputing  Volume 72, Issue 4
April 2016
408 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 April 2016

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)JOSS: Joint Exploration of CPU-Memory DVFS and Task Scheduling for Energy EfficiencyProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605586(828-838)Online publication date: 7-Aug-2023
  • (2021)CuttlefishProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476163(1-14)Online publication date: 14-Nov-2021
  • (2019)Comparing frequency scaling efficacy on different memory technologiesProceedings of the High Performance Computing Symposium10.5555/3338075.3338086(1-10)Online publication date: 29-Apr-2019
  • (2019)Effect of frequency scaling granularity on energy-saving strategiesInternational Journal of High Performance Computing Applications10.1177/109434201877440533:4(590-601)Online publication date: 1-Jul-2019
  • (2019)Sleepy-LRUThe Journal of Supercomputing10.1007/s11227-019-02758-075:7(3945-3974)Online publication date: 1-Jul-2019
  • (2018)Comparisons of core and uncore frequency scaling modes in quantum chemistry application GAMESSProceedings of the High Performance Computing Symposium10.5555/3213069.3213082(1-11)Online publication date: 15-Apr-2018
  • (2018)High-performance dynamic elastic pipelinesMicroprocessors & Microsystems10.1016/j.micpro.2017.11.00456:C(113-120)Online publication date: 1-Feb-2018
  • (2017)Evaluating effects of application based and automatic energy saving strategies on NWChemProceedings of the 25th High Performance Computing Symposium10.5555/3108096.3108112(1-12)Online publication date: 23-Apr-2017
  • (2017)Understanding Reduced-Voltage Operation in Modern DRAM DevicesProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/30844471:1(1-42)Online publication date: 13-Jun-2017
  • (2017)COSProceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3078597.3078601(155-166)Online publication date: 26-Jun-2017
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media