Power-aware multi-core simulation for early design stage hardware/software co-optimization

W Heirman, S Sarkar, TE Carlson, I Hur… - Proceedings of the 21st …, 2012 - dl.acm.org
Proceedings of the 21st international conference on Parallel architectures …, 2012dl.acm.org
Stringent performance targets and power constraints push designers towards building
specialized workload-optimized systems across a broad spectrum of the computing arena,
including supercomputing applications as exemplified by the IBM BlueGene and Intel MIC
architectures. In this paper, we make the case for hardware/software co-design during early
design stages of processors for scientific computing applications. Considering an important
scientific kernel, namely stencil computation, we demonstrate that performance and energy …
Stringent performance targets and power constraints push designers towards building specialized workload-optimized systems across a broad spectrum of the computing arena, including supercomputing applications as exemplified by the IBM BlueGene and Intel MIC architectures. In this paper, we make the case for hardware/software co-design during early design stages of processors for scientific computing applications. Considering an important scientific kernel, namely stencil computation, we demonstrate that performance and energy-efficiency can be improved by a factor of 1.66X and 1.25X, respectively, by co-optimizing hardware and software.
To enable hardware/software co-design in early stages of the design cycle, we propose a novel simulation infrastructure by combining high-abstraction performance simulation using Sniper with power modeling using McPAT and custom DRAM power models. Sniper/McPAT is fast -- simulation speed is around 2 MIPS on an 8-core host machine -- because it uses analytical modeling to abstract away core performance during multi-core simulation. We demonstrate Sniper/McPAT's accuracy through validation against real hardware; we report average performance and power prediction errors of 22.1% and 8.3%, respectively, for a set of SPEComp benchmarks.
ACM Digital Library