Article

Bit-sliced datapath for energy-efficient high performance microprocessors

Authors:

Prateek Pujara,

Aneesh AggarwalAuthors Info & Claims

PACS'04: Proceedings of the 4th international conference on Power-Aware Computer Systems

Pages 30 - 45

https://doi.org/10.1007/11574859_3

Published: 05 December 2004 Publication History

Abstract

In the recent years, both power and performance have become important in the design of microprocessors. In this paper, we investigate exploiting the small-sized data values for energy-efficient high performance microprocessors. For this purpose, we bit-slice the execution core (which includes the functional units, register files, data caches, and forwarding logic), so that small portions of the data are operated upon in different bit-slices. The bit-slices operating upon the higher order bits are activated only if required, saving significant energy consumption. We also investigate further advantages facilitated by bit-slicing such as energy savings obtained by reducing the number of ports provided in the higher order bit-slices and by “shutting off” bit-slices to reduce leakage energy consumption. We found that a significant energy saving can be obtained in the register file (about 20%) and the Level-1 Data Cache (about 30%) with a negligible loss of only about 2% in the instruction throughput. Our studies also showed almost 20% savings in the register file leakage energy consumption, when the unwanted higher order bit-slices are “turned off”. Bit-slicing also helps in reducing the latency of the different hardware structures, which can facilitate clock speed improvements.

References

[1]

A. Aggarwal and M. Franklin, "Energy Efficient Asymmetrically Ported Register File," Proc. ICCD, 2003.

Digital Library

[2]

D. Brooks and M. Martonosi, "Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance," Proc. HPCA, 1999.

Digital Library

[3]

D. Burger and T. M. Austin, "The SimpleScalar Tool Set, Version 2.0," Computer Arch. News, June 1997.

Digital Library

[4]

R. Canal, A. Gonzalez and J. E. Smith, "Very Low Power Pipelines using Significance Compression," Proc. Micro, 2000.

Digital Library

[5]

R. Canal, A. Gonzalez and J. E. Smith, "Software-Controlled Operand-Gating," Proc. International Symposium on Code Generation and Optimization, 2004.

Digital Library

[6]

M. Powell, et. al., "Gated-Vdd: A circuit technique to reduce leakage in deep-submicron cache memories," Proc. of ISLPED, 2000.

Digital Library

[7]

M. R. Guthaus, et. al., "MiBench: A Free Commercially Representative Embedded Benchmark Suite," Proc. IEEE International Workshop on Workload Characterization, 2001.

Digital Library

[8]

M. K. Gowan, et. al., "Power Considerations in the Design of the Alpha 21264 Microprocessor," Proc. DAC, 1998.

Digital Library

[9]

G. Hinton, et al, "A 0.18-um CMOS IA-32 Processor With a 4-GHz Integer Execution Unit," IEEE Journal of Solid-State Circuits, Vol. 36, No. 11, Nov. 2001.

[10]

S. Larsen, and S. Amarasinghe, "Exploiting Superword Level Parallelism with Multimedia Instruction Sets," Proc. PLDI, 2000.

Digital Library

[11]

G. Loh, "Exploiting data-width locality to increase superscalar execution bandwidth," Proc. Micro-35, 2002.

Digital Library

[12]

S. Mahlke et. al., "Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(11), Nov. 2001.

Digital Library

[13]

G. Pokam, S. Bihan, J. Simonnet, and F. Bodin, "SWARP: A Retargetable Preprocessor for Multimedia Instructions," Concurrency and Computation: Practice and Experience, 16(2-3):303-318, Feb. 2004.

Digital Library

[14]

G. Pokam et. al., "Speculative Software Management of Datapath-width for Energy Optimization," Proc. LCTES, 2004.

Digital Library

[15]

P. Shivakumar, and N. Jouppi, "CACTI 3.0: An Integrated Cache Timing Power, and Area Model," Technical Report, DEC Western Research Lab, 2002.

[16]

M. Stepehenson et. al., "Bitwidth Analysis with Application to Silicon Compilation," Proc. PLDI, 2000.

Digital Library

[17]

J. Tseng, and K. Asanovic, "Banked Multiported Register Files for High-Frequency Superscalar Microprocessors," Proc. ISCA-30, 2003.

Digital Library

[18]

Luis Villa, Michael Zhang, and Krste Asanovic, "Dynamic zero compression for cache energy reduction," Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.214-220, December 2000.

Digital Library

[19]

Y. Zhang, et. al., "Hotleakage: A Temperature-aware Model of Subthreshold and Gate Leakage for Architects," Technical Report CS-2003-05, University of Virginia, Department of CS, 2003.

Cited By

Vicarte JShome PNayak NTrippel CMorrison AKohlbrenner DFletcher CMartínez JDuato JJohn L(2021)Opening pandora's boxProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00035(347-360)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00035
Riemens DGaydadjiev GZeeuw CStrydis C(2014)Towards scalable arithmetic units with graceful degradationACM Transactions on Embedded Computing Systems10.1145/249936713:4(1-26)Online publication date: 10-Mar-2014
https://dl.acm.org/doi/10.1145/2499367
Pujara PAggarwal A(2005)Restrictive Compression Techniques to Increase Level 1 Cache CapacityProceedings of the 2005 International Conference on Computer Design10.1109/ICCD.2005.94(327-333)Online publication date: 2-Oct-2005
https://dl.acm.org/doi/10.1109/ICCD.2005.94

Index Terms

Bit-sliced datapath for energy-efficient high performance microprocessors

Index terms have been assigned to the content through auto-classification.

Recommendations

Architectural and compiler techniques for energy reduction in high-performance microprocessors
Energy-efficient and high-performance instruction fetch using a block-aware ISA
ISLPED '05: Proceedings of the 2005 international symposium on Low power electronics and design

The front-end in superscalar processors must deliver high application performance in an energy-effective manner. Impediments such as multi-cycle instruction accesses, instruction-cache misses, and mispredictions reduce performance by 48% and increase ...
Architectural and compiler techniques for energy reduction in high-performance microprocessors
Special section on low-power electronics and design

In this paper, we focus on low-power design techniques for high-performance processors at the architectural and compiler levels. We focus mainly on developing methods for reducing the energy dissipated in the on-chip caches. Energy dissipated in caches ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

PACS'04: Proceedings of the 4th international conference on Power-Aware Computer Systems

December 2004

181 pages

ISBN:3540297901

Editors:
Babak Falsafi
Electrical and Computer Engineering, Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA
,
T. N. VijayKumar
ECE, Purdue University, 5000 Forbes Avenue, Pittsburgh, IN

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 05 December 2004

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Vicarte JShome PNayak NTrippel CMorrison AKohlbrenner DFletcher CMartínez JDuato JJohn L(2021)Opening pandora's boxProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00035(347-360)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00035
Riemens DGaydadjiev GZeeuw CStrydis C(2014)Towards scalable arithmetic units with graceful degradationACM Transactions on Embedded Computing Systems10.1145/249936713:4(1-26)Online publication date: 10-Mar-2014
https://dl.acm.org/doi/10.1145/2499367
Pujara PAggarwal A(2005)Restrictive Compression Techniques to Increase Level 1 Cache CapacityProceedings of the 2005 International Conference on Computer Design10.1109/ICCD.2005.94(327-333)Online publication date: 2-Oct-2005
https://dl.acm.org/doi/10.1109/ICCD.2005.94

View Options

View options

Media

Figures

Other

Tables

View Table of Contents