Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/997163.997175acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
Article

Speculative software management of datapath-width for energy optimization

Published: 11 June 2004 Publication History
  • Get Citation Alerts
  • Abstract

    This paper evaluates managing the processor's datapath-width at the compiler level by means of exploiting dynamic narrow-width operands. We capitalize on the large occurrence of these operands in multimedia programs to build static narrow-width regions that may be directly exposed to the compiler. We propose to augment the ISA with instructions directly exposing the datapath and the register widths to the compiler. Simple exception management allows this exposition to be only speculative. In this way, we permit the software to speculatively accommodate the execution of a program on a narrower datapath-width in order to save energy. For this purpose, we introduce a novel register file organization, the byte-slice register file, which allows the width of the register file to be dynamically reconfigured, providing both static and dynamic energy savings. We show that by combining the advantages of the byte-slice register file with the advantages provided by clock-gating the datapath on a per-region basis, up to 17% of the datapath dynamic energy can be saved, while a 22% reduction of the register file static energy is achieved.

    References

    [1]
    Ayala, J.L., López, V.M., Veidenbaum, A., and López C.A. Energy Aware Register File Implementation through Instruction Predecode. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors, June 2003.
    [2]
    Bahar, R.I., and Manne, S. Power and Energy Reduction Via Pipeline Balancing. In Proceedings of the 28th International Symposium on Computer Architecture, June 2001.
    [3]
    Balasubramonian, R., Dwarkadas, S., Albonesi, D. Reducing the Complexity of the Register File in Dynamic Superscalar Processor. In Proceedings of the 34th International Symposium on Microarchitecture, December 2001.
    [4]
    Bodin, F., Rohou, E., and Seznec, A. SALTO: System for Assembly-Language Transformation and Optimization. In Proceedings of the Sixth Workshop on Compilers for Parallel Computers, December 1996.
    [5]
    Brooks, D., and Martonosi, M. Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance. In Proceedings of the 5th International Symposium on High-Performance Computer Architecture, January 1999.
    [6]
    Canal, R., Gonzales, A., and Smith, J.E. Very Low Power Pipelines Using Significance Compression. In Proceedings of the 33th International Symposium on Microarchitecture, December 2000.
    [7]
    Canal, R., Gonzales, A., and Smith, J.E. Software-Controlled Operand-Gating. In Proceedings of the International Symposium on Code Generation and Optimization, March 2004.
    [8]
    Cao, Y., and Yasuura, H. Low-Energy Design using Datapath Width Optimization for Embedded Processor-based Systems. IPSJ Journal, 43(5):1348--1356, May 2002.
    [9]
    Drach, N., and Sebot, J. SIMD ISA Extensions: Tradeoff between Power Consumption and Performance on a Superscalar Processor. In Proceedings of the Kool Chips Workshop, December 2000.
    [10]
    Faraboschi, P., Brown, G., Fisher, J.A., Desoli, G., and Homewood, F. Lx: A Technology Platform for Customizable VLIW Embedded Processing. In Proceedings of the 27th International. Symposium on Computer Architecture, June 2000.
    [11]
    Flautner, K., Sung Kim, N., Martin, S., Blaauw, D., and Mudge, T. Drowsy Caches: Simple Techniques for Reducing Leakage Power. In Proceedings of the 29th International Symposium on Computer Architecture, May 2002.
    [12]
    Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R.B. MiBench: A Free, Commercially Representative Embedded Benchmark Suite. In Proceedings of the 4th IEEE International Workshop on Workload Characterization, pages 3--14, December 2001.
    [13]
    Larsen, S., and Amarasinghe, S. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2000.
    [14]
    Loh, G. Exploiting Data-Width Locality to Increase Superscalar Execution Bandwidth. In Proceedings of the 35th International Symposium on Microarchitecture, November 2002.
    [15]
    Mahlke, S., Ravindran, R., Schlansker, M., Schreiber, R., and Sherwood, T. Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(11), November 2001.
    [16]
    Manne, S., Klauser, A., and Grunwald, D. Pipeline Gating: Speculation Control for Energy Reduction. In Proceedings of the 25th International Symposium on Computer Architecture, June 1998.
    [17]
    Moreno, J.H., et al. An Innovative Low-Power High-Performance Programmable Signal Processor for Digital Communications. IBM Journal of Research and Development, 47(2-3):299--326, March/May 2003.
    [18]
    Nakra, T., Childers, B.R., and Soffa, M.L. Width-Sensitive Scheduling for Resource-Constrained VLIW Processors. In Proceedings of the 3th ACM Workshop on Feedback-Directed and Dynamic Optimization, December 2000.
    [19]
    Pokam, G., Bihan, S., Simonnet, J., and Bodin, F. SWARP: A Retargetable Preprocessor for Multimedia Instructions. Concurrency and Computation: Practice and Experience, 16(2-3):303--318, February/March 2004.
    [20]
    Scott, J., Hwang Lee, L., Arends, J., and Moyer, W. Designing the Low-Power M.CORE Architecture. In Proceedings of Power Driven Microarchitecture, June 1998.
    [21]
    Shivakumar, P., and Jouppi, N. CACTI 3.0: An Integrated Cache Timing Power, and Area Model. Technical report, DEC Western research Lab, 2002.
    [22]
    Smith, I.E., et al. The ZS-I Central Processor. In Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, pages 199--204, October 1987.
    [23]
    Stephenson, M., Babb, J., and Amarasinghe, S. Bitwidth Analysis with Application to Silicon Compilation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2000.
    [24]
    Tseng, J.H., and Asanovic, K. Banked Multiported Register Files for High-Frequency Superscalar Microprocessors. In Proceedings of the 30th International Symposium on Computer Architecture, June 2003.
    [25]
    Vijaykrishnan, N., Kandemir, M., Irwin, M.J., Kim, H.S., and Ye, W. Energy-driven Integrated Hardware-Software Optimizations using SimplePower. In Proceedings of the 27th International Symposium on Computer Architecture, June 2000.
    [26]
    Zhang, Y., Parikh, D., Sankaranarayanan, K., Skadron, K., and Stan, M. Hotleakage: A Temperature-aware Model of Subthreshold and Gate Leakage for Architects. Technical Report CS-2003-05, University of Virginia, Department of Computer Science, March 2003.

    Cited By

    View all
    • (2019)Compiling Efficiently with Arithmetic Emulation for the Custom-Width Connex Vector ProcessorProceedings of the 5th Workshop on Programming Models for SIMD/Vector Processing10.1145/3303117.3306166(1-8)Online publication date: 16-Feb-2019
    • (2011)Global productiveness propagationACM SIGPLAN Notices10.1145/2016603.196770046:5(161-170)Online publication date: 11-Apr-2011
    • (2011)Global productiveness propagationProceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/1967677.1967700(161-170)Online publication date: 11-Apr-2011
    • Show More Cited By

    Index Terms

    1. Speculative software management of datapath-width for energy optimization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
      June 2004
      276 pages
      ISBN:1581138067
      DOI:10.1145/997163
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 39, Issue 7
        LCTES '04
        July 2004
        265 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/998300
        Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 June 2004

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. clock-gating
      2. compiler
      3. energy management
      4. narrow-width regions
      5. reconfigurable computing
      6. speculative execution

      Qualifiers

      • Article

      Conference

      LCTES04

      Acceptance Rates

      Overall Acceptance Rate 116 of 438 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)1

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Compiling Efficiently with Arithmetic Emulation for the Custom-Width Connex Vector ProcessorProceedings of the 5th Workshop on Programming Models for SIMD/Vector Processing10.1145/3303117.3306166(1-8)Online publication date: 16-Feb-2019
      • (2011)Global productiveness propagationACM SIGPLAN Notices10.1145/2016603.196770046:5(161-170)Online publication date: 11-Apr-2011
      • (2011)Global productiveness propagationProceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/1967677.1967700(161-170)Online publication date: 11-Apr-2011
      • (2008)Memory-Link Compression SchemesIEEE Transactions on Computers10.1109/TC.2008.2857:7(916-927)Online publication date: 1-Jul-2008
      • (2007)Exploiting Narrow Accelerators with Data-Centric Subgraph MappingProceedings of the International Symposium on Code Generation and Optimization10.1109/CGO.2007.11(341-353)Online publication date: 11-Mar-2007
      • (2007)Energy Management Techniques for SOC DesignEssential Issues in SOC Design10.1007/1-4020-5352-5_6(177-223)Online publication date: 2007
      • (2004)Bit-sliced datapath for energy-efficient high performance microprocessorsProceedings of the 4th international conference on Power-Aware Computer Systems10.1007/11574859_3(30-45)Online publication date: 5-Dec-2004
      • (2009)Energy-Aware Compiler OptimizationsThe Compiler Design Handbook10.1201/9781420043839.ch7(7-1-7-36)Online publication date: 7-Dec-2009

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media