Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2482767.2482772acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

A shared-FPU architecture for ultra-low power MPSoCs

Published: 14 May 2013 Publication History

Abstract

In this work we propose a shared floating point unit (FPU) architecture for ultra low power (ULP) system on chips operating at near threshold voltage (NTV). Since high-performance FP units (FPUs) are large and complex, but their utilization is relatively low, adding one FPU per each core in a ULP multicore is costly and power hungry. In our approach, we share a few FPUs among all the cores in the system. This increases the utilization of FPUs leading to an energy-efficient design. As a part of our approach, we propose two different FPU allocation techniques: optimal and random.
Experimental results demonstrate that compared to a traditional private-FPU approach, our technique in a multicore system with 8 processors and 2 shared FPUs can increase the performance/(area*power) by 5x for applications with 10% FP operations and by 2.5x for applications with 25% FP operations.

References

[1]
http://en.wikipedia.org/wiki/Tegra.
[2]
http://www.ti.com/lsds/ti/dsp/video_processors/overview.page.
[3]
http://en.wikipedia.org/wiki/Bulldozer_(microarchitecture).
[4]
http://en.wikipedia.org/wiki/UltraSPARC_T1.
[5]
IEEE Computer Society (1985), IEEE Standard for Binary Floating-Point Arithmetic, IEEE Std 754,1985.
[6]
http://www.spec.org.
[7]
http://www.arm.com.
[8]
Yee Jern Chong and S. Parameswaran. Custom floating-point unit generation for embedded systems. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 28(5):638--650, may 2009.
[9]
B. Giridhar G. Kim S. Seo M. Fojtik S. Satpathy Y. Lee D. Kim N. Liu M. Wieckowski G. Chen T. Mudge D. Sylvester D. Fick, R. Dreslinski and D. Blaauw. Centip3de: A 3930 dmips/w configurable near-threshold 3d stacked system with 64 arm cortex-m3 cores. In IEEE International Solid-State Circuits Conference, pages 1131--1136, 2012.
[10]
Ahmed Yasir Dogan, David Atienza, Andreas Burg, Igor Loi, and Luca Benini. Power/performance exploration of single-core and multi-core processor approaches for biomedical signal processing. In PATMOS'11 Proceedings of the 21st international conference on Integrated circuit and system design: power and timing modeling, optimization, and simulation, pages 102--111, 2011.
[11]
Evgeni Krimer et. al. Synctium: a near-threshold stream processor for energy-constrained parallel applications. IEEE Computer Architecture Letters, 9(1):21--24, June 2010.
[12]
Ran Hsu et. al. A 280mv-to-1.2v 256b reconfigurable simd vector permutation engine with 2-dimensional shuffle in 22nm cmos. In IEEE International Solid-State Circuits Conference, pages 66--68, February 2012.
[13]
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In WWC '01 Proceedings of the Workload Characterization, pages 3--14, February 2012.
[14]
S. Jain, S. Khare, S. Yada, V. Ambili, P. Salihundam, S. Ramani, S. Muthukumar, M. Srinivasan, A. Kumar, S. K. Gb, R. Ramanarayanan, V. Erraguntla, J. Howard, S. Vangal, S. Dighe, G. Ruhl, P. Aseron, H. Wilson, N. Borkar, V. De, and S. Borkar. A 280mv-to-1.2v wide-operating-range ia-32 processor in 32nm cmos. In Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, pages 66--68, feb. 2012.
[15]
Mohammad Reza Kakoee and Luca Benini. Robust near-threshold design with fine-grained performance tunability. Circuits and Systems I: Regular Papers, IEEE Transactions, 59(8):1815--1825, August 2012.
[16]
Abbas Rahim, Igor Loi, Mohammad Reza Kakoee, and Luca Benini. A fully-synthesizable single-cycle interconnection network for shared-l1 processor clusters. In Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 1--6, March 2011.
[17]
Michael B. Taylor. Is dark silicon useful?: harnessing the four horsemen of the coming dark silicon apocalypse. In Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, pages 1131--1136, June 2012.
[18]
Bo Zhai, Dreslinski, R. G., Blaauw D., T. Mudge, and D. Sylvester. Energy efficient near-threshold chip multi-processing. In Low Power Electronics and Design (ISLPED), 2007 ACM/IEEE International Symposium on, pages 32--37, aug. 2007.

Index Terms

  1. A shared-FPU architecture for ultra-low power MPSoCs

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CF '13: Proceedings of the ACM International Conference on Computing Frontiers
      May 2013
      302 pages
      ISBN:9781450320535
      DOI:10.1145/2482767
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 14 May 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. multicore
      2. multicore interconnect
      3. shared FPU

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CF'13
      Sponsor:
      CF'13: Computing Frontiers Conference
      May 14 - 16, 2013
      Ischia, Italy

      Acceptance Rates

      CF '13 Paper Acceptance Rate 26 of 49 submissions, 53%;
      Overall Acceptance Rate 273 of 785 submissions, 35%

      Upcoming Conference

      CF '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 182
        Total Downloads
      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 09 Nov 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media