Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2333660.2333747acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

CHARM: a composable heterogeneous accelerator-rich microprocessor

Published: 30 July 2012 Publication History

Abstract

This work discusses CHARM, a Composable Heterogeneous Accelerator-Rich Microprocessor design that provides scalability, flexibility, and design reuse in the space of accelerator-rich CMPs. CHARM features a hardware structure called the accelerator block composer (ABC), which can dynamically compose a set of accelerator building blocks (ABBs) into a loosely coupled accelerator (LCA) to provide orders of magnitude improvement in performance and power efficiency. Our software infrastructure provides a data flow graph to describe the composition, and our hardware components dynamically map available resources to the data flow graph to compose the accelerator from components that may be physically distributed across the CMP. Our ABC is also capable of providing load balancing among available compute resources to increase accelerator utilization. Running medical imaging benchmarks, our experimental results show an average speedup of 2.1X (best case 3.7X) compared to approaches that use LCAs together with a hardware resource manager. We also gain in terms of energy consumption (average 2.4X; best case 4.7X).

References

[1]
Cacti 5.3: http://quid.hpl.hp.com:9081/cacti/.
[2]
http://en.wikipedia.org/wiki/ultrasparc_iii.
[3]
J. L. Blanco. Derivation and implementation of a full 6d ekf-based solution to bearing-range slam. Technical report, University of Malaga, Spain, Mar 08.
[4]
Alex Bui et al. Customizable domain-specific computing. Design and Test of Computers, IEEE, Mar/Apr 2011.
[5]
Alex Bui et al. Platform characterization for domain-specific computing. In ASPDAC, 2012.
[6]
Nathan Clark, et al. Veal: Virtualized execution accelerator for loops. ISCA '08.
[7]
J. Cong et al. Accelerating sequential applications on cmps using core spilling. Parallel and Distributed Systems, IEEE Trans. on, pages 1094--1107, 2007.
[8]
J. Cong et al. Accelerating vision and navigation applications on a customizable platform. In ASAP, 2011.
[9]
Jason Cong et al. Architecture support for accelerator-rich cmps. In Proceedings of the 49th Annual Design Automation Conference (DAC 2012).
[10]
Jason Cong et al. Bin: A bu er-in-nuca scheme for accelerator-rich cmps. In ISLPED 2012).
[11]
Jason Cong et al. A generalized control-flow-aware pattern recognition algorithm for behavioral synthesis. DATE '10.
[12]
Jason Cong et al. UCLA computer science department technical report #120008.
[13]
Jason Cong et al. AXR-CMP: Architecture support in accelerator-rich cmps. In 2nd Workshop on SoC Architecture, Accelerators and Workloads, Feb 2011.
[14]
Kayvon Fatahalian et al. Sequoia: programming the memory hierarchy. In Proceedings of the 2006 ACM/IEEE conference on Supercomputing, 2006.
[15]
H. Franke et al. Introduction to the wire-speed processor and architecture. IBM Journal of Research and Development, pages 3:1--3:11, 2010.
[16]
Mark Gebhart et al. An evaluation of the trips computer system. ASPLOS '09.
[17]
J.R. Hauser and J. Wawrzynek. Garp: a MIPS processor with a reconfigurable coprocessor. FCCM'97, pages 12--21, 1997.
[18]
Engin Ipek et al. Core fusion: accommodating software diversity in chip multiprocessors. ISCA '07, pages 186--197.
[19]
Tim Johnson and Umesh Nawathe. An 8-core, 64-thread, 64-bit power efficient sparc soc (niagara2). ISPD '07, pages 2--2, 2007.
[20]
F. Jurie. A new log-polar mapping for space variant imaging : Application to face detection and tracking. Pattern Recognition, pages 865--875, 1999.
[21]
A. B. Kahng et al. Orion 2.0: a fast and accurate noc power and area model for early-stage design space exploration. DATE '09, pages 423--428.
[22]
R. E. Kalman. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, pages 35--45, 1960.
[23]
S. Li et al. Mcpat: An integrated power, area, and timing modeling framework for multicore and manycore architectures. MICRO '09.
[24]
Peter S. Magnusson et al. Simics: A full system simulation platform. Computer, 35:50--58, 2002.
[25]
M. Martin et al. Multifacet's general execution-driven multiprocessor simulator (gems) toolset. In Computer Architecture New, Sep 2005.
[26]
Hyunchul Park et al. Polymorphic pipeline array:a flexible multicore accelerator with virtualized execution for mobile multimedia application. MICRO 42.
[27]
A. Ramirez et al. The sarc architecture. Micro, IEEE, pages 16--29, 2010.
[28]
Larry Seiler et al. Larrabee: A many-core x86 architecture for visual computing. IEEE Micro, 29:10--21, 2009.
[29]
P.M. Stillwell et al. HiPPAI: High performance portable accelerator interface for SoCs. HiPC 2009, pages 109--118, 2009.
[30]
G. Venkatesh et al. QsCores: trading dark silicon for scalable energy efficiency with quasi-specific cores. MICRO-44 '11, pages 163--174.
[31]
Perry H. Wang et al. EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system. PLDI '07, pages 156--166.

Cited By

View all
  • (2024)CHARM 2.0: Composing Heterogeneous Accelerators for Deep Learning on Versal ACAP ArchitectureACM Transactions on Reconfigurable Technology and Systems10.1145/368616317:3(1-31)Online publication date: 5-Aug-2024
  • (2024)Mozart: Taming Taxes and Composing Accelerators with Shared-MemoryProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676896(183-200)Online publication date: 14-Oct-2024
  • (2023)An Analysis of Accelerator Data-Transfer Modes in NoC-Based SoC Architectures2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363546(1-7)Online publication date: 25-Sep-2023
  • Show More Cited By

Index Terms

  1. CHARM: a composable heterogeneous accelerator-rich microprocessor

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISLPED '12: Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
    July 2012
    438 pages
    ISBN:9781450312493
    DOI:10.1145/2333660
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 July 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. accelerator composition
    2. chip multiprocessor
    3. hardware accelerators

    Qualifiers

    • Research-article

    Conference

    ISLPED'12
    Sponsor:
    ISLPED'12: International Symposium on Low Power Electronics and Design
    July 30 - August 1, 2012
    California, Redondo Beach, USA

    Acceptance Rates

    Overall Acceptance Rate 398 of 1,159 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CHARM 2.0: Composing Heterogeneous Accelerators for Deep Learning on Versal ACAP ArchitectureACM Transactions on Reconfigurable Technology and Systems10.1145/368616317:3(1-31)Online publication date: 5-Aug-2024
    • (2024)Mozart: Taming Taxes and Composing Accelerators with Shared-MemoryProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676896(183-200)Online publication date: 14-Oct-2024
    • (2023)An Analysis of Accelerator Data-Transfer Modes in NoC-Based SoC Architectures2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363546(1-7)Online publication date: 25-Sep-2023
    • (2023)DAG Processing Unit Version 2 (DPU-v2): Efficient Execution of Irregular Workloads on a Spatial DatapathEfficient Execution of Irregular Dataflow Graphs10.1007/978-3-031-33136-7_5(89-123)Online publication date: 26-Apr-2023
    • (2022)Learning from the Past: Efficient High-level Synthesis Design Space Exploration for FPGAsACM Transactions on Design Automation of Electronic Systems10.1145/349553127:4(1-23)Online publication date: 12-Feb-2022
    • (2022)DPU-v2: Energy-efficient execution of irregular directed acyclic graphs2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00090(1288-1307)Online publication date: Oct-2022
    • (2021)HARDROID: Transparent Integration of Crypto Accelerators in Android2021 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC49654.2021.9622875(1-8)Online publication date: 20-Sep-2021
    • (2020)Decentralized Offload-based Execution on Memory-centric Compute CoresProceedings of the International Symposium on Memory Systems10.1145/3422575.3422778(61-76)Online publication date: 28-Sep-2020
    • (2020)Domain-specific hardware acceleratorsCommunications of the ACM10.1145/336168263:7(48-57)Online publication date: 18-Jun-2020
    • (2020)High-Level Synthesis Design Space Exploration: Past, Present, and FutureIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2019.294357039:10(2628-2639)Online publication date: Oct-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media