research-article

Public Access

RAMP: resource-aware mapping for CGRAs

Authors:

Mahesh Balasubramanian,

Aviral ShrivastavaAuthors Info & Claims

DAC '18: Proceedings of the 55th Annual Design Automation Conference

Article No.: 127, Pages 1 - 6

https://doi.org/10.1145/3195970.3196101

Published: 24 June 2018 Publication History

Abstract

Coarse-grained reconfigurable array (CGRA) is a promising solution that can accelerate even non-parallel loops. Acceleration achieved through CGRAs critically depends on the goodness of mapping (of loop operations onto the PEs of CGRA), and in particular, the compiler's ability to route the dependencies among operations. Previous works have explored several mechanisms to route data dependencies, including, routing through other PEs, registers, memory, and even re-computation. All these routing options change the graph to be mapped onto PEs (often by adding new operations), and without re-scheduling, it may be impossible to map the new graph. However, existing techniques explore these routing options inside the Place and Route (P&R) phase of the compilation process, which is performed after the scheduling step. As a result, they either may not achieve the mapping or obtain poor results. Our method RAMP, explicitly and intelligently explores the various routing options, before the scheduling step, and makes improve the mapping-ability and mapping quality. Evaluating top performance-critical loops of MiBench benchmarks over 12 architectural configurations, we find that RAMP is able to accelerate loops by 23× over sequential execution, achieving a geomean speedup of 2.13× over state-of-the-art.

References

[1]

Shuai Che, Jie Li, Jeremy W Sheaffer, Kevin Skadron, and John Lach. Accelerating compute-intensive applications with gpus and fpgas. In SASP, 2008.

Digital Library

[2]

Bingfeng Mei, M Berekovic, and JY Mignolet. Adres & dresc: Architecture and compiler for coarse-grain reconfigurable processors. Springer, 2007.

[3]

Hyunchul Park et al. Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In PACT, 2008.

Digital Library

[4]

Taewook Oh et al. Recurrence cycle aware modulo scheduling for coarse-grained reconfigurable architectures. In ACM Sigplan Notices, 2009.

Digital Library

[5]

Hongsik Lee, Dong Nguyen, and Jongeun Lee. Optimizing stream program performance on cgra-based systems. In DAC, 2015.

Digital Library

[6]

Zhongyuan Zhao et al. Optimizing the data placement and transformation for multi-bank cgra computing system. In DATE, 2018.

[7]

Manupa Karunaratne et al. Hycube: A cgra with reconfigurable single-cycle multi-hop interconnect. In DAC, 2017.

Digital Library

[8]

Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula. Epimap: using epimorphism to map applications on cgras. In DAC, 2012.

Digital Library

[9]

Liang Chen and Tulika Mitra. Graph minor approach for application mapping on cgras. ACM TRETS, 2014.

Digital Library

[10]

Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula. Regimap: Register-aware application mapping on cgras. In DAC, 2013.

Digital Library

[11]

Panagiotis Theocharis and Bjorn De Sutter. A bimodal scheduler for coarsegrained reconfigurable arrays. ACM TACO, 2016.

Digital Library

[12]

Bjorn De Sutter et al. Placement-and-routing-based register allocation for coarsegrained reconfigurable arrays. In ACM Sigplan Notices, 2008.

Digital Library

[13]

Shouyi Yin et al. Memory-aware loop mapping on coarse-grained reconfigurable architectures. IEEE TVLSI, 2016.

Digital Library

[14]

Matthew Guthaus et al. Mibench: A free, commercially representative embedded benchmark suite. In WWC, 2001.

Digital Library

[15]

Shail Dave and Aviral Shrivastava. Ccf: A cgra compilation framework. 2018.

[16]

Chris Lattner and Vikram Adve. Llvm: A compilation framework for lifelong program analysis & transformation. In CGO, 2004.

Digital Library

[17]

B Ramakrishna Rau. Iterative modulo scheduling: An algorithm for software pipelining loops. In MICRO, 1994.

Digital Library

[18]

Shail Dave, Mahesh Balasubramanian, and Aviral Shrivastava. Ureca: A compiler solution to manage unified register file for cgras. In DATE, 2018.

[19]

Giovanni Ansaloni, Paolo Bonzini, and Laura Pozzi. Egra: A coarse grained reconfigurable architectural template. IEEE TVLSI, 2011.

Digital Library

[20]

S Alexander Chin et al. Architecture exploration of standard-cell and fpga-overlay cgras using the open-source cgra-me framework. In ISPD, 2018.

Digital Library

[21]

Kyuseung Han, Junwhan Ahn, and Kiyoung Choi. Power-efficient predication techniques for acceleration of control flow execution on cgra. ACM TACO, 2013.

Digital Library

[22]

Nathan Binkert et al. The gem5 simulator. 2011.

[23]

Ashay Dharwadker. The clique algorithm, 2006.

Cited By

Sunny CDas SMartin KCoussy P(2024)Standalone Nested Loop Acceleration on CGRAs for Signal Processing ApplicationsDesign and Architectures for Signal and Image Processing10.1007/978-3-031-62874-0_7(83-95)Online publication date: 17-Jan-2024
https://dl.acm.org/doi/10.1007/978-3-031-62874-0_7
Tirelli CFerretti LPozzi L(2023)SAT-MapIt: A SAT-based Modulo Scheduling Mapper for Coarse Grain Reconfigurable Architectures2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137123(1-6)Online publication date: Apr-2023
https://doi.org/10.23919/DATE56975.2023.10137123
Dave SNowatzki TShrivastava AAamodt TSwift MJerger N(2023)Explainable-DSE: An Agile and Explainable Exploration of Efficient HW/SW Codesigns of Deep Learning Accelerators Using Bottleneck AnalysisProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624772(87-107)Online publication date: 25-Mar-2023
https://dl.acm.org/doi/10.1145/3623278.3624772
Show More Cited By

Index Terms

RAMP: resource-aware mapping for CGRAs
1. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators
2. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

RAMP: Resource-Aware Mapping for CGRAs
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)
Coarse-grained reconfigurable array (CGRA) is a promising solution that can accelerate even non-parallel loops. Acceleration achieved through CGRAs critically depends on the goodness of mapping (of loop operations onto the PEs of CGRA), and in particular, ...
RAMP gold: an FPGA-based architecture simulator for multiprocessors
DAC '10: Proceedings of the 47th Design Automation Conference

We present RAMP Gold, an economical FPGA-based architecture simulator that allows rapid early design-space exploration of manycore systems. The RAMP Gold prototype is a high-throughput, cycle-accurate full-system simulator that runs on a single Xilinx ...
RAMP: Research Accelerator for Multiple Processors

The RAMP project's goal is to enable the intensive, multidisciplinary innovation that the computing industry will need to tackle the problems of parallel processing. RAMP itself is an open-source, community-developed, FPGA-based emulator of parallel ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '18: Proceedings of the 55th Annual Design Automation Conference

June 2018

1089 pages

ISBN:9781450357005

DOI:10.1145/3195970

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

EDAC: Electronic Design Automation Consortium
SIGDA: ACM Special Interest Group on Design Automation
IEEE Council on Electronic Design Automation (CEDA)

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

DAC '18

Sponsor:

EDAC
SIGDA

DAC '18: The 55th Annual Design Automation Conference 2018

June 24 - 29, 2018

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

39
Total Citations
View Citations
727
Total Downloads

Downloads (Last 12 months)134
Downloads (Last 6 weeks)12

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sunny CDas SMartin KCoussy P(2024)Standalone Nested Loop Acceleration on CGRAs for Signal Processing ApplicationsDesign and Architectures for Signal and Image Processing10.1007/978-3-031-62874-0_7(83-95)Online publication date: 17-Jan-2024
https://dl.acm.org/doi/10.1007/978-3-031-62874-0_7
Tirelli CFerretti LPozzi L(2023)SAT-MapIt: A SAT-based Modulo Scheduling Mapper for Coarse Grain Reconfigurable Architectures2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137123(1-6)Online publication date: Apr-2023
https://doi.org/10.23919/DATE56975.2023.10137123
Dave SNowatzki TShrivastava AAamodt TSwift MJerger N(2023)Explainable-DSE: An Agile and Explainable Exploration of Efficient HW/SW Codesigns of Deep Learning Accelerators Using Bottleneck AnalysisProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624772(87-107)Online publication date: 25-Mar-2023
https://dl.acm.org/doi/10.1145/3623278.3624772
Tirelli CFerretti LPozzi LBartolini ARietveld KSchuman CMoreira J(2023)SAT-MapItProceedings of the 20th ACM International Conference on Computing Frontiers10.1145/3587135.3591433(383-384)Online publication date: 9-May-2023
https://dl.acm.org/doi/10.1145/3587135.3591433
Ni XGe MTao YSun WDuan FBai XXu QChen SKang Y(2023)BusMap: Application Mapping With Bus Routing for Coarse-Grained Reconfigurable ArrayIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.325368670:8(3054-3058)Online publication date: Aug-2023
https://doi.org/10.1109/TCSII.2023.3253686
Kou MGu JYao HWei SYin S(2023)TAEM 2.0: A Faster Transfer-Aware Effective Loop Mapping for Heterogeneous Resources on CGRAIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322615242:8(2552-2565)Online publication date: Aug-2023
https://doi.org/10.1109/TCAD.2022.3226152
Liu TLi WFan Z(2023)DFGC: DFG-aware NoC Control based on Time Stamp Prediction for Dataflow Architecture2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00071(432-439)Online publication date: 6-Nov-2023
https://doi.org/10.1109/ICCD58817.2023.00071
Balasubramanian MShrivastava ABolchini CO'Connor IVerbauwhede IWille R(2022)PathSeekerProceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe10.5555/3539845.3539913(268-273)Online publication date: 14-Mar-2022
https://dl.acm.org/doi/10.5555/3539845.3539913
Zhu RWang BLiu D(2022)RF-CGRA: A Routing-Friendly CGRA with Hierarchical Register Chains2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774601(262-267)Online publication date: 14-Mar-2022
https://doi.org/10.23919/DATE54114.2022.9774601
Balasubramanian MShrivastava A(2022)PathSeeker: A Fast Mapping Algorithm for CGRAs2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774520(268-273)Online publication date: 14-Mar-2022
https://doi.org/10.23919/DATE54114.2022.9774520
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents