research-article

MPack: global memory optimization for stream applications in high-level synthesis

Authors:

Jasmina Vasiljevic,

Paul ChowAuthors Info & Claims

FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

Pages 233 - 236

https://doi.org/10.1145/2554688.2554761

Published: 26 February 2014 Publication History

Get Access

Abstract

One of the challenges in designing high-performance FPGA applications is fine-tuning the use of limited on-chip memory storage among many buffers in an application. To achieve desired performance the designer faces the burden of packaging such buffers into on-chip memories and manually optimizing the utilization of each memory and the throughput of each buffer. In addition, the application memories may not match the word width or depth of the physical on-chip memories available on the FPGA. This process is time consuming and non-trivial, particularly with a large number of buffers of various depths and bit widths. We propose a tool, MPack, which globally optimizes on-chip memory use across all buffers for stream applications. The goal is to speed up development time by providing rapid design space exploration and relieving the designer of lengthy low-level iterations. We introduce new high-level pragmas allowing the user to specify global memory requirements, such as an application's on-chip memory budget and data throughput. We allow the user to quickly generate a large number of memory solutions and explore the trade-off between memory usage and achievable throughput. To demonstrate the effectiveness of our tool, we apply the new high-level pragmas to an image processing benchmark. MPack effectively explores the design space and is able to produce a large number of memory solutions ranging from 10 to 100% in throughput, and from 12 to 100% in on-chip memory usage.

References

[1]

E. H. Adelson et al. Pyramid methods in image processing. RCA engineer, 29(6):33--41, 1984.

Google Scholar

[2]

D. Karchmer and J. Rose. Definition and solution of the memory packing problem for field-programmable systems. ICCAD, 1994.

Digital Library

Google Scholar

[3]

H. Schmit and D. Thomas. Synthesis of application-specific memory designs. IEEE Trans. on VLSI Systems, 5(1):101--111, 1997.

Digital Library

Google Scholar

[4]

Xilinx Inc. User Guide High-Level Synthesis, 2012.

Google Scholar

[5]

Xilinx Inc. 7 Series FPGAs Memory Resources, 2013.

Google Scholar

Cited By

View all

Li ZLiu LDeng YYin SWei S(2018)Breaking the Synchronization Bottleneck with Reconfigurable Transactional ExecutionIEEE Computer Architecture Letters10.1109/LCA.2018.282840217:2(147-150)Online publication date: 1-Jul-2018
https://doi.org/10.1109/LCA.2018.2828402
Wingbermuehle JCytron RChamberlain RConstantinides GChen D(2015)Superoptimized Memory Subsystems for Streaming ApplicationsProceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/2684746.2689069(126-135)Online publication date: 22-Feb-2015
https://dl.acm.org/doi/10.1145/2684746.2689069

Index Terms

MPack: global memory optimization for stream applications in high-level synthesis
1. Hardware
  1. Hardware validation

Recommendations

Management and optimization for nonvolatile memory-based hybrid scratchpad memory on multicore embedded processors
Regular Papers

The recent emergence of various Non-Volatile Memories (NVMs), with many attractive characteristics such as low leakage power and high-density, provides us with a new way of addressing the memory power consumption problem. In this article, we target ...
On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems

Efficient utilization of on-chip memory space is extremely important in modern embedded system applications based on processor cores. In addition to a data cache that interfaces with slower off-chip memory, a fast on-chip SRAM, called Scratch-Pad memory,...
Optimizing Data Allocation and Memory Configuration for Non-Volatile Memory Based Hybrid SPM on Embedded CMPs
IPDPSW '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum

The recent emergence of various Non-Volatile Memories (NVMs), with many attractive characteristics such as low leakage power and high-density, provides us with a new way of addressing the memory power consumption problem. In this paper, we target ...

Comments

Information & Contributors

Information

Published In

FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

February 2014

272 pages

ISBN:9781450326711

DOI:10.1145/2554688

General Chair:
Vaughn Betz
University of Toronto, Canada
,
Program Chair:
George A. Constantinides
Imperial College London, UK

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 February 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

FPGA'14

Sponsor:

SIGDA

FPGA'14: The 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 26 - 28, 2014

California, Monterey, USA

Acceptance Rates

FPGA '14 Paper Acceptance Rate 30 of 110 submissions, 27%;

Overall Acceptance Rate 125 of 627 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
216
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Li ZLiu LDeng YYin SWei S(2018)Breaking the Synchronization Bottleneck with Reconfigurable Transactional ExecutionIEEE Computer Architecture Letters10.1109/LCA.2018.282840217:2(147-150)Online publication date: 1-Jul-2018
https://doi.org/10.1109/LCA.2018.2828402
Wingbermuehle JCytron RChamberlain RConstantinides GChen D(2015)Superoptimized Memory Subsystems for Streaming ApplicationsProceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/2684746.2689069(126-135)Online publication date: 22-Feb-2015
https://dl.acm.org/doi/10.1145/2684746.2689069

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Management and optimization for nonvolatile memory-based hybrid scratchpad memory on multicore embedded processors

On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems

Optimizing Data Allocation and Memory Configuration for Non-Volatile Memory Based Hybrid SPM on Embedded CMPs