Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2847263.2847264acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
research-article

Efficient Memory Partitioning for Parallel Data Access via Data Reuse

Published: 21 February 2016 Publication History

Abstract

In this paper, we propose an efficient memory partitioning algorithm for parallel data access via data reuse. We found that for most of the applications in image and video processing, a large amount of data can be reused among different iterations in a loop nest. Motivated by this observation, we propose to cache these reusable data by on-chip registers. The on-chip registers used to cache the re-fetched data can be organized as chains of registers. The non-reusable data are then partitioned into several memory banks by a memory partition algorithm. We revise the existing padding method to cover cases occurring frequently in our method that some components of partition vector are zeros. Experimental results have demonstrated that compared with the state-of-the-art algorithms the proposed method can reduce the required number of memory banks by 59.8% on average. The corresponding resources for bank mapping is also significantly reduced. The number of LUTs is reduced by 78.6%. The number of Flip-Flops is reduced by 66.8%. The number of DSP48Es is reduced by 41.7%. Moreover, the storage overheads of the proposed method are zeros for most of the widely used access patterns in image filtering.

References

[1]
M. Fingeroff, High-level synthesis blue book., 2010.
[2]
D. T. W. Bruce Jacob, Spencer W. Ng, Memory Systems -- Cache, DRAM, Disk. Denise E.M. Penrose, 2008.
[3]
Y. Tatsumi and H. Mattausch, "Fast quadratic increase of multiport-storage-cell area with port number," Electronics Letters, vol. 35, no. 25, pp. 2185--2187, 1999.
[4]
Q. Liu, T. Todman, and W. Luk, "Combining optimizations in automated low power design," in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), 2010, pp. 1791--1796.
[5]
Y. B. Asher and N. Rotem, "Automatic memory partitioning: increasing memory parallelism via data structure partitioning," in Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, 2010, pp. 155--162.
[6]
J. Cong, W. Jiang, B. Liu, and Y. Zou, "Automatic memory partitioning and scheduling for throughput and power optimization," ACM Transaction on Design Automation of Electronic Systems (TODAES), no. 16, 2011.
[7]
Y. Wang, P. Zhang, X. Cheng, and J. Cong, "An integrated and automated memory optimization flow for FPGA behavioral synthesis," in Asia and South Pacific Design Automation Conf.(ASP-DAC), 2012, pp. 257--262.
[8]
P. Li, Y. Wang, P. Zhang, G. Luo, T.Wang, and J.Cong, "Memory paritioning and scheduling co-optimization in behavioral synthesis," in IEEE/ACM International Conference on Computer-Aided Design(ICCAD), 2012, pp. 488--495.
[9]
Y. Wang, P. Li, P. Zhang, C. Zhang, and J. Cong, "Memory partitioning for multidimensional arrays in high-level synthesis," in Proceedings of the 50th Annual Design Automation Conference (DAC), 2013.
[10]
Y. Wang, P. Li, and J. Cong, "Theory and algorithm for generalized memory partitioning in high-level synthesis," in Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2014.
[11]
C. Meng, S. Yin, P. Ouyang, L. Liu, and S. Wei, "Efficient memory partitioning for parallel data access in multidimensional arrays," in Proceedings of the 52th Annual Design Automation Conference (DAC), 2015.
[12]
I. Issenin, E. Brockmeyer, M. Miranda, and N. Dutt, "A data reuse analysis technique for efficient scratch-pad memory management," in ACM Trans. Des. Autom. Electron. Syst., 2007.
[13]
L.-N. Pouchet, P. Zhang, P.Sadayappan, and J. Cong, "Polyhedral-based data reuse optimization for configurable computing," in Proceedings of the 2013 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2013.
[14]
J. Cong, P. Zhang, and Y. Zou, "Optimizing memory hierarchy allocation with loop transformations for high-level synthesis," in Proceedings of the 49th Annual Design Automation Conference (DAC), 2012.
[15]
J. M. S. Prewitt, Picture processing and psychopictorics. Academic Press, 1970, ch. Object enhancement and extraction.
[16]
M. S. Alfred V.Aho and J. D. Ravi Sethi, Compilers: Principles, Techniques and Tools. Pearson Education, 2007.
[17]
J. Cong, H. Huang, C. Liu, and Y. Zou, "A reuse-aware prefetching scheme for scratchpad memory," in Proceedings of the 48th Annual Design Automation Conference (DAC), 2011, pp. 960--965.
[18]
{Online}. Available: http://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/vivado-design-tools/2014-4.html\BIBentrySTDinterwordspacing
[19]
{Online}. Available: http://www.xilinx.com/products/boards-and-kits/ek-v7-vc707-g.html\BIBentrySTDinterwordspacing

Cited By

View all
  • (2023)Efficient FPGA-Based Sparse Matrix–Vector Multiplication With Data Reuse-Aware CompressionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.328171542:12(4606-4617)Online publication date: Dec-2023
  • (2021)Combining Memory Partitioning and Subtask Generation for Parallel Data Access on CGRAsProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431414(204-209)Online publication date: 18-Jan-2021
  • (2021)Optimized Data Reuse via Reordering for Sparse Matrix-Vector Multiplication on FPGAs2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643453(1-9)Online publication date: 1-Nov-2021
  • Show More Cited By

Index Terms

  1. Efficient Memory Partitioning for Parallel Data Access via Data Reuse

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
    February 2016
    298 pages
    ISBN:9781450338561
    DOI:10.1145/2847263
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 February 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data reuse
    2. high-level synthesis
    3. memory partition

    Qualifiers

    • Research-article

    Funding Sources

    • Chen Guang project supported by Shanghai Municipal Education Commission and Shanghai Education Development Foundation
    • Recruitment Program of Global Experts (the Thousand Talents Plan)
    • NSF
    • National Natural Science Foundation of China

    Conference

    FPGA'16
    Sponsor:

    Acceptance Rates

    FPGA '16 Paper Acceptance Rate 20 of 111 submissions, 18%;
    Overall Acceptance Rate 125 of 627 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Efficient FPGA-Based Sparse Matrix–Vector Multiplication With Data Reuse-Aware CompressionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.328171542:12(4606-4617)Online publication date: Dec-2023
    • (2021)Combining Memory Partitioning and Subtask Generation for Parallel Data Access on CGRAsProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431414(204-209)Online publication date: 18-Jan-2021
    • (2021)Optimized Data Reuse via Reordering for Sparse Matrix-Vector Multiplication on FPGAs2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643453(1-9)Online publication date: 1-Nov-2021
    • (2020)An Efficient Memory Partitioning Approach for Multi-Pattern Data Access in STT-RAM2020 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS45731.2020.9181278(1-4)Online publication date: Oct-2020
    • (2019)An Efficient Memory Partitioning Approach for Multi-Pattern Data Access via Data ReuseACM Transactions on Reconfigurable Technology and Systems10.1145/330129612:1(1-22)Online publication date: 5-Feb-2019
    • (2018)Bit-Level Disturbance-Aware Memory Partitioning for Parallel Data Access for MLC STT-RAMIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2018.2862388(1-13)Online publication date: 2018
    • (2018)Multi-Bank Memory Aware Force Directed Scheduling for High-Level SynthesisIEEE Access10.1109/ACCESS.2018.27985866(7526-7540)Online publication date: 2018
    • (2017)Disturbance Aware Memory Partitioning for Parallel Data Access in STT-RAMProceedings of the 54th Annual Design Automation Conference 201710.1145/3061639.3062232(1-6)Online publication date: 18-Jun-2017
    • (2017)A New Approach to Automatic Memory Banking using Trace-Based Address MiningProceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3020078.3021734(179-188)Online publication date: 22-Feb-2017
    • (2017)Efficient Memory Partitioning for Parallel Data Access in FPGA via Data ReuseIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2017.264883836:10(1674-1687)Online publication date: Oct-2017
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media