Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3287624.3288755acmconferencesArticle/Chapter ViewAbstractPublication PagesaspdacConference Proceedingsconference-collections
research-article

Runtime reconfigurable memory hierarchy in embedded scalable platforms

Published: 21 January 2019 Publication History

Abstract

In heterogeneous systems-on-chip, the optimal choice of the cache-coherence model for a loosely-coupled accelerator may vary at each invocation, depending on workload and system status. We propose a runtime adaptive algorithm to manage the coherence of accelerators. The algorithm's choices are based on the combination of static and dynamic features of the active accelerators and their workloads. We evaluate the algorithm by leveraging our FPGA-based platform for rapid SoC prototyping. Experimental results, obtained through the deployment of a multi-core and multi-accelerator system that runs Linux SMP, show the benefits of our approach in terms of execution time and memory accesses.

References

[1]
J. Alsop, M. D. Sinclair, and S. V. Advel. 2018. Spandex: A Flexible Interface for Efficient Heterogeneous Coherence. In Proc. of ISCA.
[2]
Mobileye (an Intel Company). 2018. Towards Autonomous Driving. <scp>url</scp>: https://s21.q4cdn.com/600692695/files/doc_presentations/2018/CES-2018-final-MBLY.pdf. CES.
[3]
ARM 2017. AMBA AXI and ACE Protocol Specification. ARM.
[4]
J. Balkind et al. 2016. OpenPiton: An Open Source Manycore Research Framework. In Proc. of ASPLOS.
[5]
L. Benini and G. De Micheli. 2002. Networks on Chips: A New SoC Paradigm. IEEE Computer (2002).
[6]
B. Blaner et al. 2013. IBM POWER7+ Processor On-Chip Accelerators for Cryptography and Active Memory Expansion. IBM J. Research & Development (2013).
[7]
L. P. Carloni. 2015. From Latency-Insensitive Design to Communication-Based System-Level Design. Proc. of the IEEE (2015).
[8]
L. P. Carloni. 2016. The Case for Embedded Scalable Platforms. In Proc. of DAC.
[9]
Y. T. Chen et al. 2013. Accelerator-rich CMPs: From Concept to Real Hardware. In Proc. of ICCD.
[10]
J. Cong et al. 2014. Accelerator-rich Architectures: Opportunities and Progresses. In Proc. of DAC.
[11]
E. Cota et al. 2015. An Analysis of Accelerator Coupling in Heterogeneous Architectures. In Proc. of DAC.
[12]
M. Ditty et al. 2014. NVIDIA'S Tegra K1 System-on-Chip. In Proc. of HCS.
[13]
H. Franke et al. 2010. Introduction to the Wire-Speed Processor and Architecture. IBM J. Research & Development (2010).
[14]
Jiri Gaisler. 2004. An Open-Source VHDL IP Library with Plug & Play Configuration. Building the Information Society (2004).
[15]
D. Giri, P. Mantovani, and L. P. Carloni. 2018. Accelerators & Coherence: An SoC Perspective. IEEE Micro (2018).
[16]
D. Giri, P. Mantovani, and L. P. Carloni. 2018. NoC-Based Support of Heterogeneous Cache-Coherence Models for Accelerators. In Proc. of NOCS.
[17]
John Goodacre. 2008. The Effect and Technique of System Coherence in ARM Multicore Technology. MPSoC.
[18]
Y. Hao et al. 2017. Supporting Address Translation for Accelerator-Centric Architectures. In Proc. of HPCA.
[19]
S. Kumar et al. 2015. Fusion: Design Tradeoffs in Coherent Cache Hierarchies for Accelerators. In Proc. of ISCA.
[20]
M. Lyons et al. 2012. The Accelerator Store: A Shared Memory Framework for Accelerator-based Systems. TACO (2012).
[21]
P. Mantovani et al. 2016. An FPGA-based Infrastructure for Fine-Grained DVFS Analysis in High-Performance Embedded Systems. In Proc. of DAC.
[22]
P. Mantovani et al. 2016. Handling Large Data Sets for High-performance Embedded Applications in Heterogeneous Systems-on-chip. In Proc. of CASES.
[23]
P. Mantovani, G. Di Guglielmo, and L. P. Carloni. 2016. High-level Synthesis of Accelerators in Embedded Scalable Platforms. In Proc. of ASPDAC.
[24]
S. Neuendorffer and F. Martinez-Vallina. 2013. Building Zynq® Accelerators with Vivado® High-Level Synthesis. In Proc. of FPGA.
[25]
P. Pande et al. 2005. Performance Evaluation and Design Trade-offs for Network-on-chip Interconnect Architectures. IEEE Trans. on Computers (2005).
[26]
Y. Shao et al. 2016. Co-designing Accelerators and SoC Interfaces Using gem5-Aladdin. In Proc. of MICRO.
[27]
Y. Shao and D. Brooks. 2015. Research Infrastructures for Hardware Accelerators. Morgan & Claypool.
[28]
D. Sorin et al. 2011. A Primer on Memory Consistency and Cache Coherence. Morgan & Claypool.
[29]
J. Stuecheli. 2013. POWER8. In Proc. of the IEEE Hot Chips Symp.
[30]
J. Stuecheli et al. 2015. CAPI: A Coherent Accelerator Processor Interface. IBM J. Research & Development (2015).
[31]
Xilinx. 2018. Adaptable Intelligence: The Next Computing Era. Keynote at the 30th Hot Chips Symposium.

Cited By

View all
  • (2022)An FPGA Overlay for Efficient Real-Time Localization in 1/10th Scale Autonomous Vehicles2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774517(915-920)Online publication date: 14-Mar-2022
  • (2021)Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480065(350-365)Online publication date: 18-Oct-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPDAC '19: Proceedings of the 24th Asia and South Pacific Design Automation Conference
January 2019
794 pages
ISBN:9781450360074
DOI:10.1145/3287624
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IEICE ESS: Institute of Electronics, Information and Communication Engineers, Engineering Sciences Society
  • IEEE CAS
  • IEEE CEDA
  • IPSJ SIG-SLDM: Information Processing Society of Japan, SIG System LSI Design Methodology

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 January 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FPGA prototyping
  2. cache coherence
  3. hardware accelerators
  4. heterogeneous system-on-chip

Qualifiers

  • Research-article

Conference

ASPDAC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 466 of 1,454 submissions, 32%

Upcoming Conference

ASPDAC '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)An FPGA Overlay for Efficient Real-Time Localization in 1/10th Scale Autonomous Vehicles2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774517(915-920)Online publication date: 14-Mar-2022
  • (2021)Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480065(350-365)Online publication date: 18-Oct-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media