Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2744769.2744794acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

An Analysis of Accelerator Coupling in Heterogeneous Architectures

Published: 07 June 2015 Publication History

Abstract

Existing research on accelerators has emphasized the performance and energy efficiency improvements they can provide, devoting little attention to practical issues such as accelerator invocation and interaction with other on-chip components (e.g. cores, caches). In this paper we present a quantitative study that considers these aspects by implementing seven high-throughput accelerators following three design models: tight coupling behind a CPU, loose out-of-core coupling with Direct Memory Access (DMA) to the LLC, and loose out-of-core coupling with DMA to DRAM. A salient conclusion of our study is that working sets of non-trivial size are best served by loosely-coupled accelerators that integrate private memory blocks tailored to their needs.

References

[1]
R. Banakar, S. Steinke, B.-S. Lee, M. Balakrishnan, and P. Marwedel. Scratchpad Memory: Design Alternative for Cache On-chip Memory in Embedded Systems. In Proc. of CODES+ISSS, pages 73{78, 2002.
[2]
K. Barker, T. Benson, D. Campbell, D. Ediger, R. Gioiosa, A. Hoisie, D. Kerbyson, J. Manzano, A. Marquez, L. Song, N. Tallent, and A. Tumeo. PERFECT Benchmark Suite Manual. Pacific Northwest National Laboratory and Georgia Tech Research Institute, 2013.
[3]
T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam. DianNao: a Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning. In Proc. of ASPLOS, pages 269{284, 2014.
[4]
J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman. Architecture Support for Accelerator-rich CMPs. In Proc. of DAC, pages 843{849, 2012.
[5]
A. Fog. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. Copenhagen University College of Engineering, 2011.
[6]
J. Huang, Y. Huang, O. Temam, P. Ienne, Y. Chen, and C. Wu. A Low-cost Memory Interface for High-throughput Accelerators. In Proc. of CASES, pages 11:1{11:10, 2014.
[7]
A. Jaleel. Memory Characterization of Workloads Using Instrumentation-Driven Simulation. Web Copy, 2010.
[8]
J. H. Kelm and S. S. Lumetta. HybridOS: Runtime Support for Reconfigurable Accelerators. In Proc. of FPGA, pages 212{221, 2008.
[9]
C. D. Kersey, A. Rodrigues, and S. Yalamanchili. A Universal Parallel Front-End for Execution Driven Microarchitecture Simulation. In Proc. of RAPIDO, pages 25{32, 2012.
[10]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. McPAT: an Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures. In Proc. of MICRO, pages 469{480, 2009.
[11]
G. Martin and G. Smith. High-Level Synthesis: Past, Present, and Future. IEEE Design & Test of Computers, 26(4):18{25, 2009.
[12]
N. Muralimanohar, R. Balasubramonian, and N. Jouppi. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In Proc. of MICRO, 2007.
[13]
B. Reagen, R. Adolf, Y. S. Shao, G.-Y. Wei, and D. Brooks. MachSuite: Benchmarks for Accelerator Design and Customized Architectures. 2014.
[14]
P. Rosenfeld, E. Cooper-Balis, and B. Jacob. DRAMSim2: A Cycle Accurate Memory System Simulator. Computer Architecture Letters, 10(1):16 {19, jan.-june 2011.
[15]
R. Sampson and T. F. Wenisch. ZCache Skew-ered. In Proc. of WDDD, 2011.
[16]
S. Srinivasan, L. Zhao, R. Illikkal, and R. Iyer. Efficient interaction between os and architecture in heterogeneous platforms. ACM SIGOPS Operating Systems Review, 45(1):62{72, 2011.
[17]
J. Stuecheli, B. Blaner, C. Johns, and M. Siegel. CAPI: A Coherent Accelerator Processor Interface. IBM Journal of Research and Development, 59(1):7{1, 2015.
[18]
G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. Conservation Cores: Reducing the Energy of Mature Computations. In Proc. of ASPLOS, pages 205{218, 2010.
[19]
H. Vo, Y. Lee, A. Waterman, and K. Asanovic. A Case for OS-Friendly Hardware Accelerators. In Proc. of WIVOSCA, 2013.
[20]
L. Wu, A. Lottarini, T. K. Paine, M. A. Kim, and K. A. Ross. Q100: the Architecture and Design of a Database Processing Unit. In Proc. of ASPLOS, pages 255{268, 2014.

Cited By

View all
  • (2024)MOSAICMicroprocessors & Microsystems10.1016/j.micpro.2024.105039106:COnline publication date: 1-Apr-2024
  • (2023)CryptoMMU: Enabling Scalable and Secure Access Control of Third-Party AcceleratorsProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614311(32-48)Online publication date: 28-Oct-2023
  • (2023)Cohort: Software-Oriented Acceleration for Heterogeneous SoCsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582059(105-117)Online publication date: 25-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '15: Proceedings of the 52nd Annual Design Automation Conference
June 2015
1204 pages
ISBN:9781450335201
DOI:10.1145/2744769
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

DAC '15
Sponsor:
DAC '15: The 52nd Annual Design Automation Conference 2015
June 7 - 11, 2015
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)9
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MOSAICMicroprocessors & Microsystems10.1016/j.micpro.2024.105039106:COnline publication date: 1-Apr-2024
  • (2023)CryptoMMU: Enabling Scalable and Secure Access Control of Third-Party AcceleratorsProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614311(32-48)Online publication date: 28-Oct-2023
  • (2023)Cohort: Software-Oriented Acceleration for Heterogeneous SoCsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582059(105-117)Online publication date: 25-Mar-2023
  • (2022)A Scalable Methodology for Agile Chip Development with Open-Source Hardware ComponentsProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3561102(1-9)Online publication date: 30-Oct-2022
  • (2021)Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480065(350-365)Online publication date: 18-Oct-2021
  • (2020)Mixed-data-model heterogeneous compilation and OpenMP offloadingProceedings of the 29th International Conference on Compiler Construction10.1145/3377555.3377891(119-131)Online publication date: 22-Feb-2020
  • (2019)M3XProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358859(617-631)Online publication date: 10-Jul-2019
  • (2019)Teaching Heterogeneous Computing with System-Level Design MethodsProceedings of the Workshop on Computer Architecture Education10.1145/3338698.3338893(1-8)Online publication date: 22-Jun-2019
  • (2019)BuffetsProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304025(137-151)Online publication date: 4-Apr-2019
  • (2019)Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA ArchitecturesProceedings of the 2019 ACM/SPEC International Conference on Performance Engineering10.1145/3297663.3310305(79-90)Online publication date: 4-Apr-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media