Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3173162.3173197acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

Minnow: Lightweight Offload Engines for Worklist Management and Worklist-Directed Prefetching

Published: 19 March 2018 Publication History

Abstract

The importance of irregular applications such as graph analytics is rapidly growing with the rise of Big Data. However, parallel graph workloads tend to perform poorly on general-purpose chip multiprocessors (CMPs) due to poor cache locality, low compute intensity, frequent synchronization, uneven task sizes, and dynamic task generation. At high thread counts, execution time is dominated by worklist synchronization overhead and cache misses. Researchers have proposed hardware worklist accelerators to address scheduling costs, but these proposals often harden a specific scheduling policy and do not address high cache miss rates. We address this with Minnow, a technique that augments each core in a CMP with a lightweight Minnow accelerator. Minnow engines offload worklist scheduling from worker threads to improve scalability. The engines also perform worklist-directed prefetching, a technique that exploits knowledge of upcoming tasks to issue nearly perfectly accurate and timely prefetch operations. On a simulated 64-core CMP running a parallel graph benchmark suite, Minnow improves scalability and reduces L2 cache misses from 29 to 1.2 MPKI on average, resulting in 6.01x average speedup over an optimized software baseline for only 1% area overhead.

References

[1]
Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. A Scalable Processing-in-memory Accelerator for Parallel Graph Processing Proceedings of the 42nd International Symposium on Computer Architecture (ISCA '15). ACM, New York, NY, USA, 105--117.
[2]
Sam Ainsworth and Timothy M. Jones. 2016. Graph Prefetching Using Data Structure Knowledge. Proceedings of the 2016 International Conference on Supercomputing (ICS '16). ACM, New York, NY, USA, Article 39, 11 pages.
[3]
Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porterfield, and Burton Smith. 1990. The Tera computer system. In ACM SIGARCH Computer Architecture News, Vol. Vol. 18. ACM, 1--6.
[4]
S. Beamer, K. Asanovic, and D. Patterson. 2015. Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server 2015 IEEE International Symposium on Workload Characterization. 56--65.
[5]
R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, and Y. N. Patt. 1999. Simultaneous subordinate microthreading (SSMT). In Proceedings of the 26th International Symposium on Computer Architecture. 186--195. 1109/SBAC-PAD.2014.39
[6]
A. Tumeo and J. Feo. 2015. Irregular Applications: From Architectures to Algorithms {Guest editors' introduction}. Computer, Vol. 48, 8 (Aug. 2015), 14--16. showISSN0018--9162
[7]
Joyce Jiyoung Whang, Andrew Lenharth, Inderjit S Dhillon, and Keshav Pingali. 2015. Scalable Data-Driven PageRank: Algorithms, System Issues, and Lessons Learned. Euro-Par 2015: Parallel Processing. Springer, 438--450.
[8]
Xiangyao Yu, Christopher J. Hughes, Nadathur Satish, and Srinivas Devadas. 2015. IMP: Indirect Memory Prefetcher. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 178--190.

Cited By

View all
  • (2024)UpDown: A Novel Architecture for Unlimited Memory ParallelismProceedings of the International Symposium on Memory Systems10.1145/3695794.3695801(61-77)Online publication date: 30-Sep-2024
  • (2024)PMGraph: Accelerating Concurrent Graph Queries over Streaming GraphsACM Transactions on Architecture and Code Optimization10.1145/368933721:4(1-25)Online publication date: 20-Nov-2024
  • (2024)Tyche: An Efficient and General Prefetcher for Indirect Memory AccessesACM Transactions on Architecture and Code Optimization10.1145/3641853Online publication date: 22-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
March 2018
827 pages
ISBN:9781450349116
DOI:10.1145/3173162
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 53, Issue 2
    ASPLOS '18
    February 2018
    809 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/3296957
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 March 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accelerators
  2. graph analytics
  3. helper threads
  4. parallel architectures
  5. prefetching
  6. scheduling

Qualifiers

  • Research-article

Funding Sources

Conference

ASPLOS '18

Acceptance Rates

ASPLOS '18 Paper Acceptance Rate 56 of 319 submissions, 18%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)268
  • Downloads (Last 6 weeks)42
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)UpDown: A Novel Architecture for Unlimited Memory ParallelismProceedings of the International Symposium on Memory Systems10.1145/3695794.3695801(61-77)Online publication date: 30-Sep-2024
  • (2024)PMGraph: Accelerating Concurrent Graph Queries over Streaming GraphsACM Transactions on Architecture and Code Optimization10.1145/368933721:4(1-25)Online publication date: 20-Nov-2024
  • (2024)Tyche: An Efficient and General Prefetcher for Indirect Memory AccessesACM Transactions on Architecture and Code Optimization10.1145/3641853Online publication date: 22-Jan-2024
  • (2024)PDG: A Prefetcher for Dynamic Graph UpdatingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333588043:4(1246-1259)Online publication date: Apr-2024
  • (2024)SSE: Security Service Engines to Scale Enclave Parallelism for System Interactive Applications2024 International Symposium on Secure and Private Execution Environment Design (SEED)10.1109/SEED61283.2024.00019(84-95)Online publication date: 16-May-2024
  • (2024)Leviathan: A Unified System for General-Purpose Near-Data Computing2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00095(1278-1294)Online publication date: 2-Nov-2024
  • (2023)A gem5 Implementation of the Sequential Codelet Model: Reducing Overhead and Expanding the Software Memory InterfaceProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624152(839-846)Online publication date: 12-Nov-2023
  • (2023)Decoupled Vector RunaheadProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614255(17-31)Online publication date: 28-Oct-2023
  • (2022)MetaSys: A Practical Open-source Metadata Management System to Implement and Evaluate Cross-layer OptimizationsACM Transactions on Architecture and Code Optimization10.1145/350525019:2(1-29)Online publication date: 24-Mar-2022
  • (2022)TaskStream: accelerating task-parallel workloads by recovering program structureProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507706(1-13)Online publication date: 28-Feb-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media