Keyword: accelerators : Search

Applied Filters

People

Publications

Conferences

Reproducibility Badges

Publication Date

178 Results for: Keyword: acceleratorsEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,740,394 records)|Limit your search to The ACM Full-Text Collection (753,050 records)

Showing 1 - 20of178 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
Open Access
August 2024
Massively Parallel Inverse Block-sorting Transforms for bzip2 Decompression on GPUs
- André Weißenberger,
- Bertil Schmidt
ICPP '24: Proceedings of the 53rd International Conference on Parallel ProcessingAugust 2024, Pages 856–865https://doi.org/10.1145/3673038.3673067

Lossless data compression has evolved into an indispensable tool for reducing data transfer times in heterogeneous systems. However, performing decompression on host systems can create performance bottlenecks. Accelerator libraries, such as nvCOMP, ...
0
Metrics
Total Citations0
View online with eReader
HTML
PDF
research-article
Free
July 2024
JUST ACCEPTED
A Computation of the Ninth Dedekind Number using FPGA Supercomputing
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Just Accepted https://doi.org/10.1145/3674147
This manuscript makes the claim of having computed the \(9^{th}\) Dedekind number, D(9). This was done by accelerating the core operation of the process with an efficient FPGA design that outperforms an optimized 64-core CPU reference by 95\(\times\). ...
0
49
Metrics
Total Citations0
Total Downloads49
Last 12 Months49
Last 6 weeks49
View online with eReader
PDF
research-article
Free
April 2024
JUST ACCEPTED
gem5-NVDLA: A Simulation Framework for Compiling, Scheduling and Architecture Evaluation on AI System-on-Chips
- Chengtao Lai,
- Wei Zhang
ACM Transactions on Design Automation of Electronic Systems (TODAES), Just Accepted https://doi.org/10.1145/3661997
Recent years have seen an increasing trend in designing AI accelerators together with the rest of the system, including CPUs and memory hierarchy. This trend calls for high-quality simulators or analytical models that enable such kind of co-exploration. ...
0
453
Metrics
Total Citations0
Total Downloads453
Last 12 Months453
Last 6 weeks127
View online with eReader
PDF
research-article
Open Access
April 2024
GSCore: Efficient Radiance Field Rendering via Architectural Support for 3D Gaussian Splatting
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3April 2024, Pages 497–511https://doi.org/10.1145/3620666.3651385

This paper presents GSCore, a hardware acceleration unit that efficiently executes the rendering pipeline of 3D Gaussian Splatting with algorithmic optimizations. GSCore builds on the observations from an in-depth analysis of Gaussian-based radiance ...
0
502
Metrics
Total Citations0
Total Downloads502
Last 12 Months502
Last 6 weeks181
View online with eReader
PDF
research-article
Open Access
April 2024
IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3April 2024, Pages 545–560https://doi.org/10.1145/3620666.3651324

Accelerating end-to-end inference of transformer-based large language models (LLMs) is a critical component of AI services in datacenters. However, the diverse compute characteristics of LLMs' end-to-end inference present challenges as previously ...
0
989
Metrics
Total Citations0
Total Downloads989
Last 12 Months989
Last 6 weeks284
View online with eReader
PDF
Upcoming Conferences
Skip slideshow

ESWEEK '24

September 29 - October 4, 2024

Sheraton Raleigh, Raleigh, NC, USA

ESWEEK '24 Website

PACT '24

October 14 - 16, 2024

Hilton Long Beach, Long Beach, CA, USA

PACT '24 Website

ICCAD '24

October 27 - 31, 2024

Newark Liberty International Airport Marriott, New York, NY, USA

ICCAD '24 Website

MICRO '24

November 2 - 6, 2024

TBD, Austin, TX, USA

SOSP '24

November 5 - 8, 2024

Hilton Austin, Austin, TX, USA

SEC '24

December 4 - 7, 2024

Radisson Blu, Rome, Italy

SEC '24 Website

SIGCSE Virtual 2024

December 5 - 8, 2024

USA, Virtual Event, NC, USA

SIGCSE Virtual 2024 Website

ASPLOS '25

March 30 - April 3, 2025

World Trade Center, Rotterdam, Netherlands

ASPLOS '25 Website
keynote
April 2024
My Fifteen Year Journey of Deploying FPGA Accelerated Solutions
- Prabhat K. Gupta
FPGA '24: Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate ArraysApril 2024, Page 142https://doi.org/10.1145/3626202.3644813

While FPGAs have been investigated for accelerating computing workloads in academia for many decades, industry started adopting FPGAs as an accelerator only in the last decade, but even those deployments have been fairly limited. This talk describes my ...
0
69
Metrics
Total Citations0
Total Downloads69
Last 12 Months69
Last 6 weeks10
Get Access
research-article
Open Access
February 2024
Dedicated Hardware Accelerators for Processing of Sparse Matrices and Vectors: A Survey
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 2Article No.: 27, Pages 1–26https://doi.org/10.1145/3640542
Performance in scientific and engineering applications such as computational physics, algebraic graph problems or Convolutional Neural Networks (CNN), is dominated by the manipulation of large sparse matrices—matrices with a large number of zero elements. ...
0
2,159
Metrics
Total Citations0
Total Downloads2,159
Last 12 Months2,159
Last 6 weeks436
View online with eReader
PDF
research-article
Open Access
January 2024
SparGD: A Sparse GEMM Accelerator with Dynamic Dataflow
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 29, Issue 2Article No.: 26, Pages 1–32https://doi.org/10.1145/3634703
Deep learning has become a highly popular research field, and previously deep learning algorithms ran primarily on CPUs and GPUs. However, with the rapid development of deep learning, it was discovered that existing processors could not meet the specific ...
0
1,351
Metrics
Total Citations0
Total Downloads1,351
Last 12 Months1,351
Last 6 weeks198
View online with eReader
PDF
research-article
Open Access
December 2023
Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing
ACM Transactions on Computer Systems (TOCS), Volume 41, Issue 1-4Article No.: 4, Pages 1–30https://doi.org/10.1145/3630007
Sparse tensor algorithms are becoming widespread, particularly in the domains of deep learning, graph and data analytics, and scientific computing. Current high-performance broad-domain architectures, such as GPUs, often suffer memory system ...
2
1,479
Metrics
Total Citations2
Total Downloads1,479
Last 12 Months1,479
Last 6 weeks215
View online with eReader
PDF
research-article
Open Access
August 2024
Bang for the Buck: Evaluating the cost-effectiveness of Heterogeneous Edge Platforms for Neural Network Workloads
SEC '23: Proceedings of the Eighth ACM/IEEE Symposium on Edge ComputingDecember 2023, Pages 94–107https://doi.org/10.1145/3583740.3628437

Machine learning (ML) applications have experienced remarkable growth and integration into various domains. However, challenges with cloud-based deployments, such as latency, privacy, reliability, bandwidth and connectivity, have driven the popularity of ...
0
2
Metrics
Total Citations0
Total Downloads2
Last 12 Months2
Last 6 weeks2
View online with eReader
PDF
research-article
November 2023
Optimizing High-Performance Linpack for Exascale Accelerated Architectures
SC '23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2023, Article No.: 49, Pages 1–12https://doi.org/10.1145/3581784.3607066

We detail the performance optimizations made in rocHPL, AMD's open-source implementation of the High-Performance Linpack (HPL) benchmark targeting accelerated node architectures designed for exascale systems such as the Frontier supercomputer. The ...
0
187
Metrics
Total Citations0
Total Downloads187
Last 12 Months187
Last 6 weeks16
Get Access
research-article
December 2023
Improving Data Reuse in NPU On-chip Memory with Interleaved Gradient Order for DNN Training
MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on MicroarchitectureOctober 2023, Pages 438–451https://doi.org/10.1145/3613424.3614299

During training tasks for machine learning models with neural processing units (NPUs), the most time-consuming part is the backward pass, which incurs significant overheads due to off-chip memory accesses. For NPUs, to mitigate the long latency and ...
0
478
Metrics
Total Citations0
Total Downloads478
Last 12 Months478
Last 6 weeks56
Get Access
research-article
Open Access
October 2023
FPGA-based Deep Learning Inference Accelerators: Where Are We Standing?
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Volume 16, Issue 4Article No.: 60, Pages 1–32https://doi.org/10.1145/3613963
Recently, artificial intelligence applications have become part of almost all emerging technologies around us. Neural networks, in particular, have shown significant advantages and have been widely adopted over other approaches in machine learning. In ...
6
12,890
Metrics
Total Citations6
Total Downloads12,890
Last 12 Months12,890
Last 6 weeks1,158
View online with eReader
PDF
research-article
Open Access
June 2023
CPU-free Computing: A Vision with a Blueprint
- Animesh Trivedi,
- Marco Spaziani Brunella
HOTOS '23: Proceedings of the 19th Workshop on Hot Topics in Operating SystemsJune 2023, Pages 1–14https://doi.org/10.1145/3593856.3595906

Since the inception of computing, we have been reliant on CPU-powered architectures. However, today this reliance is challenged by manufacturing limitations (CMOS scaling), performance expectations (stalled clocks, Turing tax), and security concerns (...
0
1,106
Metrics
Total Citations0
Total Downloads1,106
Last 12 Months907
Last 6 weeks116
View online with eReader
PDF
research-article
June 2023
MTIA: First Generation Silicon Targeting Meta's Recommendation Systems
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureJune 2023, Article No.: 80, Pages 1–13https://doi.org/10.1145/3579371.3589348

Meta has traditionally relied on using CPU-based servers for running inference workloads, specifically Deep Learning Recommendation Models (DLRM), but the increasing compute and memory requirements of these models have pushed the company towards using ...
5
4,623
Metrics
Total Citations5
Total Downloads4,623
Last 12 Months1,153
Last 6 weeks93
Get Access
research-article
Open Access
June 2023
Artifacts Evaluated & Functional / v1.1
Artifacts Available / v1.1
Results Reproduced / v1.1
Profiling Hyperscale Big Data Processing
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureJune 2023, Article No.: 47, Pages 1–16https://doi.org/10.1145/3579371.3589082

Computing demand continues to grow exponentially, largely driven by "big data" processing on hyperscale data stores. At the same time, the slowdown in Moore's law is leading the industry to embrace custom computing in large-scale systems. Taken together, ...
3
2,563
Metrics
Total Citations3
Total Downloads2,563
Last 12 Months2,192
Last 6 weeks163
View online with eReader
PDF
research-article
June 2023
NeuRex: A Case for Neural Rendering Acceleration
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureJune 2023, Article No.: 21, Pages 1–13https://doi.org/10.1145/3579371.3589056

This paper presents NeuRex, an accelerator architecture that efficiently performs the modern neural rendering pipeline with an algorithmic enhancement and supporting hardware. NeuRex leverages the insights from an in-depth analysis of the state-of-the-...
3
832
Metrics
Total Citations3
Total Downloads832
Last 12 Months637
Last 6 weeks29
Get Access
research-article
Open Access
March 2023
Cohort: Software-Oriented Acceleration for Heterogeneous SoCs
ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3March 2023, Pages 105–117https://doi.org/10.1145/3582016.3582059

Philosophically, our approaches to acceleration focus on the extreme. We must optimise accelerators to the maximum, leaving software to fix any hardware-software mismatches. Today’s software abstractions for programming accelerators leak hardware ...
1
2,188
Metrics
Total Citations1
Total Downloads2,188
Last 12 Months1,420
Last 6 weeks97
View online with eReader
PDF
short-paper
Open Access
February 2023
Results Reproduced / v1.1
Artifacts Evaluated & Functional / v1.1
Artifacts Available / v1.1
BOBBER A Prototyping Platform for Batteryless Intermittent Accelerators
FPGA '23: Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate ArraysFebruary 2023, Pages 221–228https://doi.org/10.1145/3543622.3573046

Batteryless systems offer promising platforms to support pervasive, near-sensor intelligence in a sustainable manner. These systems solely rely on ambient energy sources that often provide limited power. One common approach to designing batteryless ...
0
520
Metrics
Total Citations0
Total Downloads520
Last 12 Months282
Last 6 weeks22
View online with eReader
PDF
research-article
Open Access
January 2023
Results Reproduced / v1.1
Artifacts Evaluated & Functional / v1.1
Artifacts Available / v1.1
Towards a Machine Learning-Assisted Kernel with LAKE
ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2January 2023, Pages 846–861https://doi.org/10.1145/3575693.3575697

The complexity of modern operating systems (OSes), rapid diversification of hardware, and steady evolution of machine learning (ML) motivate us to explore the potential of ML to improve decision-making in OS kernels. We conjecture that ML can better ...
0
2,173
Metrics
Total Citations0
Total Downloads2,173
Last 12 Months1,324
Last 6 weeks113
View online with eReader
PDF

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences