Keyword: hls : Search

Applied Filters

People

Publications

Conferences

Reproducibility Badges

Publication Date

33 Results for: Keyword: hlsEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,740,394 records)|Limit your search to The ACM Full-Text Collection (753,050 records)

Showing 1 - 20of33 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

poster
April 2024
A Comprehensive Evaluation of FPGA-Based Spatial Acceleration of LLMs
FPGA '24: Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate ArraysApril 2024, Page 185https://doi.org/10.1145/3626202.3637600

Recent advancements in large language models (LLMs) have generated significant demands for efficient deployment in inference workloads. Most existing approaches rely on temporal architectures that reuse hardware units for different network layers and ...
0
Metrics
Total Citations0
poster
April 2024
Automatic Hardware Pragma Insertion in High-Level Synthesis: A Non-Linear Programming Approach
FPGA '24: Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate ArraysApril 2024, Page 184https://doi.org/10.1145/3626202.3637593

High-Level Synthesis enables the rapid prototyping of hardware accelerators, by combining a high-level description of the functional behavior of a kernel with a set of micro-architecture optimizations as inputs. Such pragmas may describe the pipelining ...
0
Metrics
Total Citations0
research-article
Public Access
March 2023
High-level Synthesis for Domain Specific Computing
ISPD '23: Proceedings of the 2023 International Symposium on Physical DesignMarch 2023, Pages 211–219https://doi.org/10.1145/3569052.3580027

This paper proposes a High-Level Synthesis (HLS) framework for domain-specific computing. The framework contains three key components: 1) ScaleHLS, a multi-level HLS compilation flow. Aimed to address the lack of expressiveness and hardware-dedicated ...
2
179
Metrics
Total Citations2
Total Downloads179
Last 12 Months132
Last 6 weeks30
View online with eReader
PDF
short-paper
March 2022
Marrying WebRTC and DASH for interactive streaming
MHV '22: Proceedings of the 1st Mile-High Video ConferenceMarch 2022, Page 98https://doi.org/10.1145/3510450.3517296

WebRTC is a set of W3C and IETF standards that allows the delivery of real-time content to users, with an end-to-end latency of under half a second. Support for WebRTC is built into all modern browsers across desktop and mobile devices, and it allows ...
0
150
Metrics
Total Citations0
Total Downloads150
Last 12 Months23
Last 6 weeks1
Get Access
research-article
February 2022
High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS: A Case Study on SpMV
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2022, Pages 54–64https://doi.org/10.1145/3490422.3502368

Sparse linear algebra operators are memory bound due to low compute to memory access ratio and irregular data access patterns. The exceptional bandwidth improvement provided by the emerging high-bandwidth memory (HBM) technologies, coupled with the ...
28
947
Metrics
Total Citations28
Total Downloads947
Last 12 Months355
Last 6 weeks34
1
Supplementary Material
FPGA22-fp193.mp4
Get Access
Upcoming Conferences
Skip slideshow

PACT '24

October 14 - 16, 2024

Hilton Long Beach, Long Beach, CA, USA

PACT '24 Website

MM '24

October 28 - November 1, 2024

Melbourne Convention and Exhibition Centre (MCEC), Melbourne, VIC, Australia

MM '24 Website

IMC '24

November 4 - 6, 2024

ESPACIO Fundaci?n Telef?nica, Madrid, AA, Spain

IMC '24 Website

ISPD '25

March 16 - 19, 2025

Holiday Inn Austin Midtown, Austin , TX, USA

ISPD '25 Website
research-article
Open Access
February 2022
Best Paper
RapidStream: Parallel Physical Implementation of FPGA HLS Designs
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2022, Pages 1–12https://doi.org/10.1145/3490422.3502361

FPGAs require a much longer compilation cycle than conventional computing platforms like CPUs. In this paper, we shorten the overall compilation time by co-optimizing the HLS compilation (C-to-RTL) and the back-end physical implementation (RTL-to-...
18
2,269
Metrics
Total Citations18
Total Downloads2,269
Last 12 Months763
Last 6 weeks66
1
Supplementary Material
rapidstream-record-1080p.mp4
View online with eReader
PDF
research-article
Open Access
February 2022
Accelerating SSSP for Power-Law Graphs
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2022, Pages 190–200https://doi.org/10.1145/3490422.3502358

The single-source shortest path (SSSP) problem is one of the most important and well-studied graph problems widely used in many application domains, such as road navigation, neural image reconstruction, and social network analysis. Although we have ...
11
840
Metrics
Total Citations11
Total Downloads840
Last 12 Months277
Last 6 weeks25
1
Supplementary Material
FPGA22-fpgafp047a.mp4
View online with eReader
PDF
poster
February 2022
Synthesized Garbage Collection for FPGA Accelerators
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2022, Page 53https://doi.org/10.1145/3490422.3502341

Speed and ease of accelerator design is a growing need. High level programming languages have provided significant gains in the software world, but lag in the hardware realm. We present a hardware implementation of a garbage collector, which automates ...
0
Metrics
Total Citations0
short-paper
Open Access
October 2021
A Complete End to End Open Source Toolchain for the Versatile Video Coding (VVC) Standard
MM '21: Proceedings of the 29th ACM International Conference on MultimediaOctober 2021, Pages 3795–3798https://doi.org/10.1145/3474085.3478320

Versatile Video Coding (VVC) is the most recent international video coding standard jointly developed by ITU-T and ISO/IEC, which has been finalized in July 2020. VVC allows for significant bit-rate reductions around 50% for the same subjective video ...
5
831
Metrics
Total Citations5
Total Downloads831
Last 12 Months176
Last 6 weeks18
1
Supplementary Material
MM21-osc3258.mp4
View online with eReader
PDF
poster
February 2021
Clockwork: Resource-Efficient Static Scheduling for Multi-Rate Image Processing Applications on FPGAs
FPGA '21: The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2021, Pages 145–146https://doi.org/10.1145/3431920.3439457

Image processing algorithms can benefit tremendously from hardware acceleration. However, hardware accelerators for image processing algorithms look very different from the programs that image processing algorithm designers are accustomed to writing. ...
0
Metrics
Total Citations0
research-article
Open Access
February 2021
Best Paper
AutoBridge: Coupling Coarse-Grained Floorplanning and Pipelining for High-Frequency HLS Design on Multi-Die FPGAs
FPGA '21: The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2021, Pages 81–92https://doi.org/10.1145/3431920.3439289

Despite an increasing adoption of high-level synthesis (HLS) for its design productivity advantages, there remains a significant gap in the achievable clock frequency between an HLS-generated design and a handcrafted RTL one. A key factor that limits ...
50
2,622
Metrics
Total Citations50
Total Downloads2,622
Last 12 Months682
Last 6 weeks70
1
Supplementary Material
3431920.3439289.mp4
View online with eReader
PDF
research-article
February 2021
Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers through Microbenchmarking
FPGA '21: The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2021, Pages 105–115https://doi.org/10.1145/3431920.3439284

With the public availability of FPGAs from major cloud service providers like AWS, Alibaba, and Nimbix, hardware and software developers can now easily access FPGA platforms. However, it is nontrivial to develop efficient FPGA accelerators, especially ...
11
537
Metrics
Total Citations11
Total Downloads537
Last 12 Months79
Last 6 weeks12
1
Supplementary Material
3431920.3439284.mp4
Get Access
short-paper
June 2020
Pedal to the Bare Metal: Road Traffic Simulation on FPGAs Using High-Level Synthesis
SIGSIM-PADS '20: Proceedings of the 2020 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationJune 2020, Pages 117–121https://doi.org/10.1145/3384441.3395979

The performance of Agent-based Traffic Simulations (ABTS) has been shown to benefit tremendously from offloading to accelerators such as GPUs. In the search for the most suitable hardware platform, reconfigurable hardware is a natural choice. Some ...
1
107
Metrics
Total Citations1
Total Downloads107
Last 12 Months11
Last 6 weeks2
Get Access
poster
February 2020
DBHI: A Tool for Decoupled Functional Hardware-Software Co-Design on SoCs
FPGA '20: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2020, Page 326https://doi.org/10.1145/3373087.3375386

This paper presents a system-level co-simulation and co-verification workflow to ease the transition from a software-only procedure, executed in a General Purpose processor, to the integration of a custom hardware accelerator developed in a Hardware ...
0
Metrics
Total Citations0
poster
February 2020
Studying the Potential of Automatic Optimizations in the Intel FPGA SDK for OpenCL
FPGA '20: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2020, Page 318https://doi.org/10.1145/3373087.3375355

High Level Synthesis (HLS) tools, like the Intel FPGA SDK for OpenCL, improve hardware design productivity and enable efficient design space exploration, by providing simple program directives (pragmas) and/or API calls that allow hardware programmers ...
2
Metrics
Total Citations2
poster
February 2020
Advanced Dataflow Programming using Actor Machines for High-Level Synthesis
FPGA '20: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2020, Page 310https://doi.org/10.1145/3373087.3375330

The use of parallelism has increased drastically in recent years. Parallel platforms come in many forms: multi-core processors, embedded hybrid solutions such as multi-processor system-on-chip with reconfigurable logic, and cloud datacenters with multi-...
1
Metrics
Total Citations1
poster
Public Access
February 2020
DOMIS: Dual-Bank Optimal Micro-Architecture for Iterative Stencils
- Juan Escobedo,
- Mingjie Lin
FPGA '20: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2020, Page 315https://doi.org/10.1145/3373087.3375329

High-Level Synthesis (HLS) can achieve significant performance improvements through effective memory partitioning and meticulous data reuse. Many modern applications, such as medical imaging and convolutional layers in a CNN, mostly contain kernels ...
0
Metrics
Total Citations0
research-article
February 2020
Artifacts Available
Results Reproduced / v1.1
Artifacts Evaluated & Reusable
Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis
FPGA '20: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2020, Pages 244–254https://doi.org/10.1145/3373087.3375296

Data movement is the dominating factor affecting performance and energy in modern computing systems. Consequently, many algorithms have been developed to minimize the number of I/O operations for common computing patterns. Matrix multiplication is no ...
30
747
Metrics
Total Citations30
Total Downloads747
Last 12 Months131
Last 6 weeks10
Get Access
poster
Public Access
February 2019
Optimizing Order-Associative Kernel Computation with Joint Memory Banking and Data Reuse
- Juan Escobedo,
- Mingjie Lin
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2019, Pages 189–190https://doi.org/10.1145/3289602.3293980

In this paper, we develop a joint strategy of memory banking and data reuse to specifically optimize the memory performance of any given order-associative and stencil-based computing kernel i.e., its iteration order can be reordered freely without ...
0
Metrics
Total Citations0
poster
February 2019
Building FPGA State Machines from Sequential Code
- Carl-Johannes Johnsen,
- Kenneth Skovhede
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2019, Page 186https://doi.org/10.1145/3289602.3293965

State machines are commonly used and well understood for hardware. However, in some cases they can introduce complexity as the program can no longer be read sequentially. We propose an extension to the SME model, which retains the sequential program ...
0
Metrics
Total Citations0

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences