Keyword: acceleration : Search

research-article

Free

JUST ACCEPTED

Acceleration by Stepsize Hedging: Multi-Step Descent and the Silver Stepsize Schedule

Journal of the ACM (JACM), Just Accepted https://doi.org/10.1145/3708502

Can we accelerate the convergence of gradient descent without changing the algorithm—just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in \(\(\...\)

survey

Open Access

Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey

ACM Computing Surveys (CSUR), Volume 57, Issue 4Article No.: 91, Pages 1–35https://doi.org/10.1145/3703453

Deep reinforcement learning has led to dramatic breakthroughs in the field of artificial intelligence for the past few years. As the amount of rollout experience data and the size of neural networks for deep reinforcement learning have grown continuously, ...

research-article

PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

SC '24: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisArticle No.: 40, Pages 1–19https://doi.org/10.1109/SC41406.2024.00046

Inference of Large Language Models (LLMs) across computer clusters has become a focal point of research in recent times, with many acceleration techniques taking inspiration from CPU speculative execution. These techniques reduce bottlenecks associated ...

Article

Simultaneous Image Quality Improvement and Artefacts Correction in Accelerated MRI

Machine Learning in Medical ImagingPages 228–237https://doi.org/10.1007/978-3-031-73284-3_23

Abstract

MR data are acquired in the frequency domain, known as k-space. Acquiring high-quality and high-resolution MR images can be time-consuming, posing a significant challenge when multiple sequences providing complementary contrast information are ...

research-article

Open Access

Invited: New Solutions on LLM Acceleration, Optimization, and Application

DAC '24: Proceedings of the 61st ACM/IEEE Design Automation ConferenceArticle No.: 369, Pages 1–4https://doi.org/10.1145/3649329.3663517

Large Language Models (LLMs) have revolutionized a wide range of applications with their strong human-like understanding and creativity. Due to the continuously growing model size and complexity, LLM training and deployment have shown significant ...

short-paper

Open Access

Accelerating Boolean Constraint Propagation for Efficient SAT-Solving on FPGAs

GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024Pages 305–309https://doi.org/10.1145/3649476.3658808

We present a hardware-accelerated SAT solver targeting processor/Field Programmable Gate Arrays (FPGA) SoCs. Our solution accelerates the most expensive subroutine of the Davis-Putnam-Logemann-Loveland (DPLL) algorithm, Boolean Constraint Propagation (...

short-paper

Open Access

Acceleration of Ultrasound Neurostimulation Using Mixed-Precision Arithmetic

HPDC '24: Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed ComputingPages 370–372https://doi.org/10.1145/3625549.3658823

Ultrasound neurostimulation, a technique that modulates the brain's electrical activity, has emerged as a significant secondary treatment option for cases resistant to pharmacological interventions. The therapy is achievable through the application of a ...

research-article

XVDPU: A High-Performance CNN Accelerator on the Versal Platform Powered by the AI Engine

ACM Transactions on Reconfigurable Technology and Systems (TRETS), Volume 17, Issue 2Article No.: 20, Pages 1–24https://doi.org/10.1145/3617836

Today, convolutional neural networks (CNNs) are widely used in computer vision applications. However, the trends of higher accuracy and higher resolution generate larger networks. The requirements of computation or I/O are the key bottlenecks. In this ...

research-article

Analyzing Operation Efficiency of a City Transportation System by the U-Statistics Methods. II. Optimization of the Interactive Evaluation Methods

Cybernetics and Systems Analysis (KLU-CASA), Volume 60, Issue 2Pages 268–275https://doi.org/10.1007/s10559-024-00667-6

Abstract

The authors formalize the technique of interactive evaluation of the operation efficiency of the motor vehicle system of a large city based on U-statistics methods. To optimize this technique, the authors propose efficient algorithmic ...

research-article

Computing Acceleration to Genome-Wide Association Study Based on CPU/FPGA Heterogeneous System

ACM SIGAPP Applied Computing Review (SIGAPP), Volume 23, Issue 4Pages 16–26https://doi.org/10.1145/3642964.3642966

Genome Wide Association Study (GWAS) reveals the influence of single nucleotide polymorphisms (SNP) and other genetic markers on the complex genetic disease traits, making a significant contribution to the prevention and treatment of genetic diseases. ...

research-article

Tennis players' hitting action recognition method based on multimodal data

Song Liu

International Journal of Biometrics (IJOB), Volume 16, Issue 3-4Pages 317–336https://doi.org/10.1504/ijbm.2024.138223

In order to improve the recognition accuracy of hitting movements, a tennis player hitting movement recognition method based on multimodal data is proposed. First, we collect acceleration modal data of hitting movements and extract acceleration ...

research-article

Monotone Inclusions, Acceleration, and Closed-Loop Control

Mathematics of Operations Research (MOOR), Volume 48, Issue 4Pages 2353–2382https://doi.org/10.1287/moor.2022.1343

We propose and analyze a new dynamical system with a closed-loop control law in a Hilbert space H, aiming to shed light on the acceleration phenomenon for monotone inclusion problems, which unifies a broad class of optimization, saddle point, and ...

Article

Real Acceleration of Communication Process in Distributed Algorithms with Compression

Optimization and ApplicationsPages 99–109https://doi.org/10.1007/978-3-031-47859-8_8

Abstract

Modern applied optimization problems become more and more complex every day. Due to this fact, distributed algorithms that can speed up the process of solving an optimization problem through parallelization are of great importance. The main ...

research-article

Simplicity done right for SIMDified query processing on CPU and FPGA

SiMoD '23: Proceedings of the 1st Workshop on Simplicity in Management of DataArticle No.: 3, Pages 1–5https://doi.org/10.1145/3596225.3596229

We present a simple but effective solution idea to port SIMDified query processing code to Intel® FPGA cards for acceleration. The main advantage of our approach is the seamless integration with existing SIMD abstraction libraries originally developed ...

research-article

Efficient and Effective Algorithms for Generalized Densest Subgraph Discovery

Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 2Article No.: 169, Pages 1–27https://doi.org/10.1145/3589314

The densest subgraph problem (DSP) is of great significance due to its wide applications in different domains. Meanwhile, diverse requirements in various applications lead to different density variants for DSP. Unfortunately, existing DSP algorithms ...

research-article

Scalable High-Performance Architecture for Evolving Recommender System

EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and SystemsPages 154–162https://doi.org/10.1145/3578356.3592594

Recommender systems are expected to scale to the requirement of the large number of recommendations made to the customers and to keep the latency of recommendations within a stringent limit. Such requirements make architecting a recommender system a ...

research-article

Exploiting Data Parallelism in Graph-Based Simultaneous Localization and Mapping: A Case Study with GPU Accelerations

HPCAsia '23: Proceedings of the International Conference on High Performance Computing in Asia-Pacific RegionPages 126–139https://doi.org/10.1145/3578178.3578237

Graph-based simultaneous localization and mapping (G-SLAM) is an intuitive SLAM implementation where graphs are used to represent poses, landmarks and sensor measurements when a mobile robot builds a map of the environment and locates itself in it. ...

research-article

Open Access

ENCORE: Efficient Architecture Verification Framework with FPGA Acceleration

FPGA '23: Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate ArraysPages 209–219https://doi.org/10.1145/3543622.3573187

Verification typically consumes the majority of the time in the hardware development cycle. Primarily this is because multiple iterations to debug hardware using software simulation is extremely time-consuming. While FPGAs can be utilised to accelerate ...

research-article

Accelerating Convolutional Neural Networks in Frequency Domain via Kernel-Sharing Approach

ASPDAC '23: Proceedings of the 28th Asia and South Pacific Design Automation ConferencePages 733–738https://doi.org/10.1145/3566097.3567862

Convolutional neural networks (CNNs) are typically computationally heavy. Fast algorithms such as fast Fourier transforms (FFTs), are promising in significantly reducing computation complexity by replacing convolutions with frequency-domain element-wise ...

research-article

Evaluation Methods and Differences between Three Dimensional And Two Dimensional Movies by Physiological Measurements Using A Commercial 6-Axis Sensor

Procedia Computer Science (PROCS), Volume 225, Issue CPages 4631–4639https://doi.org/10.1016/j.procs.2023.10.461

Abstract

The term "metaverse" has become a common word in recent years. This term, a virtual space on the Internet, is gaining importance in various fields in today's era of remarkable progress toward the fusion of virtual space and real space using ...

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences