Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleFebruary 2024
Explainable-DSE: An Agile and Explainable Exploration of Efficient HW/SW Codesigns of Deep Learning Accelerators Using Bottleneck Analysis
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4March 2023, Pages 87–107https://doi.org/10.1145/3623278.3624772Effective design space exploration (DSE) is paramount for hardware/software codesigns of deep learning accelerators that must meet strict execution constraints. For their vast search space, existing DSE techniques can require excessive trials to obtain a ...
- research-articleFebruary 2024
Fast Instruction Selection for Fast Digital Signal Processing
- Alexander J Root,
- Maaz Bin Safeer Ahmad,
- Dillon Sharlet,
- Andrew Adams,
- Shoaib Kamil,
- Jonathan Ragan-Kelley
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4March 2023, Pages 125–137https://doi.org/10.1145/3623278.3624768Modern vector processors support a wide variety of instructions for fixed-point digital signal processing. These instructions support a proliferation of rounding, saturating, and type conversion modes, and are often fused combinations of more primitive ...
- research-articleFebruary 2024
LightRidge: An End-to-end Agile Design Framework for Diffractive Optical Neural Networks
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4March 2023, Pages 202–218https://doi.org/10.1145/3623278.3624757To lower the barrier to diffractive optical neural networks (DONNs) design, exploration, and deployment, we propose LightRidge, the first end-to-end optical ML compilation framework, which consists of (1) precise and differentiable optical physics ...
- research-articleFebruary 2024
Predict; Don't React for Enabling Efficient Fine-Grain DVFS in GPUs
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4March 2023, Pages 253–267https://doi.org/10.1145/3623278.3624756With the continuous improvement of on-chip integrated voltage regulators (IVRs) and fast, adaptive frequency control, dynamic voltage-frequency scaling (DVFS) transition times have shrunk from the microsecond to the nanosecond regime, providing immense ...
- research-articleFebruary 2024
MiniMalloc: A Lightweight Memory Allocator for Hardware-Accelerated Machine Learning
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4March 2023, Pages 238–252https://doi.org/10.1145/3623278.3624752We present a new approach to static memory allocation, a key problem that arises in the compilation of machine learning models onto the resources of a specialized hardware accelerator. Our methodology involves a recursive depth-first search that limits ...
- research-articleFebruary 2024
Manticore: Hardware-Accelerated RTL Simulation with Static Bulk-Synchronous Parallelism
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4March 2023, Pages 219–237https://doi.org/10.1145/3623278.3624750The demise of Moore's Law and Dennard Scaling has revived interest in specialized computer architectures and accelerators. Verification and testing of this hardware depend heavily upon cycle-accurate simulation of register-transfer-level (RTL) designs. ...