research-article

Open access

Table-Lookup MAC: Scalable Processing of Quantised Neural Networks in FPGA Soft Logic

Authors:

Daniel Gerlinghoff,

Benjamin Chen Ming Choong,

Rick Siow Mong Goh,

Weng-Fai Wong,

Tao LuoAuthors Info & Claims

FPGA '24: Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays

Pages 235 - 245

https://doi.org/10.1145/3626202.3637576

Published: 02 April 2024 Publication History

PDF eReader

Abstract

Recent advancements in neural network quantisation have yielded remarkable outcomes, with three-bit networks reaching state-of-the-art full-precision accuracy in complex tasks. These achievements present valuable opportunities for accelerating neural networks by computing in reduced precision. Implementing it on FPGAs can take advantage of bit-level reconfigurability, which is not available on conventional CPUs and GPUs. Simultaneously, the high data intensity of neural network processing has inspired computing-in-memory paradigms, including on FPGA platforms. By programming the effects of trained model weights as lookup operations in soft logic, the transfer of weight data from memory units can be avoided, alleviating the memory bottleneck. However, previous methods face poor scalability - the high logic utilisation limiting them to small networks/sub-networks of binary models with low accuracy. In this paper, we introduce Table Lookup Multiply-Accumulate (TLMAC) as a framework to compile and optimise quantised neural networks for scalable lookup-based processing. TLMAC clusters and maps unique groups of weights to lookup-based processing elements, enabling highly parallel computation while taking advantage of parameter redundancy. Further place and route algorithms are proposed to reduce LUT utilisation and routing congestion. We demonstrate that TLMAC significantly improves the scalability of previous related works. Our efficient logic mapping and high degree of reuse enables entire ImageNet-scale quantised models with full-precision accuracy to be implemented using lookup-based computing on one commercially available FPGA.

References

[1]

Igor Aleksander, Massimo De Gregorio, Felipe Maia Galv ao França, Priscila Machado Vieira Lima, and Helen Morton. 2009. A brief introduction to Weightless Neural Systems. In The European Symposium on Artificial Neural Networks. 299--305. https://api.semanticscholar.org/CorpusID:15177925

Abstract

References

Index Terms

Recommendations

Synthesizable Standard Cell FPGA Fabrics Targetable by the Verilog-to-Routing CAD Flow

An Efficient Look-up Table-based Approach for Multiplication over GF(2m) Generated by Trinomials

Quantized neural networks: training neural networks with low precision weights and activations

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations