Weightless Neural Networks for Efficient Edge Inference

Susskind, Zachary; Arora, Aman; Miranda, Igor Dantas Dos Santos; Villon, Luis Armando Quintanilla; Katopodis, Rafael Fontella; de Araujo, Leandro Santiago; Dutra, Diego Leonel Cadette; Lima, Priscila Machado Vieira; Franca, Felipe Maia Galvao; Breternitz Jr., Mauricio; John, Lizy K.

Computer Science > Hardware Architecture

arXiv:2203.01479 (cs)

[Submitted on 3 Mar 2022]

Title:Weightless Neural Networks for Efficient Edge Inference

Authors:Zachary Susskind, Aman Arora, Igor Dantas Dos Santos Miranda, Luis Armando Quintanilla Villon, Rafael Fontella Katopodis, Leandro Santiago de Araujo, Diego Leonel Cadette Dutra, Priscila Machado Vieira Lima, Felipe Maia Galvao Franca, Mauricio Breternitz Jr., Lizy K. John

View PDF

Abstract:Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference. This is in contrast with Deep Neural Networks (DNNs), which use multiply-accumulate operations. State-of-the-art WNN architectures have a fraction of the implementation cost of DNNs, but still lag behind them on accuracy for common image recognition tasks. Additionally, many existing WNN architectures suffer from high memory requirements. In this paper, we propose a novel WNN architecture, BTHOWeN, with key algorithmic and architectural improvements over prior work, namely counting Bloom filters, hardware-friendly hashing, and Gaussian-based nonlinear thermometer encodings to improve model accuracy and reduce area and energy consumption. BTHOWeN targets the large and growing edge computing sector by providing superior latency and energy efficiency to comparable quantized DNNs. Compared to state-of-the-art WNNs across nine classification datasets, BTHOWeN on average reduces error by more than than 40% and model size by more than 50%. We then demonstrate the viability of the BTHOWeN architecture by presenting an FPGA-based accelerator, and compare its latency and resource usage against similarly accurate quantized DNN accelerators, including Multi-Layer Perceptron (MLP) and convolutional models. The proposed BTHOWeN models consume almost 80% less energy than the MLP models, with nearly 85% reduction in latency. In our quest for efficient ML on the edge, WNNs are clearly deserving of additional attention.

Subjects:	Hardware Architecture (cs.AR); Machine Learning (cs.LG)
Cite as:	arXiv:2203.01479 [cs.AR]
	(or arXiv:2203.01479v1 [cs.AR] for this version)
	https://doi.org/10.48550/arXiv.2203.01479

Submission history

From: Zachary Susskind [view email]
[v1] Thu, 3 Mar 2022 01:46:05 UTC (526 KB)

Computer Science > Hardware Architecture

Title:Weightless Neural Networks for Efficient Edge Inference

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Hardware Architecture

Title:Weightless Neural Networks for Efficient Edge Inference

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators