An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

Watanabe, Hirohisa; Tsukada, Mineto; Matsutani, Hiroki

Computer Science > Machine Learning

arXiv:2005.04646 (cs)

[Submitted on 10 May 2020 (v1), last revised 12 Mar 2023 (this version, v4)]

Title:An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

Authors:Hirohisa Watanabe, Mineto Tsukada, Hiroki Matsutani

View PDF

Abstract:DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement learning using deep neural networks. DQNs require a large buffer and batch processing for an experience replay and rely on a backpropagation based iterative optimization, making them difficult to be implemented on resource-limited edge devices. In this paper, we propose a lightweight on-device reinforcement learning approach for low-cost FPGA devices. It exploits a recently proposed neural-network based on-device learning approach that does not rely on the backpropagation method but uses OS-ELM (Online Sequential Extreme Learning Machine) based training algorithm. In addition, we propose a combination of L2 regularization and spectral normalization for the on-device reinforcement learning so that output values of the neural network can be fit into a certain range and the reinforcement learning becomes stable. The proposed reinforcement learning approach is designed for PYNQ-Z1 board as a low-cost FPGA platform. The evaluation results using OpenAI Gym demonstrate that the proposed algorithm and its FPGA implementation complete a CartPole-v0 task 29.77x and 89.40x faster than a conventional DQN-based approach when the number of hidden-layer nodes is 64.

Comments:	RAW'21
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2005.04646 [cs.LG]
	(or arXiv:2005.04646v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2005.04646

Submission history

From: Hiroki Matsutani [view email]
[v1] Sun, 10 May 2020 12:37:26 UTC (1,351 KB)
[v2] Tue, 25 Aug 2020 08:35:11 UTC (1,299 KB)
[v3] Tue, 23 Mar 2021 07:09:38 UTC (2,680 KB)
[v4] Sun, 12 Mar 2023 23:04:10 UTC (2,676 KB)

Computer Science > Machine Learning

Title:An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators