Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Neural Network Based Reinforcement Learning Acceleration on FPGA Platforms

Published: 11 January 2017 Publication History

Abstract

Deep Q-learning (DQN) is a recently proposed reinforcement learning algorithm where a neural network is applied as a non-linear approximator to its value function. The exploitation-exploration mechanism allows the training and prediction of the NN to execute simultaneously in an agent during its interaction with the environment. Agents often act independently on battery power, so the training and prediction must occur within the agent and on a limited power budget. In this work, We propose an FPGA acceleration system design for Neural Network Q-learning (NNQL). Our proposed system has high flexibility due to the support to run-time network parameterization, which allows neuroevolution algorithms to dynamically restructure the network to achieve better learning results. Additionally, the power consumption of our proposed system is adaptive to the network size because of a new processing element design. Based on our test cases on networks with hidden layer size ranging from 32 to 16384, our proposed system achieves 7x to 346x speedup compared to GPU implementation and 22x to 77x speedup to hand-coded CPU counterpart.

References

[1]
F. Bastien et al. Theano: new features and speed improvements. NIPS, 2012.
[2]
A. Karpathy et al. Convnetjs deep q learning demo. http://cs.stanford.edu/people/karpathy/convnetjs/.
[3]
S. K. Kim et al. A highly scalable restricted boltzmann machine fpga implementation. FCCM, 2009.
[4]
V. Mnih et al. Human level control through deep reinforcement learning view publication. Nature, 2015.
[5]
D. E. Runekhart, G. E. Hinton, and R. J. Williams. Learning representations by back-propagating errors. Nature, 1986.
[6]
K. O. Stanley, B. D. Bryant, and R. Miikkulainen. Evolving adaptive neural networks with and without adaptive synapses. in Proc. of the 2003 Congress on Evolutionary Computation, 2003.
[7]
J. Su, D. B. Thomas, and P. Y. K. Cheung. Increasing network size and training throughput of fpga restricted boltzmann machines using dropout. FCCM, 2016.
[8]
D. B. Thomas and W. Luk. Fpga-optimised uniform random number generators using luts and shift registers. FPL, 2010.

Cited By

View all
  • (2024)Dielectric Elastomer-Based Actuators: A Modeling and Control Review for Non-ExpertsActuators10.3390/act1304015113:4(151)Online publication date: 17-Apr-2024
  • (2024)FPGA-Accelerated Sim-to-Real Control Policy Learning for Robotic ArmsIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.335369071:3(1690-1694)Online publication date: Mar-2024
  • (2024)A FPGA Accelerator of Distributed A3C Algorithm with Optimal Resource DeploymentIET Computers & Digital Techniques10.1049/2024/78552502024Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 44, Issue 4
HEART '16
September 2016
96 pages
ISSN:0163-5964
DOI:10.1145/3039902
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2017
Published in SIGARCH Volume 44, Issue 4

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)5
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Dielectric Elastomer-Based Actuators: A Modeling and Control Review for Non-ExpertsActuators10.3390/act1304015113:4(151)Online publication date: 17-Apr-2024
  • (2024)FPGA-Accelerated Sim-to-Real Control Policy Learning for Robotic ArmsIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.335369071:3(1690-1694)Online publication date: Mar-2024
  • (2024)A FPGA Accelerator of Distributed A3C Algorithm with Optimal Resource DeploymentIET Computers & Digital Techniques10.1049/2024/78552502024Online publication date: 1-Jan-2024
  • (2023)A Deep Q Network Hardware Accelerator Based on Heterogeneous Computing2023 IEEE 15th International Conference on ASIC (ASICON)10.1109/ASICON58565.2023.10396321(1-4)Online publication date: 24-Oct-2023
  • (2023)DQN Algorithm Design for Fast Efficient Shortest Path System2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)10.1109/APSIPAASC58517.2023.10317113(254-260)Online publication date: 31-Oct-2023
  • (2022)DARLProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549437(1-9)Online publication date: 30-Oct-2022
  • (2022)Associative Memory Based Experience Replay for Deep Reinforcement LearningProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549387(1-9)Online publication date: 30-Oct-2022
  • (2022)E2HRL: An Energy-efficient Hardware Accelerator for Hierarchical Deep Reinforcement LearningACM Transactions on Design Automation of Electronic Systems10.1145/349832727:5(1-19)Online publication date: 21-Sep-2022
  • (2022)Hardware Accelerator for Capsule Network based Reinforcement Learning2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID)10.1109/VLSID2022.2022.00041(162-167)Online publication date: Feb-2022
  • (2022)PPOAccel: A High-Throughput Acceleration Framework for Proximal Policy OptimizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.313470933:9(2066-2078)Online publication date: 1-Sep-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media