research-article

Performance improvement of reinforcement learning algorithms for online 3D bin packing using FPGA

Authors:

Ashwin Krishnan,

Harshad Khadilkar,

Ansuma Basumatary,

Arijit MukherjeeAuthors Info & Claims

AIMLSystems '22: Proceedings of the Second International Conference on AI-ML Systems

Article No.: 18, Pages 1 - 7

https://doi.org/10.1145/3564121.3564795

Published: 16 May 2023 Publication History

Abstract

Online 3D bin packing is a challenging real-time combinatorial optimisation problem that involves packing of parcels (typically rigid cuboids) arriving on a conveyor into a larger bin for further shipment. Recent automation methods have introduced manipulator robots for packing, which need a processing algorithm to specify the location and orientation in which each parcel must be loaded. Value-based Reinforcement learning (RL) algorithms such as DQN are capable of producing good solutions in the available computation times. However, their deployment on CPU based systems employs rule-based heuristics to reduce the search space which may lead to a sub-optimal solution. In this paper, we use FPGA as a hardware accelerator to reduce inference time of DQN as well as its pre-/post-processing steps. This allows the optimised algorithm to cover the entire search space within the given time constraints. We present various optimizations, such as accelerating DQN model inference and fast checking of constraints. Further, we show that our proposed architecture achieves almost 15x computational speed-ups compared to an equivalent CPU implementation. Additionally, we show that as a result of evaluating the entire search space, the DQN rewards generated for complex data sets has improved by 1%, which can cause a significant reduction in enterprise operating costs.

References

[1]

[n.d.]. The OpenCL Specification. https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html

[2]

Marichi Agarwal, Swagata Biswas, Chayan Sarkar, Sayan Paul, and Himadri Sekhar Paul. 2020. Jampacker: An efficient and reliable robotic bin packing system for cuboid objects. IEEE Robotics and Automation Letters 6, 2 (2020), 319–326.

[3]

Teodor Gabriel Crainic, Guido Perboli, and Roberto Tadei. 2008. Extreme point-based heuristics for three-dimensional bin packing. Informs Journal on computing 20, 3 (2008), 368–384.

[4]

Pranay Reddy Gankidi and Jekan Thangavelautham. 2017. FPGA architecture for deep learning and its application to planetary robotics. In 2017 IEEE Aerospace Conference. 1–9. https://doi.org/10.1109/AERO.2017.7943929

[5]

José Fernando Gonçalves and Mauricio GC Resende. 2013. A biased random key genetic algorithm for 2D and 3D bin packing problems. International Journal of Production Economics 145, 2(2013), 500–510.

[6]

Kyungdaw Kang, Ilkyeong Moon, and Hongfeng Wang. 2012. A hybrid genetic algorithm with a new packing strategy for the three-dimensional bin packing problem. Appl. Math. Comput. 219, 3 (2012), 1287–1299.

[7]

Mairin Kroes, Lucian Petrica, Sorin Cotofana, and Michaela Blott. 2020. Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA. 1125–1133. https://doi.org/10.1145/3377930.3389808

Digital Library

[8]

Silvano Martello, David Pisinger, and Daniele Vigo. 2000. The three-dimensional bin packing problem. Operations research 48, 2 (2000), 256–267.

[9]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.

[10]

Célia Paquay, Michael Schyns, and Sabine Limbourg. 2016. A mixed integer programming formulation for the three-dimensional bin packing problem deriving from an air cargo application. International Transactions in Operational Research 23, 1-2(2016), 187–213.

[11]

Ahmad Shawahna, Sadiq M. Sait, and Aiman El-Maleh. 2019. FPGA-Based Accelerators of Deep Learning Networks for Learning and Classification: A Review. IEEE Access 7(2019), 7823–7859. https://doi.org/10.1109/ACCESS.2018.2890150

[12]

Sergio Spanò, Gian Carlo Cardarilli, Luca Di Nunzio, Rocco Fazzolari, Daniele Giardino, Marco Matta, Alberto Nannarelli, and Marco Re. 2019. An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm. IEEE Access 7(2019), 186340–186351. https://doi.org/10.1109/ACCESS.2019.2961174

[13]

Richa Verma, Aniruddha Singhal, Harshad Khadilkar, Ansuma Basumatary, Siddharth Nayak, Harsh Vardhan Singh, Swagat Kumar, and Rajesh Sinha. 2020. A generalized reinforcement learning algorithm for online 3d bin-packing. In Generalisation in Planning Workshop at AAAI.

[14]

Hirohisa Watanabe, Mineto Tsukada, and Hiroki Matsutani. 2020. An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning. (05 2020).

[15]

Yong Wu, Wenkai Li, Mark Goh, and Robert De Souza. 2010. Three-dimensional bin packing problem with variable bin height. European journal of operational research 202, 2 (2010), 347–355.

[16]

Inc. Xilinx. [n.d.]. Alveo u280 data center accelerator card. https://www.xilinx.com/products/boards-and-kits/alveo/u280.html

[17]

Inc. Xilinx. [n.d.]. Vitis AI User Guide UG1414 (v1.4). https://docs.xilinx.com/r/1.3-English/ug1414-vitis-ai

[18]

Inc. Xilinx. [n.d.]. Vitis High-Level Synthesis User Guide (UG1399) (v2020.2). https://docs.xilinx.com/r/2020.2-English/ug1399-vitis-hls

[19]

Inc. Xilinx. [n.d.]. Vitis High-Level Synthesis User Guide UG1399 (v2020.2). https://www.xilinx.com/support/documentation/sw_manuals/xilinx2020_2/ug1399-vitis-hls.pdf

[20]

Hang Zhao, Qijin She, Chenyang Zhu, Yin Yang, and Kai Xu. 2021. Online 3D bin packing with constrained deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 741–749.

[21]

Hang Zhao, Yang Yu, and Kai Xu. 2021. Learning Efficient Online 3D Bin Packing on Packing Configuration Trees. In International Conference on Learning Representations.

[22]

Hang Zhao, Chenyang Zhu, Xin Xu, Hui Huang, and Kai Xu. 2022. Learning practically feasible policies for online 3D bin packing. Science China Information Sciences 65, 1 (2022), 1–17.

Cited By

Lopes RTrovati MPereira E(2024)Volumetric Techniques for Product Routing and Loading Optimisation in Industry 4.0: A ReviewFuture Internet10.3390/fi1602003916:2(39)Online publication date: 24-Jan-2024
https://doi.org/10.3390/fi16020039

Index Terms

Performance improvement of reinforcement learning algorithms for online 3D bin packing using FPGA
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
2. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs

Recommendations

Online algorithms for 1-space bounded multi dimensional bin packing and hypercube packing

In this paper, we study 1-space bounded multi-dimensional bin packing and hypercube packing. A sequence of items arrive over time, each item is a d -dimensional hyperbox (in bin packing) or hypercube (in hypercube packing), and the length of each side ...
Implementation of Deep Reinforcement Learning
ICISS '19: Proceedings of the 2nd International Conference on Information Science and Systems

Reinforcement Learning (RL) is different from supervised learning, which is learning from a training set of labeled examples provided by a knowledgable external supervisor. RL is also different from unsupervised learning, which is typically about ...
Reinforcement learning algorithms: A brief survey
Highlights
- RL can be used to solve problems involving sequential decision-making.
- RL is based on trial-and-error learning through rewards and punishments.
- The ultimate goal of an RL agent is to maximize cumulative reward.
- RL agent tries ...
Abstract
Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential decision-making in complex problems. RL is inspired by trial-and-error based human/animal learning. It can learn an optimal policy autonomously with knowledge ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIMLSystems '22: Proceedings of the Second International Conference on AI-ML Systems

October 2022

209 pages

ISBN:9781450398473

DOI:10.1145/3564121

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AIMLSystems 2022

AIMLSystems 2022: The Second International Conference on AI-ML Systems

October 12 - 15, 2022

Bangalore, India

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
76
Total Downloads

Downloads (Last 12 months)57
Downloads (Last 6 weeks)2

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lopes RTrovati MPereira E(2024)Volumetric Techniques for Product Routing and Loading Optimisation in Industry 4.0: A ReviewFuture Internet10.3390/fi1602003916:2(39)Online publication date: 24-Jan-2024
https://doi.org/10.3390/fi16020039

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents