Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3564121.3564795acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaimlsystemsConference Proceedingsconference-collections
research-article

Performance improvement of reinforcement learning algorithms for online 3D bin packing using FPGA

Published: 16 May 2023 Publication History

Abstract

Online 3D bin packing is a challenging real-time combinatorial optimisation problem that involves packing of parcels (typically rigid cuboids) arriving on a conveyor into a larger bin for further shipment. Recent automation methods have introduced manipulator robots for packing, which need a processing algorithm to specify the location and orientation in which each parcel must be loaded. Value-based Reinforcement learning (RL) algorithms such as DQN are capable of producing good solutions in the available computation times. However, their deployment on CPU based systems employs rule-based heuristics to reduce the search space which may lead to a sub-optimal solution. In this paper, we use FPGA as a hardware accelerator to reduce inference time of DQN as well as its pre-/post-processing steps. This allows the optimised algorithm to cover the entire search space within the given time constraints. We present various optimizations, such as accelerating DQN model inference and fast checking of constraints. Further, we show that our proposed architecture achieves almost 15x computational speed-ups compared to an equivalent CPU implementation. Additionally, we show that as a result of evaluating the entire search space, the DQN rewards generated for complex data sets has improved by 1%, which can cause a significant reduction in enterprise operating costs.

References

[1]
[n.d.]. The OpenCL Specification. https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html
[2]
Marichi Agarwal, Swagata Biswas, Chayan Sarkar, Sayan Paul, and Himadri Sekhar Paul. 2020. Jampacker: An efficient and reliable robotic bin packing system for cuboid objects. IEEE Robotics and Automation Letters 6, 2 (2020), 319–326.
[3]
Teodor Gabriel Crainic, Guido Perboli, and Roberto Tadei. 2008. Extreme point-based heuristics for three-dimensional bin packing. Informs Journal on computing 20, 3 (2008), 368–384.
[4]
Pranay Reddy Gankidi and Jekan Thangavelautham. 2017. FPGA architecture for deep learning and its application to planetary robotics. In 2017 IEEE Aerospace Conference. 1–9. https://doi.org/10.1109/AERO.2017.7943929
[5]
José Fernando Gonçalves and Mauricio GC Resende. 2013. A biased random key genetic algorithm for 2D and 3D bin packing problems. International Journal of Production Economics 145, 2(2013), 500–510.
[6]
Kyungdaw Kang, Ilkyeong Moon, and Hongfeng Wang. 2012. A hybrid genetic algorithm with a new packing strategy for the three-dimensional bin packing problem. Appl. Math. Comput. 219, 3 (2012), 1287–1299.
[7]
Mairin Kroes, Lucian Petrica, Sorin Cotofana, and Michaela Blott. 2020. Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA. 1125–1133. https://doi.org/10.1145/3377930.3389808
[8]
Silvano Martello, David Pisinger, and Daniele Vigo. 2000. The three-dimensional bin packing problem. Operations research 48, 2 (2000), 256–267.
[9]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
[10]
Célia Paquay, Michael Schyns, and Sabine Limbourg. 2016. A mixed integer programming formulation for the three-dimensional bin packing problem deriving from an air cargo application. International Transactions in Operational Research 23, 1-2(2016), 187–213.
[11]
Ahmad Shawahna, Sadiq M. Sait, and Aiman El-Maleh. 2019. FPGA-Based Accelerators of Deep Learning Networks for Learning and Classification: A Review. IEEE Access 7(2019), 7823–7859. https://doi.org/10.1109/ACCESS.2018.2890150
[12]
Sergio Spanò, Gian Carlo Cardarilli, Luca Di Nunzio, Rocco Fazzolari, Daniele Giardino, Marco Matta, Alberto Nannarelli, and Marco Re. 2019. An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm. IEEE Access 7(2019), 186340–186351. https://doi.org/10.1109/ACCESS.2019.2961174
[13]
Richa Verma, Aniruddha Singhal, Harshad Khadilkar, Ansuma Basumatary, Siddharth Nayak, Harsh Vardhan Singh, Swagat Kumar, and Rajesh Sinha. 2020. A generalized reinforcement learning algorithm for online 3d bin-packing. In Generalisation in Planning Workshop at AAAI.
[14]
Hirohisa Watanabe, Mineto Tsukada, and Hiroki Matsutani. 2020. An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning. (05 2020).
[15]
Yong Wu, Wenkai Li, Mark Goh, and Robert De Souza. 2010. Three-dimensional bin packing problem with variable bin height. European journal of operational research 202, 2 (2010), 347–355.
[16]
Inc. Xilinx. [n.d.]. Alveo u280 data center accelerator card. https://www.xilinx.com/products/boards-and-kits/alveo/u280.html
[17]
Inc. Xilinx. [n.d.]. Vitis AI User Guide UG1414 (v1.4). https://docs.xilinx.com/r/1.3-English/ug1414-vitis-ai
[18]
Inc. Xilinx. [n.d.]. Vitis High-Level Synthesis User Guide (UG1399) (v2020.2). https://docs.xilinx.com/r/2020.2-English/ug1399-vitis-hls
[19]
Inc. Xilinx. [n.d.]. Vitis High-Level Synthesis User Guide UG1399 (v2020.2). https://www.xilinx.com/support/documentation/sw_manuals/xilinx2020_2/ug1399-vitis-hls.pdf
[20]
Hang Zhao, Qijin She, Chenyang Zhu, Yin Yang, and Kai Xu. 2021. Online 3D bin packing with constrained deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 741–749.
[21]
Hang Zhao, Yang Yu, and Kai Xu. 2021. Learning Efficient Online 3D Bin Packing on Packing Configuration Trees. In International Conference on Learning Representations.
[22]
Hang Zhao, Chenyang Zhu, Xin Xu, Hui Huang, and Kai Xu. 2022. Learning practically feasible policies for online 3D bin packing. Science China Information Sciences 65, 1 (2022), 1–17.

Cited By

View all
  • (2024)Volumetric Techniques for Product Routing and Loading Optimisation in Industry 4.0: A ReviewFuture Internet10.3390/fi1602003916:2(39)Online publication date: 24-Jan-2024

Index Terms

  1. Performance improvement of reinforcement learning algorithms for online 3D bin packing using FPGA

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      AIMLSystems '22: Proceedings of the Second International Conference on AI-ML Systems
      October 2022
      209 pages
      ISBN:9781450398473
      DOI:10.1145/3564121
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 May 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. FPGA
      2. hardware acceleration
      3. reinforcement learning

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      AIMLSystems 2022

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)57
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 01 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Volumetric Techniques for Product Routing and Loading Optimisation in Industry 4.0: A ReviewFuture Internet10.3390/fi1602003916:2(39)Online publication date: 24-Jan-2024

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media