Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.23919/ICCAS52745.2021.9649975guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Decomposed Q-Learning for Non-Prehensile Rearrangement Problem

Published: 12 October 2021 Publication History

Abstract

In this paper, we address a planar non-prehensile rearrangement task. As a problem of pushing objects to desired target points, we model the problem as Multi-objective Markov decision processes (MOMDPs) to efficiently solve it and propose a method of finding policies. The proposed method learns object-wise Q-value functions to learn the dynamics according to the behavior of the robot arm by individual objects. With this method, we can increase sample efficiency and improve learning speed over learning policies for multiple objects with a single Q-value function.To this end, we use the Deep Q learning framework, and since we have vision input, we can obtain a Q-value function for each pixel using the fully convolution method. Based on this learned object-wise Q-value function, we determine the behavior of the robot arm, where we confirm that the maximum strategy has the highest performance.

References

[1]
K. V. Moffaert, M. M. Drugan, and A. Nowé, “Scalarized multi-objective reinforcement learning: Novel design techniques,” in IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 191–199, 2013.
[2]
D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrish-nan, V. Vanhoucke, and S. Levine, “Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation,” in Conference on Robot Learning (CoRL), 2018.
[3]
A. Zeng, S. Song, S. Welker, J. Lee, A. Rodriguez, and T. Funkhouser, “Learning synergies between pushing and grasping with self-supervised deep reinforcement learning,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4238–4245, 2018.
[4]
C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine learning, vol. 8, no. 3–4, pp. 279–292, 1992.
[5]
M. Wilson and T. Hermans, “Learning to manipulate object collections using grounded state representations,” in Conference on Robot Learning (CoRL), pp. 490–502, PMLR, 2020.
[6]
A. Nair, V. Pong, M. Dalal, S. Bahl, S. Lin, and S. Levine, “Visual reinforcement learning with imagined goals,” in Advances in Neural Information Processing Systems (NIPS), 2018.
[7]
Y. Lin, J. Huang, M. Zimmer, Y. Guan, J. Rojas, and P. Weng, “Invariant transform experience replay: Data augmentation for deep reinforcement learning,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6615–6622, 2020.
[8]
M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, P. Abbeel, and W. Zaremba, “Hindsight experience replay,” in Advances in Neural Information Processing Systems (NIPS), 2017.
[9]
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” in Advances in Neural Information Processing Systems (NIPS), 2013.

Index Terms

  1. Decomposed Q-Learning for Non-Prehensile Rearrangement Problem
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        2021 21st International Conference on Control, Automation and Systems (ICCAS)
        Oct 2021
        1691 pages

        Publisher

        IEEE Press

        Publication History

        Published: 12 October 2021

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 27 Jan 2025

        Other Metrics

        Citations

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media