Pose-Aware Placement of Objects with Semantic Labels - Brandname-based Affordance Prediction and Cooperative Dual-Arm Active Manipulation
Pages 4760 - 4767
Abstract
The Amazon Picking Challenge and the Amazon Robotics Challenge have shown significant progress in object picking from a cluttered scene, yet object placement remains challenging. It is useful to have pose-aware placement based on human and machine readable pieces on an object. For example, the brandname of an object placed on a shelf should be facing the human customers. The robotic vision challenges in the object placement task: a) the semantics and geometry of the object to be placed need to be analysed jointly; b) and the occlusions among objects in a cluttered scene could make it hard for proper understanding and manipulation. To overcome these challenges, we develop a pose-aware placement approach by spotting the semantic labels (e.g., brandnames) of objects in a cluttered tote and then carrying out a sequence of actions to place the objects on a shelf or on a conveyor with desired poses. Our major contributions include 1) providing an open benchmark dataset of objects and brandnames with multi-view segmentation for training and evaluations; 2) carrying out comprehensive evaluations for our brandname-based fully convolutional network (FCN) that can predict the affordance and grasp to achieve pose-aware placement, whose success rates decrease along with clutters; 3) showing that active manipulation with two cooperative manipulators and grippers can effectively handle the occlusion of brandnames. We analyzed the success rates and discussed the failure cases to provide insights for future applications. All data and benchmarks are available at https://text-pick-n-place.github.io/TextPNP/
References
[1]
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
[2]
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, 2015, pp. 91–99.
[3]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single shot multibox detector,” in European conference on computer vision. Springer, 2016, pp. 21–37.
[4]
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
[5]
F. Pomerleau, F. Colas, R. Siegwart, and S. Magnenat, “Comparing icp variants on real-world data sets,” Autonomous Robots, vol. 34, no. 3, pp. 133–148, 2013.
[6]
A. Zeng, S. Song, K.-T. Yu, E. Donlon, F. R. Hogan, M. Bauza, D. Ma, O. Taylor, M. Liu, E. Romo, N. Fazeli, F. Alet, N. C. Dafle, R. Holladay, I. Morona, P. Q. Nair, D. Green, I. Taylor, W. Liu, T. Funkhouser, and A. Rodriguez, “Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching,” in Proceedings of the IEEE International Conference on Robotics and Automation, 2018.
[7]
J. Redmon and A. Angelova, “Real-time grasp detection using convolutional neural networks,” in 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2015, pp. 1316–1322.
[8]
J. Mahler, F. T. Pokorny, B. Hou, M. Roderick, M. Laskey, M. Aubry, K. Kohlhoff, T. Kröger, J. Kuffner, and K. Goldberg, “Dex-net 1.0: A cloud-based network of 3d objects for robust grasp planning using a multi-armed bandit model with correlated rewards,” in 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2016, pp. 1957–1964.
[9]
J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea, and K. Goldberg, “Dex-Net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics,” arXiv preprint arXiv:1703.09312, 2017.
[10]
J. Mahler, M. Matl, X. Liu, A. Li, D. Gealy, and K. Goldberg, “Dexnet 3.0: Computing robust robot vacuum suction grasp targets in point clouds using a new analytic model and deep learning,” arXiv preprint arXiv:1709.06670, 2017.
[11]
NCTU mobile manipulation 2019. [Online]. Available: https://text-pick-n-place.github.io/TextPNP/
[12]
N. Sünderhauf, O. Brock, W. Scheirer, R. Hadsell, D. Fox, J. Leitner, B. Upcroft, P. Abbeel, W. Burgard, M. Milford, et al., “The limits and potentials of deep learning for robotics,” The International Journal of Robotics Research, vol. 37, no. 4-5, pp. 405–420, 2018.
[13]
N. Atanasov, B. Sankaran, J. Le Ny, G. J. Pappas, and K. Daniilidis, “Nonmyopic view planning for active object classification and pose estimation,” IEEE Transactions on Robotics, vol. 30, no. 5, pp. 1078–1090, 2014.
[14]
A. Doumanoglou, R. Kouskouridas, S. Malassiotis, and T.-K. Kim, “Recovering 6d object pose and predicting next-best-view in the crowd,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3583–3592.
[15]
M. Malmir, K. Sikka, D. Forster, I. Fasel, J. R. Movellan, and G. W. Cottrell, “Deep active object recognition by joint label and action prediction,” Computer Vision and Image Understanding, vol. 156, pp. 128–137, 2017.
[16]
C. Hernandez, M. Bharatheesha, W. Ko, H. Gaiser, J. Tan, K. van Deurzen, M. de Vries, B. Van Mil, J. van Egmond, R. Burger, et al., “Team delfts robot winner of the amazon picking challenge 2016,” in Robot World Cup. Springer, 2016, pp. 613–624.
[17]
L. Pinto and A. Gupta, “Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours,” in Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016, pp. 3406–3413.
[18]
A. Zeng, S. Song, S. Welker, J. Lee, A. Rodriguez, and T. Funkhouser, “Learning synergies between pushing and grasping with self-supervised deep reinforcement learning,” arXiv preprint arXiv:1803.09956, 2018.
[19]
C. Smith, Y. Karayiannidis, L. Nalpantidis, X. Gratal, P. Qi, D. V. Dimarogonas, and D. Kragic, “Dual arm manipulationa survey,” Robotics and Autonomous systems, vol. 60, no. 10, pp. 1340–1353, 2012.
[20]
M. Schwarz, C. Lenz, G. M. García, S. Koo, A. S. Periyasamy, M. Schreiber, and S. Behnke, “Fast object learning and dual-arm coordination for cluttered stowing, picking, and packing,” in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 3347–3354.
[21]
K. Harada, T. Foissotte, T. Tsuji, K. Nagata, N. Yamanobe, A. Nakamura, and Y. Kawai, “Pick and place planning for dual-arm manipulators,” in 2012 IEEE International Conference on Robotics and Automation. IEEE, 2012, pp. 2281–2286.
[22]
W. Miyazaki and J. Miura, “Object placement estimation with occlusions and planning of robotic handling strategies,” in 2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM). IEEE, 2017, pp. 602–607.
[23]
M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, “Synthetic data and artificial neural networks for natural scene text recognition,” arXiv preprint arXiv:1406.2227, 2014.
[24]
P. Jund, N. Abdo, A. Eitel, and W. Burgard, “The freiburg groceries dataset,” vol. abs/1611.05799, 2016. [Online]. Available: https://arxiv.org/abs/1611.05799
[25]
A. Zeng, K.-T. Yu, S. Song, D. Suo, E. Walker Jr, A. Rodriguez, and J. Xiao, “Multi-view self-supervised deep learning for 6-D pose estimation in the amazon picking challenge,” in ICRA, 2017.
[26]
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
[27]
B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman, “Labelme: a database and web-based tool for image annotation,” International journal of computer vision, vol. 77, no. 1-3, pp. 157–173, 2008.
[28]
3D Builder. [Online]. Available: https://www.microsoft.com/zh-tw/p/3d-builder/9wzdncrfj3t6?activetab=pivot:overviewtab
Recommendations
Comments
Information & Contributors
Information
Published In
6597 pages
Copyright © 2019.
Publisher
IEEE Press
Publication History
Published: 01 November 2019
Qualifiers
- Research-article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 25 Jan 2025