Abstract
Nowadays, machine learning has become one of the basic technologies used in solving various computer vision tasks such as feature detection, image segmentation, object recognition and tracking. In many applications, various complex systems such as robots are equipped with visual sensors from which they learn the state of a surrounding environment by solving corresponding computer vision tasks. Solutions of these tasks are used for making decisions about possible future actions. Reinforcement learning is one of the modern machine learning technologies in which learning is carried out through interaction with the environment. In recent years, reinforcement learning has been used both for solving robotic computer vision problems such as object detection, visual tracking and action recognition as well as robot navigation. The paper describes shortly the reinforcement learning technology and its use for computer vision and robot navigation problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
\(IoU(\hat{y}, y) = \frac{|\hat{y} \cap y|}{|\hat{y} \cup y|}\).
References
Bonin-Font, F., Ortiz, A., Oliver, G.: Visual navigation for mobile robots: a survey. J. Intell. Robot. Syst. 53(3), 263–296 (2008)
Kuleshov, A., Bernstein, A., Burnaev, E.: Mobile robot localization via machine learning. In: Perner, P. (ed.) MLDM 2017. LNCS (LNAI), vol. 10358, pp. 276–290. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62416-7_20
Kuleshov, A., Bernstein, A., Burnaev, E., Yanovich, Yu.: Machine learning in appearance-based robot self-localization. In: 16th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE Conference Publications (2017)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Burnaev, E., Panov, M.: Adaptive design of experiments based on Gaussian processes. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds.) SLDS 2015. LNCS (LNAI), vol. 9047, pp. 116–125. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17091-6_7
Burnaev, E., Panin, I., Sudret, B.: Efficient design of experiments for sensitivity analysis based on polynomial chaos expansions. Ann. Math. Artif. Intell. 81, 187–207 (2017)
Puterman, M.L.: Markovian Decision Processes - Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3), 279–292 (1992)
Jaakola, T., Jordan, M., Singh, S.: On the convergence of stochastic iterative dynamic programming algorithms. Neural Comput. 6(6), 1185–1201 (1994)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Burnaev, E.V., Erofeev, P.D.: The influence of parameter initialization on the training time and accuracy of a nonlinear regression model. J. Commun. Technol. Electron. 61(6), 646–660 (2016)
Burnaev, E.V., Prikhod’ko, P.V.: On a method for constructing ensembles of regression models. Autom. Remote Control 74(10), 1630–1644 (2013)
Li, Y.: Deep reinforcement learning: an overview, pp. 1–70 (2017). [cs.LG]
Uijlings, J.R., Van De Sande, K.E., et al.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, p. I. IEEE (2001)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Caicedo, J.C., Lazebnik, S.: Active object localization with deep reinforcement learning. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2488–2496. IEEE (2015)
Zhang, D., Maei, H., et al.: Deep reinforcement learning for visual object tracking in videos. Preprint arXiv (2017)
Luo, W., Sun, P., Mu, Y., Liu, W.: End-to-end active object tracking via reinforcement learning. Preprint (2017)
Mnih, V., Badia, A.P., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
Qiu, W., Zhong, F., et al.: UnrealCV: virtual worlds for computer vision. In: ACM Multimedia Open Source Software Competition (2017)
Huang, J., Li, N., et al.: A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning (2017)
Yeung, S., Russakovsky, O., et al.: End-to-end learning of action detection from frame glimpses in videos, vol. 10, no. 1109, pp. 2678–2687 (2016)
Giusti, A., Guzzi, J., et al.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robot. Autom. Lett. 1, 661–667 (2016)
Maqueda, A.I., Loquercio, A., et al.: Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars, April 2018. ArXiv e-prints
Anderson, P., Wu, Q., et al.: Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. CoRR abs/1711.07280 (2017)
Xie, L., Wang, S., et al.: Towards monocular vision based obstacle avoidance through deep reinforcement learning. CoRR abs/1706.09829 (2017)
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: AAAI (2016)
Mirowski, P., Grimes, M.K., et al.: Learning to navigate in cities without a map. arXiv preprint arXiv:1804.00168 (2018)
Zhu, Y., Mottaghi, R., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3357–3364 (2017)
Wang, X., Xiong, W., et al.: Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation. ArXiv e-prints, March 2018
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 (2015)
Mirowski, P.W., Pascanu, R., et al.: Learning to navigate in complex environments. CoRR abs/1611.03673 (2016)
Kempka, M., Wydmuch, M., et al.: ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning. ArXiv e-prints, May 2016
Savva, M., Chang, A.X., et al.: Minos: Multimodal indoor simulator for navigation in complex environments. arXiv preprint arXiv:1712.03931 (2017)
Andrychowicz, M., Wolski, F., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 5048–5058 (2017)
Dosovitskiy, A., Ros, G., et al.: Carla: An open urban driving simulator. arXiv preprint arXiv:1711.03938 (2017)
Koenig, N.P., Howard, A.: Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 3, pp. 2149–2154 (2004)
Shah, S., Dey, D., Lovett, C., Kapoor, A.: AirSim: high-fidelity visual and physical simulation for autonomous vehicles. In: Hutter, M., Siegwart, R. (eds.) Field and Service Robotics. SPAR, vol. 5, pp. 621–635. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67361-5_40
Song, S., Yu, F., et al.: Semantic scene completion from a single depth image. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Chang, A., Dai, A., et al.: Matterport3d: Learning from RGB-D data in indoor environments. arXiv preprint arXiv:1709.06158 (2017)
Tobin, J., Fong, R., et al.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)
Sadeghi, F., Levine, S.: CAD2RL: Real single-image flight without a single real image. CoRR abs/1611.04201 (2017)
Yosinski, J., Clune, J., et al.: How transferable are features in deep neural networks? In: NIPS (2014)
You, Y., Pan, X., Wang, Z., Lu, C.: Virtual to real reinforcement learning for autonomous driving. CoRR abs/1704.03952 (2017)
Zhang, J., Tai, L., et al.: Vr goggles for robots: Real-to-sim domain adaptation for visual control. CoRR abs/1802.00265 (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Graves, A., Wayne, G., et al.: Hybrid computing using a neural network with dynamic external memory. Nature 538(7626), 471–476 (2016)
Sukhbaatar, S., Szlam, A., et al.: End-to-end memory networks. In: NIPS (2015)
Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. CoRR abs/1410.5401 (2014)
Khan, A., Zhang, C., et al.: Memory augmented control networks. In: International Conference on Learning Representations (2018)
Parisotto, E., Salakhutdinov, R.: Neural map: Structured memory for deep reinforcement learning. CoRR abs/1702.08360 (2017)
Savinov, N., Dosovitskiy, A., Koltun, V.: Semi-parametric topological memory for navigation. arXiv preprint arXiv:1803.00653 (2018)
Chaplot, D.S., Parisotto, E., Salakhutdinov, R.: Active neural localization. CoRR abs/1801.08214 (2018)
Karkus, P., Hsu, D.F.C., Lee, W.S.: QMDP-net: deep learning for planning under partial observability. In: NIPS (2017)
Acknowledgement
The work was supported by the Skoltech NGP Program No. 1-NGP-1567 “Simulation and Transfer Learning for Deep 3D Geometric Data Analysis” (a Skoltech-MIT joint project).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Bernstein, A.V., Burnaev, E.V., Kachan, O.N. (2018). Reinforcement Learning for Computer Vision and Robot Navigation. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2018. Lecture Notes in Computer Science(), vol 10935. Springer, Cham. https://doi.org/10.1007/978-3-319-96133-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-96133-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96132-3
Online ISBN: 978-3-319-96133-0
eBook Packages: Computer ScienceComputer Science (R0)