Abstract
Reinforcement learning (RL) is gaining a foothold in artificial intelligence-based research academia. More and more applications are coming to fore where RL is being applied in a novel and successful manner. As the areas of application are diverse, one important area of industrial and research significance having a high scope is machine vision. In this survey paper, the basics of RL are discussed first in order to give the reader an overview of the technology. Subsequently, the important, novel and upcoming state-of-the-art applications are discussed in the field of RL. The research areas discussed include image segmentation, object detection, object tracking, robotic vision, autonomous driving and image classification/retrieval. Various state-of-the-art works are discussed and an impression of the potential and impact of RL in the field of machine vision is built.
Similar content being viewed by others
References
Akhloufi MA, Arola S, Bonnet A (2019) Drones chasing drones: reinforcement learning and deep search area proposal. Drones. https://doi.org/10.3390/drones3030058
Bellman R (1957) A markovian decision process. J Math Mech 6(5):679–684
Bohg J, Morales A, Asfour T, Kragic D (2014) Data-driven grasp synthesis—a survey. IEEE Trans Rob 30(2):289–309. https://doi.org/10.1109/TRO.2013.2289018
Caicedo JC, Lazebnik S (2015) Active object localization with deep reinforcement learning. In: IEEE international conference on computer vision (ICCV), pp 2488–2496. https://doi.org/10.1109/ICCV.2015.286
Carta S, Ferreira A, Podda AS, Reforgiato Recupero D, Sanna A (2021) Multi-dqn: an ensemble of deep q-learning agents for stock market forecasting. Expert Syst Appl 164:113820. https://doi.org/10.1016/j.eswa.2020.113820
Casanova A, Pinheiro PO, Rostamzadeh N, Pal CJ (2020) Reinforced active learning for image segmentation. arXiv preprint. arXiv:2002.06583
Chavan-Dafle N, Rodriguez A (2018) Stable prehensile pushing: in-hand manipulation with alternating sticking contacts. In: IEEE international conference on robotics and automation (ICRA), pp 254–261. https://doi.org/10.1109/ICRA.2018.8461243
Chen J, Yuan B, Tomizuka M (2019) Model-free deep reinforcement learning for urban autonomous driving. In: IEEE intelligent transportation systems conference (ITSC). IEEE Press, pp 2765–2771. https://doi.org/10.1109/ITSC.2019.8917306
Chen L, Huang H, Feng Y, Cheng G, Huang J, Liu Z (2020) Active one-shot learning by a deep q-network strategy. Neurocomputing 383:324–335. https://doi.org/10.1016/j.neucom.2019.11.017
Chen T, Wang Z, Li G, Lin L (2018) Recurrent attentional reinforcement learning for multi-label image recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 32, p 1. https://ojs.aaai.org/index.php/AAAI/article/view/12281
Christiano P, Shah Z, Mordatch I, Schneider J, Blackwell T, Tobin J, Abbeel, P., Zaremba W (2016) Transfer from simulation to real world through learning deep inverse dynamics model. arXiv preprint arXiv:1610.03518
Czech J (2021) Distributed methods for reinforcement learning survey. In: Reinforcement learning algorithms: analysis and applications. Springer, pp 151–161
Dosovitskiy A, Ros G, Codevilla F, Lopez A, Koltun V (2017) CARLA: an open urban driving simulator. In: Levine S, Vanhoucke V, Goldberg K (eds) Proceedings of the 1st annual conference on robot learning, Proceedings of machine learning research, vol. 78. PMLR, pp 1–16. http://proceedings.mlr.press/v78/dosovitskiy17a.html
Ejaz MM, Tang TB, Lu CK (2021) Vision-based autonomous navigation approach for a tracked robot using deep reinforcement learning. IEEE Sens J 21(2):2230–2240. https://doi.org/10.1109/JSEN.2020.3016299
Fan J, Wang Z, Xie Y, Yang Z (2020) A theoretical analysis of deep q-learning. In: Bayen AM, Jadbabaie A, Pappas G, Parrilo PA, Recht B, Tomlin C, Zeilinger M (eds) Proceedings of the 2nd conference on learning for dynamics and control, Proceedings of machine learning research, vol 120. PMLR, The Cloud, pp. 486–489. http://proceedings.mlr.press/v120/yang20a.html
Gao M, Yu R, Li A, Morariu VI, Davis LS (2018) Dynamic zoom-in network for fast object detection in large images. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6926–6935. https://doi.org/10.1109/CVPR.2018.00724
Georgiou T, Liu Y, Chen W, Lew M (2020) A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. Int J Multimed Inform Retrieval 9(3):135–170. https://doi.org/10.1007/s13735-019-00183-w
Ghadirzadeh A, Maki A, Kragic D, Björkman M (2017) Deep predictive policy training using reinforcement learning. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 2351–2358. https://doi.org/10.1109/IROS.2017.8206046
Gonzalez-Garcia A, Vezhnevets A, Ferrari V (2015) An active search strategy for efficient object class detection. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3022–3031. https://doi.org/10.1109/CVPR.2015.7298921
Gözen D, Ozer S (2021) Visual object tracking in drone images with deep reinforcement learning. cs.bilkent.edu.tr
Gupta A, Devin C, Liu Y, Abbeel P, Levine S (2017) Learning invariant feature spaces to transfer skills with reinforcement learning. arXiv preprint arXiv:1703.02949
Hafiz AM (2020) Image classification by reinforcement learning with two-state q-learning. arXiv:2007.01298
Hafiz AM, Bhat GM (2020) Deep q-network based multi-agent reinforcement learning with binary action agents. arXiv preprint arXiv:2008.04109
Hafiz AM, Bhat GM (2020) A survey on instance segmentation: state of the art. Int J Multimed Inform Retrieval 9(3):171–189. https://doi.org/10.1007/s13735-020-00195-x
He H, Daumé III H, Eisner J (2012) Cost-sensitive dynamic feature selection. In: ICML inferning workshop
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Hsu RC, Liu CT, Chen WY, Hsieh HI, Wang HL (2015) A reinforcement learning-based maximum power point tracking method for photovoltaic array. Int J Photoenergy
Hundt A, Killeen B, Greene N, Wu H, Kwon H, Paxton C, Hager GD (2020) Good robot!: efficient reinforcement learning for multi-step visual tasks with sim to real transfer. IEEE Robot Autom Lett 5(4):6724–6731. https://doi.org/10.1109/LRA.2020.3015448
Isele D, Rahimi R, Cosgun A, Subramanian K, Fujimura K (2018) Navigating occluded intersections with autonomous vehicles using deep reinforcement learning. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 2034–2039. https://doi.org/10.1109/ICRA.2018.8461233
Jain SD, Grauman K (2016) Active image segmentation propagation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2864–2873. https://doi.org/10.1109/CVPR.2016.313
Jie Z, Liang X, Feng J, Jin X, Lu WF, Yan S (2016) Tree-structured reinforcement learning for sequential object localization. In: Proceedings of the 30th international conference on neural information processing systems, NIPS’16. Curran Associates Inc., Red Hook, pp 127–135
Kalakrishnan M, Righetti L, Pastor P, Schaal S (2011) Learning force control policies for compliant manipulation. In: 2011 IEEE/RSJ international conference on intelligent robots and systems, pp 4639–4644. https://doi.org/10.1109/IROS.2011.6095096
Kalashnikov D, Irpan A, Pastor P, Ibarz J, Herzog A, Jang E, Quillen D, Holly E, Kalakrishnan M, Vanhoucke V et al (2018) Scalable deep reinforcement learning for vision-based robotic manipulation. In: Conference on robot learning. PMLR, pp 651–673
Karayev S, Fritz M, Darrell T (2014) Anytime recognition of objects and scenes. In: 2014 IEEE conference on computer vision and pattern recognition, pp 572–579. https://doi.org/10.1109/CVPR.2014.80
Kardell S, Kuosku M (2017) Autonomous vehicle control via deep reinforcement learning. Master’s Thesis
Kelleher JD (2019) Deep learning
Keselman A, Ten S, Ghazali A, Jubeh M (2018) Reinforcement learning with a* and a deep heuristic. arXiv preprint arXiv:1811.07745
Kiran BR, Sobh I, Talpaert V, Mannion P, Sallab AAA, Yogamani S, Pérez P (2021) Deep reinforcement learning for autonomous driving: a survey. arXiv preprint arXiv:2002.00444v2
Konda VR, Tsitsiklis JN (1999) Actor-citic agorithms. In: Proceedings of the 12th international conference on neural information processing systems, NIPS’99. MIT Press, Cambridge, pp 1008–1014
Konyushkova K, Sznitman R, Fua P (2015) Introducing geometry in active learning for image segmentation. In: 2015 IEEE international conference on computer vision (ICCV), pp. 2974–2982. https://doi.org/10.1109/ICCV.2015.340
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539
Li C, Czarnecki K (2019) Urban driving with multi-objective deep reinforcement learning. In: Proceedings of the 18th international conference on autonomous agents and multiagent systems, AAMAS ’19. International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, pp 359–367
Li FF, Andreetto M, Ranzato MA (2004) Caltech-101. Database
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Liu W, Peng L, Cao J, Fu X, Liu Y, Pan Z, Yang J (2021) Ensemble bootstrapped deep deterministic policy gradient for vision-based robotic grasping. IEEE Access
Luo W, Sun P, Zhong F, Liu W, Zhang T, Wang Y (2020) End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans Pattern Anal Mach Intell 42(6):1317–1332. https://doi.org/10.1109/TPAMI.2019.2899570
Mackowiak R, Lenz P, Ghori O, Diego F, Lange O, Rother C (2018) Cereals-cost-effective region-based active learning for semantic segmentation. In: BMVC
Mahler J, Matl M, Liu X, Li A, Gealy D, Goldberg K (2018) Dex-net 3.0: computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 5620–5627. https://doi.org/10.1109/ICRA.2018.8460887
Martens J, Grosse R (2015) Optimizing neural networks with kronecker-factored approximate curvature. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, Proceedings of machine learning research, vol 37. PMLR, Lille, pp 2408–2417. http://proceedings.mlr.press/v37/martens15.html
Mathe S, Pirinen A, Sminchisescu C (2016) Reinforcement learning for visual object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2894–2902. https://doi.org/10.1109/CVPR.2016.316
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of the 33rd international conference on machine learning, Proceedings of machine learning research, vol 48. PMLR, New York, pp 1928–1937. http://proceedings.mlr.press/v48/mniha16.html
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236
Morrison D, Tow AW, McTaggart M, Smith R, Kelly-Boxall N, Wade-McCue S, Erskine J, Grinover R, Gurman A, Hunn T, Lee D, Milan A, Pham T, Rallos G, Razjigaev A, Rowntree T, Vijay K, Zhuang Z, Lehnert C, Reid I, Corke P, Leitner J (2018) Cartman: the low-cost cartesian manipulator that won the amazon robotics challenge. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 7757–7764. https://doi.org/10.1109/ICRA.2018.8463191
Mousavi, HK, Liu, G, Yuan, W, Takáč, M, Muñoz-Avila, H, Motee, N (2019) A layered architecture for active perception: image classification using deep reinforcement learning. arXiv preprint arXiv:1909.09705
Mousavi HK, Nazari M, Takáč M, Motee N (2019) Multi-agent image classification via reinforcement learning. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 5020–5027. https://doi.org/10.1109/IROS40897.2019.8968129
Ngai DCK, Yung NHC (2011) A multiple-goal reinforcement learning method for complex vehicle overtaking maneuvers. IEEE Trans Intell Transp Syst 12(2):509–522. https://doi.org/10.1109/TITS.2011.2106158
Nguyen TT, Nguyen ND, Nahavandi S (2020) Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybern 50(9):3826–3839. https://doi.org/10.1109/TCYB.2020.2977374
Pan X, You Y, Wang Z, Lu C (2017) Virtual to real reinforcement learning for autonomous driving. arXiv preprint arXiv:1704.03952
Park YJ, Lee YJ, Kim SB (2020) Cooperative multi-agent reinforcement learning with approximate model learning. IEEE Access 8:125389–125400. https://doi.org/10.1109/ACCESS.2020.3007219
Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV (2012) Cats and dogs. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3498–3505. https://doi.org/10.1109/CVPR.2012.6248092
Peters J, Schaal S (2008) Reinforcement learning of motor skills with policy gradients. Neural Netw 21(4):682–697. https://doi.org/10.1016/j.neunet.2008.02.003
Pirinen A, Sminchisescu C (2018) Deep reinforcement learning of region proposal networks for object detection. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6945–6954. https://doi.org/10.1109/CVPR.2018.00726
Rao K, Harris C, Irpan A, Levine S, Ibarz J, Khansari M (2020) Rl-cyclegan: reinforcement learning aware simulation-to-real. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11154–11163. https://doi.org/10.1109/CVPR42600.2020.01117
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Proceedings of the 28th international conference on neural information processing systems, vol 1, NIPS’15. MIT Press, Cambridge, pp 91–99
Rusu AA, Večerík M, Rothörl T, Heess N, Pascanu R, Hadsell R (2017) Sim-to-real robot learning from pixels with progressive nets. In: Levine S, Vanhoucke V, Goldberg K (eds) Proceedings of the 1st annual conference on robot learning, Proceedings of machine learning research, vol 78. PMLR, pp 262–270. http://proceedings.mlr.press/v78/rusu17a.html
Ryu H, Shin H, Park J (2020) Multi-agent actor-critic with hierarchical graph attention network. Proc AAAI Conf Artif Intell 34(05):7236–7243. https://doi.org/10.1609/aaai.v34i05.6214
Sadeghi F, Levine S (2016) Cad2rl: real single-image flight without a single real image. arXiv preprint arXiv:1611.04201
Sallab AE, Abdou M, Perot E, Yogamani S (2016) End-to-end deep reinforcement learning for lane keeping assist. arXiv preprint arXiv:1612.04340
Sallab AE, Abdou M, Perot E, Yogamani S (2017) Deep reinforcement learning framework for autonomous driving. Electron Imaging 19:70–76. https://doi.org/10.2352/ISSN.2470-1173.2017.19.AVM-023
Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, Proceedings of machine learning research, vol 37. PMLR, Lille, pp. 1889–1897. http://proceedings.mlr.press/v37/schulman15.html
Settles B, Craven M, Friedland L (2008) Active learning with real annotation costs. In: Proceedings of the NIPS workshop on cost-sensitive learning, vol 1. Vancouver
Singla A, Padakandla S, Bhatnagar S (2021) Memory-based deep reinforcement learning for obstacle avoidance in uav with limited environment knowledge. IEEE Trans Intell Transp Syst 22(1):107–118. https://doi.org/10.1109/TITS.2019.2954952
Smith RL, Ackerley IM, Wells K, Bartley L, Paisey S, Marshall C (2019) Reinforcement learning for object detection in pet imaging. In: 2019 IEEE nuclear science symposium and medical imaging conference (NSS/MIC), pp 1–4. https://doi.org/10.1109/NSS/MIC42101.2019.9060031
Sun C, Liu W, Dong L (2020) Reinforcement learning with task decomposition for cooperative multiagent systems. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.2996209
Sun S, Zhao X, Li Q, Tan M (2020) Inverse reinforcement learning-based time-dependent a* planner for human-aware robot navigation with local vision. Adv Robot 34(13):888–901. https://doi.org/10.1080/01691864.2020.1753569
Supancic J, Ramanan D (2017) Tracking as online decision-making: learning a policy from streaming videos with reinforcement learning. In: 2017 IEEE international conference on computer vision (ICCV), pp 322–331. https://doi.org/10.1109/ICCV.2017.43
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308
Tallamraju R, Saini N, Bonetto E, Pabst M, Liu YT, Black MJ, Ahmad A (2020) Aircaprl: autonomous aerial human motion capture using deep reinforcement learning. IEEE Robot Autom Lett 5(4):6678–6685. https://doi.org/10.1109/LRA.2020.3013906
Tobin J, Fong R, Ray A, Schneider J, Zaremba W, Abbeel P (2017) Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 23–30. https://doi.org/10.1109/IROS.2017.8202133
Tzeng E, Devin C, Hoffman J, Finn C, Abbeel P, Levine S, Saenko K, Darrell T (2020) Adapting deep visuomotor representations with weak pairwise constraints. In: Algorithmic foundations of robotics, vol XII. Springer, pp 688–703
ten Pas A, Gualtieri M, Saenko K, Platt R (2017) Grasp pose detection in point clouds. Int J Rob Res 36(13–14):1455–1473. https://doi.org/10.1177/0278364917735594
Uzkent B, Yeh C, Ermon S (2020) Efficient object detection in large images using deep reinforcement learning. In: 2020 IEEE winter conference on applications of computer vision (WACV), pp 1813–1822. https://doi.org/10.1109/WACV45572.2020.9093447
Vezhnevets A, Buhmann JM, Ferrari V (2012) Active learning for semantic segmentation with expected change. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3162–3169. https://doi.org/10.1109/CVPR.2012.6248050
van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 30(1). https://ojs.aaai.org/index.php/AAAI/article/view/10295
Vijayanarasimhan S, Grauman K (2009) What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In: 2009 IEEE conference on computer vision and pattern recognition, pp 2262–2269. https://doi.org/10.1109/CVPR.2009.5206705
Wang P, Chan C (2017) Formulation of deep reinforcement learning architecture toward autonomous driving for on-ramp merge. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), pp 1–6. https://doi.org/10.1109/ITSC.2017.8317735
Wang P, Chan C, de La Fortelle A (2018) A reinforcement learning based approach for automated lane change maneuvers. In: 2018 IEEE intelligent vehicles symposium (IV), pp 1379–1384. https://doi.org/10.1109/IVS.2018.8500556
Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep learning. Int J Mach Learn Cybernet 11(4):747–750. https://doi.org/10.1007/s13042-020-01096-5
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3–4):229–256. https://doi.org/10.1007/BF00992696
Wojek C, Dorkó G, Schulz A, Schiele B (2008) Sliding-windows for rapid object class localization: a parallel technique. In: Proceedings of the 30th DAGM symposium on pattern recognition, pp 71–81. Springer, Berlin. https://doi.org/10.1007/978-3-540-69321-5_8
Wu Y, Mansimov E, Liao S, Grosse R, Ba J (2017) Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Curran Associates Inc., Red Hook, pp 5285–5294 (2017)
Wymann B, Dimitrakakis C, Sumner A, Espié E, Guionneau C (2015) Torcs: the open racing car simulator. Simulation Software
Yahya A, Li A, Kalakrishnan M, Chebotar Y, Levine S (2017) Collective robot reinforcement learning with distributed asynchronous guided policy search. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 79–86. https://doi.org/10.1109/IROS.2017.8202141
Yang D, Roth H, Xu Z, Milletari F, Zhang L, Xu D (2019) Searching learning strategy with reinforcement learning for 3d medical image segmentation. In: Shen D, Liu T, Peters TM, Staib LH, Essert C, Zhou S, Yap PT, Khan A (eds) Medical image computing and computer assisted intervention-MICCAI 2019. Springer, Cham, pp 3–11
Yang L, Zhang Y, Chen J, Zhang S, Chen DZ (2017) Suggestive annotation: a deep active learning framework for biomedical image segmentation. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins DL, Duchesne S (eds) Medical image computing and computer assisted intervention-MICCAI 2017. Springer, Cham, pp 399–407
Yang P, Huang J (2019) Trackdqn: visual tracking via deep reinforcement learning. In: 2019 IEEE 1st international conference on civil aviation safety and information technology (ICCASIT), pp 277–282. https://doi.org/10.1109/ICCASIT48058.2019.8973189
Yun S, Choi J, Yoo Y, Yun K, Choi JY (2017) Action-decision networks for visual tracking with deep reinforcement learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1349–1358. https://doi.org/10.1109/CVPR.2017.148
Zeng A, Song S, Welker S, Lee J, Rodriguez A, Funkhouser T (2018) Learning synergies between pushing and grasping with self-supervised deep reinforcement learning. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4238–4245. https://doi.org/10.1109/IROS.2018.8593986
Zhang D, Maei H, Wang X, Wang YF (2017) Deep reinforcement learning for visual object tracking in videos. arXiv preprint arXiv:1701.08936
Zhang H, Chen W, Huang Z, Li M, Yang Y, Zhang W, Wang J (2020) Bi-level actor-critic for multi-agent coordination. Proc AAAI Conf Artif Intell 34(05):7325–7332. https://doi.org/10.1609/aaai.v34i05.6226
Zhang W, Song K, Rong X, Li Y (2019) Coarse-to-fine uav target tracking with deep reinforcement learning. IEEE Trans Autom Sci Eng 16(4):1522–1530. https://doi.org/10.1109/TASE.2018.2877499
Zhang Z, Wang D, Gao J (2020) Learning automata-based multiagent reinforcement learning for optimization of cooperative tasks. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3025711
Zhao D, Chen Y, Lv L (2017) Deep reinforcement learning with visual attention for vehicle classification. IEEE Trans Cogn Dev Syst 9(4):356–367. https://doi.org/10.1109/TCDS.2016.2614675
Zhong Z, Yang Z, Feng W, Wu W, Hu Y, Liu C (2019) Decision controller for object tracking with deep reinforcement learning. IEEE Access 7:28069–28079. https://doi.org/10.1109/ACCESS.2019.2900476
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hafiz, A.M., Parah, S.A. & Bhat, R.A. Reinforcement learning applied to machine vision: state of the art. Int J Multimed Info Retr 10, 71–82 (2021). https://doi.org/10.1007/s13735-021-00209-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-021-00209-2