Abstract
Computer vision, together with bayesian estimation algorithms, sensors, and actuators, are used in robotics to solve a variety of critical tasks such as localization, obstacle avoidance, and navigation. Classical approaches in visual servoing systems relied on extracting features from images to control robot movements. Now, state of the art computer vision systems use deep neural networks in tasks such as object recognition, detection, segmentation, and tracking. These networks and specialized controllers play a predominant role in the design and implementation of modern visual servoing systems due to their accuracy, flexibility, and adaptability. Recent research in direct systems for visual servoing has created robotic systems capable of relying only on the information contained in the whole image. Furthermore, end-to-end systems learn the control laws during training, eliminating entirely the controller. This paper presents a comprehensive survey on the state of the art in visual servoing systems, discussing the latest classical methods not included in other surveys but emphasizing the new approaches based on deep neural networks and their applications in a broad variety of applications within robotics.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
3m full hd wdr usb camera module. https://www.iadiy.com/high-resolution-USB-camera-modules-CM5M30M12C. Accessed:2020-06-13
Cornell university. robot learning lab: Learning to grasp. http://pr.cs.cornell.edu/grasping/rect_data/data.php. Accessed: 2020-08-27
Idsia dataset. http://people.idsia.ch/guzzi/DataSet.htmlhttp://people.idsia.ch/guzzi/DataSet.html. Accessed: 2020-09-27
Jacquard dataset. https://jacquard.liris.cnrs.fr/. Accessed: 2021-04-01
Abdelkader, H.H., Mezouar, Y., Andreff, N., Martinet, P.: 2 1/2 D visual servoing with central catadioptric cameras. In: 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3572–3577. IEEE (2005)
Ahlin, K., Joffe, B., Hu, A.P., McMurray, G., Sadegh, N.: Autonomous leaf picking using deep learning and visual-servoing. IFAC-PapersOnLine 49(16), 177–183 (2016)
Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. 100(1), 90–93 (1974)
Al-Kaff, A., Martín, D., García, F., de la Escalera, A., María Armingol, J.: Survey of computer vision algorithms and applications for unmanned aerial vehicles. Expert Syst. Appl. 92, 447–463 (2018). https://doi.org/10.1016/j.eswa.2017.09.033. http://www.sciencedirect.com/science/article/pii/S0957417417306395
Albani, D., Youssef, A., Suriani, V., Nardi, D., Bloisi, D.D.: A deep learning approach for object recognition with Nao soccer robots. In: Robot World Cup, pp. 392–403. Springer (2016)
Allen, P.K., Yoshimi, B., Timcenko, A.: Real-time visual servoing (1990)
Allibert, G., Hua, M.D., Krupínski, S., Hamel, T.: Pipeline following by visual servoing for autonomous underwater vehicles. Control Engineering Practice 82, 151–160 (2019). https://doi.org/10.1016/j.conengprac.2018.10.004. http://www.sciencedirect.com/science/article/pii/S0967066118306312
Alom, M.Z., Taha, T.M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M.S., Hasan, M., Van Essen, B.C., Awwal, A.A.S., Asari, V.K.: A state-of-the-art survey on deep learning theory and architectures. Electronics 8(3). https://doi.org/10.3390/electronics8030292. https://www.mdpi.com/2079-9292/8/3/292 (2019)
Andersson, R.L.: Real time expert system to control a robot ping-pong player (1988)
Araar, O., Aouf, N.: Visual servoing of a quadrotor Uav for autonomous power lines inspection. In: 22Nd Mediterranean Conference on Control and Automation, pp. 1418–1424. IEEE (2014)
Asada, M., Stone, P., Kitano, H., Werger, B., Kuniyoshi, Y., Drogoul, A., Duhaut, D., Veloso, M., Asama, H., Suzuki, S.: The robocup physical agent challenge: Phase i. Appl. Artif. Intell. 12(2-3), 251–263 (1998)
Ba, S., Alameda-Pineda, X., Xompero, A., Horaud, R.: An on-line variational bayesian model for multi-person tracking from cluttered scenes. Comput. Vis. Image Underst. 153, 64–76 (2016)
Ban, Y., Alameda-Pineda, X., Badeig, F., Ba, S., Horaud, R.: Tracking a varying number of people with a visually-controlled robotic head. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4144–4151. IEEE (2017)
Bateux, Q.: Going further with direct visual servoing. Ph.D. thesis, Université, Rennes 1 (2018)
Bateux, Q., Marchand, E.: Histograms-based visual servoing. IEEE Robot. Autom. Lett. 2 (1), 80–87 (2016)
Bateux, Q., Marchand, E., Leitner, J., Chaumette, F., Corke, P.: Visual servoing from deep neural networks. arXiv:1705.08940 (2017)
Bateux, Q., Marchand, E., Leitner, J., Chaumette, F., Corke, P.: Training deep neural networks for visual servoing. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8. IEEE (2018)
Bekey, G., Yuh, J.: The status of robotics. IEEE Robot. Autom. Mag. 15(1), 80–86 (2008)
Benini, A., Mancini, A., Longhi, S.: An imu/uwb/vision-based extended kalman filter for mini-uav localization in indoor environment using 802.15. 4a wireless sensor network. J. Intell. Robot. Syst. 70 (1-4), 461–476 (2013)
Besl, P.J.: Active optical range imaging sensors. In: Advances in Machine Vision, pp. 1–63. Springer (1989)
Bicer, Y., Moghadam, M., Sahin, C., Eroglu, B., Üre, N. K.: Vision-Based Uav guidance for autonomous landing with deep neural networks. In: AIAA Scitech 2019 Forum, pp. 0140 (2019)
Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning for self-driving cars. arXiv:1604.07316 (2016)
Borenstein, J., Everett, H.R., Feng, L., Wehe, D.: Mobile robot positioning: sensors and techniques. J. Robot. Syst. 14(4), 231–249 (1997)
Buades, A., Facciolo, G.: Reliable multiscale and multiwindow stereo matching. SIAM J. Imaging Sci. 8(2), 888–915 (2015)
Bukowski, R., Haynes, L., Geng, Z., Coleman, N., Santucci, A., Lam, K., Paz, A., May, R., DeVito, M.: Robot hand-eye coordination rapid prototyping environment. In: Proc. ISIR, vol. 16 (1991)
Carmer, D.C., Peterson, L.M.: Laser radar in robotics. Proc. IEEE 84(2), 299–320 (1996)
Chaumette, F.: Potential problems of stability and convergence in image-based and position-based visual servoing. In: The Confluence of Vision and Control, pp. 66–78. Springer (1998)
Chaumette, F., Hutchinson, S.: Visual servo control. i. basic approaches. IEEE Robot. Autom. Mag. 13(4), 82–90 (2006)
Chaumette, F., Hutchinson, S.: Visual servo control. ii. advanced approaches [tutorial]. IEEE Robot. Autom. Mag. 14(1), 109–118 (2007)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Chen, S.: Kalman filter for robot vision: a survey. IEEE Trans. Ind. Electron. 59(11), 4409–4420 (2011)
Chen, Z., Huang, X.: End-To-End learning for lane keeping of self-driving cars. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 1856–1860. IEEE (2017)
Chesi, G., Hashimoto, K., Prattichizzo, D., Vicino, A.: Keeping features in the field of view in eye-in-hand visual servoing: a switching approach. IEEE Trans. Robot. 20(5), 908–914 (2004)
Chesi, G., Hung, Y.S.: Global path-planning for constrained and optimal visual servoing. IEEE Trans. Robot. 23(5), 1050–1060 (2007)
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1, pp. 539–546. IEEE (2005)
Chu, F.J., Xu, R., Vela, P.A.: Real-world multiobject, multigrasp detection. IEEE Robot. Autom. Lett. 3(4), 3355–3362 (2018)
Ciregan, D., Meier, U., Schmidhuber, J.: Multi-Column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3642–3649. IEEE (2012)
Collewet, C., Chaumette, F.: Positioning a camera with respect to planar objects of unknown shape by coupling 2-d visual servoing and 3-d estimations. IEEE Trans. Robot. Autom. 18(3), 322–333 (2002)
Collewet, C., Marchand, E.: Photometric visual servoing. IEEE Trans. Robot. 27(4), 828–834 (2011)
Crétual, A., Chaumette, F.: Visual servoing based on image motion. Int. J. Robot. Res. 20(11), 857–877 (2001)
Cruz, N., Lobos-Tsunekawa, K., Ruiz-del Solar, J.: Using convolutional neural networks in robots with limited computational resources: Detecting Nao robots while playing soccer. In: Robot World Cup, pp. 19–30. Springer (2017)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1, pp. 886–893. IEEE (2005)
De Luca, A., Oriolo, G., Giordano, P.R.: On-line estimation of feature depth for image-based visual servoing schemes. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 2823–2828. IEEE (2007)
Delabarre, B., Marchand, E.: Visual servoing using the sum of conditional variance. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1689–1694. IEEE (2012)
Djuknic, G.M., Freidenfelds, J., Okunev, Y.: Establishing wireless communications services via high-altitude aeronautical platforms: A concept whose time has come? IEEE Commun. Mag. 35(9), 128–135 (1997)
Dos Santos, M.M., De Giacomo, G.G., Drews-Jr, P.L., Botelho, S.S.: Matching color aerial images and underwater sonar images using deep learning for underwater localization. IEEE Robot. Autom. Lett. 5(4), 6365–6370 (2020)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An Image is Worth 16X16 Words: Transformers for image recognition at scale. In: ICLR 2021: The Ninth International Conference on Learning Representations (2021)
Drews, P.L., Neto, A.A., Campos, M.F.: Hybrid unmanned aerial underwater vehicle: Modeling and simulation. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4637–4642. IEEE (2014)
Durdevic, P., Ortiz-Arroyo, D.: A deep neural network sensor for visual servoing in 3d spaces. Sensors 20(5), 1437 (2020)
Durdevic, P., Ortiz-Arroyo, D., Li, S., Yang, Z.: Uav visual servoing navigation in sparsely populated environments. In: Proceedings of the 15th European Workshop on Advanced Control and Diagnosis, ACD (2019)
Durdevic, P., Ortiz-Arroyo, D., Li, S., Yang, Z.: Vision aided navigation of a quad-rotor for autonomous wind-farm inspection. IFAC-PapersOnLine 52(8), 61–66 (2019)
Durdevic, P., Ortiz-Arroyo, D., Yang, Z.: Lidar assisted camera inspection of wind turbines: experimental study. In: 2019 1St International Conference on Electrical, Control and Instrumentation Engineering (ICECIE), pp. 1–7. IEEE (2019)
Espiau, B., Chaumette, F., Rives, P.: A new approach to visual servoing in robotics. IEEE Trans. Robot. Autom. 8(3), 313–326 (1992)
Flandin, G., Chaumette, F., Marchand, E.: Eye-in-hand/eye-to-hand cooperation for visual servoing. In: Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), vol. 3, pp. 2741–2746. IEEE (2000)
Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059 (2016)
Geirhos, R., Janssen, D.H.J., Schütt, H.H., Rauber, J., Bethge, M., Wichmann, F.A.: Comparing deep neural networks against humans: object recognition when the signal gets weaker. arXiv:abs/1706.06969 (2017)
Gemerek, J., Ferrari, S., Wang, B.H., Campbell, M.E.: Video-guided camera control for target tracking and following. IFAC-PapersOnLine 51(34), 176–183 (2019)
Giusti, A., Guzzi, J., Cireşan, D.C., He, F.L., Rodríguez, J.P., Fontana, F., Faessler, M., Forster, C., Schmidhuber, J., Di Caro, G., et al.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robot. Autom. Lett. 1(2), 661–667 (2015)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)
Gomariz, A., Li, W., Ozkan, E., Tanner, C., Goksel, O.: Siamese networks with location prior for landmark tracking in liver ultrasound sequences. In: 2019 IEEE 16Th International Symposium on Biomedical Imaging (ISBI 2019), pp. 1757–1760. IEEE (2019)
Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. arXiv:1406.2661 (2014)
Goswami, J.C., Chan, A.K.: Fundamentals of wavelets: theory, algorithms, and applications, vol. 233, John Wiley & Sons (2011)
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. arXiv:1706.04599 (2017)
Gustafsson, F.: Statistical sensor fusion. Studentlitteratur (2010)
Hager, G.D., Chang, W.C., Morse, A.S.: Robot hand-eye coordination based on stereo vision. IEEE Control. Syst. Mag. 15(1), 30–39 (1995)
Han, J., Cho, Y., Kim, J., Kim, J., Son, N.S., Kim, S.Y.: Autonomous collision detection and avoidance for aragon usv: development and field tests. J. Field Robot. 37(6), 987–1002 (2020)
Hashimoto, K., Kimura, H.: Visual servoing with nonlinear observer. In: Proceedings of 1995 IEEE International Conference on Robotics and Automation, vol. 1, pp. 484–489. IEEE (1995)
Hashimoto, K., Noritsugu, T.: Visual servoing with linearized observer. In: Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No. 99CH36288C), vol. 1, pp. 263–268. IEEE (1999)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv:1610.02136 (2016)
Hinton, G., Sabour, S., Frosst, N.: Matrix capsules with em routing. https://openreview.net/pdf?id=HJWLfGWRb (2018)
Hirschmuller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, pp. 807–814. IEEE (2005)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., et al.: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Huang, P., Wang, D., Meng, Z., Zhang, F., Liu, Z.: Impact dynamic modeling and adaptive target capturing control for tethered space robots with uncertainties. IEEE ASME Trans. Mechatron. 21(5), 2260–2271 (2016)
Hutchinson, S., Hager, G.D., Corke, P.I.: A tutorial on visual servo control. IEEE Trans. Robot. Autom. 12(5), 651–670 (1996)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv:1602.07360(2016)
James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. arXiv:1707.02267 (2017)
Janabi-Sharifi, F., Marey, M.: A kalman-filter-based method for pose estimation in visual servoing. IEEE Trans. Robot. 26(5), 939–947 (2010)
Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y.: What is the best multi-stage architecture for object recognition?. In: 2009 IEEE 12Th International Conference on Computer Vision, pp. 2146–2153. IEEE (2009)
Jeon, H.G., Lee, J.Y., Im, S., Ha, H., So Kweon, I.: Stereo matching with color and monochrome cameras in low-light conditions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4086–4094 (2016)
Joo, S.H., Manzoor, S., Rocha, Y.G., Lee, H.U., Kuc, T.Y.: A realtime autonomous robot navigation framework for human like high-level interaction and task planning in global dynamic environment. arXiv:1905.12942 (2019)
Jung, S., Cho, S., Lee, D., Lee, H., Shim, D.H.: . A direct visual servoing-based framework for the 2016 iros autonomous drone racing challenge 35, 146–166 (2018). https://doi.org/10.1002/rob.21743
Kahn, G., Abbeel, P., Levine, S.: Badgr: an autonomous self-supervised learning-based navigation system. IEEE Robot. Autom. Lett. 6(2), 1312–1319 (2021)
Kalal, Z., Mikolajczyk, K., Matas, J.: Face-Tld: Tracking-learning-detection applied to faces. In: 2010 IEEE International Conference on Image Processing, pp. 3789–3792. IEEE (2010)
Kanellakis, C., Nikolakopoulos, G.: Survey on computer vision for uavs: Current developments and trends. J. Intell. Robot. Syst. 87(1), 141–168 (2017)
Karras, G.C., Kyriakopoulos, K.J.: Visual servo control of an underwater vehicle using a laser vision system. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4116–4122 (2008)
Ke, F., Li, Z., Xiao, H., Zhang, X.: Visual servoing of constrained mobile robots based on model predictive control. IEEE Transactions on Systems, Man, and Cybernetics: Systems 47(7), 1428–1438 (2016)
Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision?. In: Advances in Neural Information Processing Systems, pp. 5574–5584 (2017)
Khan, A., Sohail, A., Zahoora, U., Qureshi, A.S.: A survey of the recent architectures of deep convolutional neural networks. Artificial Intelligence Review, pp. 1–62 (2019)
Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S., Shah, M.: Transformers in vision: A survey (2021)
Kim, W., Seok, J.: Indoor semantic segmentation for robot navigating on mobile. In: 2018 Tenth International Conference on Ubiquitous and Future Networks (ICUFN), pp. 22–25. IEEE (2018)
Kocić, J., Jovičić, N., Drndarević, V.: An end-to-end deep neural network for autonomous driving designed for embedded automotive platforms. Sensors 19(9), 2064 (2019)
Kolodziej, K.W., Hjelm, J.: Local positioning systems: LBS applications and services. CRC press (2017)
Kragic, D., Christensen, H.I., et al.: Survey on visual servoing for manipulation. Computational Vision and Active Perception Laboratory, Fiskartorpsv 15, 2002 (2002)
Krishnan, D., Fergus, R.: Dark flash photography. ACM Trans. Graph. 28(3), 96 (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 769–776. IEEE (2017)
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems, pp. 6402–6413 (2017)
Lampe, T., Riedmiller, M.: Acquiring visual servoing reaching and grasping skills using neural reinforcement learning. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. https://doi.org/10.1109/IJCNN.2013.6707053 (2013)
Le Pennec, T., Jridi, M., Dezan, C., Alfalou, A., Florin, F.: Underwater exploration by Auv using deep neural network implemented on Fpga. In: Pattern Recognition and Tracking XXXI, vol. 11400, pp. 114000N. International Society for Optics and Photonics (2020)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
LeCun, Y., Jackel, L., Bottou, L., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Muller, U.A., Sackinger, E., Simard, P., et al.: Learning algorithms for classification: a comparison on handwritten digit recognition. Neural Networks: The Statistical Mechanics Perspective 261, 276 (1995)
Lee, A.: Comparing deep neural networks and traditional vision algorithms in mobile robotics. Swarthmore University (2015)
Lee, J., Jeong, M.H., Lee, J., Kim, K.: You, B.J.: 3d pose tracking using particle filter with back projection-based sampling. Int. J. of Control, Auto. Syst. 10(6), 1232–1239 (2012)
Leiva, F., Cruz, N., Bugueño, I., Ruiz-del Solar, J.: Playing soccer without colors in the Spl: a convolutional neural network approach. In: Robot World Cup, pp. 122–134. Springer (2018)
Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1222–1230 (2017)
Li, Y., Hao, Z., Lei, H.: Survey of convolutional neural network. J. Comput. Appl. 36(9), 2508–2515 (2016)
Liang, X., Wang, H., Chen, W.: Adaptive image-based visual servoing of wheeled mobile robots with fixed camera configuration. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 6199–6204. IEEE (2014)
Lin, C., Wang, H., Yuan, J., Yu, D., Li, C.: An improved recurrent neural network for unmanned underwater vehicle online obstacle avoidance. Ocean Eng. 189, 106327 (2019)
Lippiello, V., Fontanelli, G.A., Ruggiero, F.: Image-based visual-impedance control of a dual-arm aerial manipulator. IEEE Robot. Autom. Lett. 3(3), 1856–1863 (2018)
Lippiello, V., Siciliano, B., Villani, L.: Visual motion estimation of 3D objects: an adaptive extended kalman filter approach. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566), Vol. 1, pp. 957–962. IEEE (2004)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. arXiv:abs/1512.02325 (2015)
Loquercio, A., Maqueda, A.I., Del-Blanco, C.R., Scaramuzza, D.: Dronet: Learning to fly by driving. IEEE Robot. Autom. Lett. 3(2), 1088–1095 (2018)
Lots, J.F., Lane, D., Trucco, E.: Application of 2 1/2 d visual servoing to underwater vehicle station-keeping. In: OCEANS 2000 MTS/IEEE Conference and Exhibition. Conference Proceedings (Cat. No. 00CH37158), vol. 2, pp. 1257–1264. IEEE (2000)
Loy, J.: Neural Network Projects with python: The ultimate guide to using Python to explore the true power of neural networks through six projects. Packt Publishing Ltd (2019)
Loy, J.: Neural Network Projects with python: The ultimate guide to using Python to explore the true power of neural networks through six projects. Packt Publishing Ltd (2019)
Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5695–5703 (2016)
MacKay, D.J.: A practical bayesian framework for backpropagation networks. Neural Comput. 4(3), 448–472 (1992)
Mahler, J., Liang, J., Niyaz, S., Laskey, M., Doan, R., Liu, X., Ojea, J.A., Goldberg, K.: Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv:1703.09312 (2017)
Malis, E., Chaumette, F., Boudet, S.: 2 1/2 d visual servoing. IEEE Trans. Robot. Autom. 15(2), 238–250 (1999)
Marchand, E.: Subspace-based direct visual servoing. IEEE Robot. Autom. Lett. 4(3), 2699–2706 (2019)
Marchand, E.: Direct visual servoing in the frequency domain. IEEE Robot. Autom. Lett. 5(2), 620–627 (2020)
Mariottini, G.L., Oriolo, G., Prattichizzo, D.: Image-based visual servoing for nonholonomic mobile robots using epipolar geometry. IEEE Trans. Robot. 23(1), 87–100 (2007)
Mebarki, R., Lippiello, V., Siciliano, B.: Nonlinear visual control of unmanned aerial vehicles in gps-denied environments. IEEE Trans. Robot. 31(4), 1004–1017 (2015)
Mejias, L., Campoy, P., Saripalli, S., Sukhatme, G.S.: A visual servoing approach for tracking features in urban areas using an autonomous helicopter. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., pp. 2503–2508. IEEE (2006)
Menfoukh, K., Touba, M.M., Khenfri, F., Guettal, L.: Optimized convolutional neural network architecture for Uav navigation within unstructured trail. In: 020 1St International Conference on Communications, Control Systems and Signal Processing (CCSSP), pp. 211–214. IEEE (2020)
Mercado, D., Maia, M., Diez, F.J.: Aerial-underwater systems, a new paradigm in unmanned vehicles. J. Intell. Robot. Syst. 95(1), 229–238 (2019)
Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning:, A survey. arXiv:2001.05566 (2020)
Mitchell, T.M., et al.: Machine learning (1997)
Morrison, D., Corke, P., Leitner, J.: Learning robust, real-time, reactive robotic grasping. Int. J. Robot. Res. 39(2-3), 183–201 (2020)
Muller, U., Ben, J., Cosatto, E., Flepp, B., Cun, Y.L.: Off-Road obstacle avoidance through end-to-end learning. In: Advances in Neural Information Processing Systems, pp. 739–746 (2006)
Myint, M., Yonemori, K., Yanou, A., Lwin, K.N., Minami, M., Ishiyama, S.: Visual servoing for underwater vehicle using dual-eyes evolutionary real-time pose tracking. Journal of Robotics and Mechatronics 28(4), 543–558 (2016)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML (2010)
Neal, R.M.: Bayesian Learning for Neural Networks. Ph.D. thesis, University of Toronto (1995)
Nocks, L.: The robot: the life story of a technology. Greenwood Publishing Group (2007)
Ourak, M., Tamadazte, B., Lehmann, O., Andreff, N.: Direct visual servoing using wavelet coefficients. IEEE/ASME Transactions on Mechatronics 24(3), 1129–1140 (2019)
Padhy, R.P., Verma, S., Ahmad, S., Choudhury, S.K., Sa, P.K.: Deep neural network for autonomous uav navigation in indoor corridor environments. Procedia Comput. Sci. 133, 643–650 (2018)
Pedersen, O.M., Misimi, E., Chaumette, F.: Grasping unknown objects by coupling deep reinforcement learning, generative adversarial networks, and visual servoing. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 5655–5662. IEEE (2020)
Penza, V., Salerno, D., Acemoglu, A., Ortiz, J., Mattos, L.S.: Hybrid visual servoing for autonomous robotic laser tattoo removal. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4461–4466. IEEE (2019)
Pestana, J., Sanchez-Lopez, J.L., Campoy, P., Saripalli, S.: Vision based Gps-denied object tracking and following for unmanned aerial vehicles. In: 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1–6. IEEE (2013)
Pomerleau, D.A.: Alvinn: An autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, pp. 305–313 (1989)
Pomerleau, D.A.: Efficient training of artificial neural networks for autonomous navigation. Neural Comput. 3(1), 88–97 (1991)
Qiu, Y., Li, B., Shi, W., Zhang, X.: Visual servo tracking of wheeled mobile robots with unknown extrinsic parameters. IEEE Trans. Ind. Electron. 66(11), 8600–8609 (2019)
Raja, R., Kumar, S.: A hybrid image based visual servoing for a manipulator using kinect. In: Proceedings of the Advances in Robotics, pp. 1–5 (2017)
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv:1710.05941 (2017)
Ramezani Dooraki, A., Lee, D.J.: An end-to-end deep reinforcement learning-based intelligent agent capable of autonomous exploration in unknown environments. Sensors 18(10), 3575 (2018)
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor-Net: Imagenet classification using binary convolutional neural networks. In: European Conference on Computer Vision, pp. 525–542. Springer (2016)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. arXiv:abs/1506.02640 (2015)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv:abs/1804.02767(2018)
Rizzi, A.A., Koditschek, D.E.: Preliminary experiments in spatial robot juggling. In: Experimental Robotics II, pp. 282–298. Springer (1993)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer (2015)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Sadeghi, F.: Divis: Domain invariant visual servoing for collision-free goal reaching. In: Bicchi, A., Kress-Gazit, H., Hutchinson, S. (eds.) Robotics: Science and Systems XV, University of Freiburg, Freiburg im Breisgau, Germany, June 22-26, 2019. https://doi.org/10.15607/RSS.2019.XV.055 (2019)
Said, T., Ghoniemy, S., Karam, O.: Real-time multi-object detection and tracking for autonomous robots in uncontrolled environments. In: 2012 Seventh International Conference on Computer Engineering & Systems (ICCES), pp. 67–72. IEEE (2012)
Salehian, M., RayatDoost, S., Taghirad, H.: Robust unscented kalman filter for visual servoing system. In: The 2Nd International Conference on Control, Instrumentation and Automation, pp. 1006–1011. IEEE (2011)
Samson, C., Espiau, B., Borgne, M.L.: Robot control: the task function approach. Oxford University Press Inc (1991)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Santamaria-Navarro, A., Andrade-Cetto, J., Lippiello, V.: Visual servoing of aerial manipulators. In: Aerial Robotic Manipulation, pp. 191–202. Springer (2019)
Saxena, A., Pandya, H., Kumar, G., Gaud, A., Krishna, K.M.: Exploring convolutional networks for end-to-end visual servoing. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3817–3823. IEEE (2017)
Schramm, F., Morel, G., Micaelli, A., Lottin, A.: Extended-2D visual servoing. In: IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA’04. 2004, Vol. 1, pp. 267–273. IEEE (2004)
Shademan, A., Janabi-Sharifi, F.: Sensitivity analysis of ekf and iterated ekf pose estimation for position-based visual servoing. In: Proceedings of 2005 IEEE Conference on Control Applications, 2005. CCA 2005., pp. 755–760. IEEE (2005)
Sharma, R., Hutchinson, S.: Motion perceptibility and its application to active vision-based servo control. IEEE Trans. Robot. Autom. 13(4), 607–617 (1997)
Shepard, A.J., Wang, B., Foo, T.K., Bednarz, B.P.: A block matching based approach with multiple simultaneous templates for the real-time 2d ultrasound tracking of liver vessels. Med. Phys. 44(11), 5889–5900 (2017)
Simon, D.: Optimal state estimation: Kalman, H infinity, and nonlinear approaches. John Wiley & Sons (2006)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Smith, C., Karayiannidis, Y., Nalpantidis, L., Gratal, X., Qi, P., Dimarogonas, D.V., Kragic, D.: Dual arm manipulation—a survey. Robotics and Autonomous systems 60(10), 1340–1353 (2012)
Speck, D., Barros, P., Weber, C., Wermter, S.: Ball localization for robocup soccer using convolutional neural networks. In: Robot World Cup, pp. 19–30. Springer (2016)
Stokkeland, M., Klausen, K., Johansen, T.A.: Autonomous visual navigation of unmanned aerial vehicle for wind turbine inspection. In: 2015 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 998–1007. IEEE (2015)
Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, pp. 2553–2561 (2013)
Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., Le, Q.V.: Mnasnet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
Thomas, C.: Sensor Fusion - Foundation and Applications. InTech (2011)
Thomas, J., Loianno, G., Sreenath, K., Kumar, V.: Toward image based visual servoing for aerial grasping and perching. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 2113–2118. IEEE (2014)
Thornton, J., Grace, D., Spillard, C., Konefal, T., Tozer, T.: Broadband communications from a high-altitude platform: the european helinet programme. Elect. Commun. Eng. J 13(3), 138–144 (2001)
Thrun, S., Burgard, W., Fox, D.: Probabilistic robotics. Intelligent robotics and autonomous agents. MIT Press (2005)
Thuilot, B., Martinet, P., Cordesses, L., Gallice, J.: Position based visual servoing: keeping the object in the field of vision. In: Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), vol. 2, pp. 1624–1629. IEEE (2002)
Usher, K., Ridley, P., Corke, P.: Visual servoing of a car-like vehicle-an application of omnidirectional vision. In: 2003 IEEE International Conference on Robotics and Automation (Cat. No. 03CH37422), Vol. 3, pp. 4288–4293. IEEE (2003)
Vaillant, R., Monrocq, C., Le Cun, Y.: Original approach for the localisation of objects in images. IEE Proceedings-Vision, Image and Signal Processing 141(4), 245–250 (1994)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, Pp. 5998–6008 (2017)
Vicente, P., Jamone, L., Bernardino, A.: Towards markerless visual servoing of grasping tasks for humanoid robots. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3811–3816. IEEE (2017)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol. 1, pp. I–I. IEEE (2001)
Wang, H., Guo, D., Liang, X., Chen, W., Hu, G., Leang, K.K.: Adaptive vision-based leader–follower formation control of mobile robots. IEEE Trans. Ind. Electron. 64(4), 2893–2902 (2016)
Wang, H., Liu, Y.H., Chen, W., Wang, Z.: A new approach to dynamic eye-in-hand visual tracking using nonlinear observers. IEEE/ASME Transactions on Mechatronics 16(2), 387–394 (2010)
Wang, H., Yang, B., Liu, Y., Chen, W., Liang, X., Pfeifer, R.: Visual servoing of soft robot manipulator in constrained environments with an adaptive controller. IEEE/ASME Transactions on Mechatronics 22(1), 41–50 (2016)
Wells, G., Venaille, C., Torras, C.: Vision-based robot positioning using neural networks. Image Vis. Comput. 14(10), 715–732 (1996)
Wilson, W.J., Hulls, C.W., Bell, G.S.: Relative end-effector control using cartesian position based visual servoing. IEEE Trans. Robot. Autom. 12(5), 684–696 (1996)
Wu, B., Xu, C., Dai, X., Wan, A., Zhang, P., Tomizuka, M., Keutzer, K., Vajda, P.: Visual transformers: Token-based image representation and processing for computer vision. arXiv:abs/2006.03677 (2020)
Xu, C., He, J., Zhang, X., Zhou, X., Duan, S.: Towards human motion tracking: multi-sensory imu/toa fusion method and fundamental limits. Electronics 8(2), 142 (2019)
Xu, Q., Zhang, C., Zhang, L.: Deep Convolutional Neural Network Based Unmanned Surface Vehicle Maneuvering. In: 2017 Chinese Automation Congress (CAC), pp. 878–881. IEEE (2017)
Yan, Z., Guo, Y., Zhang, C.: Deep defense: Training dnns with improved adversarial robustness. arXiv:1803.00404 (2018)
Yang, L., Qi, J., Song, D., Xiao, J., Han, J., Xia, Y.: Survey of robot 3d path planning algorithms. Journal of Control Science and Engineering 2016 (2016)
Yang, T.J., Howard, A., Chen, B., Zhang, X., Go, A., Sandler, M., Sze, V., Adam, H.: Netadapt: Platform-aware neural network adaptation for mobile applications. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 285–300 (2018)
Yu, C., Cai, Z., Pham, H., Pham, Q.C.: Siamese convolutional neural network for sub-millimeter-accurate camera pose estimation and visual servoing. arXiv:1903.04713 (2019)
Zarudzki, M., Shin, H.S., Lee, C.H.: An image based visual servoing approach for multi-target tracking using an quad-tilt rotor Uav. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 781–790. IEEE (2017)
Zereik, E., Sorbara, A., Casalino, G., Didot, F.: Autonomous dual-arm mobile manipulator crew assistant for surface operations: Force/Vision-Guided grasping. In: 2009 4th International Conference on Recent Advances in Space Technologies, pp. 710–715. IEEE (2009)
Zhang, Q., Yang, L.T., Chen, Z., Li, P.: A survey on deep learning for big data. Information Fusion 42, 146–157 (2018)
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Zhao, C., Sun, Q., Zhang, C., Tang, Y., Qian, F.: Monocular depth estimation based on deep learning: an overview. Sci. China Technol. Sci. 63(9), 1612–1627 (2020). https://doi.org/10.1007/s11431-020-1582-8
Zheng, D., Wang, H., Wang, J., Chen, S., Chen, W., Liang, X.: Image-based visual servoing of a quadrotor using virtual camera approach. IEEE/ASME Trans.Mechatron. 22(2), 972–982 (2016)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Machkour, Z., Ortiz-Arroyo, D. & Durdevic, P. Classical and Deep Learning based Visual Servoing Systems: a Survey on State of the Art. J Intell Robot Syst 104, 11 (2022). https://doi.org/10.1007/s10846-021-01540-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01540-w