Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A versatile interaction framework for robot programming based on hand gestures and poses

Published: 01 December 2023 Publication History
  • Get Citation Alerts
  • Abstract

    This paper proposes a framework for industrial and collaborative robot programming based on the integration of hand gestures and poses. The framework allows operators to control the robot via both End-Effector (EE) and joint movements and to transfer compound shapes accurately to the robot. Seventeen hand gestures, which cover the position and orientation controls of the robotic EE and other auxiliary operations, are designed according to cognitive psychology. Gestures are classified by a deep neural network, which is pre-trained for two-hand pose estimation and fine-tuned on a custom dataset, achieving a test accuracy of 99%. The index finger’s pointing direction and the hand’s orientation are extracted via 3D hand pose estimation to indicate the robotic EE’s moving direction and orientation, respectively. The number of stretched fingers is detected via two-hand pose estimation to represent decimal digits for selecting robot joints and inputting numbers. Finally, we integrate these three manners seamlessly to form a programming framework.
    We conducted two interaction experiments. The reaction time of the proposed hand gestures in indicating randomly given instructions is significantly less than that of other gesture sets, such as American Sign Language (ASL). The accuracy of our method in compound shape reconstruction is much better than that of hand movement trajectory-based methods, and the operating time is comparable with that of teach pendants.

    Highlights

    Cognitive psychology-based hand gesture design reduces operators’ reaction time.
    Knowledge in two-hand pose estimation improves two-hand gesture classification.
    Hand poses can indicate arbitrary directions, orientations, and numbers more flexibly.
    Discrete hand gestures and continuous hand poses complement each other.

    References

    [1]
    Pan Z., Polden J., Larkin N., Van Duin S., Norrish J., Recent progress on programming methods for industrial robots, Robot. Comput.-Integr. Manuf. 28 (2) (2012) 87–94.
    [2]
    Oh M.-j., Lee S.-M., Kim T.-w., Lee K.-Y., Kim J., Design of a teaching pendant program for a mobile shipbuilding welding robot using a PDA, Comput. Aided Des. 42 (3) (2010) 173–182.
    [3]
    Cho S.K., Jin H.Z., Lee J.M., Yao B., Teleoperation of a mobile robot using a force-reflection joystick with sensing mechanism of rotating magnetic field, IEEE/ASME Trans. Mechatronics 15 (1) (2009) 17–26.
    [4]
    Lakomkin E., Zamani M.A., Weber C., Magg S., Wermter S., On the robustness of speech emotion recognition for human-robot interaction with deep neural networks, in: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, IEEE, 2018, pp. 854–860.
    [5]
    Bingol M.C., Aydogmus O., Performing predefined tasks using the human–robot interaction on speech recognition for an industrial robot, Eng. Appl. Artif. Intell. 95 (2020).
    [6]
    Du G., Zhang P., A novel human–manipulators interface using hybrid sensors with Kalman filter and particle filter, Robot. Comput.-Integr. Manuf. 38 (2016) 93–101.
    [7]
    Du G., Zhang P., Liu X., Markerless human–manipulator interface using leap motion with interval Kalman filter and improved particle filter, IEEE Trans. Ind. Inform. 12 (2) (2016) 694–704.
    [8]
    Du G., Chen M., Liu C., Zhang B., Zhang P., Online robot teaching with natural human–robot interaction, IEEE Trans. Ind. Electron. 65 (12) (2018) 9571–9581.
    [9]
    Gomez-Donoso F., Orts-Escolano S., Cazorla M., Accurate and efficient 3D hand pose regression for robot hand teleoperation using a monocular RGB camera, Expert Syst. Appl. 136 (2019) 327–337.
    [10]
    Li S., Ma X., Liang H., Görner M., Ruppel P., Fang B., Sun F., Zhang J., Vision-based teleoperation of shadow dexterous hand using end-to-end deep neural network, in: 2019 International Conference on Robotics and Automation, ICRA, IEEE, 2019, pp. 416–422.
    [11]
    Xu J., Kun Q., Liu H., Ma X., Hand pose estimation for robot programming by demonstration in object manipulation tasks, in: 2018 37th Chinese Control Conference, CCC, IEEE, 2018, pp. 5328–5333.
    [12]
    Mendes N., Ferrer J., Vitorino J., Safeea M., Neto P., Human behavior and hand gesture classification for smart human-robot interaction, Procedia Manuf. 11 (2017) 91–98.
    [13]
    Nuzzi C., Pasinetti S., Lancini M., Docchio F., Sansoni G., Deep learning-based hand gesture recognition for collaborative robots, IEEE Instrum. Meas. Mag. 22 (2) (2019) 44–51.
    [14]
    Gao Q., Liu J., Ju Z., Hand gesture recognition using multimodal data fusion and multiscale parallel convolutional neural network for human–robot interaction, Expert Syst. 38 (5) (2021).
    [15]
    Mazhar O., Navarro B., Ramdani S., Passama R., Cherubini A., A real-time human-robot interaction framework with robust background invariant hand gesture detection, Robot. Comput.-Integr. Manuf. 60 (2019) 34–48.
    [16]
    Nuzzi C., Pasinetti S., Pagani R., Ghidini S., Beschi M., Coffetti G., Sansoni G., MEGURU: a gesture-based robot program builder for Meta-Collaborative workstations, Robot. Comput.-Integr. Manuf. 68 (2021).
    [17]
    C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2818–2826.
    [18]
    Afifi M., 11K hands: Gender recognition and biometric identification using a large dataset of hand images, Multimedia Tools Appl. 78 (15) (2019) 20835–20854.
    [19]
    Kumar P.P., Vadakkepat P., Loh A.P., Hand posture and face recognition using a fuzzy-rough approach, Int. J. Hum. Robotics 7 (03) (2010) 331–356.
    [20]
    Pisharady P.K., Vadakkepat P., Loh A.P., Attention based detection and recognition of hand postures against complex backgrounds, Int. J. Comput. Vis. 101 (3) (2013) 403–419.
    [21]
    Liu J.-Q., Furusawa K., Tsujinaga S., Tateyama T., Iwamoto Y., Chen Y.-W., Mahg-RGBD: A multi-angle view hand gesture RGB-D dataset for deep learning based gesture recognition and baseline evaluations, in: 2019 IEEE International Conference on Consumer Electronics, ICCE, IEEE, 2019, pp. 1–4.
    [22]
    Zhang Y., Cao C., Cheng J., Lu H., Egogesture: a new dataset and benchmark for egocentric hand gesture recognition, IEEE Trans. Multimed. 20 (5) (2018) 1038–1050.
    [23]
    Gao Q., Liu J., Ju Z., Zhang X., Dual-hand detection for human–robot interaction by a parallel network based on hand detection and body pose estimation, IEEE Trans. Ind. Electron. 66 (12) (2019) 9663–9672.
    [24]
    Liu J., Luo Y., Ju Z., An interactive astronaut-robot system with gesture control, Comput. Intell. Neurosci. 2016 (2016).
    [25]
    Nuzzi C., Pasinetti S., Pagani R., Coffetti G., Sansoni G., HANDS: an RGB-D dataset of static hand-gestures for human-robot interaction, Data Brief 35 (2021).
    [26]
    Shukla D., Erkent Ö., Piater J., A multi-view hand gesture rgb-d dataset for human-robot interaction scenarios, in: 2016 25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN, IEEE, 2016, pp. 1084–1091.
    [27]
    Y.-S. Hsiao, J. Sanchez-Riera, T. Lim, K.-L. Hua, W.-H. Cheng, LaRED: A large RGB-D extensible hand gesture dataset, in: Proceedings of the 5th ACM Multimedia Systems Conference, 2014, pp. 53–58.
    [28]
    Simonyan K., Zisserman A., Very deep convolutional networks for large-scale image recognition, 2014, arXiv preprint arXiv:1409.1556.
    [29]
    C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1–9.
    [30]
    K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
    [31]
    G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected convolutional networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4700–4708.
    [32]
    Tan Y.S., Lim K.M., Lee C.P., Hand gesture recognition via enhanced densely connected convolutional neural network, Expert Syst. Appl. 175 (2021).
    [33]
    Sen A., Mishra T.K., Dash R., A novel hand gesture detection and recognition system based on ensemble-based convolutional neural network, Multimedia Tools Appl. (2022) 1–24.
    [34]
    Neethu P., Suguna R., Sathish D., An efficient method for human hand gesture detection and recognition using deep learning convolutional neural networks, Soft Comput. 24 (20) (2020) 15239–15248.
    [35]
    Benitez-Garcia G., Prudente-Tixteco L., Castro-Madrid L.C., Toscano-Medina R., Olivares-Mercado J., Sanchez-Perez G., Villalba L.J.G., Improving real-time hand gesture recognition with semantic segmentation, Sensors 21 (2) (2021) 356.
    [36]
    Zhou W., Chen K., A lightweight hand gesture recognition in complex backgrounds, Displays 74 (2022).
    [37]
    C. Zimmermann, T. Brox, Learning to estimate 3d hand pose from single rgb images, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4903–4911.
    [38]
    Huang Y., Yang J., A multi-scale descriptor for real time RGB-D hand gesture recognition, Pattern Recognit. Lett. 144 (2021) 97–104.
    [39]
    Newell A., Yang K., Deng J., Stacked hourglass networks for human pose estimation, in: European Conference on Computer Vision, Springer, 2016, pp. 483–499.
    [40]
    S. Woo, J. Park, J.-Y. Lee, I.S. Kweon, Cbam: Convolutional block attention module, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 3–19.
    [41]
    Wang J., Mueller F., Bernard F., Sorli S., Sotnychenko O., Qian N., Otaduy M.A., Casas D., Theobalt C., Rgb2hands: real-time tracking of 3d hand interactions from monocular rgb video, ACM Trans. Graph. 39 (6) (2020) 1–16.
    [42]
    Moon G., Yu S.-I., Wen H., Shiratori T., Lee K.M., Interhand2. 6m: A dataset and baseline for 3d interacting hand pose estimation from a single rgb image, in: European Conference on Computer Vision, Springer, 2020, pp. 548–564.
    [43]
    T. He, Z. Zhang, H. Zhang, Z. Zhang, J. Xie, M. Li, Bag of tricks for image classification with convolutional neural networks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 558–567.
    [44]
    Zhang H., Cisse M., Dauphin Y.N., Lopez-Paz D., Mixup: Beyond empirical risk minimization, 2017, arXiv preprint arXiv:1710.09412.
    [45]
    Müller R., Kornblith S., Hinton G., When does label smoothing help?, 2019, arXiv preprint arXiv:1906.02629.
    [46]
    L. Ge, Z. Ren, Y. Li, Z. Xue, Y. Wang, J. Cai, J. Yuan, 3d hand shape and pose estimation from a single rgb image, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10833–10842.
    [47]
    C. Zimmermann, D. Ceylan, J. Yang, B. Russell, M. Argus, T. Brox, Freihand: A dataset for markerless capture of hand pose and shape from single rgb images, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 813–822.
    [48]
    Plouffe G., Cretu A.-M., Static and dynamic hand gesture recognition in depth data using dynamic time warping, IEEE Trans. Instrum. Meas. 65 (2) (2015) 305–316.
    [49]
    Murcia-López M., Steed A., A comparison of virtual and physical training transfer of bimanual assembly tasks, IEEE Trans. Vis. Comput. Graphics 24 (4) (2018) 1574–1583.
    [50]
    Lee H., Kim J., Kim T., A robot teaching framework for a redundant dual arm manipulator with teleoperation from exoskeleton motion data, in: 2014 IEEE-RAS International Conference on Humanoid Robots, IEEE, 2014, pp. 1057–1062.

    Cited By

    View all
    • (2023)Calibration of the Industrial Robot End-effector Pose Measurement ErrorProceedings of the 2023 12th International Conference on Computing and Pattern Recognition10.1145/3633637.3633717(510-515)Online publication date: 27-Oct-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Robotics and Computer-Integrated Manufacturing
    Robotics and Computer-Integrated Manufacturing  Volume 84, Issue C
    Dec 2023
    313 pages

    Publisher

    Pergamon Press, Inc.

    United States

    Publication History

    Published: 01 December 2023

    Author Tags

    1. Robot programming
    2. Human–robot interaction
    3. Hand gesture dataset
    4. Hand gesture recognition
    5. Hand pose estimation

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Calibration of the Industrial Robot End-effector Pose Measurement ErrorProceedings of the 2023 12th International Conference on Computing and Pattern Recognition10.1145/3633637.3633717(510-515)Online publication date: 27-Oct-2023

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media