Design and Implementation of the Voice Command Recognition and the Sound Source Localization System for Human–Robot Interaction

M. H. Korayem; S. Azargoshasb; A. H. Korayem; Sh. Tabibian

doi:10.1017/S0263574720001496

Design and Implementation of the Voice Command Recognition and the Sound Source Localization System for Human–Robot Interaction

Published online by Cambridge University Press: 15 March 2021

A. H. Korayem and

M. H. Korayem*: Affiliation:
Robotic Research Laboratory, Center of Excellence in Experimental Solid Mechanics and Dynamics, School of Mechanical Engineering, Iran University of Science and Technology, Tehran 1684613114, Iran E-mail: azargoshasb71@gmail.com
S. Azargoshasb: Affiliation:
Robotic Research Laboratory, Center of Excellence in Experimental Solid Mechanics and Dynamics, School of Mechanical Engineering, Iran University of Science and Technology, Tehran 1684613114, Iran E-mail: azargoshasb71@gmail.com
A. H. Korayem: Affiliation:
Mechanical and Mechatronics Engineering Department, University of Waterloo, ON N2L 3G1, Canada E-mail: amin.korayem@uwaterloo.ca
Sh. Tabibian: Affiliation:
Department of Mechanical and Mechatronics Engineering, Cyberspace Research Institute, Shahid Beheshti University, Tehran, Iran E-mail: sh_tabibian@sbu.ac.ir
*: *Corresponding author. E-mail: hkorayem@iust.ac.ir

Article contents

Summary
References

Get access

Rights & Permissions

Summary

Human–robot interaction (HRI) is becoming more important nowadays. In this paper, a low-cost communication system for HRI is designed and implemented on the Scout robot and a robotic face. A hidden Markov model-based voice command detection system is proposed and a non-native database has been collected by Persian speakers, which contains 10 desired English commands. The experimental results confirm that the proposed system is capable to recognize the voice commands, and properly performs the task or expresses the right answer. Comparing the system with a trained system on the Julius native database shows a better true detection (about 10%).

Keywords

Voice command Human–robot interaction Hidden Markov model Social robotic Robotic face

Type: Article
Information: Robotica , Volume 39 , Issue 10 , October 2021 , pp. 1779 - 1790

DOI: https://doi.org/10.1017/S0263574720001496 [Opens in a new window]
Copyright: © Iran University of Science and Technology, 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Lo, S.-Y. and Huang, H.-P., “Realization of sign language motion using a dual-arm/hand humanoid robot,” Intell. Serv. Rob. 9(4), 333–345 (2016).10.1007/s11370-016-0203-8CrossRef Google Scholar

Halawani, A., Ur Réhman, S., Li, H. and Anani, A., “Active vision for controlling an electric wheelchair,” Intell. Service Rob. 5(2), 89–98 (2012).CrossRef Google Scholar

Leica, P., Roberti, F., Monllor, M., Toibero, J. M. and Carelli, R., “Control of bidirectional physical human–robot interaction based on the human intention,” Intell. Serv. Rob. 10(1), 31–40 (2017).CrossRef Google Scholar

Kennedy, J., Lemaignan, S., Montassier, C., Lavalade, P., Irfan, B., Papadopoulos, F., Senft, E. and Belpaeme, T., “Child Speech Recognition in Human-Robot Interaction: Evaluations and Recommendations,” Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (2017).CrossRef Google Scholar

Ivaldi, S., Lefort, S., Peters, J., Chetouani, M., Provasi, J. and Zibetti, E., “Towards engagement models that consider individual factors in HRI: On the relation of extroversion and negative attitude towards robots to gaze and speech during a human–robot assembly task,” Int. J. Soc. Rob. 9(1), 63–86 (2017).CrossRef Google Scholar

Varrasi, S., Lucas, A., Soranzo, A., McNamara, J. and Di Nuovo, A., “IBM Cloud Services Enhance Automatic Cognitive Assessment via Human-Robot Interaction,” In: New Trends in Medical and Service Robotics (Springer, 2019) pp. 169–176.CrossRef Google Scholar

Grigore, E. C., Pereira, A., Zhou, I., Wang, D. and Scassellati, B., “Talk to Me: Verbal Communication Improves Perceptions of Friendship and Social Presence in Human-Robot Interaction,” International Conference on Intelligent Virtual Agents (Springer, 2016).Google Scholar

Fardana, A., Jain, S., Jovančević, I., Suri, Y., Morand, C. and Robertson, N., “Controlling a Mobile Robot with Natural Commands based on Voice and Gesture,” Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2013).Google Scholar

Gómez, J.-B., Ceballos, A., Prieto, F. and Redarce, T., “Mouth Gesture and Voice Command Based Robot Command Interface,” International Conference on Robotics and Automation (2009).CrossRef Google Scholar

Mustofa, T., “Implementation speech recognition for robot control using MFCC and ANFIS,” J. Telematics Inf. (JTI) 5(2), 47–55 (2017).Google Scholar

Zinchenko, K., Wu, C.-Y. and Song, K.-T., “A study on speech recognition control for a surgical robot,” IEEE Trans. Ind. Inf. 13(2), 607–615 (2016).CrossRef Google Scholar

Burger, B., Ferrané, I., Lerasle, F. and Infantes, G., “Two-handed gesture recognition and fusion with speech to command a robot,” Auto. Robots. 32(2), 129–147 (2012).CrossRef Google Scholar

Abdullahi, Z. H., Muhammad, N. A., Kazaure, J. S. and Amuda, F. A., “Mobile robot voice recognition in control movements,” Int. J. Comput. Sci. Electron. Eng. (IJCSEE). 3(1), 11–16 (2015).Google Scholar

Abed, A. A. and Jasim, A. A., “Design and implementation of wireless voice controlled mobile robot,” Al-Qadisiyah J. Eng. Sci. 9(2), 135–147 (2016).Google Scholar

Okuno, H. G., Nakadai, K., Takahashi, T., Takeda, R., Nakamura, K., Mizumoto, T., Yoshida, T., Lim, A., Otsuka, T., Nagira, K. and Itohara, T., HARK Document Version 2.0. 0.(Revision: 6357).Google Scholar

Tabibian, S., “A voice command detection system for aerospace applications,” Int. J. Speech Technol. 20(4), 1049–1061 (2017).10.1007/s10772-017-9467-4CrossRef Google Scholar

Yoshizawa, S., Hayasaka, N., Wada, N. and Miyanaga, Y., “Cepstral Gain Normalization for Noise Robust Speech Recognition,” 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (2004).Google Scholar

Young, S. J. and Young, S., The HTK Hidden Markov Model Toolkit: Design and Philosophy (University of Cambridge, Department of Engineering Cambridge, UK, 1993).Google Scholar

Elahi, M. T., Korayem, A. H., Shariati, A., Meghdari, A., Alemi, M., Ahmadi, E., Taheri, A. and Heidari, R., “‘Xylotism’: A Tablet-Based Application to Teach Music to Children with Autism,” International Conference on Social Robotics (Springer, 2017).Google Scholar

Pour, A. G., Taheri, A., Alemi, M. and Meghdari, A., “Human–robot facial expression reciprocal interaction platform: Case studies on children with autism,” Int. J. Soc. Rob. 10(2), 179–198 (2018).Google Scholar

Taheri, A., Meghdari, A., Alemi, M. and Pouretemad, H., “Human–robot interaction in autism treatment: Acase study on three pairs of autistic children as twins, siblings, and classmates,” Int. J. Soc. Rob. 10(1), 93–113 (2018).CrossRef Google Scholar

Bennett, C. and Šabanovic, S., “Perceptions of Affective Expression in a Minimalist Robotic Face,” 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (2013).CrossRef Google Scholar

Bennett, C. C. and Šabanoviæ, S., “Deriving minimal features for human-like facial expressions in robotic faces,” Int. J. Soc. Rob. 6(3), 367–381 (2014).CrossRef Google Scholar

Blow, M., Dautenhahn, K., Appleby, A., Nehaniv, C. L. and Lee, D., “The Art of Designing Robot Faces: Dimensions for Human-Robot Interaction,” Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction (2006).CrossRef Google Scholar

DiSalvo, C. F., Gemperle, F., Forlizzi, J. and Kiesler, S., “All Robots are not Created Equal: The Design and Perception of Humanoid Robot Heads,” Proceedings of the 4th Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques (2002).Google Scholar

Korayem, A., Nekoo, S. and Korayem, M., “Sliding mode control design based on the state-dependent Riccati equation: Theoretical and experimental implementation,” Int. J. Control 92(9), 2136–2149 (2019).10.1080/00207179.2018.1428769CrossRef Google Scholar

Korayem, A. H., Nekoo, S. R. and Korayem, M. H., “Optimal sliding mode control design based on the state-dependent Riccati equation for cooperative manipulators to increase dynamic load carrying capacity,” Robotica, 37(2), 321–337 (2019).CrossRef Google Scholar

Davouddoust, A., Upgrading the Scout Mobile Robot, by Creating Social Interactions with Humans, Using Audio and Video Channels in a Dynamic Environment Master’s Thesis (Islamic Azad University of Tehran Research Sciences, 2018).Google Scholar

Tanha, S., Dehkordi, S. and Korayem, A., “Control a Mobile Robot in Social Environments by Considering Human as a Moving Obstacle,” 2018 6th RSI International Conference on Robotics and Mechatronics (IcRoM) (2018).10.1109/ICRoM.2018.8657641CrossRef Google Scholar

Article contents

Design and Implementation of the Voice Command Recognition and the Sound Source Localization System for Human–Robot Interaction

Summary

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests