Hostname: page-component-cd9895bd7-jn8rn Total loading time: 0 Render date: 2024-12-26T07:14:21.891Z Has data issue: false hasContentIssue false

Design and Implementation of the Voice Command Recognition and the Sound Source Localization System for Human–Robot Interaction

Published online by Cambridge University Press:  15 March 2021

M. H. Korayem*
Affiliation:
Robotic Research Laboratory, Center of Excellence in Experimental Solid Mechanics and Dynamics, School of Mechanical Engineering, Iran University of Science and Technology, Tehran 1684613114, Iran E-mail: azargoshasb71@gmail.com
S. Azargoshasb
Affiliation:
Robotic Research Laboratory, Center of Excellence in Experimental Solid Mechanics and Dynamics, School of Mechanical Engineering, Iran University of Science and Technology, Tehran 1684613114, Iran E-mail: azargoshasb71@gmail.com
A. H. Korayem
Affiliation:
Mechanical and Mechatronics Engineering Department, University of Waterloo, ON N2L 3G1, Canada E-mail: amin.korayem@uwaterloo.ca
Sh. Tabibian
Affiliation:
Department of Mechanical and Mechatronics Engineering, Cyberspace Research Institute, Shahid Beheshti University, Tehran, Iran E-mail: sh_tabibian@sbu.ac.ir
*
*Corresponding author. E-mail: hkorayem@iust.ac.ir

Summary

Human–robot interaction (HRI) is becoming more important nowadays. In this paper, a low-cost communication system for HRI is designed and implemented on the Scout robot and a robotic face. A hidden Markov model-based voice command detection system is proposed and a non-native database has been collected by Persian speakers, which contains 10 desired English commands. The experimental results confirm that the proposed system is capable to recognize the voice commands, and properly performs the task or expresses the right answer. Comparing the system with a trained system on the Julius native database shows a better true detection (about 10%).

Type
Article
Copyright
© Iran University of Science and Technology, 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Lo, S.-Y. and Huang, H.-P., “Realization of sign language motion using a dual-arm/hand humanoid robot,” Intell. Serv. Rob. 9(4), 333345 (2016).10.1007/s11370-016-0203-8CrossRefGoogle Scholar
Halawani, A., Ur Réhman, S., Li, H. and Anani, A., “Active vision for controlling an electric wheelchair,” Intell. Service Rob. 5(2), 8998 (2012).CrossRefGoogle Scholar
Leica, P., Roberti, F., Monllor, M., Toibero, J. M. and Carelli, R., “Control of bidirectional physical human–robot interaction based on the human intention,” Intell. Serv. Rob. 10(1), 3140 (2017).CrossRefGoogle Scholar
Kennedy, J., Lemaignan, S., Montassier, C., Lavalade, P., Irfan, B., Papadopoulos, F., Senft, E. and Belpaeme, T., “Child Speech Recognition in Human-Robot Interaction: Evaluations and Recommendations,” Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (2017).CrossRefGoogle Scholar
Ivaldi, S., Lefort, S., Peters, J., Chetouani, M., Provasi, J. and Zibetti, E., “Towards engagement models that consider individual factors in HRI: On the relation of extroversion and negative attitude towards robots to gaze and speech during a human–robot assembly task,” Int. J. Soc. Rob. 9(1), 6386 (2017).CrossRefGoogle Scholar
Varrasi, S., Lucas, A., Soranzo, A., McNamara, J. and Di Nuovo, A., “IBM Cloud Services Enhance Automatic Cognitive Assessment via Human-Robot Interaction,” In: New Trends in Medical and Service Robotics (Springer, 2019) pp. 169176.CrossRefGoogle Scholar
Grigore, E. C., Pereira, A., Zhou, I., Wang, D. and Scassellati, B., “Talk to Me: Verbal Communication Improves Perceptions of Friendship and Social Presence in Human-Robot Interaction,International Conference on Intelligent Virtual Agents (Springer, 2016).Google Scholar
Fardana, A., Jain, S., Jovančević, I., Suri, Y., Morand, C. and Robertson, N., “Controlling a Mobile Robot with Natural Commands based on Voice and Gesture,” Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2013).Google Scholar
Gómez, J.-B., Ceballos, A., Prieto, F. and Redarce, T., “Mouth Gesture and Voice Command Based Robot Command Interface,” International Conference on Robotics and Automation (2009).CrossRefGoogle Scholar
Mustofa, T., “Implementation speech recognition for robot control using MFCC and ANFIS,” J. Telematics Inf. (JTI) 5(2), 4755 (2017).Google Scholar
Zinchenko, K., Wu, C.-Y. and Song, K.-T., “A study on speech recognition control for a surgical robot,” IEEE Trans. Ind. Inf. 13(2), 607615 (2016).CrossRefGoogle Scholar
Burger, B., Ferrané, I., Lerasle, F. and Infantes, G., “Two-handed gesture recognition and fusion with speech to command a robot,” Auto. Robots. 32(2), 129147 (2012).CrossRefGoogle Scholar
Abdullahi, Z. H., Muhammad, N. A., Kazaure, J. S. and Amuda, F. A., “Mobile robot voice recognition in control movements,” Int. J. Comput. Sci. Electron. Eng. (IJCSEE). 3(1), 1116 (2015).Google Scholar
Abed, A. A. and Jasim, A. A., “Design and implementation of wireless voice controlled mobile robot,” Al-Qadisiyah J. Eng. Sci. 9(2), 135147 (2016).Google Scholar
Okuno, H. G., Nakadai, K., Takahashi, T., Takeda, R., Nakamura, K., Mizumoto, T., Yoshida, T., Lim, A., Otsuka, T., Nagira, K. and Itohara, T., HARK Document Version 2.0. 0.(Revision: 6357).Google Scholar
Tabibian, S., “A voice command detection system for aerospace applications,” Int. J. Speech Technol. 20(4), 10491061 (2017).10.1007/s10772-017-9467-4CrossRefGoogle Scholar
Yoshizawa, S., Hayasaka, N., Wada, N. and Miyanaga, Y., “Cepstral Gain Normalization for Noise Robust Speech Recognition,” 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (2004).Google Scholar
Young, S. J. and Young, S., The HTK Hidden Markov Model Toolkit: Design and Philosophy (University of Cambridge, Department of Engineering Cambridge, UK, 1993).Google Scholar
Elahi, M. T., Korayem, A. H., Shariati, A., Meghdari, A., Alemi, M., Ahmadi, E., Taheri, A. and Heidari, R., “‘Xylotism’: A Tablet-Based Application to Teach Music to Children with Autism,International Conference on Social Robotics (Springer, 2017).Google Scholar
Pour, A. G., Taheri, A., Alemi, M. and Meghdari, A., “Human–robot facial expression reciprocal interaction platform: Case studies on children with autism,” Int. J. Soc. Rob. 10(2), 179198 (2018).Google Scholar
Taheri, A., Meghdari, A., Alemi, M. and Pouretemad, H., “Human–robot interaction in autism treatment: Acase study on three pairs of autistic children as twins, siblings, and classmates,” Int. J. Soc. Rob. 10(1), 93113 (2018).CrossRefGoogle Scholar
Bennett, C. and Šabanovic, S., “Perceptions of Affective Expression in a Minimalist Robotic Face,” 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (2013).CrossRefGoogle Scholar
Bennett, C. C. and Šabanoviæ, S., “Deriving minimal features for human-like facial expressions in robotic faces,” Int. J. Soc. Rob. 6(3), 367381 (2014).CrossRefGoogle Scholar
Blow, M., Dautenhahn, K., Appleby, A., Nehaniv, C. L. and Lee, D., “The Art of Designing Robot Faces: Dimensions for Human-Robot Interaction,” Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction (2006).CrossRefGoogle Scholar
DiSalvo, C. F., Gemperle, F., Forlizzi, J. and Kiesler, S., “All Robots are not Created Equal: The Design and Perception of Humanoid Robot Heads,” Proceedings of the 4th Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques (2002).Google Scholar
Korayem, A., Nekoo, S. and Korayem, M., “Sliding mode control design based on the state-dependent Riccati equation: Theoretical and experimental implementation,” Int. J. Control 92(9), 21362149 (2019).10.1080/00207179.2018.1428769CrossRefGoogle Scholar
Korayem, A. H., Nekoo, S. R. and Korayem, M. H., “Optimal sliding mode control design based on the state-dependent Riccati equation for cooperative manipulators to increase dynamic load carrying capacity,” Robotica, 37(2), 321337 (2019).CrossRefGoogle Scholar
Davouddoust, A., Upgrading the Scout Mobile Robot, by Creating Social Interactions with Humans, Using Audio and Video Channels in a Dynamic Environment Master’s Thesis (Islamic Azad University of Tehran Research Sciences, 2018).Google Scholar
Tanha, S., Dehkordi, S. and Korayem, A., “Control a Mobile Robot in Social Environments by Considering Human as a Moving Obstacle,” 2018 6th RSI International Conference on Robotics and Mechatronics (IcRoM) (2018).10.1109/ICRoM.2018.8657641CrossRefGoogle Scholar