English mispronunciation detection module using a Transformer network integrated into a chatbot

Marcos  E. Martinez-Quezada; J.  Patricia Sánchez-Solís; Gilberto Rivera; Rogelio  Florencia; Francisco López-Orozco

Authors

Marcos E. Martinez-Quezada Universidad Autónoma de Ciudad Juárez
J. Patricia Sánchez-Solís Universidad Autónoma de Ciudad Juárez
Gilberto Rivera Universidad Autónoma de Ciudad Juárez
Rogelio Florencia Universidad Autónoma de Ciudad Juárez
Francisco López-Orozco Universidad Autónoma de Ciudad Juárez

Keywords:

Mispronunciation detection, Automatic Speech recognition, Transformer network

Abstract

Today it is crucial to have up-to-date information for companies to be more competitive in this business world. There are applications based on speech recognition that allows access to data stored in databases. However, the proper functioning of these applications lies in good pronunciation, a skill that most people do not have. In this paper, the architecture of an English mispronunciation detection module integrated into a chatbot is proposed. It allows users to enter the audio of the phrases in which they want to evaluate their pronunciation. The output is the mispronounced words, thus helping the user to practice their English language pronunciation. The proposed architecture consists of an Automatic Speech Recognizer (ASR) model based on a Transformer network that converts the audio signal to text and an algorithm for string alignment that identifies mispronounced words using the Levenshtein distance. The Transformer network was trained using the LibriSpeech and L2-ARTIC datasets. The module was evaluated using the Accuracy metrics, reaching 90%, and the Character Error Rate metric, reaching 9.5%. Additionally, its performance was evaluated on a group of real users, showing promising results.

English mispronunciation detection module using a Transformer network integrated into a chatbot

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Current Issue