Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Wang, Peng; Yang, Yifan; Liang, Zheng; Tan, Tian; Zhang, Shiliang; Chen, Xie

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2309.07648 (eess)

[Submitted on 14 Sep 2023 (v1), last revised 8 Jun 2024 (this version, v2)]

Title:Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Authors:Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

View PDF HTML (experimental)

Abstract:Despite advancements of end-to-end (E2E) models in speech recognition, named entity recognition (NER) is still challenging but critical for semantic understanding. Previous studies mainly focus on various rule-based or attention-based contextual biasing algorithms. However, their performance might be sensitive to the biasing weight or degraded by excessive attention to the named entity list, along with a risk of false triggering. Inspired by the success of the class-based language model (LM) in NER in conventional hybrid systems and the effective decoupling of acoustic and linguistic information in the factorized neural Transducer (FNT), we propose C-FNT, a novel E2E model that incorporates class-based LMs into FNT. In C-FNT, the LM score of named entities can be associated with the name class instead of its surface form. The experimental results show that our proposed C-FNT significantly reduces error in named entities without hurting performance in general word recognition.

Comments:	Accepted in INTERSPEECH 2024
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
Cite as:	arXiv:2309.07648 [eess.AS]
	(or arXiv:2309.07648v2 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2309.07648

Submission history

From: Yifan Yang [view email]
[v1] Thu, 14 Sep 2023 12:14:49 UTC (1,063 KB)
[v2] Sat, 8 Jun 2024 13:08:39 UTC (570 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators