Dual Language Models for Code Switched Speech Recognition

Garg, Saurabh; Parekh, Tanmay; Jyothi, Preethi

Computer Science > Computation and Language

arXiv:1711.01048 (cs)

[Submitted on 3 Nov 2017 (v1), last revised 3 Aug 2018 (this version, v2)]

Title:Dual Language Models for Code Switched Speech Recognition

Authors:Saurabh Garg, Tanmay Parekh, Preethi Jyothi

View PDF

Abstract:In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text. Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models. We propose a novel technique called dual language models, which involves building two complementary monolingual language models and combining them using a probabilistic model for switching between the two. We evaluate the efficacy of our approach using a conversational Mandarin-English speech corpus. We prove the robustness of our model by showing significant improvements in perplexity measures over the standard bilingual language model without the use of any external information. Similar consistent improvements are also reflected in automatic speech recognition error rates.

Comments:	Accepted at Interspeech 2018
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1711.01048 [cs.CL]
	(or arXiv:1711.01048v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1711.01048

Submission history

From: Saurabh Garg [view email]
[v1] Fri, 3 Nov 2017 07:56:31 UTC (150 KB)
[v2] Fri, 3 Aug 2018 13:46:46 UTC (265 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2017-11

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Saurabh Garg
Tanmay Parekh
Preethi Jyothi

export BibTeX citation

Computer Science > Computation and Language

Title:Dual Language Models for Code Switched Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Dual Language Models for Code Switched Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators