Large Scale Language Modeling in Automatic Speech Recognition

Chelba, Ciprian; Bikel, Dan; Shugrina, Maria; Nguyen, Patrick; Kumar, Shankar

Computer Science > Computation and Language

arXiv:1210.8440 (cs)

[Submitted on 31 Oct 2012]

Title:Large Scale Language Modeling in Automatic Speech Recognition

Authors:Ciprian Chelba, Dan Bikel, Maria Shugrina, Patrick Nguyen, Shankar Kumar

View PDF

Abstract:Large language models have been proven quite beneficial for a variety of automatic speech recognition tasks in Google. We summarize results on Voice Search and a few YouTube speech transcription tasks to highlight the impact that one can expect from increasing both the amount of training data, and the size of the language model estimated from such data. Depending on the task, availability and amount of training data used, language model size and amount of work and care put into integrating them in the lattice rescoring step we observe reductions in word error rate between 6% and 10% relative, for systems on a wide range of operating points between 17% and 52% word error rate.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1210.8440 [cs.CL]
	(or arXiv:1210.8440v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1210.8440

Submission history

From: Ciprian Chelba [view email]
[v1] Wed, 31 Oct 2012 18:57:14 UTC (75 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2012-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ciprian Chelba
Dan Bikel
Maria Shugrina
Patrick Nguyen
Shankar Kumar

export BibTeX citation

Computer Science > Computation and Language

Title:Large Scale Language Modeling in Automatic Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Large Scale Language Modeling in Automatic Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators