Large-Scale Domain Adaptation via Teacher-Student Learning

Li, Jinyu; Seltzer, Michael L.; Wang, Xi; Zhao, Rui; Gong, Yifan

Computer Science > Computation and Language

arXiv:1708.05466 (cs)

[Submitted on 17 Aug 2017]

Title:Large-Scale Domain Adaptation via Teacher-Student Learning

Authors:Jinyu Li, Michael L. Seltzer, Xi Wang, Rui Zhao, Yifan Gong

View PDF

Abstract:High accuracy speech recognition requires a large amount of transcribed data for supervised training. In the absence of such data, domain adaptation of a well-trained acoustic model can be performed, but even here, high accuracy usually requires significant labeled data from the target domain. In this work, we propose an approach to domain adaptation that does not require transcriptions but instead uses a corpus of unlabeled parallel data, consisting of pairs of samples from the source domain of the well-trained model and the desired target domain. To perform adaptation, we employ teacher/student (T/S) learning, in which the posterior probabilities generated by the source-domain model can be used in lieu of labels to train the target-domain model. We evaluate the proposed approach in two scenarios, adapting a clean acoustic model to noisy speech and adapting an adults speech acoustic model to children speech. Significant improvements in accuracy are obtained, with reductions in word error rate of up to 44% over the original source model without the need for transcribed data in the target domain. Moreover, we show that increasing the amount of unlabeled data results in additional model robustness, which is particularly beneficial when using simulated training data in the target-domain.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1708.05466 [cs.CL]
	(or arXiv:1708.05466v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1708.05466

Submission history

From: Jinyu Li [view email]
[v1] Thu, 17 Aug 2017 23:37:18 UTC (725 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2017-08

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Jinyu Li
Michael L. Seltzer
Xi Wang
Rui Zhao
Yifan Gong

export BibTeX citation

Computer Science > Computation and Language

Title:Large-Scale Domain Adaptation via Teacher-Student Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Large-Scale Domain Adaptation via Teacher-Student Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators