Active Learning for Domain Classification in a Commercial Spoken Personal Assistant

Chen, Xi C.; Sagar, Adithya; Kao, Justine T.; Li, Tony Y.; Klein, Christopher; Pulman, Stephen; Garg, Ashish; Williams, Jason D.

Computer Science > Machine Learning

arXiv:1908.11404 (cs)

[Submitted on 29 Aug 2019]

Title:Active Learning for Domain Classification in a Commercial Spoken Personal Assistant

Authors:Xi C. Chen, Adithya Sagar, Justine T. Kao, Tony Y. Li, Christopher Klein, Stephen Pulman, Ashish Garg, Jason D. Williams

View PDF

Abstract:We describe a method for selecting relevant new training data for the LSTM-based domain selection component of our personal assistant system. Adding more annotated training data for any ML system typically improves accuracy, but only if it provides examples not already adequately covered in the existing data. However, obtaining, selecting, and labeling relevant data is expensive. This work presents a simple technique that automatically identifies new helpful examples suitable for human annotation. Our experimental results show that the proposed method, compared with random-selection and entropy-based methods, leads to higher accuracy improvements given a fixed annotation budget. Although developed and tested in the setting of a commercial intelligent assistant, the technique is of wider applicability.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1908.11404 [cs.LG]
	(or arXiv:1908.11404v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1908.11404

Submission history

From: Xi Chen [view email]
[v1] Thu, 29 Aug 2019 18:14:46 UTC (3,423 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-08

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Stephen Pulman
Ashish Garg
Jason D. Williams

export BibTeX citation

Computer Science > Machine Learning

Title:Active Learning for Domain Classification in a Commercial Spoken Personal Assistant

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Active Learning for Domain Classification in a Commercial Spoken Personal Assistant

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators