Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

Hwang, Kyuyeon; Sung, Wonyong

Computer Science > Machine Learning

arXiv:1609.03777 (cs)

[Submitted on 13 Sep 2016 (v1), last revised 2 Feb 2017 (this version, v2)]

Title:Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

Authors:Kyuyeon Hwang, Wonyong Sung

View PDF

Abstract:Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature. However, their performance is generally much worse than the word-level language models (WLMs), since CLMs need to consider longer history of tokens to properly predict the next one. We address this problem by proposing hierarchical RNN architectures, which consist of multiple modules with different timescales. Despite the multi-timescale structures, the input and output layers operate with the character-level clock, which allows the existing RNN CLM training approaches to be directly applicable without any modifications. Our CLM models show better perplexity than Kneser-Ney (KN) 5-gram WLMs on the One Billion Word Benchmark with only 2% of parameters. Also, we present real-time character-level end-to-end speech recognition examples on the Wall Street Journal (WSJ) corpus, where replacing traditional mono-clock RNN CLMs with the proposed models results in better recognition accuracies even though the number of parameters are reduced to 30%.

Comments:	Submitted to NIPS 2016 on May 20, 2016 (v1), accepted to ICASSP 2017 (v2)
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1609.03777 [cs.LG]
	(or arXiv:1609.03777v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1609.03777

Submission history

From: Kyuyeon Hwang [view email]
[v1] Tue, 13 Sep 2016 11:41:48 UTC (174 KB)
[v2] Thu, 2 Feb 2017 13:49:41 UTC (178 KB)

Computer Science > Machine Learning

Title:Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators