Deep pyramid convolutional neural networks for text categorization

R Johnson, T Zhang - Proceedings of the 55th Annual Meeting of …, 2017 - aclanthology.org
R Johnson, T Zhang
Proceedings of the 55th Annual Meeting of the Association for …, 2017aclanthology.org
This paper proposes a low-complexity word-level deep convolutional neural network (CNN)
architecture for text categorization that can efficiently represent long-range associations in
text. In the literature, several deep and complex neural networks have been proposed for
this task, assuming availability of relatively large amounts of training data. However, the
associated computational complexity increases as the networks go deeper, which poses
serious challenges in practical applications. Moreover, it was shown recently that shallow …
Abstract
This paper proposes a low-complexity word-level deep convolutional neural network (CNN) architecture for text categorization that can efficiently represent long-range associations in text. In the literature, several deep and complex neural networks have been proposed for this task, assuming availability of relatively large amounts of training data. However, the associated computational complexity increases as the networks go deeper, which poses serious challenges in practical applications. Moreover, it was shown recently that shallow word-level CNNs are more accurate and much faster than the state-of-the-art very deep nets such as character-level CNNs even in the setting of large training data. Motivated by these findings, we carefully studied deepening of word-level CNNs to capture global representations of text, and found a simple network architecture with which the best accuracy can be obtained by increasing the network depth without increasing computational cost by much. We call it deep pyramid CNN. The proposed model with 15 weight layers outperforms the previous best models on six benchmark datasets for sentiment classification and topic categorization.
aclanthology.org