Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Johnson, Rie; Zhang, Tong

Computer Science > Computation and Language

arXiv:1412.1058 (cs)

[Submitted on 1 Dec 2014 (v1), last revised 26 Mar 2015 (this version, v2)]

Title:Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Authors:Rie Johnson, Tong Zhang

View PDF

Abstract:Convolutional neural network (CNN) is a neural network that can make use of the internal structure of data such as the 2D structure of image data. This paper studies CNN on text categorization to exploit the 1D structure (namely, word order) of text data for accurate prediction. Instead of using low-dimensional word vectors as input as is often done, we directly apply CNN to high-dimensional text data, which leads to directly learning embedding of small text regions for use in classification. In addition to a straightforward adaptation of CNN from image to text, a simple but new variation which employs bag-of-word conversion in the convolution layer is proposed. An extension to combine multiple convolution layers is also explored for higher accuracy. The experiments demonstrate the effectiveness of our approach in comparison with state-of-the-art methods.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1412.1058 [cs.CL]
	(or arXiv:1412.1058v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1412.1058

Submission history

From: Rie Johnson [view email]
[v1] Mon, 1 Dec 2014 16:19:51 UTC (235 KB)
[v2] Thu, 26 Mar 2015 12:59:35 UTC (270 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2014-12

Change to browse by:

cs
cs.LG
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Rie Johnson
Tong Zhang

export BibTeX citation

Computer Science > Computation and Language

Title:Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators