Research On Text Classification Based On CNN and LSTM: Yuandong Luan Shaofu Lin
Negative Pre Positive Pre
Negative Act TN FP
Positive Act FN TP
𝐹𝐹1 = × 100%#(11)
A. Data Set FN + FP
TP +
The experimental data in this paper are derived from the
subjective and objective text data used in [1]. The data set D. Experimental comparison
version number is subjectivity dataset v1.0, which includes
In this paper, a single CNN and LSTM model is used as
5000 subjective text and 5000 objective text data.
the contrast model, and produce four kinds of subjective and
B. Experimental settings objective text data classification models by combining CNN
In this experiment, for the convolutional network layer, and LSTM and their variants. They are standard CNN
we use the word embedding dimension of 256, the size of the combined with standard LSTM called CNN-LSTM model,
filter is 3,4,5, the number of filters is 128, the sliding step is non-activation function CNN combined with standard LSTM
1, and the valid padding method is used. For LSTM layer, we called NA-CNN-LSTM model, standard CNN combined
use two-tier stacked LSTM and set the number of hidden with variant LSTM called CNN-COIF-LSTM model, and
units to 128. non-activation function CNN combined with variant LSTM
called NA-CNN-COIF-LSTM model.
C. Evaluating indicator
In order to evaluate the performance of our model, we
use the precision, recall and f1-score as the evaluation In this paper, the above model is used to experiment on
criteria of this experiment. To illustrate the meanings of a given data set. The final experimental results are shown in
these indicators, confusion matrix is introduced first [12], Table II. comparison result
shown in TableⅠ. Model Precision Recall F1 score
CNN 98.9353% 98.5197% 98.7270%
LSTM 98.9816% 99.1598% 99.0706% combination of CNN without activation function and LSTM
CNN-LSTM 99.4769% 98.9197% 99.1975% or its variant has better performance. Ref. [9] proposed eight
NA-CNN-LSTM 99.2201% 99.2598% 99.2400% variant models of LSTM. The next step of this paper is to
CNN-COIF-LSTM 98.9816% 99.1598% 99.0706%
NA-CNN-COIF-LSTM 99.1415 99.3398% 99.2406%
explore the performance of the combination of CNN and
other variants of LSTM.
From the above results, we can draw some interesting
findings. REFERENCES
Through comparative experiments, it is proved that the