Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Experiments of Supervised Learning and Semi-Supervised Learning in Thai Financial News Sentiment: A Comparative Study

Published: 20 July 2023 Publication History

Abstract

Sentiment classification is an instrument of natural language processing tasks in text analysis to measure customer feedback from given documents such as product reviews, news, and texts. This research aims to experiment with Thai financial news sentiment classification and evaluate sentiment classification performance. In this research, we show financial news sentiment classification experimental results when comparing supervised and semi-supervised methods. In the research methodology, we use PyThaiNLP to tokenize and remove stopwords and split datasets into 85% of the training set and 15% of the testing set. Next, we classify sentiment using machine learning and deep learning approaches with feature extraction such as bag-of-words, term frequency–inverse document frequency, and word embedding (Word2Vec and Bidirectional Encoder Representations from Transformers (BERT)) in given texts. The results show that support vector machine with the BERT model yields the best performance at 83.38%; in contrast, the random forest classifier with bag-of-words yields the worst performance at 54.10% in the machine learning approach. Another experiment reveals that long short-term memory with the BERT model yields the best performance at 84.07% in contrast to the convolutional neural network with bag-of-words, which yields the worst performance at 69.80% in the deep learning approach. The results imply that support vector machine, convolutional neural network, and long short-term memory are suitable for classifying sentiment in complex structure language. From this study, we observe the importance of sentiment classification tools between supervised and semi-supervised learning, and we look forward to furthering this work.

1 Introduction

Sentiment classification is the automated process of identifying opinions in the text related to Natural Language Processing (NLP), data mining, and computational linguistics [1]. Many organizations realize the importance of sentiment classification to control or improve their products or service quality. In business, sentiment classification measures feedback or customer reviews of products or services. Many sentiment classification techniques have been used in different contexts. Medhat et al. [2] surveyed sentiment analysis algorithms and applications, then classified sentiment analysis techniques into two techniques: a machine learning approach and a lexicon-based approach.
Social media has become a powerful medium, as the growth of the Web 2.0 era has contributed to a large amount of content. Most people use social media such as Facebook, Twitter, and YouTube. Therefore, sentiment classification is applied from NLP to examine users’ opinions automatically. Nevertheless, public sentiment classification must contain keywords or trending topics in several frequencies [3]. Many research studies on sentiment classification topics are reported in foreign languages such as English, Chinese, and Spanish. These studies are categorized by text messages, including opinions, emoticons, voices, and video gesture images. However, text classification in Asian languages is rarely published because of the variety of language families, complex structure, and low resources [4]. From the stated problem, we found an opportunity to study Thai sentiment classification from social media information to express emotions, feelings, and opinions in positive, neutral, and negative sentiments in every language.
As mentioned earlier, this research uses machine learning and deep learning techniques to classify Thai language news sentiment from social media. Machine learning and deep learning approaches will be helpful for social media marketers. Additionally, this work aims to compare sentiment classification performance for complex structure language.
The rest of the article is organized as follows. In Section 2, we describe the structure of Thai word sentiment, which expresses emotions and feelings in sentiment polarity; identify the problem and motivation to develop our models; and describe and highlight sentiment classification techniques. In Section 3, we describe the research methodology, focusing on our experiment. In Section 4, we present the results from our earlier experiment. In Section 5, the obtained results are described. Finally, Section 6 provides our conclusion and contributions to this work.

2 Literature Review

This section describes the literature and theory, including characteristics of the Thai language, Thai word sentiment, NLP, sentiment classification techniques, and related works.

2.1 Characteristics of the Thai Language

The Thai language is a unique writing system with its syllabic alphabets consisting of four parts including consonant, vowel, diphthong and tone. The Thai language is a unique writing system with its syllabic alphabets consisting of three parts: consonant-vowel-consonant-vowel and tone [5]. The Thai language has 87 characters consisting of 44 consonant characters, 18 vowel symbols, 4 tone marks, 5 diacritics, 10 numerals, and 6 other symbols [6], as shown in Figure 1.
Fig. 1.
Fig. 1. Thai characters.
The Thai language is written from left to right, and one of the unique features is that vowel symbols are placed in front of, back of, above, and below a consonant character. Additionally, tone marks are placed above a consonant, diacritics are written above or below a consonant, and a government official often uses numerals, with no word separation between words [7]. Unlike English or other Roman alphabet writing systems, the Thai language has no capital letters. A space uses sentences and independent phrases in a sense, but no absolute rule of the use of space. Moreover, Thai verbs do not change their forms according to tense or concord, and tense is optionally expressed with auxiliary verbs or time adverbials [8]. An example of the Thai writing system is shown in Figure 2.
Fig. 2.
Fig. 2. Example of a Thai writing system.
The Thai language is spoken differently in daily life for social activities in various regions of Thailand. In Northern Thailand, they speak “Lanna” or “Kam Mueang,” which came from the Southwestern Tai language that is widely used in Northwestern Laos, Northeastern Myanmar, and Southern Yunnan province, China [9]. In Northeastern Thailand or Isan, the “Isan” language is spoken, which is modified from the Lao language [10]. However, the south of Isan region, especially Surin, Sisaket, and Buriram provinces are widely spoken Northern Khmer language and closely related to the Khmer's culture [11]. In Southern Thailand, “Pak Thai” is spoken in Nakhon Si Thammarat province, and some words are borrowed from the Peranakans dialect. However, Songkhla, Pattani, Yala, and Narathiwat provinces close to Malaysia widely speak “Yawi,” which is related to the Malay language [12].
As mentioned earlier, the Thai language is a complex structure language in writing and speaking systems. Therefore, one challenge in the NLP research topic in Thai language is algorithm improvement to develop classification performance, fulfill research gaps, and adapt to many interdisciplinary studies.

2.2 Thai Word Sentiment

The Thai language has complex structure because there is no word, phrase, or sentence boundary [13]. Many words can express emotions, feelings, and opinions in positive, neutral, and negative sentiments. Examples of Thai word sentiment include /Dee-Jai (happy), /Sia-Jai (sad), /Ning-ChoiChoi (calm), and /Tuen-Ten (excite). In the business sector, businesses can measure customer feedback from sentiment analysis instruments to improve their products or services. Positive word sentiment represents the customer satisfaction, and the sentiment of the negative words represents the customers who are unsatisfied.
In recent works, many Thai researchers have focused on Thai sentences because it is a complex structure of words and forms of sentences, with limitations in ambiguity and lack of resources [14]—the related works in Thai sentiment analysis research used various contexts, such as in political context. Prasertdum and Wichadakul [15] used a text mining method to mine keywords in the 2019 Thailand General Election event. The results showed that positive sentiments correlated with the number of votes, especially those of new voters. In a social media marketing context, Chumwatana [16] used a lexicon-based approach to classify product and service reviews in social media to develop sentiment polarity scores in Thai language. Porntrakoon [17] developed SenseComp 2, which used a lexicon-based approach to measure customer reviews [18]. The results showed that sentiment analysis is useful for product and service improvement in the business sector.
As earlier mentioned, one challenge in this study field is Thai language is one of unsegmented language and also has unclear meaning because it can be both positive and negative sentiment [19]. For example, /Paeng can be translated into English in positive (Elegant) and negative (Expensive). Another example is positive word which combines of two negative words such as /Gong-Kuam-Tai (Comeback) is a positive sentiment that consisted of /Gong (Cheat) and /Kuam-Tai (Death) which represented negative sentiment. Hence, our research tries to solve and develop sentiment classification model by using supervised and semi-supervised method for Thai words sentiment. All sentiments data are divided into three classes such as positive, neutral and negative.

2.3 Natural Language Processing

NLP is a subfield of artificial intelligence and linguistics that enables machines to understand human languages [20]. An objective of NLP is to reduce the gap in communication between machines and humans. NLP is divided into two methods: a grammar-based method and a non-grammar-based method. Grammar-based methods depend on grammar rules and linguistic principles. Non-grammar-based methods use statistical instead of linguistic principles, grammar rules, or other techniques. NLP is an important technique that involves text classification and text segmentation to interpret content sentiment. Aroonmanakun et al. [21] stated that NLP contributes to training a large amount of data and exploring directions in real-time text classification.
Many works use NLP related to sentiment classification in Thai languages. For example, Sriphaew et al. [22] developed NLP for Thai sentiment analysis, such as word segmentation, a part-of-speech tagger, and a sentence boundary detector. Haruechaiyasak and Kongthon [23] developed NLP from a limitation of a dictionary-based approach. They found a solution for tokenizing and normalizing texts with intentional insertion errors—for example, a repeated alphabet at the end of words. Sanguansat [24] developed Paragraph2Vec-based sentiment analysis to measure sentiment from business sectors such as retail, financial banking, and telecommunication services. In recent works, many researchers have focused on NLP usage for text mining in terms of business. For example, Polpinij [25] used multilingual sentiment to measure product review feedback. Deewattananon and Sammapun [26] applied NLP based on the lexicon approach to measure Thai sentiment from customer reviews. Esichaikul and Phumdontree [27] used NLP to apply text extraction in financial news to develop SentiFine, a web-based Thai sentiment analysis. Chansanam and Tuamsuk [28] used NLP to tokenize and segment Thai keywords to perform persistent observation and assessment of individuals in Thailand.
As mentioned earlier, the challenges for NLP in Thai language sentiment classification are language-related issues such as short-length messages, word usage variation, and unbalanced data problems [29]. Additionally, the complexity of Thai words and sentences may use a high-level classification technique to measure sentiment.

2.4 Sentiment Classification Techniques

Sentiment classification is an automated process that uses NLP and text analysis to identify opinions in a text and label them as positive, negative, or neutral, based on the emotions that customers express within them. Many entrepreneurs use sentiment classification to measure customer opinions for product or service improvement processes in the business sector. Medhat et al. [2] categorized the sentiment classification technique into machine learning and lexicon-based approaches. The machine learning approach is an applied algorithm that builds a classification model from linguistic features with supervised and unsupervised learning called traditional models. In a recent study, Dang et al. [30] explored an extension of the machine learning approach to semi-supervised learning consisting of a deep learning classifier to improve traditional classifiers’ performance and classification problems at the document, sentence, or aspect level.
In comparison, lexicon-based analysis is a data analysis task that employs sentiment words and phrases without prior knowledge, such as dictionary-based and corpus-based approaches. In recent works, Pandey et al. [31] and Zainuddin et al. [32] combined lexicon-based and machine learning based approaches to improve sentiment polarity and accuracy from existing baseline sentiment classification methods. A taxonomy of sentiment classification techniques is shown in Figure 3.
Fig. 3.
Fig. 3. Sentiment classification techniques.
Therefore, unlike traditional systems, there are input data that proceed to generate output data. However, many researchers popularly use machine learning because it provides a mathematical model based on sample data to make prediction models without being explicitly programmed [33]. Hence, the machine learning approach will be utilized to find an improved prediction model in this research.

2.4.1 Machine Learning Approach.

The machine learning approach relies on algorithms that build a classification model from labeled linguistic features. The aim of this approach in sentiment classification work is to detect sentiment that reflects customer feedback automatically. The machine learning approach consists of three types: supervised learning, semi-supervised learning, and unsupervised learning.
Supervised Learning. The supervised learning method is a machine learning approach trained on a dataset with labeled classes [34]. The training data involves a set of labeled training documents. In text classification, supervised learning has many algorithms to build a classification model, such as decision tree classifiers, linear classifiers, rule-based classifiers, and probabilistic classifiers.
Semi-Supervised Learning. The semi-supervised learning method is a machine learning approach trained on a dataset by giving a small number of labeled classes with many unlabeled data during training. In text classification, semi-supervised learning provides a deep learning method to improve classification performance better than traditional models [30]. Examples of semi-supervised learning include deep neural networks, Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM), among others.
Unsupervised Learning. The unsupervised learning method is a machine learning approach that uses a trained dataset with given unlabeled classes that depends on semantic orientation pointwise mutual information, semantic spaces, and distributional similarity to measure between words and polarity [35]. Examples of unsupervised learning include cluster analysis, anomaly detection, and neural networks.

2.4.2 Lexicon-Based Approach.

The lexicon-based approach relies on a study conducted by language experts. The outcome of this study is a set of rules according to word classification, including positive or negative, along with their corresponding intensity measure [36]. The lexicon-based approach is divided into two types of learning: a dictionary-based approach and a corpus-based approach.
Dictionary-Based Approach. The dictionary-based approach refers to a small set of opinion words collected in a sentence. Then, the datasets are grown by searching the corpus (e.g., WordNet) for their synonyms and antonyms [37]. The advantages of the dictionary-based approach are no labeled data and the learning procedure is not required. However, the limitation of the dictionary-based approach is the powerful linguistic resources that are difficult to find.
Corpus-Based Approach. The corpus-based approach relies on solving a problem of opinion words with a specific context. This methodology depends on syntactic patterns and a list of opinion words to find sentiment polarity in a large corpus, such as coordinate construction sentences. The corpus-based approach was introduced by Hatzivassiloglou and McKeown [38] to represent sentiment polarity words by using a conjunction. For example, two words with the same sentiment polarity will use “AND”; however, they use “BUT” to connect opposite sentiment polarity words. This method could be done by using statistical or semantic methods.

2.5 Related Works

In this section, we begin by introducing related works with regard to Thai sentiment classification. For traditional methodology research, many researchers started to use a lexicon-based approach like WordNet [42] and machine learning approaches like decision tree [39], naive Bayes [43, 44, 46], maximum entropy [33], and Support Vector Machine (SVM) [39, 40, 43, 44] to classify texts from product/service reviews, message boards, and news. For deep learning methodology research, many researchers aimed to improve sentiment classification performance and find an appropriate deep learning model like CNN [27, 41, 4547], LSTM [27, 41, 4547], or the recurrent neural network [27, 46]. Our research aims to compare the sentiment classification model between supervised and semi-supervised learning and find an appropriate sentiment classification model for Thai financial news. In this section, we introduce related works around the research regarding Thai sentiment classification. A summary of those works is shown in Table 1.
Table 1.
No.Author (Year) [Ref.]Research ObjectiveSentiment Classification TechniqueSource of Content
1Netisopakul & Chattupan (2015) [39]To classify stock market news sentiment by using the wordpair featureDT, SVM1,381 news
2Netisopakul et al. (2016) [40]To explain the cause of misclassification by using 10-fold cross validationSVM1,964 sentences
(positive = 573,
neutral = 940,
negative = 451)
3Vateekul & Koomsubha (2016) [41]To classify sentiment on Twitter data in Thai by using a well-known deep learning approachLSTM, DCNN28,890 documents
(positive = 14,445,
negative = 14,445)
4Deewattananin & Sammapun (2017) [42]To analyze user review aspects and perform sentiment analysis to reveal which features users like or do not likeDictionary-based approach1,970 documents
(positive = 985,
negative = 985)
5Kadmateekarun et al. (2017) [43]To study automatic sentiment analysis in cosmetic product reviewsNB, SVM200 videos
(positive = 100,
negative = 100)
6Lertsiwaporn & Senivongse (2017) [44]To represent the development of a Twitter analysis tool that can collect, analyze, and visualize a set of tweets in ThaiME, NB, SVM1,245 documents
(positive = 663,
negative = 662)
7Esichaikul & Phumdontree (2018) [27]To analyze Thai daily financial news by integrating the fine-grained sentiment analysis technique with a deep neural networkBGRU, CNN, LSTM9,352 news
8Pasupa & Seneewong Na Ayutthaya (2019) [45]To compare sentiment analysis based on word embedding, POS-tag, and sentic featuresCNN, LSTM1,115 sentences
9Piyaphakdeesakun et al. (2019) [46]To classify sentiment of a Thai document by comparing several approaches and finding the appropriate deep learningBGRU, CNN, LSTM, NB41,073 documents
(positive = 21,490,
negative = 19,583)
10Pugsee & Ongsirmongkol (2019) [47]To develop the classification model by two basic types of deep machine learning for Thai sentiment analysisCNN, LSTM12,596 reviews
(positive = 5,701,
neutral = 3,640,
negative = 3,255)
11Our researchTo compare sentiment classification techniques and find an appropriate sentiment classification model for Thai financial newsNB, RF, SVM, CNN, LSTM41,150 financial news
Table 1. Summary of Related Works in the Thai Sentiment Classification Topic
BGRU, bidirectional gated recurrent unit; DT, decision tree; DCNN, dynamic convolutional neural network; ME, maximum entropy; NB, naive Bayes; RF, random forest.

3 Research Methodology

The overview of this research methodology consists of six main steps, including data collection, preprocessing, feature extraction, feature vector, sentiment classification, and sentiment classifier evaluation. The research is divided into six aims to combine ontology and text classification, as shown in Figure 4.
Fig. 4.
Fig. 4. Research methodology overview.
The details of each research methodology are explained next.

3.1 Data Collection

In the data collection phase, our experiment was conducted on datasets related to financial news. We retrieved 50,000 financial news tweets between January 1, 2019 and June 30, 2022 from financial news agency Twitter accounts such as Prachachat (@prachachat), Bangkok Biz News (@ktnewsonline), and Thansettakij (@thansettakij) by using the Python programming language modules “Twint” and “Tweepy.” The output of this methodology is a comma-separated values (.csv) file.

3.2 Data Preprocessing

In the data preprocessing phase, we divided our method into two main tasks: data cleansing and tokenization.

3.2.1 Data Cleansing.

We cleaned datasets by removing symbols (e.g., “/”, “|”, “•”), line breaks, and duplicate documents, leaving 41,150 documents. Next, we classified document datasets by labeling the metatag into three categories, including positive, neutral, and negative [48]. Finally, we split documents into two datasets, including 85% of the training set and 15% of the testing set.

3.2.2 Tokenization.

We used the Python programming language module “PyThaiNLP” with the “deepcut” engine to break an enormous paragraph into tokens and remove stopwords that do not affect the sentiment classification process. Tokenization refers to breaking raw text into meaningful data and leaves the information of the text. Hence, raw text may contain some unnecessary words, whereas stopwords will remove unwanted words during the feature extraction process.

3.3 Feature Extraction

In the feature extraction phase, we divided the method into four main tasks: bag-of-words, Term Frequency–Inverse Document Frequency (TF-IDF), word embedding, and n-gram.

3.3.1 Bag-of-Words.

We presented a feature extraction as bag-of-words, a feature extraction used in this research, such as naive Bayes, random forest, and SVM. A bag-of-words refers to a converter that converts words into a unique ID where the word frequency is used as a feature for training a classifier [49].

3.3.2 Term Frequency–Inverse Document Frequency.

TF-IDF refers to identifying the weight of keywords in a document. Term Frequency (TF) is explained for the word count in a document and calculated by the number of times term t appears over the absolute number of terms in document d in Equation (1).
\begin{equation} TF\left( {t,d} \right) = \frac{{Number\ of\ times\ term\ t\ appears\ in\ document\ d}}{{Total\ number\ of\ terms\ in\ document\ d}} \end{equation}
(1)
Inverse Document Frequency (IDF) assists in estimating the importance of words [50] and calculates by loge the total number of documents over the total number of documents with term t in Equation (2).
\begin{equation} IDF\left( t \right) = lo{g}_e\ \frac{{Total\ number\ of\ documents}}{{Total\ number\ of\ documents\ with\ term\ t\ in\ it}} \end{equation}
(2)
The final weight for term t in document d is calculated in Equation (3).
\begin{equation} TF - IDF\left( {t,d} \right) = TF\left( {t,d} \right)\ \times IDF\left( t \right) \end{equation}
(3)

3.3.3 Word Embedding.

In this research, the deep learning model cannot learn directly from text data because it was intentionally developed for computer vision. Therefore, word embedding is required to transform a word or sentence into a vector. This kind of transformation is called word embedding, which can be done by models such as Word2Vec, GloVe, and ULMFiT, among others.
Word2Vec. Word2Vec was developed by Mikolov et al. [51]. This technique uses a shallow neural network model, which employs a two-layer neural network to learn word embedding and to predict words occurring in similar contexts. In the Thai language context, Thai2Vec is one of the most commonly used Thai word embedding techniques. It was used to transform each word in a sentence into a vector by the UMLFit method [52], which contains 60,000 words in the corpus, and a 300D vector represents each word.
BERT. Bidirectional Encoder Representations from Transformers (BERT) is a transformer-based machine learning technique for NLP pre-training developed by Devlin et al. [53]. In low-resource languages such as Thai, the choices of models are limited to training a BERT model based on a much smaller dataset, such as BERT-th [54]. The BERT model converts a Thai character into a numeric vector and utilizes the pre-train weights of the BERT-base multilingual cased version to fine-tune the language model with the corpus.
In summary, bag-of-words, TF-IDF, and word embedding are used in this study to weight keyword frequency related to Thai financial news topics from social media.

3.3.4 N-gram.

In this research, n-gram would perform document features in a supervised machine learning approach. These are sequences of n tokens from the given document [55]. The value of n can be 1, 2, …, n. If n = 1, it is called a unigram; for n = 2, bigram; for n = 3 trigram, and so on. We used unigram and bigram to compare and investigate how combining two words in financial news would affect each sentiment. The following sentences in Table 2 exemplify different sizes of n-gram models.
Table 2.
Table 2. Examples of Different Sizes of N-gram Models

3.4 Feature Vector

In this phase, we converted texts into vector form by using “CountVectorizer” to create a feature vector using both bag-of-words and TF-IDF with n-gram, which are used in machine learning.
Before the sentiment classification process, we used k-fold cross validation to represent that the training set is split into k subsets. The purpose of k-fold cross validation is to test the practice of the classification model by dividing the training set (e.g., 5-fold cross validation to split five parts of the dataset) as shown in Figure 5. We compared 5-fold cross validation and 10-fold cross validation experiments.
Fig. 5.
Fig. 5. An example of fivefold cross validation.

3.5 Sentiment Classification

In the sentiment classification phase, we applied text classification as sentiment classification to analyze news from social media accounts. The research methodology of sentiment classifier modeling can be expressed as follows.

3.5.1 Naive Bayes Classifier.

The naive Bayes classifier refers to the calculation of probabilities by assuming that the probability of each attribute belonging to a given class value is independent of all other attributes based on Bayes’ theorem [56] with solid independent assumptions between features, as shown in Figure 6.
Fig. 6.
Fig. 6. An example of fivefold cross validation.
The equation of Bayesian rule is as follows:
\begin{equation} P(A|B) = \ \frac{{P\left( A \right)P(B|A)}}{{P\left( B \right)}}, \end{equation}
(4)
where A is a specific class called evidence and B is the classified document. P(A) is the class prior probability of a specific class, and P(B) is the predictor prior probability of a document that cannot be equal to zero. P(A|B) is the posterior probability of the class-given document. In contrast, P(B|A) is the likelihood of probability that document B appears in a specific class A. Hence, the value of class A could be positive, negative, or neutral, whereas B is a document only. The goal is to choose A value to maximize P(A|B). The advantages of the naive Bayes classifier are that it is simple, fast, and highly accurate. However, this classifier's limitation is invalid for the word independence assumption.

3.5.2 Random Forest.

Random forest refers to a supervised learning model based on ensemble learning for classification and regression that operates by constructing multiple decision trees at training time and then aggregating the class to find the final class for the classification model or mean prediction for the regression model. Random forest was introduced by Ho [57] by using the random subspace method. An extension was developed by Breiman [58] by using the bagging idea (also known as bootstrapping) and a random selection feature.
The random forest methodology started by creating a decision tree from 1, …, n. Each decision tree consists of random data from the dataset and feature. The classification from each decision tree will have a majority vote for selecting the best output to be a final class for the classification model or mean prediction for the regression model, as shown in Figure 7.
Fig. 7.
Fig. 7. Random forest.

3.5.3 Support Vector Machine.

SVM is a supervised learning model associated with a machine learning algorithm that can be used for classification or regression model challenges based on statistics [59]. In the classification process, SVM performs classification by finding the hyperplane that differentiates the classes we plotted in n-dimensional space. The hyperplane boundary is chosen from the maximum distance between the training samples, as shown in Figure 8.
Fig. 8.
Fig. 8. Support vector machine.
Assuming an input space X, an output space Y, and a training dataset T in Equation (5):
\begin{equation} T = \left\{ {\left( {{x}_i{y}_i} \right),\ i = 1, \ldots ,n} \right\} \in {\left( {X \times Y} \right)}^n, \end{equation}
(5)
where xi ∈ Rn, yi ∈ Y = {1, 1}. A separating hyperplane can be written as follows:
\begin{equation} w \cdot x + b = 0, \end{equation}
(6)
where w = {w1, …, wn}, wn is the weight vector of n attributes, x is an object to be classified, and b is a bias, a constant value. The margin between two hyperplanes called Hilbert space is 2/||w||, which aims to maximize the margin. The two hyperplanes can be written as follows:
\begin{equation} \left( {w \cdot x} \right) - b = 1\ \hbox{and}\ \left( {w \cdot x} \right) - b = - 1. \end{equation}
(7)
The SVM classifier is trained using N-pair words and their category. The goal of SVM is to separate negative and positive sentiment datasets. In sentiment classification work, any datasets above the boundary will equal +1 for the positive label, but any datasets below the boundary will equal –1 for the negative label. The usefulness of SVM is offering good classification performance and less dependency, but it needs more transparency in results and may cause the overfitted model.

3.5.4 Convolutional Neural Network.

CNN is a class of artificial neural networks that was introduced by LeCun et al. [60]. CNN consists of three components:
(1)
The convolutional layer is a core component that automatically generates feature maps by sliding a filter over an image.
(2)
The pooling layer is employed to reduce the size of the feature map by combining the outputs of neuron clusters at one layer into a single neuron in the next layer.
(3)
The fully connected layer is the last layer after the convolutional and pooling layers. This layer consists of a perceptron connected between the previous and subsequent layers. This can be calculated into forward or backward diffusion.
An input feature vector is first fed into the convolutional layer, allowing the model to learn information from groups of words through a striding filter. A striding filter has a dimension of w × h, where w is the length of the feature vector and h is the number of words the filter covers at a time. This leads to an output with a size of s × n, where n is the number of nodes in the convolutional model, s is the number of strides equal to h − (l − 1), and l is the number of words in a sentence. Then, the output from the convolutional layer passes through rectified linear unit activation function because the vector from the input to the output layer must be a 1D vector. A 1D dynamic max pooling with a size of s × 1 is required. It strides for n times and gives a 1D output vector that goes to the dropout layer and then to the output layer. An overview of CNN architecture is illustrated in Figure 9.
Fig. 9.
Fig. 9. Convolutional neural network.

3.5.5 Long Short-Term Memory.

LSTM is a class of artificial recurrent neural network architecture that is used in the field of deep learning. LSTM was introduced by Hochreiter and Schmidhuber [62] to fix the vanishing gradient problem. LSTM is composed of five components:
(1)
The forget gate is the gate that decides what information should be thrown away or kept. Information from the previous hidden state and information from the current input is passed through the sigmoid function (σ). The equation can be written as follows:
\begin{equation}{f}_t = \sigma \ \left( {{x}_t\ \times \ {U}_f + {H}_{t - 1}\ \times \ {W}_f} \right),\end{equation}
(8)
where xt is an input to the current timestamp t. Uf is the weight associated with the input. Ht1 is a hidden state of the previous timestamp, and Wf is the weight matrix associated with the hidden state. Values come out between 0 and 1. Closer to 0 means forget, and closer to 1 means keep.
(2)
The input gate is the gate that decides which information will be updated, again considered together with the previous hidden state. The sigmoid function will decide how much new information should be updated based on values of 0 to 1, in which 0 means not important and 1 means important. The equation can be written as follows:
\begin{equation}{i}_t = \sigma \ \left( {{x}_t\ \times \ {U}_i + {H}_{t - 1}\ \times \ {W}_i} \right),\end{equation}
(9)
where xt is an input at the current timestamp t. Ui is the weight matrix of input. Ht1 is a hidden state at the previous timestamp, and Wi is the weighted matrix of input associated with the hidden state.
(3)
The cell state is an LSTM cell that combines old information dropped from the forget gate and new information from the input gate. The equation can be written as follows:
\begin{equation}{c}_t = tanh\ \left( {{x}_t\ \times \ {U}_c + {H}_{t - 1}\ \times \ {W}_c} \right),\end{equation}
(10)
where xt is an input at the current timestamp t and the activation function is tanh function. The value of the cell state will be between –1 and 1. If the value of ct is negative, the information will subtract from the cell state at the current timestamp. While ct is positive, the information will add to the cell state at the current timestamp. Uc is the weight matrix of input. Ht1 is the hidden state at the previous timestamp, and Wc is the weighted matrix of input associated with the hidden state.
(4)
The output gate is the gate that has a role in deciding what the next hidden state should be. The gate sends information to the hidden state restricted to an interval of 0 and 1 by a sigmoid function. The equation can be written as follows:
\begin{equation}{o}_t = \sigma \ \left( {{x}_t\ \times \ {U}_o + {H}_{t - 1}\ \times \ {W}_o} \right),\end{equation}
(11)
where xt is an input at the current timestamp t. Uo is the weight matrix of input. Ht1 is a hidden state at the previous timestamp, and Wo is the weighted matrix of input associated with the hidden state.
(5)
The hidden state refers to an output of LSTM. This state carries the information on what LSTM has seen. The hidden state uses ot and tanh functions to the updated cell state. The equation can be written as follows:
\begin{equation}{h}_t = {o}_t\ \times \ tanh\left( {{c}_t} \right),\end{equation}
(12)
where ht is a current hidden state, ot is an output gate, and ct is a current cell state. It turns out that the hidden state is a function of LSTM (ct) and the current output. If you need to take the output of the current timestamp, just apply the SoftMax activation on hidden state Ht as follows:
\begin{equation} Output = Softmax\left( {{H}_t} \right). \end{equation}
(13)
The overview of LSTM in the text classification model is shown in Figure 10.
Fig. 10.
Fig. 10. Long short-term memory.

3.6 Classifier Evaluation

In this phase, we used various instruments to measure sentiment classifier performance.

3.6.1 Inter-Annotator Agreement.

We used inter-annotator agreement based on Cohen's kappa coefficient [61] to evaluate sentiment reliability. The coefficient is defined as follows:
\begin{equation} k = \frac{{ Pr ( a )\ - \ Pr( e )}}{{1\ - \ Pr( e )}}, \end{equation}
(14)
where Pr(a) is the proportion of the cases where both annotators agree and Pr(e) is the proportion we search that the two annotators agree by chance. Interpretation of the k parameter is shown in Table 3.
Table 3.
kAgreement Level
<0.00Poor
0.01–0.20Slight
0.21–0.40Fair
0.41–0.60Moderate
0.61–0.80Substantial
0.81–1.00Perfect
Table 3. Interpretation of the k Parameter

3.6.2 Confusion Matrix.

We used a confusion matrix to measure sentiment classification performance including the false positive rate (FPR) and the true positive rate (TPR).
\begin{equation} False\ Positive\ Rate\ \left( {FPR} \right) = \frac{{FP}}{{FP + TN}}, \end{equation}
(15)
\begin{equation} True\ Positive\ Rate\ {\rm{\ }}\left( {TPR} \right) = \frac{{TP}}{{TP + FN}}, \end{equation}
(16)
where true positive (TP) represents a number of sentiments that are calculated properly through the class by algorithms, whereas true negative (TN) represents a number of aspects in which sentiment does not assign and is not appropriately calculated through the class by the algorithms. False positive (FP) represents sentiment incorrectly calculated through the class by the algorithms, and false negative (FN) represents a number of aspects that do not assign but is calculated through the class by the algorithms. Regarding the confusion matrix concept, in Table 4, the columns explain the examples of a prediction class, whereas the rows explain the instances of an actual class in the matrix.
Table 4.
 PositiveNegative
TrueTrue positiveTrue negative
FalseFalse positiveFalse negative
Table 4. Confusion Matrix of Sentiment Classification

3.6.3 Measurement Factors.

For performance assessment, we used the F-measure, precision, recall, and accuracy to measure the sentiment classification performance of the machine learning approach [6264]. Those indices are computed based on four measurement factors as follow:
F-measure refers to the weighted average of precision and recall of the computation score. The equation can be written as follows:
\begin{equation} F - measure = 2x\frac{{Precision\ x\ Recall}}{{Precision + Recall}}. \end{equation}
(17)
Precision refers to the ratio of true positive predictions over the total of true positive and false positive predictions. The equation can be written as follows:
\begin{equation} Precision = \ \frac{{TP}}{{TP + FP}}. \end{equation}
(18)
Recall refers to the ratio of true positive predictions over the total of true positive and false negative predictions. The equation can be written as follows:
\begin{equation} Recall = \ \frac{{TP}}{{TP + FN}} \end{equation}
(19)
Accuracy refers to the ratio of the total of true positive predictions and true negative predictions over the total of true positive predictions, true negative predictions, false positive predictions, and false negative predictions. The equation can be written as follows:
\begin{equation} Accuracy = \ \frac{{\left( {TP + TN} \right)}}{{\left( {TP + TN + FP + FN} \right)}} \end{equation}
(20)

3.6.4 Experiment Settings.

In experiment settings, we prepared experimentation to determine which sentiment classification model would be proper for Thai financial news:
(1)
Model: We classified raw text into three models by defined as the triplet: M = {T, R, I}:
T = {News_classe, Positive, Negative, Neutral}
R = {News_classe::= Positive|Negative| Neutral}
I = {Positive: “Subjective with positive sentiment”, Negative: “Subjective with negative sentiment”, Neutral: “out of topic or without sentiment (objective)”}
(2)
Annotate: We let two linguistics researchers annotate a comment to represent a positive, negative, or neutral sentiment with regard to the article topic and used Cohen's kappa coefficient to annotate each comment. The result is shown in Table 5.
Table 5.
Annotator BAnnotator ATotal
 PositiveNeutralNegative 
Positive16,0742,08049118,645
Neutral4218,6442319,296
Negative15740912,64313,209
Total16,65211,13313,36541,150
Table 5. Annotated Confusion Matrix
The proportion of the cases where both annotators agree is as follows:
\begin{equation} \Pr \left( a \right) = \frac{\hbox{37,361}}{\hbox{41,150}} = 0.9079. \end{equation}
(21)
We first calculated the probability of rating a positive sentiment by chance, then added totals for the rows and columns.
For positive, Annotator A has 16,652 of 41,150 positive sentiment ratings, whereas Annotator B has 18,645 of 41,150 positive sentiment ratings. The probabilities of positive for Annotator A and B are thus as follows:
\begin{equation} ProbA\left( {Positive} \right) = \frac{\hbox{16,652}}{\hbox{41,150}} = 0.4047, \end{equation}
(22)
\begin{equation} ProbB\left( {Positive} \right) = \frac{\hbox{18,645}}{\hbox{41,150}} = 0.4531. \end{equation}
(23)
The probability that both annotators agree on a positive sentiment by chance is equal to the product of ProbA and ProbB as follows:
\begin{equation} ChanceAgree\left( {Positive} \right) = 0.4047\ \times \ 0.4531 = 0.1833. \end{equation}
(24)
Similarly, we calculated the agreement probabilities for neutral sentiment and negative sentiment as follows:
\begin{equation} ProbA\left( {Neutral} \right) = \frac{\hbox{11,133}}{\hbox{41,150}} = 0.2754, \end{equation}
(25)
\begin{equation} ProbB\left( {Neutral} \right) = \frac{\hbox{9,296}}{\hbox{41,150}} = 0.2259, \end{equation}
(26)
\begin{equation} ChanceAgree\left( {Positive} \right) = 0.2754\ \times \ 0.2259 = 0.0622, \end{equation}
(27)
\begin{equation} ProbA\left( {Negative} \right) = \frac{\hbox{13,365}}{\hbox{41,150}} = 0.3247, \end{equation}
(28)
\begin{equation} ProbB\left( {Negative} \right) = \frac{\hbox{13,209}}{\hbox{41,150}} = 0.3210, \end{equation}
(29)
\begin{equation} ChanceAgree\left( {Negative} \right) = 0.3247\ \times \ 0.3210 = 0.1042, \end{equation}
(30)
Summing up the preceding three probabilities, we get the probability of agreement on any of the positive, neutral, and negative sentiments by chance:
\begin{equation} ChanceAgree = 0.1833 + 0.0622 + 0.1042 = 0.3497. \end{equation}
(31)
So, the Kappa score for classification is equal to the following:
\begin{equation} k = \ \frac{{0.9079\ - \ 0.3497}}{{1\ - \ 0.3497}}. \end{equation}
(32)
Table 6.
PositiveNeutralNegativeTotal
18,9108,66213,57841,150
Table 6. Results of Sentiment Polarity
So, k = 0.8584 is considered as perfect.
(3)
Adjudication: Two linguistics researchers had a discussion to obtain a consensus on annotation in the adjudication process. If a consensus was not obtained, they concluded that the news was neutral. The results of sentiment polarity of Thai financial news are shown in Table 6.
(4)
Processing: We provided preprocessing, UTF-8 encoding, tokenization, stemming, stopword removal, n-gram word generation, and word vector creation in the processing step.
(5)
Train and test: We used 5-fold and 10-fold cross validation that adopted a classic machine learning approach and deep learning approach.
(6)
Evaluate: We evaluated our models by calculating the F-measure, precision, and recall of positive, neutral, and negative sentiment classes, and accuracy of classification models.

4 Research Methodology

We collected data from 41,150 financial news documents from January 1, 2020 to December 31, 2021, and divided it into three categories, including 18,909 positive documents, 8,663 neutral documents, and 13,578 negative documents. The ratio of splitting into training and testing datasets was 85:15, respectively. Results of the experiments are presented next.

4.1 Wordcloud

In this phase, a wordcloud was created to monitor significant features easily. Here, we visualize the data from our research by presenting text data in the wordcloud. The most frequent keywords are related to the COVID-19 pandemic, followed by economic issues, as shown in Figure 11.
Fig. 11.
Fig. 11. Frequency of each sentiment in Thai financial news.
An overview of Thai financial news keywords is illustrated in Figure 12.
Fig. 12.
Fig. 12. An example of a wordcloud in Thai financial news.

4.2 Classifier Model Evaluation

In the sentiment classifier evaluation phase, we obtained the results of our experiment on Thai financial news sentiment classification by using machine learning techniques such as naive Bayes, random forest, and SVM, and deep learning techniques such as CNN and LSTM. Our experiment used both 5-fold and 10-fold cross validation to perform feature extractions. Then, sentiment classifier evaluation measured sentiment performance.

4.2.1 First Experiment.

First experiment, we used fivefold cross validation for each classification model, and the results are as follows.
For the naive Bayes classifier, the first unigram experiment combined with TF-IDF provides accuracy at 63.37%, whereas positive sentiment gives an F1-score of 0.45, a precision score of 0.36, and a recall score of 0.58. Neutral sentiment gives an f1-score of 0.65, a precision score of 0.66, and a recall score of 0.64. Negative sentiment gives an f1-score of 0.64, a precision score of 0.69, and a recall score of 0.60. The second experiment using bag-of-words provides accuracy at 62.14%, whereas positive sentiment gives an f1-score of 0.67, a precision score of 0.62, and a recall score of 0.73. Neutral sentiment gives an f1-score of 0.53, a precision score of 0.55, and a recall score of 0.51. Negative sentiment gives an f1-score of 0.68, a precision score of 0.71, and a recall score of 0.61. The third experiment using Word2Vec provides accuracy at 66.99%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.70, and a recall score of 0.80. Neutral sentiment gives an f1-score of 0.65, a precision score of 0.62, and a recall score of 0.69. Negative sentiment gives an f1-score of 0.56, a precision score of 0.60, and a recall score of 0.52. Finally, the BERT model provides accuracy at 69.72%, whereas positive sentiment gives an f1-score of 0.73, a precision score of 0.70, and a recall score of 0.75. Neutral sentiment gives an f1-score of 0.64, a precision score of 0.61, and a recall score of 0.68. Negative sentiment gives an f1-score of 0.79, a precision score of 0.78, and a recall score of 0.81.
The first bigram experiment combined with TF-IDF provides accuracy at 68.29%, whereas positive sentiment gives an f1-score of 0.62, a precision score of 0.59, and a recall score of 0.66. Neutral sentiment gives an f1-score of 0.62, a precision score of 0.60, and a recall score of 0.65. Negative sentiment gives an f1-score of 0.62, a precision score of 0.59, and a recall score of 0.65. The second experiment using bag-of-words provides accuracy at 66.32%, whereas positive sentiment gives an f1-score of 0.56, a precision score of 0.59, and a recall score of 0.53. Neutral sentiment gives an f1-score of 0.63, a precision score of 0.65, and a recall score of 0.62. Negative sentiment gives an f1-score of 0.60, a precision score of 0.66, and a recall score of 0.55. The third experiment using Word2Vec provides accuracy at 67.06%, whereas positive sentiment gives an f1-score of 0.65, a precision score of 0.65, and a recall score of 0.64. Neutral sentiment gives an f1-score of 0.63, a precision score of 0.57, and a recall score of 0.71. Negative sentiment gives an f1-score of 0.73, a precision score of 0.69, and a recall score of 0.78. Finally, the BERT model provides accuracy at 70.11%, whereas positive sentiment gives an f1-score of 0.77, a precision score of 0.82, and a recall score of 0.73. Neutral sentiment gives an f1-score of 0.67, a precision score of 0.69, and a recall score of 0.65. Negative sentiment gives an f1-score of 0.74, a precision score of 0.70, and a recall score of 0.77. The results are shown in Table 7.
Table 7.
FeatureMeasureNaive Bayes
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.44590.65260.64340.62300.62050.6231
 Precision0.36070.66200.69320.58670.59460.6001
 Recall0.58390.64340.60030.66400.64880.6480
 Accuracy0.6337  0.6829  
Bag-of-wordsF1-score0.66670.52630.68160.55990.63480.6034
 Precision0.61630.54560.71030.59280.64930.6611
 Recall0.72600.50830.60800.53050.62100.5549
 Accuracy0.6214  0.6632  
Word2VecF1-score0.74270.65250.55720.64660.63080.7313
 Precision0.69540.61540.59650.65430.56730.6924
 Recall0.79690.69440.52270.63910.71020.7749
 Accuracy0.6699  0.6706  
BERTF1-score0.72530.64260.79190.76940.66850.7357
 Precision0.70000.61150.77830.81930.69270.7043
 Recall0.75240.67700.80560.72530.64600.7700
 Accuracy0.6972  0.7011  
Table 7. Experimental Results of Thai Financial News Sentiment Classification Using the Naive Bayes Classifier with Fivefold Cross Validation
For the random forest classifier, the first unigram experiment combined with TF-IDF provides accuracy at 55.55%, whereas positive sentiment gives an f1-score of 0.57, a precision score of 0.66, and a recall score of 0.46. Neutral sentiment gives an f1-score of 0.52, a precision score of 0.73, and a recall score of 0.40. Negative sentiment gives an f1-score of 0.44, a precision score of 0.50, and a recall score of 0.39. The second experiment using bag-of-words provides accuracy at 54.10%, whereas positive sentiment gives an f1-score of 0.58, a precision score of 0.49, and a recall score of 0.73. Neutral sentiment gives an f1-score of 0.47, a precision score of 0.60, and a recall score of 0.38. Negative sentiment gives an f1-score of 0.52, a precision score of 0.47, and a recall score of 0.58. The third experiment using Word2Vec provides accuracy at 60.27%, whereas positive sentiment gives an f1-score of 0.58, a precision score of 0.57, and a recall score of 0.59. Neutral sentiment gives an f1-score of 0.59, a precision score of 0.52, and a recall score of 0.69. Negative sentiment gives an f1-score of 0.56, a precision score of 0.52, and a recall score of 0.60. Finally, the BERT model provides accuracy at 60.69%, whereas positive sentiment gives an f1-score of 0.57, a precision score of 0.50, and a recall score of 0.67. Neutral sentiment gives an f1-score of 0.65, a precision score of 0.70, and a recall score of 0.61. Negative sentiment gives an f1-score of 0.51, a precision score of 0.49, and a recall score of 0.54.
The first bigram experiment combined with TF-IDF provides accuracy at 57.56%, whereas positive sentiment gives an f1-score of 0.56, a precision score of 0.58, and a recall score of 0.54. Neutral sentiment gives an f1-score of 0.57, a precision score of 0.53, and a recall score of 0.60. Negative sentiment gives an f1-score of 0.55, a precision score of 0.61, and a recall score of 0.51. The second experiment using bag-of-words provides accuracy at 56.07%, whereas positive sentiment gives an f1-score of 0.55, a precision score of 0.63, and a recall score of 0.49. Neutral sentiment gives an f1-score of 0.61, a precision score of 0.68, and a recall score of 0.55. Negative sentiment gives an f1-score of 0.56, a precision score of 0.57, and a recall score of 0.55. The third experiment using Word2Vec provides accuracy at 60.98%, whereas positive sentiment gives an f1-score of 0.59, a precision score of 0.61, and a recall score of 0.56. Neutral sentiment gives an f1-score of 0.61, a precision score of 0.67, and a recall score of 0.56. Negative sentiment gives an f1-score of 0.57, a precision score of 0.70, and a recall score of 0.54. Finally, the BERT model provides accuracy at 61.18%, whereas positive sentiment gives an f1-score of 0.63, a precision score of 0.65, and a recall score of 0.60. Neutral sentiment gives an f1-score of 0.57, a precision score of 0.55, and a recall score of 0.59. Negative sentiment gives an f1-score of 0.61, a precision score of 0.72, and a recall score of 0.53. The results are shown in Table 8.
Table 8.
FeatureMeasureRandom Forest
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.57340.51760.43650.55770.56600.5540
 Precision0.65500.72990.49630.58090.53400.6062
 Recall0.45560.40100.38960.53620.60210.5100
 Accuracy0.5555  0.5756  
Bag-of-wordsF1-score0.58290.46660.51670.55140.61160.5578
 Precision0.48470.59910.46820.63420.68430.5698
 Recall0.73090.38210.57630.48770.55290.5462
 Accuracy0.5410  0.5607  
Word2VecF1-score0.57760.59390.55710.58530.61320.5691
 Precision0.56790.52010.51760.60900.67270.6980
 Recall0.58770.69220.60320.56330.56330.5428
 Accuracy0.6027  0.6098  
BERTF1-score0.57200.65430.51230.62640.57130.6114
 Precision0.49890.70240.48690.65380.55000.7241
 Recall0.67010.61230.54060.60120.59440.5290
 Accuracy0.6069  0.6118  
Table 8. Experimental Results of Thai Financial News Sentiment Classification Using the Random Forest Classifier with Fivefold Cross Validation
For the SVM classifier, the first unigram experiment combined with TF-IDF provides accuracy at 69.85%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.70, and a recall score of 0.78. Neutral sentiment gives an f1-score of 0.66, a precision score of 0.64, and a recall score of 0.68. Negative sentiment gives an f1-score of 0.67, a precision score of 0.76, and a recall score of 0.60. The second experiment using bag-of-words provides accuracy at 67.75%, whereas positive sentiment gives an f1-score of 0.65, a precision score of 0.64, and a recall score of 0.66. Neutral sentiment gives an f1-score of 0.64, a precision score of 0.63, and a recall score of 0.65. Negative sentiment gives an f1-score of 0.67, a precision score of 0.64, and a recall score of 0.70. The third experiment using Word2Vec provides accuracy at 74.25%, whereas positive sentiment gives an f1-score of 0.71, a precision score of 0.77, and a recall score of 0.66. Neutral sentiment gives an f1-score of 0.64, a precision score of 0.63, and a recall score of 0.65. Negative sentiment gives an f1-score of 0.66, a precision score of 0.67, and a recall score of 063. Finally, the BERT model provides accuracy at 78.76%, whereas positive sentiment gives an f1-score of 0.75, a precision score of 0.74, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.68, a precision score of 0.75, and a recall score of 0.63. Negative sentiment gives an f1-score of 0.68, a precision score of 0.68, and a recall score of 0.68.
The first bigram experiment combined with TF-IDF provides accuracy at 70.58%, whereas positive sentiment gives an f1-score of 0.78, a precision score of 0.66, and a recall score of 0.94. Neutral sentiment gives an f1-score of 0.52, a precision score of 0.72, and a recall score of 0.41. Negative sentiment gives an f1-score of 0.65, a precision score of 1.00, and a recall score of 0.48. The second experiment using bag-of-words provides accuracy at 70.03%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.70, and a recall score of 0.80. Neutral sentiment gives an f1-score of 0.69, a precision score of 0.69, and a recall score of 0.70. Negative sentiment gives an f1-score of 0.69, a precision score of 0.69, and a recall score of 0.69. The third experiment is Word2Vec provides accuracy at 77.98%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.75, and a recall score of 0.72. Neutral sentiment gives an f1-score of 0.71, a precision score of 0.65, and a recall score of 0.78. Negative sentiment gives an f1-score of 0.67, a precision score of 0.64, and a recall score of 070. Finally, the BERT model provides accuracy at 78.91%, whereas positive sentiment gives an f1-score of 0.79, a precision score of 0.89, and a recall score of 0.71. Neutral sentiment gives an f1-score of 0.84, a precision score of 0.84, and a recall score of 0.84. Negative sentiment gives an f1-score of 0.80, a precision score of 0.79, and a recall score of 0.81. The results are shown in Table 9.
Table 9.
FeatureMeasureSVM
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.73550.66130.66960.77520.52300.6497
 Precision0.69510.64060.75760.65800.72360.9997
 Recall0.78080.68330.60000.94310.40950.4812
 Accuracy0.6985  0.7058  
Bag-of-wordsF1-score0.64610.64390.66640.74470.69460.6902
 Precision0.63690.63390.63680.69780.68810.6888
 Recall0.65560.65420.69890.79840.70130.6917
 Accuracy0.6775  0.7003  
Word2VecF1-score0.70920.64340.66160.73650.71020.6677
 Precision0.76790.63430.67100.75000.64990.6408
 Recall0.65880.65280.62650.72340.78280.6969
 Accuracy0.7425  0.7798  
BERTF1-score0.75350.68310.67740.79180.84200.7988
 Precision0.73580.75110.67480.89270.84330.7921
 Recall0.77200.62640.68010.71140.84070.8056
 Accuracy0.7876  0.7891  
Table 9. Experimental Results of Thai Financial News Sentiment Classification Using SVM with Fivefold Cross Validation
For the CNN classifier, the first unigram experiment combined with TF-IDF provides accuracy at 71.90%, whereas positive sentiment gives an f1-score of 0.75, a precision score of 0.77, and a recall score of 0.73. Neutral sentiment gives an f1-score of 0.59, a precision score of 0.62, and a recall score of 0.56. Negative sentiment gives an f1-score of 0.73, a precision score of 0.78, and a recall score of 0.69. The second experiment using bag-of-words provides an accuracy of 69.80%, whereas positive sentiment gives an f1-score of 0.72, precision score of 0.72, and recall score of 0.73. Neutral sentiment gives an f1-score of 0.51, a precision score of 0.44, and a recall score of 0.59. Negative sentiment gives an f1-score of 0.60, a precision score of 0.74, and a recall score of 0.51. The third experiment using Word2Vec provides accuracy at 77.66%, whereas positive sentiment gives an f1-score of 0.70, a precision score of 0.67, and a recall score of 0.73. Neutral sentiment gives an f1-score of 0.62, a precision score of 0.60, and a recall score of 0.65. Negative sentiment gives an f1-score of 0.71, a precision score of 0.78, and a recall score of 065. Finally, the BERT model provides accuracy at 78.27%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.72, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.66, a precision score of 0.64, and a recall score of 0.68. Negative sentiment gives an f1-score of 0.75, a precision score of 0.80, and a recall score of 0.70.
The first bigram experiment combined with TF-IDF provides accuracy at 72.72%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.72, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.63, a precision score of 0.64, and a recall score of 0.62. Negative sentiment gives an f1-score of 0.59, a precision score of 0.60, and a recall score of 0.57. The second experiment using bag-of-words provides an accuracy of 71.47%, whereas positive sentiment gives an f1-score of 0.75, precision score of 0.75, and recall score of 0.76. Neutral sentiment gives an f1-score of 0.53, a precision score of 0.55, and a recall score of 0.51. Negative sentiment gives an f1-score of 0.63, a precision score of 0.82, and a recall score of 0.52. The third experiment using Word2Vec provides accuracy at 79.89%, whereas positive sentiment gives an f1-score of 0.70, a precision score of 0.77, and a recall score of 0.64. Neutral sentiment gives an f1-score of 0.66, a precision score of 0.64, and a recall score of 0.67. Negative sentiment gives an f1-score of 0.65, a precision score of 0.70, and a recall score of 0.60. Finally, the BERT model provides accuracy at 80.64%, whereas positive sentiment gives an f1-score of 0.75, a precision score of 0.70, and a recall score of 0.81. Neutral sentiment gives an f1-score of 0.66, a precision score of 0.68, and a recall score of 0.63. Negative sentiment gives an f1-score of 0.64, a precision score of 0.62, and a recall score of 0.66. The results are shown in Table 10.
Table 10.
FeatureMeasureCNN
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.74710.58820.72950.74480.63150.5873
 Precision0.76560.61900.77740.72260.63980.6038
 Recall0.72940.56030.68720.76850.62340.5717
 Accuracy0.7190  0.7272  
Bag-of-wordsF1-score0.72480.50540.60400.75380.52810.6332
 Precision0.72000.43930.73930.74530.54810.8202
 Recall0.72970.59490.51060.76250.50950.5157
 Accuracy0.6980  0.7147  
Word2VecF1-score0.69990.62310.71090.70000.65820.6472
 Precision0.67210.59930.78040.76900.64400.7009
 Recall0.73020.64890.65280.64230.67310.6011
 Accuracy0.7766  0.7989  
BERTF1-score0.74400.65880.74830.75070.65640.6398
 Precision0.71590.64340.80420.69650.68420.6238
 Recall0.77440.67500.69970.81410.63080.6566
 Accuracy0.7827  0.8064  
Table 10. Experimental Results of Thai Financial News Sentiment Classification Using CNN with Fivefold Cross Validation
For the LSTM classifier, the first unigram experiment combined with TF-IDF provides accuracy at 73.40%, whereas positive sentiment gives an f1-score of 0.76, a precision score of 0.75, and a recall score of 0.76. Neutral sentiment gives an f1-score of 0.72, a precision score of 0.80, and a recall score of 0.66. Negative sentiment gives an f1-score of 0.73, a precision score of 0.66, and a recall score of 0.81. The second experiment using bag-of-words provides accuracy at 70.66%, whereas positive sentiment gives an f1-score of 0.69, a precision score of 0.68, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.70, a precision score of 0.71, and a recall score of 0.69. Negative sentiment gives an f1-score of 0.75, a precision score of 0.76, and a recall score of 0.73. The third experiment using Word2Vec provides accuracy at 76.18%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.69, and a recall score of 0.79. Neutral sentiment gives an f1-score of 0.69, a precision score of 0.70, and a recall score of 0.68. Negative sentiment gives an f1-score of 0.77, a precision score of 0.79, and a recall score of 0.74. Finally, the BERT model provides accuracy at 79.62%, whereas positive sentiment gives an f1-score of 0.78, a precision score of 0.81, and a recall score of 0.76. Neutral sentiment gives an f1-score of 0.65, a precision score of 0.61, and a recall score of 0.69. Negative sentiment gives an f1-score of 0.79, a precision score of 0.77, and a recall score of 0.81.
The first bigram experiment combined with TF-IDF provides accuracy at 75.75%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.80, and a recall score of 0.69. Neutral sentiment gives an f1-score of 0.76, a precision score of 0.84, and a recall score of 0.70. Negative sentiment gives an f1-score of 0.73, a precision score of 0.74, and a recall score of 0.72. The second experiment using bag-of-words provides accuracy at 74.04%, whereas positive sentiment gives an f1-score of 0.75, a precision score of 0.76, and a recall score of 0.74. Neutral sentiment gives an f1-score of 0.75, a precision score of 0.77, and a recall score of 0.73. Negative sentiment gives an f1-score of 0.68, a precision score of 0.71, and a recall score of 0.65. The third experiment using Word2Vec provides accuracy at 76.20%, whereas positive sentiment gives an f1-score of 0.76, a precision score of 0.82, and a recall score of 0.71. Neutral sentiment gives an f1-score of 0.72, a precision score of 0.74, and a recall score of 0.71. Negative sentiment gives an f1-score of 0.74, a precision score of 0.81, and a recall score of 0.68. Finally, the BERT model provides accuracy at 80.75%, whereas positive sentiment gives an f1-score of 0.82, a precision score of 0.83, and a recall score of 0.80. Neutral sentiment gives an f1-score of 0.72, a precision score of 0.72, and a recall score of 0.72. Negative sentiment gives an f1-score of 0.84, a precision score of 0.83, and a recall score of 0.85. The results are shown in Table 11.
Table 11.
FeatureMeasureLSTM
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.75790.72310.72570.74450.76480.7269
 Precision0.75280.79530.65740.80470.84020.7369
 Recall0.76310.66300.80990.69260.70190.7171
 Accuracy0.7340  0.7575  
Bag-of-wordsF1-score0.69200.69780.74810.75250.74580.6766
 Precision0.67890.70560.76240.76120.76760.7095
 Recall0.77230.69020.73430.74390.72520.6467
 Accuracy0.7066  0.7404  
Word2VecF1-score0.74040.68820.76570.75630.72220.7395
 Precision0.69340.69670.79240.81490.73900.8121
 Recall0.79420.67990.74070.70550.70620.6788
 Accuracy0.7618  0.7620  
BERTF1-score0.78100.64990.78940.81650.72000.8403
 Precision0.80940.61270.76770.83260.72440.8275
 Recall0.75450.69200.81240.80110.71560.8534
 Accuracy0.7962  0.8075  
Table 11. Experimental Results of Thai Financial News Sentiment Classification Using LSTM with Fivefold Cross Validation

4.2.2 Second Experiment.

For the second experiment, we used 10-fold cross validation for each classification model, and the results are presented next.
For the naive Bayes classifier, the first unigram experiment combined with TF-IDF provides accuracy at 66.79%, whereas positive sentiment gives an f1-score of 0.71, a precision score of 0.66, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.64, a precision score of 0.65, and a recall score of 0.62. Negative sentiment gives an f1-score of 0.63, a precision score of 0.65, and a recall score of 0.62. The second experiment using bag-of-words provides an accuracy of 65.56%, whereas positive sentiment gives an f1-score of 0.69, a precision score of 0.64, and a recall score of 0.74. Neutral sentiment gives an f1-score of 0.62, a precision score of 0.64, and a recall score of 0.60. Negative sentiment gives an f1-score of 0.64, a precision score of 0.70, and a recall score of 0.58. The third experiment using Word2Vec provides accuracy at 78.70%, whereas positive sentiment gives an f1-score of 0.68, a precision score of 0.69, and a recall score of 0.67. Neutral sentiment gives an f1-score of 0.57, a precision score of 0.58, and a recall score of 0.56. Negative sentiment gives an f1-score of 0.60, a precision score of 0.62, and a recall score of 0.58. Finally, the BERT model provides accuracy at 70.72%, whereas positive sentiment gives an f1-score of 0.71, a precision score of 0.71, and a recall score of 0.71. Neutral sentiment gives an f1-score of 0.64, a precision score of 0.64, and a recall score of 0.65. Negative sentiment gives an f1-score of 0.70, a precision score of 0.71, and a recall score of 0.69.
The first bigram experiment combined with TF-IDF provides accuracy at 69.85%, whereas positive sentiment gives an f1-score of 0.77, a precision score of 0.67, and a recall score of 0.92. Neutral sentiment gives an f1-score of 0.50, a precision score of 0.64, and a recall score of 0.41. Negative sentiment gives an f1-score of 0.65, a precision score of 0.89, and a recall score of 0.52. The second experiment using bag-of-words provides an accuracy of 67.21%, whereas positive sentiment gives an f1-score of 0.72, a precision score of 0.68, and a recall score of 0.76. Neutral sentiment gives an f1-score of 0.65, a precision score of 0.66, and a recall score of 0.63. Negative sentiment gives an f1-score of 0.64, a precision score of 0.68, and a recall score of 0.60. The third experiment using Word2Vec provides accuracy at 70.01%, whereas positive sentiment gives an f1-score of 0.68, a precision score of 0.70, and a recall score of 0.65. Neutral sentiment gives an f1-score of 0.68, a precision score of 0.67, and a recall score of 0.70. Negative sentiment gives an f1-score of 0.68, a precision score of 0.66, and a recall score of 0.70. Finally, the BERT model provides accuracy at 71.29%, whereas positive sentiment gives an f1-score of 0.69, a precision score of 0.67, and a recall score of 0.71. Neutral sentiment gives an f1-score of 0.63, a precision score of 0.62, and a recall score of 0.64. Negative sentiment gives an f1-score of 0.72, a precision score of 0.74, and a recall score of 0.70. The results are shown in Table 12.
Table 12.
FeatureMeasureNaive Bayes
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.71420.63510.62780.77450.49990.6549
 Precision0.66390.65340.68900.66670.64460.8903
 Recall0.77270.61780.57660.92400.40830.5180
 Accuracy0.6679  0.6985  
Bag-of-wordsF1-score0.68670.61990.63660.71670.64600.6377
 Precision0.63990.64170.70450.67930.66000.6819
 Recall0.74080.59950.58070.75840.63250.5988
 Accuracy0.6556  0.6721  
Word2VecF1-score0.67790.57360.59970.67710.68270.6765
 Precision0.68530.58330.62150.70320.67100.6583
 Recall0.67060.56430.57940.65280.69490.6958
 Accuracy0.6870  0.7001  
BERTF1-score0.71050.64140.70340.69100.63230.7174
 Precision0.70840.63570.71400.67310.62120.7358
 Recall0.71260.64720.69320.70980.64380.6999
 Accuracy0.7072  0.7129  
Table 12. Experimental Results of Thai Financial News Sentiment Classification Using the Naive Bayes Classifier with 10-Fold Cross Validation
For the random forest classifier, the first unigram experiment combined with TF-IDF provides accuracy at 61.08%, whereas positive sentiment gives an f1-score of 0.62, a precision score of 0.68, and a recall score of 0.57. Neutral sentiment gives an f1-score of 0.56, a precision score of 0.49, and a recall score of 0.66. Negative sentiment gives an f1-score of 0.48, a precision score of 0.40, and a recall score of 0.59. The second experiment using bag-of-words provides accuracy at 59.53%, whereas positive sentiment gives an f1-score of 0.45, a precision score of 0.51, and a recall score of 0.40. Neutral sentiment gives an f1-score of 0.44, a precision score of 0.34, and a recall score of 0.64. Negative sentiment gives an f1-score of 0.66, a precision score of 0.63, and a recall score of 0.70. The third experiment using Word2Vec provides accuracy at 62.98%, whereas positive sentiment gives an f1-score of 0.60, a precision score of 0.60, and a recall score of 0.60. Neutral sentiment gives an f1-score of 0.51, a precision score of 0.59, and a recall score of 0.45. Negative sentiment gives an f1-score of 0.65, a precision score of 0.62, and a recall score of 0.69. Finally, the BERT model provides accuracy at 64.17%, whereas positive sentiment gives an f1-score of 0.62, a precision score of 0.67, and a recall score of 0.57. Neutral sentiment gives an f1-score of 0.55, a precision score of 0.55, and a recall score of 0.56. Negative sentiment gives an f1-score of 0.59, a precision score of 0.57, and a recall score of 0.68.
The first bigram experiment combined with TF-IDF provides accuracy at 62.69%, whereas positive sentiment gives an f1-score of 0.52, a precision score of 0.41, and a recall score of 0.70. Neutral sentiment gives an f1-score of 0.50, a precision score of 0.53, and a recall score of 0.63. Negative sentiment gives an f1-score of 0.51, a precision score of 0.64, and a recall score of 0.43. The second experiment using bag-of-words provides accuracy at 61.12%, whereas positive sentiment gives an f1-score of 0.66, a precision score of 0.61, and a recall score of 0.73. Neutral sentiment gives an f1-score of 0.57, a precision score of 0.52, and a recall score of 0.62. Negative sentiment gives an f1-score of 0.59, a precision score of 0.78, and a recall score of 0.47. The third experiment using Word2Vec provides accuracy at 63.10%, whereas positive sentiment gives an f1-score of 0.60, a precision score of 0.61, and a recall score of 0.59. Neutral sentiment gives an f1-score of 0.55, a precision score of 0.58, and a recall score of 0.53. Negative sentiment gives an f1-score of 0.59, a precision score of 0.53, and a recall score of 0.67. Finally, the BERT model provides accuracy at 66.26%, whereas positive sentiment gives an f1-score of 0.70, a precision score of 0.69, and a recall score of 0.70. Neutral sentiment gives an f1-score of 0.55, a precision score of 0.50, and a recall score of 0.69. Negative sentiment gives an f1-score of 0.70, a precision score of 0.70, and a recall score of 0.70. The results are shown in Table 13.
Table 13.
FeatureMeasureRandom Forest
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.61680.56460.47590.51660.50350.5144
 Precision0.67850.49320.39780.41020.41900.6377
 Recall0.56540.66010.59230.69760.63060.4310
 Accuracy0.6108  0.6269  
Bag-of-wordsF1-score0.44520.44440.66280.66250.56700.5871
 Precision0.50600.34100.62690.60920.52480.7763
 Recall0.39750.63770.70300.72600.61670.4720
 Accuracy0.5953  0.6112  
Word2VecF1-score0.60010.51250.65290.60180.55200.5936
 Precision0.59840.58970.62010.61440.57980.5309
 Recall0.60190.45310.68930.58970.52680.6731
 Accuracy0.6298  0.6310  
BERTF1-score0.61930.55390.59280.69470.54650.6995
 Precision0.67340.54800.57260.69060.49880.6968
 Recall0.57330.56000.67640.69890.69420.7023
 Accuracy0.6417  0.6626  
Table 13. Experimental Results of Thai Financial News Sentiment Classification Using the Random Forest Classifier with 10-Fold Cross Validation
For the SVM classifier, the first unigram experiment combined with TF-IDF provides accuracy at 72.60%, whereas positive sentiment gives an f1-score of 0.79, a precision score of 0.93, and a recall score of 0.68. Neutral sentiment gives an f1-score of 0.67, a precision score of 0.69, and a recall score of 0.64. Negative sentiment gives an f1-score of 0.79, a precision score of 0.77, and a recall score of 0.80. The second experiment using bag-of-words provides accuracy at 69.09%, whereas positive sentiment gives an f1-score of 0.50, a precision score of 0.63, and a recall score of 0.42. Neutral sentiment gives an f1-score of 0.33, a precision score of 0.48, and a recall score of 0.25. Negative sentiment gives an f1-score of 0.81, a precision score of 0.73, and a recall score of 0.91. The third experiment using Word2Vec provides accuracy at 77.09%, whereas positive sentiment gives an f1-score of 0.78, a precision score of 0.77, and a recall score of 0.78. Neutral sentiment gives an f1-score of 0.69, a precision score of 0.73, and a recall score of 0.65. Negative sentiment gives an f1-score of 0.70, a precision score of 0.69, and a recall score of 0.71. Finally, the BERT model provides accuracy at 82.97%, whereas positive sentiment gives an f1-score of 0.83, a precision score of 0.83, and a recall score of 0.83. Neutral sentiment gives an f1-score of 0.80, a precision score of 0.80, and a recall score of 0.80. Negative sentiment gives an f1-score of 0.84, a precision score of 0.84, and a recall score of 0.84.
The first bigram experiment combined with TF-IDF provides accuracy at 76.39%, whereas positive sentiment gives an f1-score of 0.83, a precision score of 0.90, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.86, a precision score of 0.91, and a recall score of 0.80. Negative sentiment gives an f1-score of 0.60, a precision score of 0.61, and a recall score of 0.59. The second experiment using bag-of-words provides an accuracy of 74.26%, whereas positive sentiment gives an f1-score of 0.69, a precision score of 0.61, and a recall score of 0.80. Neutral sentiment gives an f1-score of 0.76, a precision score of 0.93, and a recall score of 0.79. Negative sentiment gives an f1-score of 0.59, a precision score of 0.58, and a recall score of 0.60. The third experiment using Word2Vec provides accuracy at 78.13%, whereas positive sentiment gives an f1-score of 0.78, a precision score of 0.77, and a recall score of 0.78. Neutral sentiment gives an f1-score of 0.71, a precision score of 0.72, and a recall score of 0.70. Negative sentiment gives an f1-score of 0.76, a precision score of 0.76, and a recall score of 0.76. Finally, the BERT model provides accuracy at 83.38%, whereas positive sentiment gives an f1-score of 0.84, a precision score of 0.84, and a recall score of 0.83. Neutral sentiment gives an f1-score of 0.84, a precision score of 0.83, and a recall score of 0.85. Negative sentiment gives an f1-score of 0.84, a precision score of 0.83, and a recall score of 0.84. The results are shown in Table 14.
Table 14.
FeatureMeasureSVM
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.78570.66570.78810.83110.85460.5986
 Precision0.92800.69250.77410.90270.91420.6096
 Recall0.68130.64090.80260.77000.80230.5879
 Accuracy0.7260  0.7639  
Bag-of-wordsF1-score0.50350.32810.81010.69470.85590.5935
 Precision0.63060.47950.73000.61230.93030.5844
 Recall0.41900.24940.91000.80270.79260.6028
 Accuracy0.6909  0.7426  
Word2VecF1-score0.77510.68770.70000.77680.70740.7575
 Precision0.77140.72600.68990.77240.71710.7577
 Recall0.77890.65330.71050.78130.69800.7573
 Accuracy0.7709  0.7813  
BERTF1-score0.83020.79990.84390.83660.83900.8349
 Precision0.83080.79820.84350.83940.82910.8309
 Recall0.82970.80170.84430.83390.84910.8390
 Accuracy0.8297  0.8338  
Table 14. Experimental Results of Thai Financial News Sentiment Classification Using SVM with 10-Fold Cross Validation
For the CNN classifier, the first unigram experiment combined with TF-IDF provides accuracy at 72.64%, whereas positive sentiment gives an f1-score of 0.74, a precision score of 0.75, and a recall score of 0.78. Neutral sentiment gives an f1-score of 0.77, a precision score of 0.78, and a recall score of 0.78. Negative sentiment gives an f1-score of 0.65, a precision score of 0.68, and a recall score of 0.62. The second experiment using bag-of-words provides an accuracy of 70.88%, whereas positive sentiment gives an f1-score of 0.71, a precision score of 0.70, and a recall score of 0.71. Neutral sentiment gives an f1-score of 0.69, a precision score of 0.71, and a recall score of 0.68. Negative sentiment gives an f1-score of 0.73, a precision score of 0.75, and a recall score of 0.71. The third experiment using Word2Vec provides accuracy at 75.73%, whereas positive sentiment gives an f1-score of 0.71, a precision score of 0.76, and a recall score of 0.76. Neutral sentiment gives an f1-score of 0.75, a precision score of 0.79, and a recall score of 0.71. Negative sentiment gives an f1-score of 0.81, a precision score of 0.82, and a recall score of 0.80. Finally, the BERT model provides accuracy at 80.09%, whereas positive sentiment gives an f1-score of 0.81, a precision score of 0.80, and a recall score of 0.82. Neutral sentiment gives an f1-score of 0.80, a precision score of 0.79, and a recall score of 0.80. Negative sentiment gives an f1-score of 0.80, a precision score of 0.80, and a recall score of 0.79.
The first bigram experiment combined with TF-IDF provides accuracy at 77.67%, whereas positive sentiment gives an f1-score of 0.75, a precision score of 0.74, and a recall score of 0.76. Neutral sentiment gives an f1-score of 0.72, a precision score of 0.72, and a recall score of 0.72. Negative sentiment gives an f1-score of 0.73, a precision score of 0.69, and a recall score of 0.77. The second experiment using bag-of-words provides accuracy at 75.19%, whereas positive sentiment gives an f1-score of 0.70, a precision score of 0.69, and a recall score of 0.71. Neutral sentiment gives an f1-score of 0.78, a precision score of 0.77, and a recall score of 0.79. Negative sentiment gives an f1-score of 0.74, a precision score of 0.79, and a recall score of 0.70. The third experiment using Word2Vec provides accuracy at 77.20%, whereas positive sentiment gives an f1-score of 0.77, a precision score of 0.77, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.74, a precision score of 0.77, and a recall score of 0.72. Negative sentiment gives an f1-score of 0.71, a precision score of 0.65, and a recall score of 0.78. Finally, the BERT model provides accuracy at 83.86%, whereas positive sentiment gives an f1-score of 0.84, a precision score of 0.84, and a recall score of 0.84. Neutral sentiment gives an f1-score of 0.82, a precision score of 0.83, and a recall score of 0.82. Negative sentiment gives an f1-score of 0.84, a precision score of 0.79, and a recall score of 0.89. The results are shown in Table 15.
Table 15.
FeatureMeasureCNN
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.73840.77680.64880.75290.71900.7275
 Precision0.75110.77230.67680.74220.71990.6914
 Recall0.72620.78130.62300.76390.71820.7676
 Accuracy0.7264  0.7767  
Bag-of-wordsF1-score0.70530.69380.73060.69540.78170.7442
 Precision0.70300.70820.74940.69130.77410.7915
 Recall0.70770.68000.71280.71020.78940.7022
 Accuracy0.7088  0.7519  
Word2VecF1-score0.75750.74740.80780.76950.74250.7111
 Precision0.75770.78770.81980.76700.76450.6533
 Recall0.75740.71100.79620.77210.72180.7801
 Accuracy0.7573  0.7720  
BERTF1-score0.80820.79610.79510.84010.82400.8351
 Precision0.80080.79230.80100.84150.83050.7890
 Recall0.81580.79990.78930.83870.81760.8869
 Accuracy0.8009  0.8386  
Table 15. Experimental Results of Thai Financial News Sentiment Classification Using CNN with 10-Fold Cross Validation
For the LSTM classifier, the first unigram experiment combined with TF-IDF provides accuracy at 76.33%, whereas positive sentiment gives an f1-score of 0.75, a precision score of 0.73, and a recall score of 0.77. Neutral sentiment gives an f1-score of 0.73, a precision score of 0.77, and a recall score of 0.69. Negative sentiment gives an f1-score of 0.71, a precision score of 0.70, and a recall score of 0.72. The second experiment using bag-of-words provides an accuracy of 74.20%, whereas positive sentiment gives an f1-score of 0.72, a precision score of 0.74, and a recall score of 0.71. Neutral sentiment gives an f1-score of 0.73, a precision score of 0.74, and a recall score of 0.72. Negative sentiment gives an f1-score of 0.73, a precision score of 0.76, and a recall score of 0.70. The third experiment using Word2Vec provides accuracy at 79.77%, whereas positive sentiment gives an f1-score of 0.78, a precision score of 0.78, and a recall score of 0.78. Neutral sentiment gives an f1-score of 0.77, a precision score of 0.79, and a recall score of 0.75. Negative sentiment gives an f1-score of 0.80, a precision score of 0.80, and a recall score of 0.79. Finally, the BERT model provides accuracy at 80.35%, whereas positive sentiment gives an f1-score of 0.81, a precision score of 0.81, and a recall score of 0.82. Neutral sentiment gives an f1-score of 0.80, a precision score of 0.77, and a recall score of 0.82. Negative sentiment gives an f1-score of 0.77, a precision score of 0.70, and a recall score of 0.84.
The first bigram experiment combined with TF-IDF provides accuracy at 78.88%, whereas positive sentiment gives an f1-score of 0.76, a precision score of 0.77, and a recall score of 0.75. Neutral sentiment gives an f1-score of 0.75, a precision score of 0.79, and a recall score of 0.70. Negative sentiment gives an f1-score of 0.77, a precision score of 0.80, and a recall score of 0.74. The second experiment using bag-of-words provides an accuracy of 77.28%, whereas positive sentiment gives an f1-score of 0.77, a precision score of 0.79, and a recall score of 0.76. Neutral sentiment gives an f1-score of 0.76, a precision score of 0.77, and a recall score of 0.74. Negative sentiment gives an f1-score of 0.73, a precision score of 0.75, and a recall score of 0.71. The third experiment using Word2Vec provides accuracy at 80.71%, whereas positive sentiment gives an f1-score of 0.83, a precision score of 0.85, and a recall score of 0.81. Neutral sentiment gives an f1-score of 0.80, a precision score of 0.77, and a recall score of 0.82. Negative sentiment gives an f1-score of 0.80, a precision score of 0.80, and a recall score of 0.81. Finally, the BERT model provides accuracy at 84.07%, whereas positive sentiment gives an f1-score of 0.83, a precision score of 0.82, and a recall score of 0.84. Neutral sentiment gives an f1-score of 0.80, a precision score of 0.76, and a recall score of 0.86. Negative sentiment gives an f1-score of 0.84, a precision score of 0.85, and a recall score of 0.83. The results are shown in Table 16.
Table 16.
FeatureMeasureLSTM
UnigramBigram
PositiveNeutralNegativePositiveNeutralNegative
TF-IDFF1-score0.74640.72550.70930.76070.74500.7713
 Precision0.72570.77120.69690.77430.79040.8019
 Recall0.76830.68500.72220.74760.70450.7430
 Accuracy0.7633  0.7888  
Bag-of-wordsF1-score0.72310.73010.72750.77150.75760.7313
 Precision0.74040.73930.76050.78570.77310.7528
 Recall0.70660.72110.69720.75790.74280.7109
 Accuracy0.7420  0.7728  
Word2VecF1-score0.77860.76840.79490.82680.79560.8036
 Precision0.77670.79150.79800.84790.76980.8014
 Recall0.78060.74670.79180.80670.82320.8058
 Accuracy0.7977  0.8071  
BERTF1-score0.81250.79710.76450.82930.80290.8385
 Precision0.80710.77150.69960.82000.75670.8472
 Recall0.81800.82440.84270.83890.85510.8299
 Accuracy0.8035  0.8407  
Table 16. Experimental Results of Thai Financial News Sentiment Classification Using LSTM with 10-Fold Cross Validation

4.2.3 Comparison of Results.

After both experiments, we compared all obtained results for performance comparison of these algorithms, and the summarized results of accuracy from all experiments are shown in Table 17. We summarize the results as follows:
(1)
The naive Bayes classifier had higher overall accuracy than random forest. The best results were obtained in the BERT model rather than TF-IDF, bag-of-words, and Word2Vec. In contrast, bigram helped naive Bayes improve sentiment classification performance better than unigram.
(2)
The random forest classifier had the lowest accuracy among others. The obtained results did not differ in each experiment. Random forest was the worst sentiment classification to classify Thai language sentiment.
(3)
SVM had the highest accuracy over other machine learning algorithms. The obtained results from the second experiment were outstanding when compared to the first experiment. We recommend using SVM to classify Thai language sentiment in terms of the machine learning approach.
(4)
CNN had improved accuracy over machine learning methods. The results obtained from the second experiment differed slightly from the first experiment. However, CNN had lower efficiency than LSTM.
(5)
LSTM had the highest accuracy when compared to others. The obtained results from both experiments were outstanding when compared to other methods. Using a deep learning approach, we recommend using LSTM to classify Thai language sentiment.
Table 17.
FeatureMeasureAlgorithm
Naive BayesRandom ForestSVMCNNLSTM
UnigramBigramUnigramBigramUnigramBigramUnigramBigramUnigramBigram
TF-IDF5-fold cross validation0.63370.68290.55550.57560.69850.70580.71900.72720.73400.7575
 10-fold cross validation0.66790.69850.61080.62690.72600.76390.72640.77670.76330.7888
Bag-of-words5-fold cross validation0.63370.68290.54100.56070.67750.70030.69800.71470.70660.7404
 10-fold cross validation0.65560.67210.59530.61120.69090.74260.70880.75190.74200.7728
Word2Vec5-fold cross validation0.66990.67060.60270.60980.74250.77980.77660.79890.76180.7620
 10-fold cross validation0.68700.70010.62980.63100.77090.78130.75730.77200.79770.8071
BERT5-fold cross validation0.69720.70110.60690.61180.78760.78910.78270.80640.79620.8075
 10-fold cross validation0.70720.71290.64170.66260.82970.83380.80090.83860.80350.8407
Table 17. Summarized Results of Accuracy from All Experiments

5 Discussion

We applied five algorithms to the same dataset. These results are described for performance comparison of the algorithms as follows:
(1)
The naive Bayes classifier performed better than random forest in terms of higher accuracy, f1-score, precision, and recall. However, this classifier was the second worst compared to SVM, CNN, and LSTM.
(2)
The random forest classifier performed the worst in terms of lowest accuracy, f1-score, precision, and recall when compared with other algorithms.
(3)
SVM was the best sentiment classification tool for machine learning techniques, with the highest accuracy, f1-score, precision, and recall. However, SVM could be better performance when compared to other machine learning techniques.
(4)
CNN performed better than machine learning techniques, with a high accuracy, f1-score, precision, and recall. However, CNN had the second-best performance when compared to LSTM.
(5)
LSTM had the best performance compared to others, with the highest accuracy, f1-score, precision, and recall.
In summary and remain all paragraph of Thai financial news, we categorized and applied sentiment classification algorithms using supervised machine learning and deep learning approaches. The results showed that all algorithms obtained better performance with bigram, the BERT model, and 10-fold cross validation than unigram, bag-of-words, and 5-fold cross validation.
Additionally, we wanted to find the best sentiment classification tool for Thai language sentiment. The results showed that SVM has the best classifier performance for the machine learning approach, whereas LSTM has the best classifier performance for the deep learning approach.

6 Conclusion

In this work, we conducted an analysis and classified Thai financial news. We categorized and applied sentiment classification algorithms using supervised and semi-supervised approaches. The results showed that SVM has the best classifier performance in the supervised machine learning approach. In contrast, LSTM has the best classifier performance with regard to the semi-supervised deep learning approach for Thai sentiment.
The limitations of this research include that the first datasets were financial news in the Thai language. Second, we retrieved headline news from Twitter only. Third, we only used machine learning and deep learning techniques to classify financial news sentiment. Finally, we found that the complex structure of the Thai language may decrease sentiment classification performance.
Our study could contribute to commercialization instruments such as product/service feedback and investor sentiment. For future work, it is our hope that sentiment classification will improve classification performance considering the complexity of the Thai language.

Acknowledgments

We would like to acknowledge the research and editing support provided by Chulalongkorn University, and we thank the Python programming developer who created the Python package to convenience us in classifying financial news sentiment.

References

[1]
Xiaodong Li, Haoran Xie, Li Chen, Jianping Wang, and Xiaotie Deng. 2014. News impact on stock price return via sentiment analysis. Knowledge-Based Systems 49, 1 (Oct. 2014), 14–23.
[2]
Walaa Medhat, Ahmed Hassan, and Hoda Korashy. 2014. Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal 5, 4 (Dec. 2014), 1093–1113.
[3]
Linhao Zhang. 2013. Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation. M.S. Thesis. Department of Computer Science, College of Natural Sciences, University of Texas at Austin.
[4]
Nick J. Enfield. 2002. How to define ‘Lao,’ ‘Thai,’ and ‘Isan’ language? A view from linguistic science. Tai Culture 7, 1 (Jan. 2002), 62–67.
[5]
Suwilai Premsrinat. 2006. Thailand: Language situation. In Encyclopedia of Language & Linguistics (2nd ed.), Keith Brown (Ed.). Elsevier, Amsterdam, Netherlands, 642–644.
[6]
Hugh Thaweesak Koanantakool, Theppitak Karoonboonyanan, and Chai Wutiwiwatchai. 2009. Computers and the Thai Language 31, 1 (Jan. 2009), 46–61.
[7]
Karnchana Nacaskul. 1986. Works in Thailand commemorating the seven hundred years of Thai writing. Crossroads: An Interdisciplinary Journal of Southeast Asian Studies 3, 1, 21–39.
[8]
Thanaruk Theeramunkong, and Sasiporn Usanavasin. 2001. Non-dictionary-based Thai word segmentation using decision trees. In Proceedings of the 1st International Conference on Human Language Technology Research (HLT’01). ACM, New York, NY, 1–5.
[9]
David Filbeck. 1973. Pronouns in Northern Thai. Anthropological Linguistics 15, 8 (1973), 345–361.
[10]
Charles F. Keyes. 1966. Ethnic identity and loyalty of villagers in Northeastern Thailand. Asian Survey 6, 7 (July 1966), 362–369.
[11]
David D. Thomas and Wanna Tienmee. 1983. An acoustic study of Northern Khmer vowels. Work Papers of the Summer Institute of Linguistics, University of North Dakota 27, 8 (1983), 147–159.
[12]
Sirikun Nookua. 2011. The patterns of language use in the southernmost provinces of Thailand. Journal of Cultural Approach 12, 22 (2011), 26–35.
[13]
Nattapong Tongtep and Thanaruk Theeramunkong. 2009. A feature-based approach for relation extraction from Thai news documents. In Proceedings of the Pacific Asia Workshop on Intelligence and Security Informatics (PAISI’09). 149–154.
[14]
Kanlaya Thong-iad and Ponrudee Netisopakul. 2019. Comparison of Thai sentence sentiment tagging methods using Thai sentiment resource. In Proceedings of the 15th International Conference on Computing and Information Technology (IC2IT’19). 89–98.
[15]
Chamemee Prasertdum and Duangdao Wichadakul. 2019. 2019 Thai general election: A Twitter analysis. In Proceedings of the 5th International Conference on Soft Computing in Data Science (SCDS’19). 336–350.
[16]
Todsanai Chumwatana. 2018. Comment analysis for product and service satisfaction from Thai customers’ review in social network. Journal of Information and Communication Technology 17, 2 (April 2018), 271–289.
[17]
Paitoon Porntrakoon. 2019. Improve the accuracy of SenseComp in Thai consumer's review using syntactic analysis. In Proceedings of the 16th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications, and Information Technology (ECTI-CON’19). IEEE, Los Alamitos, CA, 369–372.
[18]
Paitoon Porntrakoon and Chayapol Moemeng. 2018. Thai sentiment analysis for consumer's review in multiple dimensions using sentiment compensation technique (SenseComp). In Proceedings of the 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications, and Information Technology (ECTI-CON’18). IEEE, Los Alamitos, CA, 25–28.
[19]
Kanokorn Trakultaweekoon and Supon Klaithin. 2016. SenseTag: A tagging tool for constructing Thai sentiment lexicon. In Proceedings of the 13th International Joint Conference on Computer Science and Software Engineering (JCSSE’18). IEEE, Los Alamitos, CA, 1–4.
[20]
Madeleine Bates. 1995. Models of natural language understanding. Proceedings of the National Academy of Sciences of the United States of America 92, 22 (Oct. 1995), 9977–9982.
[21]
Wirote Aroonmanakun, Natawut Nupairoj, Veera Muangsin, and Songphan Choemprayong. 2018. Thai monitor corpus: Challenges and contribution to Thai NLP. VACANA Journal of Language & Linguistics 6, 2 (2018), 1–14.
[22]
Kritsada Sriphaew, Hiroya Takamura, and Manabu Okumura. 2009. Sentiment analysis for Thai natural language processing. In Proceedings of the 2nd Thailand-Japan International Academic Conference (TJIA’09). 123–124.
[23]
Choochart Haruechaiyasak and Alisa Kongthon. 2013. LexToPlus: A Thai lexeme tokenization and normalization tool. In Proceedings of the 4th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP’14). 9–16.
[24]
Parinya Sanguansat. 2016. Paragraph2Vec-based sentiment analysis on social media for business in Thailand. In Proceedings of the 8th International Conference on Knowledge and Smart Technology (KST’16). IEEE, Los Alamitos, CA, 175–178.
[25]
Jantima Polpinij. 2014. Multilingual sentiment classification on large textual data. In Proceedings of the IEEE 4th International Conference on Big Data and Cloud Computing (ICBICC’14). IEEE, Los Alamitos, CA, 183–188.
[26]
Boonyarit Deewattananon and Usa Sammapun. 2017. Analyzing user reviews in Thai language toward aspects in mobile applications. In Proceedings of the 14th International Joint Conference on Computer Science and Software Engineering (JCSSE’17). IEEE, Los Alamitos, CA, 1–6.
[27]
Vatcharaporn Esichaikul and Chawisa Phumdontree. 2018. Sentiment analysis of Thai financial news. In Proceedings of the 2nd International Conference on Software and e-Business (ICSEB’18). ACM, New York, NY, 39–43.
[28]
Wirapong Chansanam and Kulthida Tuamsuk. 2020. Thai Twitter sentiment analysis: Performance monitoring of politics in Thailand using text mining techniques. International Journal of Innovation, Creativity and Change 11, 12, 436–452.
[29]
Warunya Wunnasri, Thanaruk Theeramunkong, and Choochart Haruechaiyasak. 2013. Solving unbalanced data for Thai sentiment analysis. In Proceedings of the 10th International Joint Conference on Computer Science and Software Engineering (JCSSE’13). IEEE, Los Alamitos, CA, 200–205.
[30]
Nhan Cach Dang, María N. Moreno-García, and Fernando De la Prieta. 2020. Sentiment analysis based on deep learning: A comparative study. Electronics 9, 3, 483–512.
[31]
Avinash Chandra Pandey, Dharmveer Singh Rajpoot, and Mukesh Saraswat. Twitter sentiment analysis using hybrid cuckoo search method. Information Processing & Management 53, 4 (July 2017), 764–779.
[32]
Nurulhuda Zainuddin, Ali Selamat, and Roliana Ibrahim. 2018. Hybrid sentiment classification on Twitter aspect-based sentiment analysis. Applied Intelligence 48 (May 2018), 1218–1232.
[33]
Tom Mitchell. 1997. Machine Learning. McGraw Hill, New York, NY.
[34]
Nipuna Upeka Pannala, Chamira Priyamanthi Nawarathna, J. T. K. Jayakody, Lakmal Rupasinghe, and Kesavan Krishnadeva. 2016. Supervised learning based approach to aspect based sentiment analysis. In Proceedings of the 2016 IEEE International Conference on Computer and Information Technology (CIT’16). IEEE, Los Alamitos, CA, 662–666.
[35]
Youngjoong Ko and Jungyun Seo. 2000. Automatic text categorization by unsupervised learning. In Proceedings of the 18th Conference on Computational Linguistics (COLING’00). ACM, New York, NY, 453–459.
[36]
Anna Jurek, Maurice D. Mulvenna, and Yaxin Bi. 2015. Improved lexicon-based sentiment analysis for social media analytics. Security Informatics 4, 9 (Dec. 2015), 1–13.
[37]
George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J. Miller. 1990. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography 3, 4 (Dec. 1990), 235–244.
[38]
Vasileios Hatzivassiloglou and Kathleen R. McKeown. 1997. Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the Association for Computational Linguistics (ACL/EACL’98). ACM, New York, NY, 174–181.
[39]
Ponrudee Netisopakul and Apinan Chattupan. 2015. Thai stock news sentiment classification using wordpair features. In Proceedings of the 29th Pacific Asia Conference on Language, Information, and Computation (PACLIC’15). 188–195.
[40]
Ponrudee Netisopakul, Kitsuchart Pasupa, and Rathawut Lertsuksakda. 2016. Hypothesis testing based on observation from Thai sentiment classification. Artificial Life and Robotics 22, (July 2017), 184–190.
[41]
Peerapon Vateekul and Thanabhat Koomsubha. 2016. A study of sentiment analysis using deep learning techniques on Thai Twitter data. In Proceedings of the 13th International Joint Conference on Computer Science and Software Engineering (JCSSE’16). IEEE, Los Alamitos, CA, 1–6.
[42]
Boonyarit Deewattananon and Usa Sammapun. 2017. Analyzing user reviews in Thai language toward aspects in mobile applications. In Proceedings of the 14th International Joint Conference on Computer Science and Software Engineering (JCSSE’17). IEEE, Los Alamitos, CA, 1–6.
[43]
Preedawon Kadmateekarun, Phayung Meesad, and Sumitra Nuanmeesri. 2017. Comparing techniques for sentiment analysis in cosmetic industry from Thai reviews videos. Journal of Engineering and Applied Sciences 12, 2 (2017), 397–403.
[44]
Jaraspong Lertsiwaporn and Twittie Senivongse. 2017. Time-based visualization tool for topic modeling and sentiment analysis of Twitter messages. In Proceedings of the 25th International MultiConference of Engineers and Computer Scientists (IMECS’17). 1–6.
[45]
Kitsuchart Pasupa and Thititorn Seneewong Na Ayutthaya. 2019. Thai sentiment analysis with deep learning techniques: A comparative study based on word embedding, POS-tag, and sentic features. Sustainable Cities and Society 50, 1 (Oct. 2019), 1–14.
[46]
Chayapol Piyaphakdeesakun, Nuttanart Facundes, and Jumpol Polvichai. 2019. Thai comments sentiment analysis on social networks with deep learning approach. In Proceedings of the 34th International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC’19). IEEE, Los Alamitos, CA, 1–4.
[47]
Pakawan Pugsee and Nitikorn Ongsirimongkol. 2019. A classification model for Thai statement sentiments by deep learning techniques. In Proceedings of the 2nd International Conference on Computational Intelligence and Intelligent Systems (CIIS’19). ACM, New York, NY, 22–27.
[48]
Trithep Thumrongluck. 2010. An Automated System for Summarizing Structured Product Reviews. M.S. Thesis. Department of Statistics, Chulalongkorn Businesss School, Chulalongkorn University.
[49]
Michael McTear, Zoraida Callejas, and David Griol. 2016. The Conversational Interface: Talking to Smart Devices. Springer, Berlin, Germany.
[50]
Sungjick Lee and Han-Joon Kim. 2008. News keyword extraction for topic tracking. In Proceedings of the 4th International Conference on Networked Computing and Advanced Information Management (NCM’08). IEEE, Los Alamitos, CA, 554–559.
[51]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781v3 (2013). https://arxiv.org/abs/1301.3781.
[52]
PyThaiNLP. 2019. Thai2Vec Embeddings Examples. Retrieved September 20, 2022 from https://pythainlp.github.io/tutorials/notebooks/word2vec_examples.html.
[53]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional Transformers for language understanding. arXiv:1810.04805v2 (2018). https://arxiv.org/abs/1810.04805.
[54]
ThAIKeras. 2018. ThAIKeras/bert. Retrieved September 21, 2022 from https://github.com/ThAIKeras/bert.
[55]
Abinash Tripathy, Ankit Agrawal, and Santanu Kumar Rath. 2016. Classification of sentiment reviews using n-gram machine learning approach. Expert Systems with Applications 57, 15 (Sept. 2016), 117–126.
[56]
Stuart Russell and Peter Norvig. 1995. Artificial Intelligence: A Modern Approach (3rd. ed.). Prentice Hall, Hoboken, NJ.
[57]
Tin Kam Ho. 1995. Random decision forest. In Proceedings of the 3rd International Conference on Document Analysis and Recognition (ICDAR’95). 278–282.
[58]
Leo Breiman. 2001. Random forests. Machine Learning 45, 1 (Oct. 2001), 5–32.
[59]
Vladimir N. Vapnik. 2000. The Nature of Statistical Learning Theory. Springer, Berlin, Germany.
[60]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning, Nature 521, 1 (May 2015), 436–444.
[61]
Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 1 (April 1960), 37–46.
[62]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (Nov. 1997), 1735–1780.
[63]
David M. W. 2011. Powers. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies 2, 1 (Oct. 2011), 37–63.
[64]
Mohammed J. Zaki and Wagner Meira Jr. 2014. Data Mining and Machine Learning: Fundamental Concepts and Algorithms. Cambridge University Press, Cambridge, UK.

Cited By

View all
  • (2024)A Comparative Study of Classification Algorithms for Sentiment Analysis of Covid-19 Vaccine Opinions Using Machine Learning2024 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)10.1109/HORA61326.2024.10550643(1-6)Online publication date: 23-May-2024

Index Terms

  1. Experiments of Supervised Learning and Semi-Supervised Learning in Thai Financial News Sentiment: A Comparative Study
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Transactions on Asian and Low-Resource Language Information Processing
            ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 7
            July 2023
            422 pages
            ISSN:2375-4699
            EISSN:2375-4702
            DOI:10.1145/3610376
            Issue’s Table of Contents

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 20 July 2023
            Online AM: 08 June 2023
            Accepted: 23 May 2023
            Revised: 29 April 2023
            Received: 19 June 2022
            Published in TALLIP Volume 22, Issue 7

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. Natural language processing
            2. semi-supervised learning
            3. sentiment classification
            4. supervised learning
            5. Thai language

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)642
            • Downloads (Last 6 weeks)58
            Reflects downloads up to 11 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)A Comparative Study of Classification Algorithms for Sentiment Analysis of Covid-19 Vaccine Opinions Using Machine Learning2024 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)10.1109/HORA61326.2024.10550643(1-6)Online publication date: 23-May-2024

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Login options

            Full Access

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media