0% found this document useful (0 votes)

11 views

Word Embedding For Detecting Cyberbullying Based On Recurrent Neural Networks

The phenomenon of cyberbullying has spread and has become one of the biggest problems facing users of social media sites and generated significant adverse effects on society and the victim in particular. Finding appropriate solutions to detect and reduce cyberbullying has become necessary to mitigate its negative impacts on society and the victim. Twitter comments on two datasets are used to detect cyberbullying, the first dataset was the Arabic cyberbullying dataset, and the second was th

Uploaded by

IAES IJAI

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Word Embedding For Detecting Cyberbullying Based On Recurrent Neural Networks

Uploaded by

IAES IJAI

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 13, No. 1, March 2024, pp. 500~508

ISSN: 2252-8938, DOI: 10.11591/ijai.v13.i1.pp500-508  500

Word embedding for detecting cyberbullying based on

recurrent neural networks

Noor Haydar Shaker, Ban N. Dhannoon

Department of Computer Science, College of Science, Al-Nahrain University, Baghdad, Iraq

Article Info ABSTRACT

Article history: The phenomenon of cyberbullying has spread and has become one of the
biggest problems facing users of social media sites and generated significant
Received Jan 27, 2023 adverse effects on society and the victim in particular. Finding appropriate
Revised Mar 19, 2023 solutions to detect and reduce cyberbullying has become necessary to mitigate
Accepted Mar 27, 2023 its negative impacts on society and the victim. Twitter comments on two
datasets are used to detect cyberbullying, the first dataset was the Arabic
cyberbullying dataset, and the second was the English cyberbullying dataset.
Keywords: Three different pre-trained global vectors (GloVe) corpora with different
dimensions were used on the original and preprocessed datasets to represent
Deep learning classifiers the words. Recurrent neural networks (RNN), long short-term memory
Gated recurrent unit (LSTM), Bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and
GloVe word embedding Bidirectional GRU (BiGRU) classifiers utilized, evaluated and compared. The
Long short-term memory GRU outperform other classifiers on both datasets; its accuracy on the Arabic
Recurrent neural networks cyberbullying dataset using the Arabic GloVe corpus of dimension equal to
256D is 87.83%, while the accuracy on the English datasets using 100 D pre-
trained GloVe corpus is 93.38%.
This is an open access article under the CC BY-SA license.

Corresponding Author:
Noor Haydar Shaker
Department of Computer Science, College of Science, Al-Nahrain University
Baghdad, Iraq
Email: noor.haidar21@ced.nahrainuniv.edu.iq

1. INTRODUCTION
The development of technological technologies and the increase in the number of users of social media
sites, including users who try to harm others, led to the spread of cyberbullying. Cyberbullying is a type of
bullying in which one or more persons (the bully) purposefully and frequently cause harm to another person
(the victim) through using technological technologies. Cyberbullies utilize technological technologies like
mobile phones, computers, or other electronic devices to send emails, instant text messages, make comments
on social media or in chat rooms, or otherwise to harass their victims [1], [2]. Cyberbullying may have serious
and long-term consequences for its victims, like a physical, mental, and emotional impact on the victim that
leaves them feeling scared, furious, humiliated, exhausted, or have symptoms such as headaches or stomach
pains. When victims experience cyberbullying, they might start to feel ashamed, nervous, anxious, and insecure
about what people say or think about them. This can lead to withdrawal from friends and family, and it may
lead to the victim's suicide [3], [4]. So, it has become necessary to search for and find solutions to detect
cyberbullying messages. Many attempts have been made in the field of artificial intelligence to detect the
phenomenon of cyberbullying by using machine learning and deep learning techniques, and attempts are
continuing to find the best results and appropriate solutions to detect this phenomenon to reduce the negative
effects that generate in society, especially on the category of teenagers who are more exposed to cyberbullying
than the rest category of society.

Journal homepage: http://ijai.iaescore.com

Int J Artif Intell ISSN: 2252-8938  501

In this research, we used deep learning classifiers with two labelled datasets (Arabic and English) to
detect the phenomenon of cyberbullying. In the step of word representation, we used a pre-trained global vector
for word representation (Pre-trained GloVe) for obtaining vector representations for words, which facilitates
dealing with these words inside the computer since most electronic devices, including the computer, only
understand and deal with digital values, so it became a step to represent words and convert them into vectors,
the most important step. Each vector contains a number of numbers to represent this word and facilitate dealing
with it inside the computer. Five deep learning classifiers we used in this research to detect cyberbullying are:
standard recurrent neural networks (RNN), long short-term memory (LSTM), bidirectional long short-term
memory (BiLSTM), gated recurrent unit (GRU), and Bidirectional gated recurrent unit (BiGRU) based on a
powerful and robust form of a chain of repeating modules of neural networks with internal memory used for
sequential data. This research is organized: Section 2 presents related works, details of the dataset, word
representations, the classifier, and their corresponding results. Section 3 explains the basic concepts in the
practical part of this research. Section 4 provides the methodology that this research followed to achieve the
results. Section 5 presents the experimental results and discussion.

2. RELATED WORK
Since cyberbullying is one of the major problems we are facing, many researchers have contributed
to developing models based on machine learning and deep learning to detect this type of bullying. Reviewing
previous work found that there was not enough research done to identify Arab cyberbullying in particular. This
can be attributed to many challenges and problems related to the Arabic language itself, such as 1) the lack of
a large data set for adopting it to build prediction models. 2) using colloquial language in speaking, and 3) not
all libraries support the Arabic language [5].
Tyagi et al. [6] employed convolution neural network (CNN) with LSTM as a deep learning module
(CNN-LSTM) on 1.6 million English tweets, which categorize into two classes (negative and positive class).
The accuracy was 81.20% in the CNN-LSTM module with GloVe word embedding model dimension equal to
300D. Al-Bayati et al. [7] used an Arabic dataset taken from the internet, which is called large scale Arabic
book reviews (LABR), and contains over 16,448 rows, including positive labels (1) and negative labels (0).
The dataset was preprocessed by removing any words found in the dataset that are not in Arabic, normalization,
stemming, removing stopwords, and others. The dataset is split into 67% for training, 17% for testing, and 16%
for validation. The dataset is trained and tested with LSTM as a deep learning classifier and a pre-trained
embedding layer as word embedding for word representation. The accuracy was 82% with the LSTM classifier,
batch size 256, and epoch 10, which was the best result in this study.
The result in [8] an English cyberbullying dataset from Kaggle, which was collected from social media
sites like Twitter, Instagram, and Facebook. The dataset includes 100,000 comments, and the dataset was
preprocessed in several processes such as text cleaning, tokenization, stemming, lemmatization, and stopwords
removal. This research used LSTM, BiLSTM, GRU, and RNN as deep learning classifiers. The accuracy was
80.86% with an LSTM, 82.18% with BiLSTM, 81.46% with GRU, and 81.01% with RNN. Higher accuracy
was achieved in this research 82.18% with a BiLSTM.
Janardhana et al. [9] used the movie review (MR) dataset, which included 12,500 positive and 12,500
negative reviews. The dataset was preprocessed in several processes such as eliminating the stopwords and
removing the punctuation. This paper used a GloVe word embedding dimension of 200 with three deep learning
classifiers like LSTM, CNN, and CRNN (Generalized CNN combined with the BiLSTM). The accuracy was
79.47% with LSTM, 72.32% with CNN, and 84% with CRNN. The better accuracy was achieved at 84% with
the CRNN deep learning classifier. The LSTM as a deep neural network has been used with a sentiment-
specific word embedding (SSWE) layer for word representation as can be seen in [10]. The dataset was
compiled from three sources: Twitter, Formspring, and Wikipedia, with each platform contributing 3,000
examples for 9,000. The dataset was preprocessed in several processes, like removing numbers, punctuation
marks, symbols, blank spaces, and other processes. The accuracy of each separate platform was 79.1% with
Twitter, 72% with Formspring, 75.5% with Wikipedia, and 77.9% from the total examples with the LSTM
deep learning classifier. The proposed module in this research has some limitations, like the small size of the
dataset used and the one deep learning classifier tried in this research. Venkatesh et al. [11] applied an English
Twitter dataset including 10,007 comments on tweets, and the dataset was preprocessed in several processes,
such as converting all characters to lowercase, removing the links, removing punctuation, removing
whitespace, and others. The authors tried to use deep learning and machine learning modules to achieve the
best result. The best accuracy achieved was 85% with CNN-LSTM and GloVe word embedding.
Almutiry et al. [12] utilized Arabic comments Twitter dataset size of 17,748 comments tweets, which included
14,178 cyberbullying tweets and 3,570 non-cyberbullying tweets. The Arabic comments Twitter dataset
achieved 84.03% with support vector machine (SVM) as the classifier and term frequency-inverse document

Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
502  ISSN: 2252-8938

frequency (TD-IDF) as feature extraction to word representation of the dataset. Table 1 shows some recently
used methods for feature extraction and the data set with their highest accuracy.

Table 1. The highest related work accuracy for each classifier on the used dataset
Research Dataset Feature extraction/Word embedding Classifier Accuracy
[6] 1.6 million English tweets GloVe CNN-LSTM 81.20%
[7] 16,448 rows from (LABR) pre-trained embedding layer LSTM 82%
[8] 100,000 English comments cyberbullying dataset embedding layer LSTM 80.86%
from Kaggle BiLSTM 82.18%
GRU 81.46%
RNN 81.01%
[9] 25,000 reviews from Movie Review GloVe LSTM 79.47%
CNN 72.32%
CRNN 84%
[10] 9,000 examples compiled from Twitter, SSWE layer LSTM 77.9%
Formspring, and Wikipedia
[11] 10,007 comments on English tweets GloVe CNN-LSTM 85%
[12] 17,748 Arabic comments tweet TF-IDF SVM 84.03%

3. PRELIMINARIES
Global vectors (GloVe) is an algorithm that was trained on a huge number of words using
unsupervised training to obtain the embedding matrix for the words, knowing how close the words are to each
other and drawing the words nearest or furthest from each other. GloVe depends on co-occurrence statistics
and a probability ratio statistic of the words to generate an embedding matrix for these words. Because the
computer understands only digital data, this requires converting words into digital values to make them easier
to understand and deal with inside the computer. GloVe is used to represent words using an embedding matrix
containing many words. Each of these words corresponds to several numerical values, representing the vectors
embedding this word, which are then employed as the input layer for neural networks of deep learning
classifiers [13], [14]. Recurrent neural network (RNN) is one type of deep learning classifier based on keeping
the output of a certain layer and feeding it back to the input to predict the layer's output, but it suffers from the
problem of vanishing and exploding gradients. RNN has been developed into different types of classifiers to
achieve better results and possibly solve the problems that RNN's deep learning classifier suffers from [15].
Long short-term memory (LSTM) is one of the types and developments of the RNN that Solves the problem
of vanishing and exploding gradients, especially when faced with long text sentences. The LSTM contains a
memory that saves the most important information and neglects the less important information through four
gates: forgets gate, input gate, cell state, and output gate. Figure 1 shows the LSTM structure [4], [16].

Figure 1. The LSTM structure [16]

Where the A are the neurons of LSTM, the input gates of the neurons are Xt, Xt-1, Xt+1, and the output
gates are ht, ht-1, ht+1. The two outputs from each neuron to the next neuron represent the forget gate and cell
state. The  is the sigmoid activation function [17], [18]. Bidirectional long short-term memory (BiLSTM) is
also a type of recurrent neural network (RNN). The sequence processing model consists of two LSTMs: the
first takes the input in a forward direction and the other in a backward direction. The BiLSTM is working to
effectively increase the information available to the network and improve the context available to the algorithm.

Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508

Int J Artif Intell ISSN: 2252-8938  503

Figure 2 shows how the BiLSTM works, where the input gates of the neurons are Xt, Xt-1, Xt+1, and the output
gates are yt, yt-1, yt+1. The  is the sigmoid nonlinear activation function [19], [20].

Figure 2. The BiLSTM structure [20]

Gated recurrent unit (GRU) is a type of recurrent neural network (RNN); it solves the vanishing and
exploding gradients problems that face RNN. The GRU is similar to the LSTM classifier but with fewer
parameters, generally faster and easier in the training process [21]–[23]. Figure 3 shows the structure of the
GRU classifier [24].
A typical RNN learns sequential information in one direction, i.e., the dependence of the time step t
to the previous temporal steps. Still, potentially available information will be lost. So, BiGRU is suggested,
where a GRU layer is added to process the backward data, causing the yt output at time t to be based on the
information of the previous time steps (Ht−1) and the information of the next time steps (Ht+1) [25].

Figure 3. The structure of GRU deep learning classifier [24]

4. METHODOLOGY
In this research, two datasets on cyberbullying are used, the first in Arabic and the second in English,
each of which was processed with several operations in the preprocessing step. Then, three types of the pre-
trained corpus were used with different dimensions to represent words, making it easier to understand and deal
with them inside the computer. Since the computer only understands digital values, it became necessary to
represent these words with digital values through this step. In the classifiers step, several deep learning
classifiers were used to achieve the best results in classifying and detecting the phenomenon of cyberbullying.
The methodology of all these steps will be shown in Figure 4, which will be clarified and explained for each
step in detail.

Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
504  ISSN: 2252-8938

Figure 4. The methodology of deep learning used to detect cyberbullying in this research

4.1. The input dataset

Searching for the appropriate dataset for the research and its subject is necessary for every practical
research project. Then comes the stage of studying this data and knowing its size, label, and other details that
must be known about the dataset we have chosen. In this research, two Kaggle datasets were used, each having
several tweets related to the research topic: cyberbullying. Tweets are posts or messages that individuals
publish on the Twitter platform to exchange information with each other all over the world [26]. The first
dataset is the Arabic cyberbullying dataset. The size of the first dataset is 17,748 Arabic tweets, including
14,178 cyberbullying tweets and 3,570 non-cyberbullying tweets [12]. The second dataset is the English
cyberbullying dataset. The size of the second dataset is 47k English tweets, including 7,631 not-cyberbullying
tweets, and the rest are cyberbullying tweets, which contain harassing comments like religion, age, and others
[27]. The two cyberbullying datasets used in this research to detect and classify the comments on Twitter on
cyberbullying to reduce and prevent this phenomenon.

4.2. The preprocessing dataset

After the stage of selecting the appropriate dataset for the research, studying it, and considering it as
the input dataset for the research, it became necessary to process this dataset in the preprocessing stage to
achieve better results with this dataset and the subject of the research, and since the dataset may contain noise.
For this reason, the preprocessing stage is required to minimize the number of words and sentences by
eliminating unnecessary words from tweets and trying to connect or approximate words with the same meaning
or words close to each other, among different techniques. The preprocessing dataset process in this research is
divided into two groups of processes. The first is a group of preprocessing processes for a cyberbullying dataset
in Arabic, and a second is a group of preprocessing processes for a cyberbullying dataset in English. The Arabic
cyberbullying dataset is preprocessed in two main steps, normalization and stemming. The normalization
process contains several operations, such as tokenization, removing Arabic stopwords, extra spaces, numbers,
and repeated characters. The stemming process includes light stemming, root stemming, and lemmatization.
The English cyberbullying dataset is preprocessed using normalization (such as tokenization, removing English
stopwords, extra spaces, punctuation and numbers, repeated characters, and others). After the preprocessed
step of two cyberbullying datasets comes an important step: splitting the dataset. Each dataset was divided into
training and testing data with a rate of 8:2, respectively. This research used training and testing data to detect
and classify the comments on Twitter on cyberbullying, whether the comment is cyberbullying or not.

4.3. GloVe word embedding

The computer device only understands digital data. For this reason, we need to represent the words
by converting each word into several vectors, which includes a huge amount of numbers to represent this word
and is easy to understand and deal with these words by the computer device. In this research, we used pre-

Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508

Int J Artif Intell ISSN: 2252-8938  505

trained GloVe word embedding to represent the word of each tweet's comments with the Arabic and English
cyberbullying datasets. The GloVe has pre-defined dense vectors for around every 6 billion words of English
literature, along with many other general-use characters like commas, braces, and semicolons. Four varieties
available of GloVe are 50 D, 100 D, 200 D, and 300 D. Here D stands for dimension. 100 D means that each
word has an equivalent vector of size 100. GloVe files are simple text files in the form of a dictionary. Words
are keys, and dense vectors are values of the key.
Three pre-trained GloVe corpora are utilized from the Kaggle. The first GloVe corpus is an Arabic
corpus language with 1,538,616 Arabic words with 256 D. The second GloVe corpus is English, which contains
over a million English words with 100D. The third GloVe corpus contains multi-languages; among these are
Arabic and English. It contains 1,193,514 words with 50 D, 100 D, and 200 D.

4.4. The classifiers

The classifier is an algorithm trained on datasets, and its accuracy depends on finding the best weights
that maximize the accuracy of the tested data. Five deep learning classifiers are used to classify and detect the
phenomenon of cyberbullying on the two datasets (Arabic and English) with pre-trained GloVe. These
classifiers are standard recurrent neural networks (RNN), long short-term memory (LSTM) networks,
Bidirectional LSTM (BiLSTM), gated recurrent units (GRU), and Bidirectional GRU (BiGRU) networks with
different experiments.

5. EXPERIMENT RESULTS AND DISCUSSION

In this section, three different experiments with deep learning classifiers and pre-trained GloVe are
utilized to classify and detect the phenomenon of cyberbullying. Each one of these three experiments contained
a set of results, which we reached by executing a large number of lines of code for each of these three
experiments using the Python language. The Python language, which is considered one of the most important
and most used programming languages in the field of computer science, was used to achieve the best results
for this research. We will explain each of these experiments separately in detail, as shown in sections 5.1 and 5.2.

5.1. The first experiment

The first dataset is Arabic cyberbullying, applied using the Arabic pre-trained corpus GloVe of 256D.
The dataset was trained and tested with 256 batch size, 10 epochs, and splitting the dataset was into 80% for
training and 20% for testing. The accuracy results of this experiment are shown in Table 2.

Table 2. The accuracy of deep learning classifiers with Arabic GloVe corpus 256D
Dataset Preprocess RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 83.79% 85.77% 86.50% 87.83% 86.16%
Normalized 84.35% 85.85% 85.77% 86.30% 86.95%
Light stemming 84.24% 85.85% 86.19% 86.56% 86.73%
Root stemming 84.33% 85.71% 85.40% 85.63% 85.29%
Lemmatization 80.09% 86.25% 86.16% 86.22% 86.13%

From the classifier point of view, the best accuracy applied to the Arabic cyberbullying dataset is
achieved using the GRU classifier with an accuracy of 87.83% applied to the original dataset. If we notice the
rest of the results that were implemented and obtained from the practical part of this research, the classifiers
GRU and BiGRU mostly achieved better results than the rest of the classifiers. Also, from our observation of
the results of the practical part that we conducted in this research, the root stemming process in this experiment
mostly achieved less results than the rest of the processes, and thus the root stemming process in this experiment
has mostly failed to achieve good results compared to the rest of the processes. In contrast, the BiGRU and
RNN conducted their best results after the normalization process dataset.

5.2. The second experiment

This experiment uses the pre-trained corpus GloVe, which contains multi-languages; among these are
Arabic and English with 50 D, 100 D, and 200 D. Two datasets were trained and tested with 256 batch size, 10
epochs, and splitting the dataset was into 80% for training and 20% for testing. The accuracy results of these
experiments is shown in Tables 3-5. The Arabic cyberbullying dataset achieved its best result with the GRU
classifier applied after the lemmatization process on different corpora (50, 100, and 200). Increasing the corpus
size enhances the accuracy, so 200 D achieved the best accuracy among these corpora with 86.59%. Also, the
GRU classifier achieved the best accuracy when applied to the English cyberbullying datasets. The experiments

Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
506  ISSN: 2252-8938

were applied on (50, 100, and 200) dimensions, with 93.38% accuracy achieved using a 100D corpus applied
to the normalized dataset.
According to the results, the GRU, LSTM, and BiGRU classifiers mostly achieved better than the
rest. The root stemmer failed to achieve good results when applied to the Arabic cyberbullying dataset
compared to the rest of the preprocessing operations. The normalized preprocess to the English cyberbullying
datasets enhances its accuracy.

Table 3. The accuracy of deep learning classifiers with GloVe corpus 50D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 83.93% 84.04% 85.74% 84.86% 85.15%
Normalized 84.07% 84.19% 84.64% 85.29% 83.90%
Light stemming 83.42% 84.55% 84.44% 85.12% 84.61%
Root stemming 83.23% 84.52% 84.07% 85.20% 83.56%
Lemmatization 83.99% 85.46% 85.09% 86.19% 85.34%
English Cyberbullying Dataset Original 90.92% 92.23% 92.80% 92.94% 92.30%
Normalized 91.14% 92.74% 92.81% 93.19% 92.85%

Table 4. The accuracy of deep learning classifiers with GloVe corpus 100D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 84.47% 84.50% 84.44% 85.09% 85.60%
Normalized 84.92% 85.20% 85.15% 85.34% 85.34%
Light stemming 84.47% 85.34% 85.71% 85.54% 85.23%
Root stemming 84.33% 84.47% 85.12% 85.03% 84.13%
Lemmatization 83.03% 85.46% 85.63% 86.19% 85.63%
English Cyberbullying Dataset Original 91.42% 92.44% 92.62% 92.84% 92.62%
Normalized 91.74% 92.45% 92.96% 93.38% 93.14%

Table 5. The accuracy of deep learning classifiers with GloVe corpus 200D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 84.98% 85.15% 85.96% 85.85% 85.46%
Normalized 84.44% 85.54% 84.89% 85.99% 85.82%
Light stemming 84.92% 85.03% 84.92% 85.88% 85.68%
Root stemming 82.72% 85.34% 84.21% 85.65% 85.03%
Lemmatization 83.87% 85.79% 85.63% 86.59% 85.48%
English Cyberbullying Dataset Original 91.27% 92.19% 92.38% 92.88% 92.80%
Normalized 91.70% 92.98% 92.44% 93.08% 92.66%

5.3. The third experiment

An English pre-trained corpus GloVe of 100 D was applied to the English cyberbullying dataset. The
dataset was trained and tested with 256 batch size, 10 epochs, and splitting the dataset was into 80% for training
and 20% for testing. Table 6 shows the accuracy of the tested dataset using different preprocessing and
classifiers.

Table 6. The accuracy of deep learning classifiers with English GloVe corpus 100D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
English Cyberbullying Dataset Original 90.50% 91.59% 92.32% 92.40% 92.19%
Normalized 90.63% 92.27% 92.45% 92.50% 92.25%

After the normalized process, the GRU classifier achieved the highest accuracy of 92.50% among the
other classifiers, RNN, LSTM, BiLSTM, and BiGRU. The results show that the normalization process is
essential when using the English dataset. There is a trade-off between increasing the corpus dimension and the
accuracy of results. The Arabic corpus with 256 D outperforms other corpora. It doesn't need any dataset
preprocessing. It is recommended with the GRU classifier. GloVe with 50 D, 100 D, and 200 D is evaluated
in the second pre-trained corpus containing multiple languages. The 100D outperforms other dimensions when
applied to the normalized Arabic and English datasets. The third corpus is English, with 100 D doesn't
outperform the second pre-trained corpus that contains multiple languages. From the classifier's point of view,
the GRU classifier outperforms other classifiers.

Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508

Int J Artif Intell ISSN: 2252-8938  507

6. CONCLUSION
Due to the spread of cyberbullying and the adverse effects that result from this phenomenon, it has
become necessary to find appropriate solutions to detect cyberbullying through modern technologies in
artificial intelligence. Current deep learning technologies (RNN, LSTM, BiLSTM, GRU, and BiGRU) are
utilized on two datasets (The Arabic and English cyberbullying datasets). Three different pre-trained GloVe
corpora (the Arabic pre-trained corpus GloVe of 256 D, pre-trained corpus GloVe, which contains multi-
languages with 50 D, 100 D and 200 D, and An English pre-trained corpus GloVe of 100D). The best results
for the Arabic cyberbullying dataset were achieved using the GloVe of 256 D and GRU classifier applied to
the original dataset, which was 87.83% compared with [12], which reached an accuracy of 84.03%. While the
best result for the English cyberbullying datasets was 93.38% achieved when using GloVe 100 D and GRU
classifier after the normalization process.

REFERENCES
[1] T. Alsubait and D. Alfageh, “Comparison of machine learning techniques for cyberbullying detection on YouTube Arabic
comments,” International Journal of Computer Science & Network Security, vol. 21, no. 1, pp. 1–5, 2021.
[2] A. Ali and A. M. Syed, “Cyberbullying Detection Using Machine Learning,” Pakistan Journal of Engineering and Technology
(PakJET), vol. SI, no. 01, pp. 45–50, 2020.
[3] M. Anand and R. Eswari, “Classification of abusive comments in social media using deep learning,” in Proceedings of the 3rd
International Conference on Computing Methodologies and Communication, ICCMC 2019, Mar. 2019, pp. 974–977, doi:
10.1109/ICCMC.2019.8819734.
[4] T. H. H. Aldhyani, M. H. Al-Adhaileh, and S. N. Alsubari, “Cyberbullying identification system based deep learning algorithms,”
Electronics (Switzerland), vol. 11, no. 20, p. 3273, Oct. 2022, doi: 10.3390/electronics11203273.
[5] Z. K. Hussien and B. N. Dhannoon, “Anomaly detection approach based on deep neural network and dropout,” Baghdad Science
Journal, vol. 17, no. 2, pp. 701–709, Jun. 2020, doi: 10.21123/bsj.2020.17.2(SI).0701.
[6] V. Tyagi, A. Kumar, and S. Das, “Sentiment analysis on twitter data using deep learning approach,” in Proceedings - IEEE 2020
2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2020, Dec. 2020,
pp. 187–190, doi: 10.1109/ICACCCN51052.2020.9362853.
[7] A. Q. Al-Bayati, A. S. Al-Araji, and S. H. Ameen, “Arabic sentiment analysis (ASA) using deep learning approach,” Journal of
Engineering, vol. 26, no. 6, pp. 85–93, Jun. 2020, doi: 10.31026/j.eng.2020.06.07.
[8] C. Iwendi, G. Srivastava, S. Khan, and P. K. R. Maddikunta, “Cyberbullying detection solutions based on deep learning
architectures,” Multimedia Systems, vol. 29, no. 3, pp. 1839–1852, Oct. 2020, doi: 10.1007/s00530-020-00701-5.
[9] D. R. Janardhana, C. P. Vijay, G. B. J. Swamy, and K. Ganaraj, “Feature enhancement based text sentiment classification using
deep learning model,” Oct. 2020, doi: 10.1109/ICCCS49678.2020.9277109.
[10] M. Mahat, “Detecting cyberbullying across multiple social media platforms using deep learning,” in 2021 International Conference
on Advance Computing and Innovative Technologies in Engineering, ICACITE 2021, Mar. 2021, pp. 299–301, doi:
10.1109/ICACITE51222.2021.9404736.
[11] Venkatesh, S. U. Hegde, A. S. Zaiba, and Y. Nagaraju, “Hybrid CNN-LSTM model with glove word vector for sentiment analysis
on football specific tweets,” Proceedings of the 2021 1st International Conference on Advances in Electrical, Computing,
Communications and Sustainable Technologies, ICAECT 2021, 2021, doi: 10.1109/ICAECT49130.2021.9392516.
[12] S. Almutiry and M. Abdel Fattah, “Arabic cyberbullying detection using Arabic sentiment analysis,” The Egyptian Journal of
Language Engineering, vol. 8, no. 1, pp. 39–50, Apr. 2021, doi: 10.21608/ejle.2021.50240.1017.
[13] N. A. Hamzah and B. N. Dhannoon, “The detection of sexual harassment and chat predators using artificial neural network,”
Karbala International Journal of Modern Science, vol. 7, no. 4, pp. 301–312, Dec. 2021, doi: 10.33640/2405-609X.3157.
[14] T. Hossain, H. Z. Mauni, and R. Rab, “Reducing the effect of imbalance in text classification using SVD and glove with ensemble
and deep learning,” Computing and Informatics, vol. 41, no. 1, pp. 98–115, 2022, doi: 10.31577/CAI_2022_1_98.
[15] M. A. Akbar, A. Jazlan, M. Mahbuburrashid, H. F. M. Zaki, M. N. Akhter, and A. H. Embong, “Solar thermal process parameters
forecasting for evacuated tube collectors (Etc) based on RNN-LSTM,” IIUM Engineering Journal, vol. 24, no. 1, pp. 256–268, Jan.
2023, doi: 10.31436/iiumej.v24i1.2374.
[16] P. Zheng, W. Zhao, Y. Lv, L. Qian, and Y. Li, “Health status-based predictive maintenance decision-making via LSTM and markov
decision process,” Mathematics, vol. 11, no. 1, p. 109, Dec. 2023, doi: 10.3390/math11010109.
[17] T. A. Wotaifi and B. N. Dhannoon, “An effective hybrid deep neural network for arabic fake news detection,” Baghdad Science
Journal, Jan. 2023, doi: 10.21123/bsj.2023.7427.
[18] P. Hu, J. Qi, J. Bo, Y. Xia, C.-M. Jiao, and M.-T. Huang, “Research on LSTM-based industrial added value prediction under the
framework of federated learning,” in Proceedings of the 2022 3rd International Conference on Big Data and Informatization
Education (ICBDIE 2022), Atlantis Press International {BV}, 2023, pp. 426–434.
[19] A. Pratomo, M. O. Jatmika, B. Rahmat, and Y. S. Triana, “Transfer learning implementation on BiLSTM with optimizer for
predicting non-ferrous metals prices,” 2022.
[20] D. Naik and C. D. Jaidhar, “A novel multi-layer attention framework for visual description prediction using bidirectional LSTM,”
Journal of Big Data, vol. 9, no. 1, Nov. 2022, doi: 10.1186/s40537-022-00664-6.
[21] G. Shen, Q. Tan, H. Zhang, P. Zeng, and J. Xu, “Deep learning with gated recurrent unit networks for financial sequence
predictions,” Procedia Computer Science, vol. 131, pp. 895–903, 2018, doi: 10.1016/j.procs.2018.04.298.
[22] M. Li et al., “Internet financial credit risk assessment with sliding window and attention mechanism LSTM model,” Tehnicki
Vjesnik, vol. 30, no. 1, pp. 1–7, Feb. 2023, doi: 10.17559/TV-20221110173532.
[23] Y. Liu, X. Liu, Y. Zhang, and S. Li, “CEGH: A hybrid model using CEEMD, entropy, GRU, and history attention for intraday stock
market forecasting,” Entropy, vol. 25, no. 1, p. 71, Dec. 2023, doi: 10.3390/e25010071.
[24] Z. Liu, J. Mei, D. Wang, Y. Guo, and L. Wu, “A novel damage identification method for steel catenary risers based on a novel
CNN-GRU model optimized by PSO,” Journal of Marine Science and Engineering, vol. 11, no. 1, p. 200, Jan. 2023, doi:
10.3390/jmse11010200.
[25] T. Saghi, D. Bustan, and S. S. Aphale, “Bearing fault diagnosis based on multi-scale CNN and bidirectional GRU,” Vibration, vol.
6, no. 1, pp. 11–28, Dec. 2022, doi: 10.3390/vibration6010002.

Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
508  ISSN: 2252-8938

[26] T. A. Wotaifi and B. N. Dhannoon, “Improving prediction of arabic fake news using fuzzy logic and modified random forest model,”
Karbala International Journal of Modern Science, vol. 8, no. 3, pp. 477–485, Aug. 2022, doi: 10.33640/2405-609X.3241.
[27] J. Wang, K. Fu, and C. T. Lu, “SOSNet: A graph convolutional network approach to fine-grained cyberbullying detection,” in
Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020, Dec. 2020, pp. 1699–1708, doi:
10.1109/BigData50022.2020.9378065.

BIOGRAPHIES OF AUTHORS

Noor Haydar Shaker holds a bachelor's degree in computer science from Al-
Nahrain University, Iraq, since 2019. She is currently a master's student at Al-Nahrain
University. She specialized in artificial intelligence and is currently doing some research
within the field of artificial intelligence, specifically in deep learning algorithms, which is the
field of her master's thesis, which she is currently preparing. She can be contacted at email:
noor.haidar21@ced.nahrainuniv.edu.iq.

Ban N. Dhannoon Ph.D. holder in computer science since 2001 from the
University of Technology, Baghdad, Iraq, with the Dissertation "Fuzzy Rule Extraction". A
professor in Computer Science Dept./College of Science/Al-Nahrain University since 2013.
My research interests are Artificial Intelligence (natural language processing, machine
learning, and Deep Learning), Digital Image Processing, and Pattern Recognition. She can be
contacted at email: ban.n.dhannoon@nahrainuniv.edu.iq.