1 s2.0 S2772442523000655 Main

Healthcare Analytics 3 (2023) 100198
Contents lists available at ScienceDirect
Healthcare Analytics
journal homepage: www.elsevier.com/locate/health
A comparative study of retrieval-based and generative-based chatbots using

Deep Learning and Machine Learning
Sumit Pandey ∗, Srishti Sharma
The NorthCap University School of Engineering & Technology, Gurugram, Haryana, 122017, India
ARTICLE INFO ABSTRACT

Keywords: Increased screen time may cause significant health impacts, including harmful effects on mental health. Studies
Artificial Intelligence on the association between technological obsessions and their influence on health have been conducted
Chatbot using Deep Learning (DL) and Machine Learning (ML) techniques. The deployment of chatbots in different
Deep Learning
industries has been proven as a game-changer. We study conversational Artificial Intelligence (AI) systems
Machine Learning
enabling operators to conduct conversations with machines that resemble those with humans. We design and
Mental health
develop two retrieval-based and generative-based chatbots, each with six designs. Among the retrieval-based
chatbots, Vanilla Recurrent Neural Network (RNN) has an accuracy of 83.22%, Long Short Term Memory
(LSTM) is 89.87% accurate, Bidirectional LSTM (Bi-LSTM) is 91.57% accurate, Gated Recurrent Unit (GRU) is
65.57% accurate, and Convolution Neural Network (CNN) is 82.33% accurate. In comparison, generative-based
chatbots have encoder–decoder designs that are 94.45% accurate. The most significant distinction is that while
generative-based chatbots can generate new text, retrieval-based chatbots are restricted to responding to inputs
that match the best of the outputs they already know.
1. Introduction was published [1]. Twelve states altogether attended the NMHS. It
used both quantitative and qualitative methods to conduct assessments
Psychological behavioural changes caused by mental diseases may of adults such as Focus-Group Discussions (FGD) and Key Informant
have an impact on a person’s development. People of different ages, Interviews (KII) as stated in [2]. India had a population of about
from many cultures, and nations are susceptible to them. Frequently, 150 million individuals who needed assistance, with men outnum-
odd thoughts, feelings, behaviour, and perceptions are indicators of bering women. As is commonly assumed, men are much more likely
mental illnesses. Mental health conditions include developmental and than women to experience behavioural and mental health problems.
neurodegenerative illnesses like autism1 as well as schizophrenia,2 However, schizotypal, psychoneurotic diseases, mood disorders, and
bipolar disorder,3 depression,4 and other psychoses.5 One in seven psychotic syndromes are associated with physical abnormalities. The
Indians, according to a 2017 study, suffer from a mental condition such majority of those with mental health issues were between the ages of
as schizophrenia or bipolar disorder. As people’s awareness of mental 40 and 49. It was also demonstrated that the middle class and lower
health issues grows, many researchers are turning their attention to this classes were burdened more than the wealthy [2,3]. The NMHS System
area as a key area for improvement. To provide a uniform approach to has raised attention to mental health issues in individuals and increased
mental health facilities, including treatment, support, and prevention, the use of Psychological Therapies (PSIs), and prescription drugs. From
as well as a comprehensive means of addressing the nation’s psycholog- 70% to 92% more people with various mental diseases are receiving
ical well-being, India’s first National Mental Health Survey6 (NMHS) treatment. Currently, 1.3% of all health spending in India is allocated
to supporting psychological well-being [4].
∗ Corresponding author.
E-mail addresses: sumit18csd004@ncuindia.edu (S. Pandey), srishti@ncuindia.edu (S. Sharma).
1
A neurodevelopmental disease called autism impacts behaviour, social interaction, and communication.
2
Schizophrenia is a psychiatric disorder that hinders an individual’s ability to engage in coherent cognitive processes, experience emotions in a stable manner,
and exhibit appropriate behaviours.
3
Bipolar disorder, a mental illness, is typified by alternating episodes of depressive and manic or hypomanic states.
4
Depression, a mental disorder, is characterized by prolonged periods of experiencing emotions such as sadness, despair, and a lack of interest in previously
enjoyed activities.
5
A range of mental health conditions known as psychoses have an impact on a person’s capacity to think, feel, and perceive reality.
6
The National Mental Health Survey (NMHS) had the goal of estimating the frequency and burden of mental health disorders in India, identifying current
treatment gaps, existing patterns of healthcare use, and understanding the effect and disability caused by these diseases.
https://doi.org/10.1016/j.health.2023.100198
Received 2 April 2023; Received in revised form 8 May 2023; Accepted 17 May 2023
2772-4425/© 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
S. Pandey and S. Sharma Healthcare Analytics 3 (2023) 100198
Also, the importance of School-Based Mental Health Services

(SBMHSs) must be prioritized because adolescents have experienced
comparable challenges such as one in every thirty-three children and
one in every eight adolescents may deal with clinical depression which
increases the risk of suicide in the case of adolescents. Therefore, this
study emphasizes the value of a multi-tiered approach at different
stages to get around potential obstacles [5,6]. A multi-tiered approach
can be extremely valuable in overcoming potential obstacles in any
complex project, such as the development of chatbots for mental health
care. In this study, the authors propose a multi-tiered approach at
different stages to address the challenges faced in developing effec-
tive chatbots for mental health care. At the initial stage, the authors
suggest compiling pertinent data to address the problem of scarcity
of large-scale, subject-to-excessive-quality data. This approach involves
gathering relevant data from various sources and using Machine Learn-
ing (ML) approaches to filter out irrelevant data or noise, which can
help to increase data quality and decrease bias. In the next stage,
with the help of Neural Networks (NN), the authors of this paper
have created two different types of chatbots, a chatbot that searches Fig. 1. Retrieval-based chatbots.
for information and a chatbot powered by generative algorithms,

which specifically highlight mental health issues in college students by
reducing stress and, as a result, assisting in the determination of the
root cause of their issues. Furthermore, the authors suggest customizing
the chatbot by changing the operator’s questions and combining them
with the developed chatbot, which can improve the chatbot’s ability to
communicate with humans and have a ‘‘human-like’’ look. As, with the
advancement of digital technology, there is a chance to create novel
and easily accessible mental health therapies, such as telepsychiatry,
online counselling, and mental health chatbots. These innovations have
the potential to improve access to mental health care, lessen the stigma
around asking for assistance, and offer more specialized and efficient
treatment alternatives. The utilization of technology can also involve
the collection and analysis of vast amounts of data, which can offer
insights into the occurrence and potential causes of mental health
disorders, as well as the effectiveness of various treatments. We can
address the escalating mental health crisis and enhance the wellness
of people and communities by using technology to improve mental
health. Finally, the authors emphasize the importance of conducting
follow-up studies with larger sample sizes to evaluate the effectiveness
of chatbots in boosting well-being and reducing stress. This multi-tiered
approach involves evaluating the chatbots’ performance at different
stages and addressing potential challenges, such as the risk of placebo
Fig. 2. Generative-based chatbots.
effects and false positives. Because people might have an interactive
way to take part in Artificial Intelligence (AI)-driven behavioural health
interventions if chatbots are used as a scalable option as chatbots can
be available 24/7, providing immediate support and assistance during data can have different structures such as Comma Separated Values
times of crisis. However, people are growing more and more dependent (CSV) and JavaScript Object Notation (JSON), there is a need to
on digital devices for work, play, and communication as technology construct two separate kinds of chatbots: retrieval-based chatbots and
becomes more pervasive in modern culture. But this increased reliance generative-based chatbots as shown in Figs. 1 and 2. Retrieval-based
on technology has also been linked to several detrimental health effects, models utilize a set of predetermined responses and experiences to
such as sedentary behaviour, poor posture, eye strain, disturbed sleep determine the most suitable reply, taking into account the input and
cycles, and psychological issues like anxiety, depression, and addiction. context provided. The evaluation standards could be anything from
We can better understand the effects of technology on people and soci- straightforward rule-based charting to intricate ML ensembles. These
ety as a whole by investigating the relationship between technological techniques select a response from a list of options rather than creating
engrossment and health. We can then develop measures to encourage fresh text. Retrieval-based chatbots are often error-free when they were
healthy technology use and lessen the detrimental consequences on trained on a large and diverse dataset with sufficient annotation and
health and wellbeing. Therefore, understanding the user behaviours feedback. But because they seem too stiff and the responses do not
of chatbots for depression is a crucial first step in creating chatbot seem ‘‘human’’, they are constrained in their approach. Pre-determined
designs and sharing information about the benefits and drawbacks responses are not used in complex generative-based chatbot models.
of chatbots. Although some chatbots’ early efficacy results have been They develop original ideas from the ground up. In generative-based
encouraging, there is little data on how users use these chatbots. chatbot models, automatic conversion is common; because they learn
For instance, chatbots in different industries have been proven as from the ground up, generative-based chatbot models are utilized to
game-changer such as [7–10]. However, in mental health people tend create sophisticated chatbots that are quite progressive in function.
to forget unpleasant circumstances; therefore, certain stress-related This paper is organized as: A thorough literature analysis of the
instances could go unrecognized which makes data collecting difficult suggested chatbot approaches is provided in Section 2. Additionally,
when it comes to felt stress [11,12]. Therefore, because collecting Section 3 explains the implementation, and the dataset, along with the
2
Table 1
Retrieval-based Chatbots.
Study Efficacy Privacy and Confidentiality Safety
Empirical Informationb Transparencyc Privacy Availabilitye Transparencyf Traditional Automatic conversation
testinga agreementsd supportg terminationh
Zhang et al. [13] ✔ ✔ ✔
Moore et al. [14] ✔ ✔
Akkineni et al. [15] ✔ ✔ ✔
Shi et al. [16] ✔ ✔
Wang and Fang [17] ✔ ✔ ✔
Qian and Dou [18] ✔ ✔
Lan et al. [19] ✔ ✔
Kadam et al. [20] ✔ ✔ ✔ ✔
Wang and Fang [21] ✔
Kim et al. [22] ✔ ✔
Aksu et al. [23]
Patchava and Kiran [24] ✔ ✔
Lopez-Rodriguez et al. [25] ✔ ✔
a
Empirical testing of chatbots?
b
Providing information on theoretical approach of chatbots to users?
c
Transparency of privacy policy in chatbots for patients?
d Providing privacy agreements to patients before starting the conversation with chatbots?
e Availability of privacy agreement review for patients during chatbot conversation?
f Transparency of chatbots to patients regarding their intelligent nature?
g
Availability of traditional support in chatbots for patients?
h
Automatic conversation termination by chatbots while interacting with patients?
explanation of the experimental results for the suggested chatbots that adjusting to different talks [39] that do not follow a predetermined
are retrieval and generative. A comparison of JSON and CSV files with plan. Due to the non-predetermined dialogue flow, it might sometimes
various models is demonstrated in Section 4. The paper concluded in result in consumers revealing more information than is necessary [29].
Section 5 with a few observations and suggestions for supplementary Building interaction management frameworks for chatbots typically
study. involves using Artificial Intelligence Markup Language (AIML) and
ChatScript [26], two well-liked methods. Artificial Linguistic Internet
2. Related works Computer Entity (ALICE), a pioneering chatbot that was able to have
basic interactions with users, was the first to use AIML. A tree structure
The two main categories of conversational frameworks used to build is used by the Extensible Markup Language (XML)-compliant language
chatbots are retrieval-based and generative-based [26,27]. Retrieval- AIML to quickly match patterns and get the right answers. AIML
based chatbots find matched responses from a database of intentionally was used to create the chatbot systems for several treatment chatbots
awkward conversational phrases, which is the main distinction between that have been mentioned in the literature, including a Virtual Agent
the two. Additionally, generative-based chatbots use ML approaches to Equipped with Voice Communication (VICA) which was proposed by
automatically generate responses. Sakurai et al. [40], an alcohol misuse intervention chatbot proposed by
As of now, the retrieval-based approach is used by the majority Dulin et al. [41], Barnett et al. [42], Win et al. [43] and a consultant
of therapeutic chatbots [28,29]. Lommatzsch and Katins et al. [30]
chatbot proposed by Parviainen and Rantala [44], Bharti et al. [45],
state that retrieval-based chatbots keep track of the conversation using
Shinde et al. [46]. Decision tree topologies have been employed by
dialogue management frameworks and determine what to do next de-
several treatment chatbots, including Vivibot which was proposed by
pending on the responses they discover [31]. Whereas, Wang et al. [32]
Greer et al. [47], Woebot was proposed by Fitzpatrick et al. [48],
state that many therapy chatbots manage their user conversations
and a chatbot for post-traumatic stress disorder was proposed by Han
using pre-trained dialogue management frameworks. Retrieval-based
et al. [49], Chaix et al. [50], Ahn et al. [51], Tielman et al. [52].
frameworks can be divided into two categories: Finite state which was
Users were given the option to react in an option-choice manner by an
proposed by Sutton et al. [33] in 1998 and Frame-based which was
embodied conversational agent for education [53]. On the other hand,
proposed by Goddeau et al. [34] in 1996. Parmar et al. [35] state
the generative-based approach limits free dialogues [54,55] due to pre-
that a chatbot with a finite state framework restricts the dialogue to a
determined outputs [27], whereas the retrieval-based approach enables
predetermined set of steps. At each level, users are only provided with a
limited number of response possibilities, and the chatbot can only react chatbots to answer with more meaningful responses [26]. Because it
using those options. This indicates that the dialogue is constrained and relies on a decision tree mechanism, the option-choice format used by
does not permit natural conversation [36]. For simple, structured activi- some chatbots is inappropriate for multi-linear conversations [26]. Fur-
ties where the chatbot can direct the dialogue, a finite state framework thermore, the usability of the system becomes more difficult to enhance
works well [29]. The dialogue flow is not predetermined in frames- since it will not successfully complete the task in cases where user
based [29]. Instead, the chatbot poses targeted queries to the user and inputs fail to match any information in the database [26]. Table 1 lists
methodically gathers data. The ‘‘slots’’ that make up this structured data more retrieval-based chatbots in addition to those already discussed.
are collections of well-known notions according to Wei et al. [37]. The In contrast to retrieval-based chatbots, generative chatbots employ
chatbot then moves forward under pre-specified actions for each group ML approaches to learn how to respond. These chatbots are educated
concept of slots, enabling it to offer more individualized responses on a huge quantity of training data [26] and use that knowledge to pro-
and manage increasingly challenging jobs. Chen et al. [38] state that duce responses to users’ inputs rather than relying on pre-determined
this concept is frequently applied in interactions involving information- responses. RNN, LSTM, and Seq2Seq models are a few of the common
seeking, where users have information based on a set of constraints. AI approaches utilized in generative-based chatbots. But there have not
Users providing data to fill in specified slots, such as their departure and been many studies that have used generative-based methods to create
arrival cities when looking for a route, is an example of a frame-based therapy chatbots. Bidirectional Encoder Representations from Trans-
framework. This kind of framework, nevertheless, may have trouble formers (BERT) [56] and the OpenAI Generative Pre-Training-2 Model
3
Fig. 3. CSV file converted to the JSON file.
Table 2
Generative-based chatbots.
Study Efficacy Privacy and Confidentiality Safety
Empirical Information Transparency Privacy Availability Transparency Traditional support Automatic conversation
testing agreements termination
See and Manning [62] ✔ ✔ ✔
Sheikh et al. [63] ✔ ✔ ✔
Hirosawa et al. [64] ✔ ✔
Sawant et al. [65] ✔ ✔
Si et al. [66] ✔ ✔ ✔
Raj and Phridviraj [67] ✔ ✔
Bachtiar et al. [68] ✔ ✔
Khadija et al. [69] ✔ ✔
(GPT-2) [57], which has been improved into the Third-Generation Au- Table 3
Dataset description.
toregressive Language Model (GPT-3) [58], are two of the most sophis-
Name Mental_Health_FAQ
ticated generative-based models. These models are frequently utilized
in tasks involving Natural Language Processing (NLP) because they Category Comma-Separated Values (CSV)
No. of Rows 98
have demonstrated notable gains in producing human-like language.
No. of Columns 3 i.e. (Question_ID, Questions, Answers)
For producing task-oriented discourse, these models are open-source,
This CSV file does not include any tags that match every query, hence it cannot be
simple to train, and adaptable [55,59]. In 2019, OpenAI introduced
utilized in the case of a retrieval-based class method. Thus, the author’s physical tags
the GPT-2, an unsupervised generative model that had undergone were added to the first rows of the database.
pre-training on a substantial unannotated dataset. This model’s advan-
tages include the ability to support deep language models, reduce the data [75] are some sources for mental health chatbots that have a
cost of manual annotation, and avoid the need to train a new model variety of datasets available. Technical details on the parts that make
from scratch. On language-related tasks including summarizing, read- up the chatbot models are provided in this section. The classification
ing comprehension, answering questions, and translation, the model issue has been addressed in the past using a variety of ML techniques.
did well according to Relc et al. [57]. The chatbot can also be enhanced However, given their extensive feature grabbing, NNs frequently seem
with various domain data to serve certain objectives for its target to outperform ML methods. The authors create a conversational AI that
customers [60,61]. The OpenAI GPT-2 model has benefits, but there specifically highlights issues with college students’ mental health by
are still some problems that need to be solved, such as users having reducing stress and, as a result, assisting in the discovery of the root
trouble understanding the responses and the model producing mistakes of their difficulties. The authors are attempting to create an AI-based
that do not make sense in the context of the conversation according chatbot. This paradigm uses NN to classify user input depending on a
to Zhang et al. [55]. Incorporating pre-trained models that have been preset set of replies and connect user input to intent. The authors chose
specifically designed for particular domains and fine-tuning them with the Kaggle psychological well-being Frequently Asked Question (FAQ)
datasets that are specific to those domains could be a solution to these database [76] since there was no publicly available mental health ser-
problems. Table 1 lists more generative-based chatbots in addition to vices database designed specifically for creating a chatbot. The chosen
those already discussed (see Table 2). . dataset’s description is shown in Table 3.
3. Implementation and analysis 3.1. Dataset preprocessing
A variety of data collection techniques [70], including observa- The conversion of CSV data into a JSON file was necessary for the
tional research, case studies using focus groups, and a quasi-statistical development of a retrieval-based chatbot, as tags play a pivotal role
method, have been employed in the development of chatbots [71]. in the process. Fig. 3 provides an illustration of the JSON file, which
Kaggle [72], GitHub [73], scraping data from Reddit [74], and clinical contains intent tags with defined forms and corresponding responses.
4
Table 5
Final learning accuracies and losses of LSTM.
Metrics Value
Loss 0.7420
Accuracy 0.8987
Validation loss 2.4457
Validation accuracy 0.1990
Table 6
Final learning accuracies and losses of Bi-LSTM.
Metrics Value
Fig. 4. Vocabulary lists with index. Loss 1.2129

Accuracy 0.9157
Validation loss 2.3987
Table 4 Validation accuracy 0.1010
Final learning accuracies and losses of Vanilla RNN.
Metrics Value
Loss 1.1619
Accuracy 0.8322 a simple RNN, making it more sophisticated. KLs were employed by
Validation loss 2.0922 the authors to train the architecture. Table 5 lists the final learning
Validation accuracy 0.2013 accuracies and losses after 50 LSTM epochs, and Fig. 6 displays the
LSTM learning curves. The discrepancy between training and valida-
tion accuracy indicates overfitting. The gap increases with increasing
This is an example of an intent tag, and the chatbot replies with overfitting. A low training error indicates low bias.
the solution provided in the ‘‘responses’’ of the purpose part. By trans- ∙ Bidirectional LSTM (Bi-LSTM)9 : A Bi-LSTM and an LSTM are
forming the outlines and replies into tags, the authors produced two fundamentally different from one another since a Bi-LSTM uses double
data structures. During preprocessing, extra spaces, tokenization, and the amount of information that an LSTM does first from beginning to
punctuation were taken out of the response and outline strings. A end and vice-versa. Differing from a unidirectional LSTM, it utilizes
vocabulary, which is a thesaurus of unique phrases and their frequency, bidirectional processing to capture information from future time steps,
was built using the tokenized terms. Utilizing this vocabulary, a list of and the authors can save data from both at any moment in the past
exclusive phrases is created and given an index, as seen in Fig. 4. or future during the phase. Because they are so adept at understanding
The authors used the example structure to construct the validation rounded outward methods, Bi-LSTMs ought to produce greater results.
and training data for the model by removing the initial outline from The dense layer’s constraints on the number of input elements and
each tag and then formalizing the architecture. The ten columns that output elements are the same as those of the Bi-LSTM architecture.
make up the final training data each list the lengthiest string in the The architecture was trained by the authors using KL’s LSTM. The final
outline. The array’s length is 16, which is the entire number of tags, as learning accuracies and losses after 50 epochs of Bi-LSTM are provided
well as the final analysis. Labels make up data that match each text in in Table 6 and the learning curves of Bi-LSTM are shown in Fig. 7.
the practice data (containing the puzzled tag). The architecture is quite varied given the stark contrast between the
validation and training datasets. However, Bi-LSTM is more accurate
3.2. Retrieval-based models when compared to LSTM.
∙ Gated Recurrent Unit (GRU)10 : The authors also utilized GRU,
This subsection displays the outcomes of the retrieval-based chatbot another well-liked NN algorithm. Given that it has a smaller number
employing several NN types, including the following: of gates than LSTMs, it operates more rapidly. Because of the LSTM
∙ Vanilla Recurrent Neural Network (RNN)7 : The Embedding architecture’s reduced difficulty, the authors rated the results as being
Layer (EL), which is the first layer, use a method to represent vocabu- less precise than the LSTM architecture. However, LSTM overfitting was
lary in texts as an N-Dimensional Vector (NDV), where N is the integer a possibility because the authors’ database was so tiny. The number of
of sizes the authors intend to give the words. These Feature Vectors elements in the ELs and dense layers both present restrictions when
(FV) are picked up while training and equivalent terms senses in the used in conjunction with the LSTM model. The authors employed KL’s
vector space have connected signals. Following that, a basic RNN was LSTM to train the architecture. The overall learning errors and losses
created using Keras Layers (KL). The model had been trained across for GRU after 50 epochs are shown in Table 7 and Fig. 8 respectively.
50 epochs, and Fig. 5 illustrates the learning curves of the Vanilla The validation loss initially reduces but starts to grow after about 10
RNN. Table 4 lists the model’s final learning accuracy and losses. epochs, indicating overfitting. The training loss falls continuously. The
Overfitting is revealed by the discrepancy in exactness between vali- final learning accuracies and losses of GRU are shown in Table 7, which
dation and training. After four trials, the endorsement harm increases, demonstrates that while the model is less accurate and more prone to
demonstrating that our method does not work well with unreliable damage than the LSTM architecture, it still has a respectable accuracy
data.
∙ Long Short Term Memory (LSTM)8 : The LSTM architecture 9
The classic LSTM is extended by the Bi-LSTM, which concurrently captures
outperforms the vanilla architecture because it has more gates than
past and future context by processing the input sequence both forward and
backward across time. In NLP tasks like text categorization, sentiment analysis,
7
Using a hidden state that is updated at each time step, a vanilla RNN is and machine translation, it is frequently employed.
10
a type of NN created to process sequential data. Each input in the sequence is A specific kind of RNN architecture called a GRU includes gating methods
processed along with data from previous inputs in a recursive manner. It has to regulate the information flow between time steps. With fewer parameters
issues with vanishing and exploding gradients and is the most basic type of and comparable performance on some applications, it was introduced as a
RNN. more straightforward alternative to the LSTM design. A reset gate and an
8
LSTM is a type of RNN architecture that addresses the problem of update gate are two gates in the GRU that control how much of the previous
vanishing gradients in conventional RNNs by incorporating memory cells that state should be forgotten and how much of the current input should be
allow the network to selectively retain or discard information over time. This incorporated, respectively. As a result, the GRU is quicker to train and more
makes it particularly advantageous for tasks that involve sequential input. computationally economical than the LSTM.
5
Fig. 5. Learning curves of Vanilla RNN.
Fig. 6. Learning curves of LSTM.
Fig. 7. Learning curves of Bi-LSTM.
6
Fig. 8. Learning curves of GRU.
Table 7 Table 8
Final learning accuracies and losses of GRU. Final learning accuracies and losses of CNN.
Metrics Value Metrics Value
Loss 1.0214 Loss 1.9127
Accuracy 0.6557 Accuracy 0.8233
Validation loss 2.8854 Validation loss 2.1981
Validation accuracy 0.1035 Validation accuracy 0.1980
of 0.6557. Additionally, with a validation accuracy of only 0.1035, the training loss and accuracy improve over time after about 20 epochs.
the generalization performance is poor. They are far less precise and The validation loss and accuracy likewise increase at the same time
damage-prone than the LSTM architecture, as seen by the final results. but begin to settle after 30 epochs. The model is not overfitting to the
The breakdowns in endorsement precisions and practising precisions, training data, as shown by the training and validation loss gaps.
which are in line with earlier findings from practised architecture, also
show that the architecture is overfitting. 3.3. Generative-based models
The authors then attempted a little more complex architecture to
see if they could prevent overfitting. The outcomes of the generative-based chatbot are displayed in this
∙ Convolution Neural Network (CNN)11 : An EL, a CNN layer, subsection and are as follows:
and a connected layer were the next designs that were evaluated.
The output of the ELs is served into a 1D- Convolutional Layer (CL), • Encoder–Decoder architecture: All of the models that we have
which is subsequently compressed and served into two completely seen so far have been retrieval-based approaches. The best an-
related coats. A-Max Pooling Layer (MPL) monitors all of these coats swer to the operator’s query was chosen with the use of NN on
to control their sizes. These settings are the result of several iterative numerous models, and the responses were encrypted. Instead
strategies meant to provide the greatest design possible. KLs were of choosing from a predefined response list, the authors will
employed by the authors to train the architecture. Table 8 lists the generate an answer based on the preparation body. A seq2seq
final learning accuracies and losses for CNN after 50 epochs, and Fig. 9 paradigm that produces results is the encoder–decoder. To put it
displays the CNN learning curves. CNN’s learning curves show how simply, it predicts a phrase that the operator provides, and every
the model performs when learning from training and validation sets of one of the following statements is then forecasted depending on
data. The vertical axis shows the performance parameter, such as loss the possibility that term will appear. Because the tags are not
or accuracy, while the horizontal axis reflects the number of epochs. necessary for producing predictions, the database for this design
Fig. 9 shows an orange line for the validation loss and a blue line may simply be a CSV file. The ‘‘<END>’’ tag was applied to the
for the training loss. The validation accuracy is displayed in red, and Target Tags (TT), which were left alone in the input column.
the training accuracy is displayed in green. The accuracy is initially Decoder output data, encoder input data, and decoder input data
poor, and both the training and validation losses are large. The model are the three matrices of One-Hot Vectors (OHV). The Decoder
gets better at fitting the training data as the training goes on, which uses two matrices, which are also used by the seq2seq structure
lowers the training loss and raises the training accuracy. But if the during preparation, to enable instructor force. The goal is to help
model gets too complicated, it might begin to overfit the training the architecture get ready for the current objective token by using
data, which would lower validation accuracy and raise validation loss. the input token from a prior phase step. The encoder architecture
According to Fig. 9, the model has learnt from the training data when demands both an LSTM level with a predetermined number of
unseen positions and an input level that creates a matrix for stor-
11
It is frequently used for image and video recognition applications. A ing OHVs. Except for sending the status information along with
CNN is a type of NN that processes input data using filters (kernels) to the decoder involvements, the decoder design is fundamentally
extract features, convolutions to combine the features, and pooling to reduce comparable to the encoder architecture. The decoder operates as
dimensionality. follows:
7
Fig. 9. Learning curves of CNN.
Fig. 10. Encoder–decoder model summary.
1. Obtain the output positions of the encoder. Table 9

Final learning accuracies and losses of encoder–decoder.
2. Give the decoder the output places so that it can decode
Metrics Value
the phrase term by word.
3. After decoding each word, update the decoder’s hidden Loss 0.2010
Accuracy 0.9445
location so that previously decoded words can be used to Validation loss 2.1554
assist in decoding new words. Validation accuracy 0.7017
In Fig. 10, the model summary is displayed. KLs were employed by the
authors to train the architecture. Table 9 lists the overall learning errors
and losses after 50 epochs, and Fig. 11 displays the curves of learning 4. Comparative analysis
for the encoder–decoder. The training and validation curves may be
In this section, a comparison between files, and models are de-
used to create drawings with particular inferences. The less distance
scribed for mental health chatbots.
between the validation and training curves, the better the architecture
fits the datasets. The damage curves’ notable discrepancy indicates that 4.1. Comparison between JSON file and CSV file
a suitable condition is emerging. A declining training loss that stays
down until the end of the epochs may likewise point to an unsuitable For a retrieval-based chatbot, the authors used a JSON file, and for a
architecture. chatbot that is generated by itself, we used a CSV file. Examining how a
8
Fig. 11. Learning curves of encoder–decoder..
chatbot based on retrieval works chooses its options can help to explain 3. In comparing LSTM and Bi-LSTM, the authors observe that the
this. Based on a collection of pre-written responses, a retrieval-based latter endeavors to minimize loss during the later stages of
chatbot is prepared to provide the optimal response. This approach is training. This is supported by the descending loss curve, which
best suited for situations where the range of possible user inputs is results in a smaller gap between the validation loss curves and
limited and well-defined, as it can quickly provide accurate responses the training loss.
without the need for extensive training data. However, it may struggle
Fig. 12 illustrates how the generative-based chatbots, which are based
to handle more complex or open-ended interactions where the user’s on the encoder–decoder model, perform better than the retrieval-based
input is less predictable. Therefore, the total number of predetermined chatbots when the two models are compared. When compared to earlier
responses has a tag. The input for the result plotting must be a tag. chatbots, it has a much higher validation accuracy. The chatbot is
Alternatively, the tag applied to the operator’s input classifies it (the still acceptable, and minimizing the harm is crucial, according to the
setting). Following the identification of the most appropriate tag, the considerable shift in validation and training loss as well as the reduced
operator is provided with one of the pre-established responses. Due to training loss following training. This might be a result of the chatbot’s
its structured and organized design, storing such information in a JSON current rudimentary condition, which prevents it from being able to
file is a more straightforward task. On the other hand, an effective learn a lengthy sequence of results. The authors suggest that they solve
chatbot ‘‘creates’’ answers from the ground up. As a result of this, final this issue by developing attention layers that enhance forecasting and
predictions for the following word are generated by graphing the input, only recall pertinent preceding information.
outcome, and prior words. To select the next word at each step based
on the predicted outcomes and previous words, a selective approach is 5. Conclusions and future works
used. As only the input script and output script columns are necessary
for this data format, capturing it in a CSV file is uncomplicated. Com- ML has the potential to enhance the delivery of mental health
pared to a JSON file, inserting or removing data is more straightforward services, but the efficacy of current approaches is unclear due to a
in this case. dearth of high-quality data. The initial step towards addressing this is to
acquire relevant data through techniques such as topic-noise modelling.
The chatbot can be trained and validated once sufficient data has been
4.2. Model comparison
gathered, and the authors can then think about the trade-off between
bias and modification. Instead of looking for a larger dataset, more
To take into account the two different chatbot varieties – retrieval- research can give a deeper understanding of the data, and the suggested
based chatbots and generative-based chatbots – the authors designed approach can be changed to get the best results. Users can pick between
six designs and connected them. The six architectures the authors retrieval-based and generative-based chatbots. While generative-based
created have some of the following features in common and differences: chatbots allow for more experimentation through the addition of new
layers like transformer structures and attention layers, retrieval-based
1. CNN performs excellently when compared to retrieval-based chatbots demand annotated input from a medical expert. The Keras-
chatbots like LSTM, GRU, CNN, Bi-LSTM, and vanilla RNN in facilitated EL is used by the authors to make grammatical inspection
terms of overfitting and accuracy. The discrepancy between val- easier, but other prediction structures, including Global Vectors for
idation loss and training loss is what defines overfit. Comparing Word Representations (GloVe), can improve the model’s base layer.
CNN’s curves to those of other systems, they are fairly uniform. The generated chatbot can be combined with the operator’s questions
2. When comparing RNNs, LSTM, GRU, CNN, Bi-LSTM, and vanilla in the current database, which is an Excel file with responses from
RNN all perform better than the others. GRU is a more recent operators and bots. The ultimate objective is to build a chatbot that
technique that performs better mathematically than LSTM. On interacts with people in a ‘‘human-like’’ way. This might be included
low-practice data, GRUs outperform LSTMs in terms of growth in a web application or a mobile application. There are disadvantages
and implementation. GRUs are often used, which makes it less to this strategy, though. If the chatbot is truly helpful in enhancing
difficult to modify them, additional gates can be incorporated well-being and reducing stress, further research with larger sample
into the network if more input is needed, for example. sizes is required. The chatbot should be available to participants for
9
Fig. 12. Generative-based chatbots vs. retrieval-based chatbots.
as long as necessary, and follow-up measurements at one, three, and [5] R. Parikh, D. Michelson, M. Sapru, R. Sahu, A. Singh, P. Cuijpers, V. Patel,
six months should be included to see if benefits hold up over time. The Priorities and preferences for school-based mental health services in India: a
multi-stakeholder study with adolescents, parents, school staff, and mental health
authors note that future studies with more advanced designs may be
providers, Glob. Ment. Health 6 (2019) 1–12, http://dx.doi.org/10.1017/gmh.
required to uncover ‘‘hidden’’ cases of depression that the chatbot failed 2019.16.
to correctly identify and that the lack of an effective control group [6] S. Bissoyi, M.R. Patra, A similarity matrix based approach for building patient
increases the potential for placebo effects. centric social networks, Int. J. Inf. Technol. (Singap.) 13 (2021) 1449–1455,
http://dx.doi.org/10.1007/s41870-021-00692-0.
[7] S.A. Abdul-Kader, J. Woods, Question answer system for online feedable new
Declaration of competing interest born Chatbot, in: 2017 Intelligent Systems Conference, IntelliSys 2017, Institute
of Electrical and Electronics Engineers Inc, 2018, pp. 863–869, http://dx.doi.
org/10.1109/IntelliSys.2017.8324231.
The authors declare that they have no known competing finan- [8] L. Zhou, J. Gao, D. Li, H.Y. Shum, The design and implementation of xiaoice,
cial interests or personal relationships that could have appeared to an empathetic social chatbot, Comput. Linguist. 46 (2020) 53–93, http://dx.doi.
influence the work reported in this paper. org/10.1162/COLI_a_00368.
[9] B.A. Eren, Determinants of customer satisfaction in chatbot use: evidence from
a banking application in Turkey, Int. J. Bank Mark. 39 (2021) 294–311, http:
Data availability //dx.doi.org/10.1108/IJBM-02-2020-0056.
[10] Q. Chen, Y. Gong, Y. Lu, J. Tang, Classifying and measuring the service
quality of AI chatbot in frontline service, J. Bus. Res. 145 (2022) 552–568,
Data will be made available on request.
http://dx.doi.org/10.1016/j.jbusres.2022.02.088.
[11] T. Celine, J. Antony, A study on mental disorders: 5-year retrospective study,
References J. Fam. Med. Prim. Care 3 (2014) 12, http://dx.doi.org/10.4103/2249-4863.
130260.
[12] S. Pandey, S. Sharma, S. Wazir, Mental healthcare chatbot based on natural
[1] R. Sagar, R. Dandona, G. Gururaj, R.S. Dhaliwal, A. Singh, A. Ferrari, T. Dua, language processing and deep learning approaches: Ted the therapist, Int. J. Inf.
A. Ganguli, M. Varghese, J.K. Chakma, G.A. Kumar, K.S. Shaji, A. Ambekar, Technol. (Singap.) (2022) 1–10, http://dx.doi.org/10.1007/s41870-022-00999-
T. Rangaswamy, L. Vijayakumar, V. Agarwal, R.P. Krishnankutty, R. Bhatia, F. 6.
Charlson, N. Chowdhary, H.E. Erskine, S.D. Glenn, V. Krish, A.M. Mantilla Her- [13] L. Zhang, Y. Yang, J. Zhou, C. Chen, L. He, Retrieval-polished response gener-
rera, P. Mutreja, C.M. Odell, P.K. Pal, S. Prakash, D. Santomauro, D.K. Shukla, ation for chatbot, IEEE Access 8 (2020) 123882–123890, http://dx.doi.org/10.
R. Singh, R.K.L. Singh, J.S. Thakur, A.S. ThekkePurakkal, C.M. Varghese, K.S. 1109/ACCESS.2020.3004152.
Reddy, S. Swaminathan, H. Whiteford, H.J. Bekedam, C.J.L. Murray, T. Vos, [14] K. Moore, S. Zhong, Z. He, T. Rudolf, N. Fisher, B. Victor, N. Jindal, A
L. Dandona, The burden of mental disorders across the states of India: the comprehensive solution to retrieval-based chatbot construction, 2021, https:
Global Burden of Disease study 1990–2017, Lancet Psychiatry 7 (2020) 148–161, //arxiv.org/abs/2106.06139v1 (accessed May 5, 2023).
http://dx.doi.org/10.1016/S2215-0366(19)30475-4. [15] H. Akkineni, P.V.S. Lakshmi, L. Sarada, Design and development of retrieval-
[2] B.S. Pradeep, G. Gururaj, M. Varghese, V. Benegal, G.N. Rao, G.M. Sukumar, S. based chatbot using sentence similarity, in: Lecture Notes in Networks and
Amudhan, B. Arvind, S. Girimaji, K. Thennarasu, P. Marimuthu, K.J. Vijayasagar, Systems, Springer Science and Business Media Deutschland GmbH, 2022, pp.
B. Bhaskarapillai, J. Thirthalli, S. Loganathan, N. Kumar, P. Sudhir, V.A. 477–487, http://dx.doi.org/10.1007/978-981-16-2919-8_43.
Sathyanarayana, K. Pathak, L.K. Singh, R.Y. Mehta, D. Ram, T.M. Shibukumar, [16] L. Shi, K. Zhang, W. Rong, Query-response interactions by multi-tasks in semantic
A. Kokane, L.R.K. Singh, B.S. Chavan, P. Sharma, C. Ramasubramanian, P.K. search for chatbot candidate retrieval, 2022, https://arxiv.org/abs/2208.11018v1
Dalal, P.K. Saha, S.P. Deuri, A.K. Giri, A.B. Kavishvar, V.K. Sinha, J. Thavody, (accessed May 5, 2023).
R. Chatterji, B.S. Akoijam, S. Das, A. Kashyap, R.V. Sathish, M. Selvi, S.K. Singh, [17] D. Wang, H. Fang, Length adaptive regularization for retrieval-based chatbot
V. Agarwal, R. Misra, National mental health survey of India, 2016 - rationale, models, in: ICTIR 2020 - Proceedings of the 2020 ACM SIGIR International
design and methods, PLoS One 13 (2018) e0205096, http://dx.doi.org/10.1371/ Conference on Theory of Information Retrieval, Association for Computing
journal.pone.0205096. Machinery, 2020, pp. 113–120, http://dx.doi.org/10.1145/3409256.3409823.
[3] M. Rashida, M.A. Habib, A smartphone-based wander management system for [18] H. Qian, Z. Dou, Topic-Enhanced Personalized Retrieval-Based Chatbot, Springer,
bangla speaking patients with Alzheimer’s disease, Int. J. Inf. Technol. (Singap.) Cham, 2023, pp. 79–93, http://dx.doi.org/10.1007/978-3-031-28238-6_6.
13 (2021) 2543–2550, http://dx.doi.org/10.1007/s41870-021-00761-4. [19] T. Lan, X.-L. Mao, X. Gao, W. Wei, H. Huang, Ultra-fast, low-storage, highly
[4] R.S. Murthy, National mental health survey of India 2015–2016, Indian J. Psychi- effective coarse-grained selection in retrieval-based chatbot by using deep
atry 59 (2017) 21–26, http://dx.doi.org/10.4103/psychiatry.IndianJPsychiatry_ semantic hashing, 2020, https://arxiv.org/abs/2012.09647v2 (accessed May 5,
102_17. 2023).
10
[20] K. Kadam, S. Godbole, D. Joijode, S. Karoshi, P. Jadhav, S. Shilaskar, Multilingual [41] P. Dulin, R. Mertz, A. Edwards, D. King, Contrasting a mobile app with a
Information Retrieval Chatbot, in: Studies in Computational Intelligence, Springer conversational chatbot for reducing alcohol consumption: Randomized controlled
Science and Business Media Deutschland GmbH, 2022, pp. 107–121, http://dx. pilot trial, JMIR Form. Res. 6 (2022) e33037, http://dx.doi.org/10.2196/33037.
doi.org/10.1007/978-3-030-96634-8_10. [42] A. Barnett, M. Savic, K. Pienaar, A. Carter, N. Warren, E. Sandral, V. Manning,
[21] D. Wang, H. Fang, Predicting question responses to improve the performance D.I. Lubman, Enacting ‘more-than-human’ care: Clients’ and counsellors’ views
of retrieval-based chatbot, in: Lecture Notes in Computer Science (Including on the multiple affordances of chatbots in alcohol and other drug counselling,
Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioin- Int. J. Drug Policy 94 (2021) 102910, http://dx.doi.org/10.1016/j.drugpo.2020.
formatics), Springer Science and Business Media Deutschland GmbH, 2021, pp. 102910.
425–431, http://dx.doi.org/10.1007/978-3-030-72240-1_44. [43] M.N. Win, L.W. Han, E.K.K. Samson Chandresh Kumar, T.Y. Keat, S.D. Ravana,
[22] J. Kim, S. Chung, S. Moon, S. Chi, Feasibility study of a BERT-based question AI-based personalized virtual therapist for alcohol relapse, Enthus.: Int. J. Appl.
answering chatbot for information retrieval from construction specifications, Stat. Data Sci. (2022) 82–96, http://dx.doi.org/10.20885/enthusiastic.vol2.iss2.
in: IEEE International Conference on Industrial Engineering and Engineering art3.
Management, IEEE Computer Society, 2022:, pp. 970–974, http://dx.doi.org/10. [44] J. Parviainen, J. Rantala, Chatbot breakthrough in the 2020s? An ethical
1109/IEEM55944.2022.9989625. reflection on the trend of automated consultations in health care, Med. Health
[23] I.T. Aksu, N.F. Chen, L.F. D’Haro, R.E. Banchs, Reranking of Responses using Care Philos. 25 (2022) (2020) 61–71, http://dx.doi.org/10.1007/s11019-021-
Transfer Learning for a Retrieval-Based Chatbot, in: Lecture Notes in Electrical 10049.
Engineering, Springer Science and Business Media Deutschland GmbH, 2021, pp. [45] U. Bharti, D. Bajaj, H. Batra, S. Lalit, S. Lalit, A. Gangwani, Medbot: Conver-
239–250, http://dx.doi.org/10.1007/978-981-15-9323-9_20. sational Artificial Intelligence Powered Chatbot for Delivering Tele-Health After
[24] R.S. Patchava, J.S. Kiran, Intelligent response retrieval for semantically similar COVID-19, Institute of Electrical and Electronics Engineers (IEEE), 2020, pp.
querying using a chatbot, in: Proceedings of the International Conference on 870–875, http://dx.doi.org/10.1109/icces48766.2020.9137944.
Intelligent Computing and Control Systems, ICICCS 2020, Institute of Electrical [46] N.V. Shinde, A. Akhade, P. Bagad, H. Bhavsar, S.K. Wagh, A. Kamble, Health-
and Electronics Engineers Inc, 2020, pp. 502–508, http://dx.doi.org/10.1109/ care chatbot system using artificial intelligence, in: Proceedings of the 5th
ICICCS48265.2020.9121118. International Conference on Trends in Electronics and Informatics, ICOEI 2021,
[25] V. Lopez-Rodriguez, H.G. Ceballos, Retrieval-based statistical chatbot in a scien- Institute of Electrical and Electronics Engineers Inc, 2021, pp. 1174–1181, http:
tometric domain, in: Lecture Notes in Computer Science (Including Subseries //dx.doi.org/10.1109/ICOEI51242.2021.9452902.
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), [47] S. Greer, D. Ramo, Y.J. Chang, M. Fu, J. Moskowitz, J. Haritatos, Use of the chat-
Springer Science and Business Media Deutschland GmbH, 2022, pp. 303–315, bot vivibot to deliver positive psychology skills and promote well-being among
http://dx.doi.org/10.1007/978-3-031-19496-2_23. young people after cancer treatment: Randomized controlled feasibility trial,
[26] R. Dsouza, S. Sahu, R. Patil, D.R. Kalbande, Chat with bots intelligently: A critical JMIR MHealth UHealth 7 (2019) e15018, http://dx.doi.org/10.2196/15018.
review analysis, in: 2019 6th IEEE International Conference on Advances in [48] K.K. Fitzpatrick, A. Darcy, M. Vierhile, Delivering cognitive behavior therapy to
Computing, Communication and Control, ICAC3 2019, Institute of Electrical and young adults with symptoms of depression and anxiety using a fully automated
Electronics Engineers Inc, 2019, http://dx.doi.org/10.1109/ICAC347590.2019. conversational agent (Woebot): A randomized controlled trial, JMIR Ment. Health
9036844. 4 (2017) e7785, http://dx.doi.org/10.2196/mental.7785.
[27] L.T. Mudikanwi, T.T. Gotora, Student personal assistant using machine learning,
[49] H.J. Han, S. Mendu, B.K. Jaworski, J.E. Owen, S. Abdullah, Ptsdialogue:
(n.d.).
Designing a conversational agent to support individuals with Post-Traumatic
[28] A.A. Abd-alrazaq, M. Alajlani, A.A. Alalwan, B.M. Bewick, P. Gardner, M.
Stress Disorder, in: UbiComp/ISWC 2021 - Adjunct Proceedings of the 2021
Househ, An overview of the features of chatbots in mental health: A scoping
ACM International Joint Conference on Pervasive and Ubiquitous Computing and
review, Int. J. Med. Inform. 132 (2019) 103978, http://dx.doi.org/10.1016/j.
Proceedings of the 2021 ACM International Symposium on Wearable Computers,
ijmedinf.2019.103978.
ACM, New York, NY, USA, 2021, pp. 198–203, http://dx.doi.org/10.1145/
[29] L. Laranjo, A.G. Dunn, H.L. Tong, A.B. Kocaballi, J. Chen, R. Bashir, D. Surian, B.
3460418.3479332.
Gallego, F. Magrabi, A.Y.S. Lau, E. Coiera, Conversational agents in healthcare:
[50] B. Chaix, G. Delamon, A. Guillemassé, B. Brouard, J.E. Bibault, Psychological
A systematic review, J. Am. Med. Inform. Assoc. 25 (2018) 1248–1258, http:
distress during the COVID-19 pandemic in France: A national assessment of at-
//dx.doi.org/10.1093/jamia/ocy072.
risk populations, Gen. Psychiatry 33 (2020) 100349, http://dx.doi.org/10.1136/
[30] A. Lommatzsch, J. Katins, An information retrieval-based approach for building
gpsych-2020-100349.
intuitive chatbots for large knowledge bases, in: CEUR Workshop Proceedings,
[51] Y. Ahn, Y. Zhang, Y. Park, J. Lee, A chatbot solution to chat app problems:
2019, https://dialogflow.com/ (accessed May 5, 2023).
Envisioning a chatbot counseling system for teenage victims of online sexual
[31] W. Swartout, R. Artstein, E. Forbell, S. Foutz, H.C. Lane, B. Lange, J. Morie,
exploitation, in: Conference on Human Factors in Computing Systems - Proceed-
D. Noren, S. Rizzo, D. Traum, Virtual humans for learning, AI Mag. 34 (2013)
ings, Association for Computing Machinery, 2020, http://dx.doi.org/10.1145/
13–30, http://dx.doi.org/10.1609/aimag.v34i4.2487.
3334480.3383070.
[32] L. Wang, M.I. Mujib, J. Williams, G. Demiris, J. Huh-Yoo, An evaluation of
[52] M.L. Tielman, M.A. Neerincx, W.P. Brinkman, Design and evaluation of personal-
generative pre-training model-based therapy chatbot for caregivers, 2021, https:
ized motivational messages by a virtual agent that assists in post-traumatic stress
//arxiv.org/abs/2107.13115v1 (accessed May 5, 2023).
[33] S. Sutton, R. Cole, J. de Villiers, J. Schalkwyk, P. Vermeulen, M. Macon, Y. Yan, disorder therapy, J. Med. Internet Res. 21 (2019) e9240, http://dx.doi.org/10.
E. Kaiser, B. Rundle, K. Shobaki, P. Hosom, A. Kain, J. Wouters, D. Massaro, M. 2196/jmir.9240.
Cohen, Universal speech tools: the CSLU toolkit, in: 5th International Conference [53] J. Sebastian, D. Richards, Changing stigmatizing attitudes to mental health via
on Spoken Language Processing, ICSLP 1998, 1998, http://dx.doi.org/10.21437/ education and contact with embodied conversational agents, Comput. Hum.
icslp.1998-714. Behav. 73 (2017) 479–488, http://dx.doi.org/10.1016/j.chb.2017.03.071.
[34] D. Goddeau, H. Meng, J. Polifroni, S. Seneff, S. Busayapongchai, Form-based [54] L.C. Klopfenstein, S. Delpriori, S. Malatini, A. Bogliolo, The rise of bots: A survey
dialogue manager for spoken language applications, in: International Conference of conversational interfaces, patterns, and paradigms, in: DIS 2017 - Proceedings
on Spoken Language Processing, ICSLP, Proceedings, IEEE, 1996, pp. 701–704, of the 2017 ACM Conference on Designing Interactive Systems, Association
http://dx.doi.org/10.21437/icslp.1996-177. for Computing Machinery, Inc, 2017, pp. 555–565, http://dx.doi.org/10.1145/
[35] P. Parmar, J. Ryu, S. Pandya, J. Sedoc, S. Agarwal, Health-focused conversational 3064663.3064672.
agents in person-centered care: a review of apps, Npj Digit. Med. 5 (2022) 1–9, [55] J. Zhang, Y.J. Oh, P. Lange, Z. Yu, Y. Fukuoka, Artificial intelligence chatbot
http://dx.doi.org/10.1038/s41746-022-00560-6. behavior change model for designing artificial intelligence chatbots to promote
[36] H. Chen, X. Liu, D. Yin, J. Tang, A survey on dialogue systems, ACM SIGKDD physical activity and a healthy diet: Viewpoint, J. Med. Internet Res. 22 (2020)
Explor. Newsl. 19 (2017) 25–35, http://dx.doi.org/10.1145/3166054.3166058. e22845, http://dx.doi.org/10.2196/22845.
[37] J. Wei, S. Kim, H. Jung, Y.-H. Kim, Leveraging large language models to power [56] J. Devlin, M.W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidi-
chatbots for collecting user self-reported data, 2023, https://arxiv.org/abs/2301. rectional transformers for language understanding, in: Chapter of the Association
05843v1 (accessed May 5, 2023). for Computational Linguistics: Human Language Technologies - Proceedings
[38] Z. Chen, Y. Lu, M.P. Nieminen, A. Lucero, Creating a chatbot for and with of the Conference, Association for Computational Linguistics, ACL, 2019, pp.
migrants: Chatbot personality drives co-design activities, in: DIS 2020 - Proceed- 4171–4186, https://arxiv.org/abs/1810.04805v2 (accessed May 3, 2023).
ings of the 2020 ACM Designing Interactive Systems Conference, Association [57] R. Alec, W. Jeffrey, C. Rewon, L. David, A. Dario, S. Ilya, Language models
for Computing Machinery, Inc, 2020, pp. 219–230, http://dx.doi.org/10.1145/ are unsupervised multitask learners | enhanced reader, OpenAI Blog 1 (2019) 9,
3357236.3395495. https://github.com/codelucas/newspaper (accessed May 3, 2023).
[39] B. Thomson, Statistical Methods for Spoken Dialogue Management (Ph.D.), [58] T.B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A.
Cambridge University, 2009, http://mi.eng.cam.ac.uk/~brmt2/papers/2010- Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G.
thomson-thesis.pdf (accessed May 3, 2023), t. Krueger, T. Henighan, R. Child, A. Ramesh, D.M. Ziegler, J. Wu, C. Winter, C.
[40] Y. Sakurai, Y. Ikegami, M. Sakai, H. Fujikawa, S. Tsuruta, A.J. Gonzalez, E. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner,
Sakurai, E. Damiani, A. Kutics, R. Knauf, F. Frati, VICA, A visual counseling agent S. McCandlish, A. Radford, I. Sutskever, D. Amodei, Language models are few-
for emotional distress, J. Ambient Intell. Human. Comput. 10 (2019) 4993–5005, shot learners, in: Advances in Neural Information Processing Systems, 2020, pp.
http://dx.doi.org/10.1007/s12652-019-01180-x. 1877–1901, https://commoncrawl.org/the-data/ (accessed May 3, 2023).
11
[59] X.P. Qiu, T.X. Sun, Y.G. Xu, Y.F. Shao, N. Dai, X.J. Huang, Pre-trained models [67] V. Raj, M.S.B. Phridviraj, A generative model based chatbot using recurrent
for natural language processing: A survey, Sci. China Technol. Sci. 63 (2020) neural networks, in: Communications in Computer and Information Science,
1872–1897, http://dx.doi.org/10.1007/s11431-020-1647-3. Springer Science and Business Media Deutschland GmbH, 2023, pp. 379–392,
[60] J.S. Lee, J. Hsiang, Patent claim generation by fine-tuning OpenAI GPT-2, World http://dx.doi.org/10.1007/978-3-031-28183-9_27.
Patent Inf. 62 (2020) 101983, http://dx.doi.org/10.1016/j.wpi.2020.101983. [68] F.A. Bachtiar, A.D. Fauzulhaq, M.T.R. Manullang, F.R. Pontoh, K.S. Nugroho,
[61] J. Vig, A multiscale visualization of attention in the transformer model, in: N. Yudistira, A Generative-Based Chatbot for Daily Conversation: A Prelimi-
ACL 2019-57th Annual Meeting of the Association for Computational Linguistics, nary Study, in: ACM International Conference Proceeding Series, Association
for Computing Machinery, 2022, pp. 8–12, http://dx.doi.org/10.1145/3568231.
Proceedings of System Demonstrations, 2019, pp. 37–42, http://dx.doi.org/10.
3568234.
18653/v1/p19-3007.
[69] M.A. Khadija, W. Nurharjadmo, . Widyawan, Deep learning generative Indone-
[62] A. See, C.D. Manning, Understanding and predicting user dissatisfaction in a
sian response model chatbot for JKN-KIS, in: APICS 2022-2022 1st International
neural generative chatbot, in: SIGDIAL 2021-22nd Annual Meeting of the Special Conference on Smart Technology, Applied Informatics, and Engineering, Pro-
Interest Group on Discourse and Dialogue, Proceedings of the Conference, 2021, ceedings, Institute of Electrical and Electronics Engineers Inc, 2022, pp. 70–74,
pp. 1–12, https://aclanthology.org/2021.sigdial-1.1 (accessed May 5, 2023). http://dx.doi.org/10.1109/APICS56469.2022.9918686.
[63] S.A. Sheikh, V. Tiwari, S. Singhal, Generative model chatbot for human resource [70] Methods of collecting data | boundless psychology, (n.d.), 2022,
using deep learning, in: 2019 International Conference on Data Science and https://courses.lumenlearning.com/boundless-psychology/chapter/methods-
Engineering, ICDSE 2019, Institute of Electrical and Electronics Engineers Inc, of-collecting-data/ (accessed February 20, 2022).
2019, pp. 126–132, http://dx.doi.org/10.1109/ICDSE47409.2019.8971795. [71] L.A. Palinkas, Qualitative and mixed methods in mental health services and
[64] T. Hirosawa, Y. Harada, M. Yokose, T. Sakamoto, R. Kawamura, T. Shimizu, Diag- implementation research, J. Clin. Child Adolesc. Psychol. 43 (2014) 851–861,
nostic accuracy of differential-diagnosis lists generated by generative pretrained http://dx.doi.org/10.1080/15374416.2014.910791.
transformer 3 chatbot for clinical vignettes with common chief complaints: A [72] . Kaggle, Kaggle: Your Machine Learning and Data Science Community, Kaggle,
pilot study, Int. J. Environ. Res. Public Health 20 (2023) 3378, http://dx.doi. 2019, https://www.kaggle.com/ (accessed February 20, 2022).
org/10.3390/ijerph20043378. [73] Github, GitHub: Where the World Builds Software ⋅ GitHub, Internet, 2021, p.
[65] S. Sawant, A. Vishwakarma, P. Sawant, P. Bhavathankar, Analytical and sen- 1, https://github.com/ (accessed February 20, 2022).
[74] Reddit, Reddit - Dive into anything, 2021, https://www.reddit.com/ (accessed
timent based text generative chatbot, in: 2021 12th International Conference
February 20, 2022).
on Computing Communication and Networking Technologies, ICCCNT 2021,
[75] L.A. Claude, J. Houenou, E. Duchesnay, P. Favre, Will machine learning applied
Institute of Electrical and Electronics Engineers Inc, 2021, http://dx.doi.org/10.
to neuroimaging in bipolar disorder help the clinician? A critical review and
1109/ICCCNT51525.2021.9580069. methodological suggestions, Bipolar Disord. 22 (2020) 334–355, http://dx.doi.
[66] P. Si, Y. Qiu, J. Zhang, Y. Yang, Guiding topic flows in the generative chatbot org/10.1111/bdi.12895.
by enhancing the ConceptNet with the conversation corpora, 2021, https://arxiv. [76] Mental health FAQ for chatbot | kaggle, (n.d.), 2022, https://www.kaggle.com/
org/abs/2109.05406v2 (accessed May 5, 2023). narendrageek/mental-health-faq-for-chatbot (accessed February 20, 2022).
12

1 s2.0 S2772442523000655 Main

Uploaded by

Copyright:

Available Formats

1 s2.0 S2772442523000655 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S2772442523000655 Main

Uploaded by

Copyright:

Available Formats

Healthcare Analytics 3 (2023) 100198

Contents lists available at ScienceDirect

A comparative study of retrieval-based and generative-based chatbots using

ARTICLE INFO ABSTRACT

Also, the importance of School-Based Mental Health Services

for information and a chatbot powered by generative algorithms,

e Availability of privacy agreement review for patients during chatbot conversation?

f Transparency of chatbots to patients regarding their intelligent nature?

Fig. 3. CSV file converted to the JSON file.

3. Implementation and analysis 3.1. Dataset preprocessing

Fig. 4. Vocabulary lists with index. Loss 1.2129

Fig. 5. Learning curves of Vanilla RNN.

Fig. 6. Learning curves of LSTM.

Fig. 7. Learning curves of Bi-LSTM.

Fig. 8. Learning curves of GRU.

Fig. 9. Learning curves of CNN.

Fig. 10. Encoder–decoder model summary.

1. Obtain the output positions of the encoder. Table 9

Fig. 11. Learning curves of encoder–decoder..

Fig. 12. Generative-based chatbots vs. retrieval-based chatbots.

You might also like