Data Science Chatbot
Data Science Chatbot
Data Science Chatbot
ABSTRACT The application of natural language to improve students’ interaction with information systems
is demonstrated to be beneficial. In particular, advances in cognitive computing enable a new way of interac-
tion that accelerates insight from existing information sources, thereby contributing to the process of learning.
This work aims at researching the application of cognitive computing in blended learning environments. We
propose a modular cognitive agent architecture for pedagogical question answering, featuring social dialogue
(small talk), improved for a specific knowledge domain. This system has been implemented as a personal
agent to assist students in learning Data Science and Machine Learning techniques. Its implementation
includes the training of machine learning models and natural language understanding algorithms in a human-
like interface. The effectiveness of the system has been validated through an experiment.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
180672 VOLUME 8, 2020
D. Carlander-Reuterfelt et al.: JAICOB: A Data Science Chatbot
The rest of the paper is organized as follows. Section II significantly outperform the most carefully crafted rule-based
analyses related works about chatbots and the techniques systems [24]. The reason is that they can achieve a more
applied in their development. Section III describes the dif- profound understanding of the intent and the requested infor-
ferent modules of the architecture and how they are inter- mation, thanks to machine learning techniques [25]. The
connected. Section IV describes the evaluation process and most usual and effective approach [26], which is explained
results. Finally, Section V summarizes the learnings of this in greater detail in Section III, is based on intent-entity and
article with conclusions and defining future works. Knowledge Base (KB).
Another aspect to take into account is if they are text or
II. RELATED WORK voice-based. Users tend to use longer sentences with voice-
A comprehensive systematic review of the use of chatbots in based chatbots and prefer reading expanded answers in a text-
education is provided in this recent survey [9]. The authors like manner. However, there is no significant difference in
identify several perspectives for analyzing current research perceived effectiveness, learnability, and humanness between
following the theoretical model of Technology-Mediated text-based and voice-based chatbots [27].
Learning (TML) [10]: structure (input), learning process
(process), and learning outcome (output). Regarding the input III. PROPOSED ARCHITECTURE FOR THE COGNITIVE BOT
perspective, several dimensions have been identified [9]: stu- The first step to design the proposed architecture was to
dent profile, educational settings, and chatbot technology. identify the way students learn and the types of questions.
Learning outputs depend on individual student characteris- Different types of requirements for different types of learning
tics such as personality traits, technological skills as well as (inductive and deductive) [28] were identified due to the
educational and social background [9]. nature of students’ curiosity, and the specifics of the topic.
Some research works claim that chatbot technology is so The following pedagogical solutions were identified:
disruptive that it will eliminate the need for websites and
apps [11]. Chatbots have been used in different educational 1) A definition of a concept is a consequence of the
settings, such as language learning [12], health-related coach- usual teaching style, which is deductive, starting from
ing agents [13], chatbots designed to provide feedback to the main concepts and developing towards the applica-
students [14], programming language learning [15], admin- tions. It is part of the process of learning, but cannot be
istrative support [16] or increase students’ motivation [17]. the whole process. In the Oliver model [29], definitions
These are examples if we don’t take into account open- provide learning content.
domain solutions such as Amazon or Google’s [18], which 2) As stated in [30], the learning of programming tech-
aim to answer any kind of question, instead of a specific area niques can be enhanced by using examples of code
of knowledge. While these types of chatbots are astoundingly using analogy [31] and induction. Also, learning is
ambitious and function with a near-human precision, some- significantly facilitated by examples in initial cod-
times, they come at a very high cost. Closed-domain question ing attempts. Furthermore, surveys suggest that engi-
answering systems benefit from the ability to respond with neering students usually view themselves as inductive
more profound and specific knowledge [19], and also can learners [28]. In the Oliver model [29], examples can
achieve high quality at a lower complexity cost. provide learner support.
Design aspects of chatbots can influence the learning pro- 3) Lastly, the human need for small-talk, such as joking
cess. Flow-based chatbots, like [20]–[22], also called rule- and asking for the weather, must be satisfied to provide
based, can require an extensive database of questions and a more significant communication source [2].
answers and need to have a clear flow of conversation that, With that in mind, the architecture was designed, having
if the user decides not to follow, can result in a bad experience. identified the pedagogical needs of the student. There are
A study on chatbots of this type [23] concludes that they are several steps involved in the process and are explained below
quite limited to human direction and control. These can be and represented in Figure 1.
built with frameworks like Landbot.ai,1 or with simple coding A Knowledge Base (KB) was populated with pertinent
abilities, but require great sophistication to work correctly. information regarding the topic at hand, to satisfy the requests
There is where its limitations lie. An extension to this kind for definitions and examples. The Question Answering (QA)
of bots is button-based, like HelloFreshus,2 that avoids the module is designed to extract meaning from all the data with
possibility of exiting the pre-planned flow. These can work the pedagogical requirements in mind to make sense of that
well but can be very limited in scope and depth. information.
On the other hand, artificial intelligent based chatbots To analyze the students’ question, we use the Speech Act
can better understand student intents. Even the most sim- Classifier. It selects the module where the question must be
ple non-rule-based natural language understanding methods delegated. The way it works will be explained in greater detail
in Section III-B. If small talk is detected, it is passed onto
1 https://landbot.io the Small Talk module or into the QA Module if a question
2 https://chatfuel.com/bot/HelloFreshus regarding Data Science is detected.
FIGURE 2. QA architecture.
FIGURE 3. QA example.
2) EXAMPLE ANSWERING
When the Answer Type is of the example type, we need
a more complex type of search. There is a search across the
documentation text to match the keywords of the query. When
a match is found, the corresponding code snippet is sent to
respond with the appropriate format. Examples of these type FIGURE 6. Small talk examples.
of questions are:
• How is a dataframe defined in Pandas?
TABLE 3. Intent distribution.
• How can I implement a k-fold using scikit?
This module is implemented as a DialogFlow agent with
and intent trained to detect example queries. The slot, in this
case, is more open, so there is no Entity defined. The example
can be of any kind. The result can be seen in Figure 5.
B. PARTICIPANTS
The experiment was done with 50 participants, all of them
with technical backgrounds. All of them were unaware of the
inner workings of Jaicob. They were asked to use the chatbot
as a tool to answer any questions or doubts that may arise
in understanding Data Science related topics or writing the
corresponding code.
The median of the ages of the participants is 22 years.
A 51% of them were studying a Telecommunication Engi- FIGURE 7. Structural paths in the applied PLS model.
neering Grade and the rest a Master or superior studies.
About their technological background, 54% of the partic- • Behavioral intentions (BI) refers to the recommenda-
ipants had developed and implemented some machine learn- tion of users to others to use the bot.
ing programs. The rest had some basic knowledge in the field. • Satisfaction (SS) refers to the feeling after using the bot.
• Utilitarian value (UV) refers to the value it provides to
Research [39] suggests that the quality of information on TABLE 5. Discriminant validity.
an e-commerce website has a positive impact on perceived
value. Reference [40] suggests that accurate information
can help users make better decisions, thus improving both
utilitarian values. According to [41], the utilitarian values
increases when the interaction with the process improves.
These hypotheses are proposed:
H1. Perceptions of a better answer accuracy improve util-
itarian value.
User satisfaction is influenced by the human-likeness of the
chatbot [36]. Also, [2] state that people are more inclined to
send messages to a chatbot that handles this type of small-
talk well. A website’s social dimension is another impor-
tant antecedent of perceived value [42]. Research [43], [44]
reveals that there is a direct link between perceived sociability
and satisfaction.
H2. The social handling of the bot improves the overall
satisfaction of the user.
H3. The social handling of the bot improves the utilitarian
value a user perceives.
H4. Good social handling improves the behavioral inten-
tions of users after using the bot.
Utilitarian value is central to user satisfaction and behav-
ioral intentions. If the perceived value is low, the user proba- FIGURE 8. Path coefficients.
bly switches to other sources [39].
H5. A higher perceived answer accuracy value increases
positive behavioral intentions. as shown in Table 4. The PLS analysis also provides us
H6. A higher perceived utilitarian value increases positive with the Composite Reliability of each latent variable. This
behavioral intentions. index surpassed the minimum acceptable value of .70 in all
Perceived utilitarian value also enhances satisfaction [40]. variables, being all over .85.
Research [45] demonstrates that utilitarian value can improve The average variance extracted (AVE) for each variable
the final user satisfaction: must surpass a threshold of .50 [43], [47], and provide a
H7. Perceived utilitarian value has a positive effect on user square root that is much larger than the correlation of the
satisfaction. specific construct with any other construct in the model. All
H8. Perceived answer accuracy has a positive effect on user the latent variables surpass a .70 AVE, as shown in Table 4.
satisfaction. Table 5 shows that the square roots of the AVE (on the
diagonal) are higher than any other values, in support of the
D. RESULTS discriminant validity of the measurement scales [38].
The testers made an average of 15.86 queries per session. The Then, discriminant validity is tested, which indicates the
intent that matched most of the queries was related to code extent to which a given construct (variable) differs from other
example requests, which means that users used the bot for latent constructs. The validity of these variables requires that
what it was intended. After that, there is the Definition intent each measurement item correlates weakly with all constructs,
and then the complex intent. Also, 26.7% of the queries except for which it is theoretically associated. The results in
resulted in small talk handling. The distribution can be seen Table 5 support the validity of the measurement scales.
in Table 3. All the direct hypotheses received support, except for H4,
The results extracted from the PLS modeling, having used as shown in Figure 8. From these results, we can extract
SmartPLS 3.0 [46] meet the requirements, being the sample some insights, such as the impact that Answering Accu-
size ten times the largest number of structural paths directed at racy has on all the other variables. Therefore, the quality
a particular construct in the structural model. There are three of the system and its ability to respond effectively is what
paths directed to Behavioral Intentions and Satisfaction in this makes the difference for overall user Satisfaction, Utilitar-
model, so the minimum sample size should be 30, and the ian Value, and Behavioral Intentions (H1, H5, H8). Also,
sample size is above this minimum. the perceived Utilitarian Value has a positive effect on Behav-
To test the experiment’s internal coherence, and therefore, ioral Intentions and Satisfaction (H6, H7). Surprisingly,
reliability, we look at the outer loadings. These coefficients Social Handling was not significant in positive behavioral
need to meet a threshold for every measure that points to the intentions (H4), contrasting with the Utilitarian Value and
latent variables. All the measures met this reliability index, Satisfaction (H2, H3).
[33] J. Y. Lee, W. Paik, and S. Joo, ‘‘Information resource selection of under- CARLOS A. IGLESIAS received the Telecom-
graduate students in academic search tasks,’’ Inf. Res., Int. Electron. J., munications Engineering degree and the Ph.D.
vol. 17, no. 1, p. n1, 2012. degree in telecommunications from the Uni-
[34] B. Samei, H. Li, F. Keshtkar, V. Rus, and A. C. Graesser, ‘‘Context-based versidad Politécnica de Madrid (UPM), Spain,
speech act classification in intelligent tutoring systems,’’ in Proc. Int. Conf. in 1993 and 1998, respectively. He is currently an
Intell. Tutoring Syst. Cham, Switzerland: Springer, 2014, pp. 236–241. Associate Professor with the Telecommunications
[35] E. N. Forsythand and C. H. Martell, ‘‘Lexical and discourse analysis Engineering School, UPM. He has been leading
of online chat dialog,’’ in Proc. Int. Conf. Semantic Comput. (ICSC), the Intelligent Systems Group, UPM, since 2014.
Sep. 2007, p. 19.
He has been the Principal Investigator on numer-
[36] M. Xuetao, F. Bouchet, and J.-P. Sansonnet, ‘‘Impact of agent’s answers
ous research grants and contracts in the field of
variability on its believability and human-likeness and consequent chatbot
improvements,’’ in Proc. AISB, 2009, pp. 31–36. advanced social and the IoT systems, funded by the regional, national, and
[37] J. Finch, ‘‘Naming names: Kinship, individuality and personal names,’’ European bodies. His main research interests include social computing,
Sociology, vol. 42, no. 4, pp. 709–725, Aug. 2008. multiagent systems, information retrieval, sentiment and emotion analysis,
[38] M. S. Ben Mimoun and I. Poncin, ‘‘A valued agent: How ECAs affect web- linked data, and web engineering.
site customers’ satisfaction and behaviors,’’ J. Retailing Consum. Services,
vol. 26, pp. 70–82, Sep. 2015.
[39] M. Magni, M. Susan Taylor, and V. Venkatesh, ‘‘‘To play or not to play’:
A cross-temporal investigation using hedonic and instrumental perspec-
tives to explain user intentions to explore a technology,’’ Int. J. Hum.-
Comput. Stud., vol. 68, no. 9, pp. 572–588, Sep. 2010.
[40] C. Kim, R. D. Galliers, N. Shin, J.-H. Ryoo, and J. Kim, ‘‘Factors influenc-
ing Internet shopping value and customer repurchase intention,’’ Electron. ÓSCAR ARAQUE received the graduate and mas-
Commerce Res. Appl., vol. 11, no. 4, pp. 374–387, Jul. 2012. ter’s degrees in telecommunication engineering
[41] T. Ahn, S. Ryu, and I. Han, ‘‘The impact of Web quality and playfulness from the Technical University of Madrid (Univer-
on user acceptance of online retailing,’’ Inf. Manage., vol. 44, no. 3, sidad Politécnica de Madrid), Spain, in 2014 and
pp. 263–275, Apr. 2007. 2016, respectively, where he is currently pursuing
[42] J. Lawler, P. Vandepeutte, and A. Joseph, ‘‘An exploratory study of apparel
the Ph.D. degree. He is currently a Teaching Assis-
dress model technology on European Web sites,’’ J. Inf., Inf. Technol.,
tant with the Universidad Politécnica de Madrid.
Organizations, vol. 2, no. 244, pp. 31–46, Nov. 2007.
His research interest includes the application of
[43] S. S. Al-Gahtani, G. S. Hubona, and J. Wang, ‘‘Information technology (IT)
in Saudi Arabia: Culture and the acceptance and use of IT,’’ Inf. Manage., machine learning techniques for natural language
vol. 44, no. 8, pp. 681–691, Dec. 2007. processing. The main topic of his thesis is the
[44] O. Turel, A. Serenko, and N. Bontis, ‘‘User acceptance of hedonic digital introduction of specific domain knowledge into machine learning systems
artifacts: A theory of consumption values perspective,’’ Inf. Manage., in order to enhance sentiment and emotion analysis techniques.
vol. 47, no. 1, pp. 53–59, Jan. 2010.
[45] T. W. Traphagan, Y.-H.-V. Chiang, H. M. Chang, B. Wattanawaha, H. Lee,
M. C. Mayrath, J. Woo, H.-J. Yoon, M. J. Jee, and P. E. Resta, ‘‘Cognitive,
social and teaching presence in a virtual world and a text chat,’’ Comput.
Educ., vol. 55, no. 3, pp. 923–936, Nov. 2010.
[46] C. Ringle, S. Wende, and J.-M. Becker, ‘‘Smartpls 3,’’ SmartPLS GmbH,
Bönningstedt, Germany, Tech. Rep., Jan. 2015.
[47] C. Fornell and D. F. Larcker, ‘‘Structural equation models with unob- JUAN FERNANDO SÁNCHEZ RADA received
servable variables and measurement error: Algebra and statistics,’’ J. the Ph.D. degree from the Universidad Politécnica
Marketing Res., vol. 18, no. 3, pp. 382–388, 1981, doi: 10.1177/ de Madrid (UPM), Spain, in 2020. He is currently
002224378101800313. a Researcher with the Intelligent Systems Group,
Universidad Politécnica de Madrid. His research
DANIEL CARLANDER-REUTERFELT is cur- interests include natural language processing (sen-
rently pursuing the master’s degree in telecom- timent and emotion analysis), social network anal-
munications engineering with the Universidad ysis (social context and graph embedding), the web
Politécnica de Madrid (UPM). He was awarded and distributed systems (interoperability and fed-
an honorary mention for the bachelor’s thesis eration), and semantic technologies (linked data,
(Development of a Cognitive Bot for Data Science ontologies, and knowledge graphs).
Tutoring based on a Big Data Natural Language
Analytics Platform). He has been a part of the
Intelligent Systems Group, UPM, since 2019. His
research interests include intelligent agents, natu-
ral language processing and understanding, and time series prediction.