Research Paper-3
Research Paper-3
Research Paper-3
Review
Natural Language Processing Techniques for Text Classification
of Biomedical Documents: A Systematic Review
Cyrille YetuYetu Kesiku * , Andrea Chaves-Villota and Begonya Garcia-Zapirain
eVida Research Group, University of Deusto, Avda/Universidades 24, 48007 Bilbao, Spain
* Correspondence: cyrille.kesiku@opendeusto.es
Abstract: The classification of biomedical literature is engaged in a number of critical issues that
physicians are expected to answer. In many cases, these issues are extremely difficult. This can be
conducted for jobs such as diagnosis and treatment, as well as efficient representations of ideas such
as medications, procedure codes, and patient visits, as well as in the quick search of a document or
disease classification. Pathologies are being sought from clinical notes, among other sources. The goal
of this systematic review is to analyze the literature on various problems of classification of medical
texts of patients based on criteria such as: the quality of the evaluation metrics used, the different
methods of machine learning applied, the different data sets, to highlight the best methods in this
type of problem, and to identify the different challenges associated. The study covers the period
from 1 January 2016 to 10 July 2022. We used multiple databases and archives of research articles,
including Web Of Science, Scopus, MDPI, arXiv, IEEE, and ACM, to find 894 articles dealing with
the subject of text classification, which we were able to filter using inclusion and exclusion criteria.
Following a thorough review, we selected 33 articles dealing with biological text categorization issues.
Following our investigation, we discovered two major issues linked to the methodology and data
used for biomedical text classification. First, there is the data-centric challenge, followed by the data
Citation: Kesiku, C.Y.; quality challenge.
Chaves-Villotas, A.; Garcia-Zapirain,
B. Natural Language Processing Keywords: text classification; biomedical document; natural language processing; biomedical text
Techniques for Text Classification of classification challenges
Biomedical Documents: A Systematic
Review. Information 2022, 13, 499.
https://doi.org/10.3390/
info13100499 1. Introduction
Academic Editor: Ralf Krestel The focus on text data is increasing day by day in different fields. Generally in the
healthcare field, patient information consists mostly of medical texts or notes taken by
Received: 15 September 2022
doctors and nurses. The classification of medical text in the process of extracting knowledge
Accepted: 11 October 2022
from medical data has gained momentum in recent times thanks to Natural Language
Published: 17 October 2022
Processing techniques. In this technique, the main approach is the recognition of a necessary
Publisher’s Note: MDPI stays neutral pattern that explains a fact from the links between words and sentences in a text. These
with regard to jurisdictional claims in links give a semantic meaning and allow a good understanding of the information in the
published maps and institutional affil- text. In health, this helps in the rapid search for the causes of a disease and correlates all
iations. the causes extracted from the text to predict the disease. Many other problems are treated
by following this approach.
Since 2013 until today, NLP research has demonstrated its inescapable capabilities
with very relevant models emerging every year probably. Techniques based on neural
Copyright: © 2022 by the authors.
network architectures, very intuitive in classification and other important natural language
Licensee MDPI, Basel, Switzerland.
This article is an open access article
processing tasks [1,2]. Many other problems in health care use text classification such
distributed under the terms and
as in the International Classification of Diseases (ICD), which is a medical classification
conditions of the Creative Commons list published by the World Health Organization, which defines the universe of diseases,
Attribution (CC BY) license (https:// disorders, injuries and other related health conditions as well as the standard of diagnosis
creativecommons.org/licenses/by/ classification [3,4].
4.0/).
In this systematic review, we examine the different articles on patient medical text
classification from 1 January 2016 to 10 July 2022, in order to identify the relevant challenges
in biomedical text classification. The knowledge gained in this study will clearly map out
the methodologies and techniques for future research work. In this study, we seek to
answer the questions in the Table 1.
Question Purpose
How are medical text classification datasets To study the composition and description of
Q2
constructed? medical texts in the classification task.
In terms of data, what are the most common To understand and highlight the common
Q3 problems that medical text classification problems and challenges addressed in
can solve? medical text-based problem solving.
What are the mostly used evaluation To identify the different mostly metrics used
Q4
metrics of medical document classification? in medical document classification
Criteria Description
This criteria was given less weight, particularly for articles published
Citations
recently, such as those from 2021 and 2022
Information 2022, 13, 499 4 of 19
Figure 1. Paper selection flow diagram for text classification in biomedical domain.
M6 Innovation [0,1] 1
The dataset used in the research is
M7 a benchmark or it has been made [No,Yes] (0–1) 1
publicly available
Regarding the performance, if the
percentage of quality of result is
Other quality metrics (10 points ) M8 Performance (Accuracy) between 60–70% (0.5), between 2
71–80% (1), between 81–90% (1.5)
and 91% + (2) otherwise (0)
If the paper is cited 0 times (0), 1–4
M9 Citation 1
times (0.5) and cited 6+ (1)
M10 Availability of source code [0,1] 1
If rank = Q1 then (4), rank = Q2
M11 Journal ranking then (3) rank = Q3 then (2) and if 4
rank = Q4 then 1
3. Results
All the papers selected following the steps of the flow diagram in Figure 1 were
included in the analysis. Table A1 summarizes the selections made in this review paper.
All the metrics of Table 3 were applied to evaluate the selected papers, and the result of
this evaluation is in Table 4. The whole evaluation process is presented in Table A2 in
the Appendix A. In addition, to answer the questions in Table 1, the evaluation of the
different text classification databases used in each selected paper was conducted in order
to discover new challenges in the data and their influence in building the models. Finally,
an evaluation of the frequency distribution of the selected papers by location, publication
database, and type (Journal/Conference) was done, followed by an evaluation of the
frequency distribution by ranking and year of publication.
Excellence 3 6 9
Very good 14 7 21
Good 2 1 3
Sufficient 0 0 0
Deficient 0 0 0
Total 19 14 33
Table 5. Performance of the most frequent text classification methods and database used.
Table 6. Performance obtained on different text classification methods used in each paper.
MT-MI-BiLSTM-ATT [22] EMR data set comes from a hospital (benchmark) 93.00% - - 87.00%
Random forest [19] Text dataset from NHLS-CDW 95.25% 94.60% 95.69% 95.34%
Table 7. Number and frequency of research database, conference or journal and by geographical
distribution of publication.
Frequency
Parameters Category
No. Papers Percentage
Arxiv 7 21.2%
ACM 2 6.1%
MDPI 2 6.1%
Database WoSc 10 30.3%
Scopus 4 12.1%
24.2%
IEEE 8
Conference 14 42.4%
Type of publication
Journal 19 57.6%
Information 2022, 13, 499 9 of 19
Considering the ranking, more of the selected papers, i.e., 12 out of 19 papers published in
journals, were of Q1. Figure 4 presents the different frequencies in the analysis for the year
and the ranking distribution.
Frequency
Parameters Category
No. Papers Percentage
2016 1 3.0%
2017 3 9.1%
2018 1 3.0%
Year 2019 5 15.1%
2020 11 33.3%
2021 8 24.3%
2022 4 12.1%
Q1 12 36,4%
Q2 3 9.1%
Paper ranking
Q3 3 9.1%
Conference 15 45.5%
Figure 4. (a) Represent the frequency by year and (b) the distribution by conference and paper
ranking.
4. Discussion
Text classification in biomedical field plays an important role in the rapid search
(diagnosis) of a disease from the patient record, hospital administration, and even the
treatment appropriate for a specific case, as the volume of patient medical records continues
to increase significantly. Each year, new classification methods with high classification
accuracy are proposed, while the performance of older [38–40] NLP methods is enhanced
through the use of alternative approaches such as optimization and other type of algorithm
based on transformers architecture [12,40–42] and XLNet [43], data-centric technique and
many others. The data-centric technique presents a challenge in enhancing the performance
of biomedical text classification methods [44]. The observation is that the majority of
methods have been pre-trained with text databases in a generic context without any prior
specificity. In other words, a model that has been pre-trained with biomedical data will
adapt better when re-trained with new data for a biomedical domain. In this context, we
discuss the data-centric problem, which must be a key consideration when developing
models tailored to specific case. Another challenge in the classification of biomedical texts
is the data quality. We found two kinds of datasets in the articles we looked at: those
made public by research institutes and labs [7,9,13,15–17,21], and those that any reference
Information 2022, 13, 499 11 of 19
(benchmark) could use without more information. When training the models to give good
results, it is important to think about how good the data [45] are. This quality can be made
sure of by thinking about the whole process of collecting and preprocessing the data until
it is ready to be used for classification tasks.
Before performing the classification task, biomedical texts can be derived from a
variety of sources [46,47]. We find data in medical reports that are already in text form,
as well as notes taken by doctors or nurses during consultations that are scanned images.
Depending on the context of the problem and the goal to be achieved, several approaches
can be used with these types of data. Alternatively, the data can be represented in both
formats, or a radio image is accompanied by text that explains the image. Depending on
the expected result, several methods can be combined in the text classification process in
image-text data [13]. To complete these tasks, methods based on CNN architectures [48,49]
are frequently used [13,50].
The classification of biomedical texts is involved in several important problems that
physicians are expected to solve. These issues can sometimes be large challenges in multiple
steps. This can be conducted in diagnosis [11,28], patient treatment [11], or even effective
representations of concepts such as diagnoses, drugs, procedure codes, and patient visits [33],
as well as in the quick search of a document or disease classification [23]. Pathologies
from clinical notes [23] and much more In all of these ways, it is harder to classify texts
in the biomedical field than in other fields in general. This is because biomedical texts
include both medical records and medical literature, which are both important sources of
clinical information. However, medical texts have hard-to-understand medical terms and
measurements that cause problems with high dimensionality and a lack of data [9]. All of
these problems are very important when it comes to the task of classifying biomedical text.
In the biomedical text classification task, as in most classification problems in gen-
eral [51], the model evaluation metrics are the same. In all the papers studied in our
systematic review, the metrics identified are Accuracy, Recall, F1-score, Precision, Average
precision, Average recall, and Average F1-score. These metrics are the most commonly
used to evaluate text classification models. As in this study, the different methods used
in each paper analyzed, used at least one of these metrics except for one paper [52] which
used Spearman.C.C. metric [53].
consider while training the models to deliver good outcomes. This quality may be assured
by considering the whole collecting and pre-processing process until the data set is ready
as an usable source for classification tasks. Several other challenges can be described by
taking into account several aspects that we have not addressed in this work. Some of the
challenges we have discussed are the most common ones in our overall study.
In the perspective, to significantly advance research in the biomedical field, it is
preferable to make well-preserved and verified data more widely available in order to
assist research and overcome data quality [54–56] in biomedical classification challenges.
Because of domain drifts among different institutes, the cooperation between research
laboratories, universities and other research entities, would be an action to be strengthened
in order to create a great network of scientific sharing of scarce resources such as data to
advance research. Joint work sessions between domain experts should be a good procedure
to validate the dataset as a common resource for scientific research of text classification.
Finally, a policy of simplification of data sharing, which is often confidential, would be an
essential point among many others to be defined to answer the problem of data deficiency.
Most of the models used in the papers selected in this study are based on Deep learning. The
interpretability of robust models is an important aspect in clinical research. Accuracy and
reliability are also important aspects in biomedical research field. Whether one uses simple
models based on statistical learning or robust models based on Deep learning, whatever
their performance, the interpretability and reliability aspect would be very important to
take into account, to validate the results for a clinical research.
Author Contributions: Conceptualization, C.Y.K., A.C.-V. and B.G.-Z.; methodology, C.Y.K. and
A.C.-V.; formal analysis, C.Y.K.; investigation, C.Y.K.; writing—original draft preparation, C.Y.K.;
writing—review and editing, C.Y.K., A.C.-V. and B.G.-Z.; supervision, B.G.-Z.; All authors have read
and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Acknowledgments: The authors thank and acknowledge the eVida research group of the University
of Deusto, recognized by the Basque Government with the code IT1536-22, and the ICCRF Colon
Cancer Challenge for its untiring support and commitment to providing us with the resources
necessary to carry out the study, until its finalization.
Conflicts of Interest: The authors declare no conflict of interest.
Information 2022, 13, 499 13 of 19
Appendix A
P. Year J/C Loc Database Methods Dataset Best Method Metric (Best) Rank Cite
Mavg.P: 98.0
[57] 2021 C China IEEE ResNet; BERT-BiGRU; ResNet-BERTBiGRU Text-image data (benchmark) ResNet-BERTBiGRU Mavg.R: 98.0 None 0
Mavg.F1: 98.0
R: 76.46;
EMR data from outpatient visits during 2017
SVM (Linear Kernel); SVM (Polynomial Kernel); SVM (Sigmoid P: 81.28;
[58] 2021 C Indonesia IEEE to 2018 at a public hospital in Surabaya City, None 0
SVM (RBF Kernel); SVM (Sigmoid Kernel) Kernel) F1: 78.80;
Indonesia (benchmark)
Ac: 91.0
Micro-P: 90.6;
GM; Seq2Seq; CNN; LP; HBLA-A (This model can be ARXIV Academic Paper Dataset (AAPD);
[24] 2020 C China IEEE BLA-A Micro-R: 89.2; None 1
seen as a combination of BERT and BiLSTM.) Reuters Corpus Volume I (RCV1-V2)
Micro-F1: 89.9
Ac: 96.63;
P: 96.64;
[15] 2021 C China IEEE Text CNN; BERT; ALBERT THUCNews; iFLYTEK BERT None 0
R: 96.63;
F1: 96.61
Ac: 99.71;
BERT-base; BERT-large; RoBERTa-base;
COVID-19 fake news dataset” by Sumit P: 98.82;
[16] 2022 J Saudi Arabia Scopus RoBERTa-large; DistilBERT; ALBERT-base-v2; BERT-base Q3 3
Bank; extremist-non-extremist dataset R: 97.84;
XLM-RoBERTa-base; Electra-small; and BART-large
F1: 98.33
P: 98.0;
LSTM; Multilingual; BERT-base; SCIBERT;
[18] 2020 C UK WoSc SQuAD LSTM R: 98.0; None 10
SCIBERT 2.0
F1: 98.0
Information 2022, 13, 499 14 of 19
P. Year J/C Loc Database Methods Dataset Best Method Metric(best) Rank Cite
AUC: 99.1;
[27] 2027 J USA WoSc Tf-Idf CRNN iDASH dataset; MGH dataset CRNN Q1 59
F1: 84.5
Ac: 97.2;
cMedQA medical diagnosis dataset; P: 91.8;
[21] 2027 J China WoSc CNN; LSTM; CNN-LSTM; GRU; DC-LSTM DC-LSTM Q3 1
Sentiment140 Twitter dataset R: 91.8;
F1: 91.0
Ac: 58.0;
EMR Progress Notes from a medical center P: 58.2;
[28] 2020 J Taiwan WoSc CNN; CNN Based model CNN Based model Q1 2
(benchmark) R: 57.9;
F1: 58.0
Ac: 90.05;
[7] 2020 C UK WoSc BioBERT; Bert MIMIC-III database BioBERT Precision: 77.37; None 0
F1: 48.63
P. Year J/C Loc Database Methods Dataset Best Method Metric(best) Rank Cite
F1: 95.34;
R: 95.69
[19] 2021 J South Africa MDPI Random forest, SVMLinear, SVMRadial text dataset from NHLS-CDW Random forest Q2 2
P: 94.60
Ac: 95.25
[8] 2020 J UK WoSc LSTM; LSTM-RNNs; SVM, Decision Tree; RF MIMIC-III; CSU dataset LSTM F1: 91.0 Q1 13
Ac: 91.99;
[20] 2020 C USA ACM CNN-MHA-BLSTM; CNN, LSTM EMR texte dataset (benchmark) CNN-MHA-BLSTM None 22
F1: 92.03
Ac: 82.0;
[31] 2019 C USA IEEE MLP EMR dataset (benchmark) MLP None 1
F1: 82.0
[12] 2019 C USA Arxiv BERT-base, ELMo, BioBERT PubMed abstract; MIMIC III BERT-base Ac: 82.3 None 0
[33] 2016 C USA ACM Med2Vec CHOA dataset Med2Vec R: 91.0 None 378
[52] 2018 C Canada Arxiv word2vec, Hill, dict2vec MENd dataset; SV-d dataset word2vec Spearman.C.C: 65.3 None 37
[34] 2017 C Switzerland Arxiv biGRU, GRU, DENSE RCV1/RCV2 dataset biGRU F1: 84.0 None 34
[11] 2021 J China Arxiv Logistic regression; SWAM-CAML; SWAM-text CNN MIMIC-III full dataset; MIMIC-III 50 dataset SWAM-text CNN F1: 60.0 Q1 6
[35] 2022 J China MDPI LSTM, CNN, GRU, Capsule+GRU, Capsule+LSTM Chinese electronic medical record dataset Capsule+LSTM F1: 73.51 Q2 2
Table A2. Results of the application of the eligibility criteria to the filtered papers.
References
1. Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013,
arXiv:1301.3781.
2. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In
Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December
2017; Volume 30.
3. World Health Organization. The International Classification of Diseases, 10th Revision. 2015. Available online: https://icd.who.
int/browse10/2015/en (accessed on 4 August 2021).
4. Chen, P.; Wang, S.; Liao, W.; Kuo, L.; Chen, K.; Lin, Y.; Yang, C.; Chiu, C.; Chang, S.; Lai, F. Automatic ICD-10 Coding and Training
System: Deep Neural Network Based on Supervised Learning. JMIR Med. Inform. 2021, 9, e23230. [CrossRef] [PubMed]
5. Zahia, S.; Zapirain, M.B.; Sevillano, X.; González, A.; Kim, P.J.; Elmaghraby, A. Pressure injury image analysis with machine
learning techniques: A systematic review on previous and possible future methods. Artif. Intell. Med. 2020, 102, 101742. [CrossRef]
[PubMed]
Information 2022, 13, 499 17 of 19
6. Urdaneta-Ponte, M.C.; Mendez-Zorrilla, A.; Oleagordia-Ruiz, I. Recommendation Systems for Education: Systematic Review.
Electronics 2021, 10, 1611. [CrossRef]
7. Amin-Nejad, A.; Ive, J.; Velupillai, S. LREC Exploring Transformer Text Generation for Medical Dataset Augmentation. In
Proceedings of the Twelfth Language Resources and Evaluation Conference, Palais du Pharo, Marseille, France, 11–16 May 2020;
Available online: https://aclanthology.org/2020.lrec-1.578 (accessed on 4 August 2021).
8. Venkataraman, G.R.; Pineda, A.L.; Bear Don’t Walk, O.J., IV.; Zehnder, A.M.; Ayyar, S.; Page, R.L.; Bustamante, C.D.; Rivas, M.A.
FasTag: Automatic text classification of unstructured medical narratives. PLoS ONE 2020, 15, e0234647. [CrossRef]
9. Qing, L.; Linhong, W.; Xuehai, D. A Novel Neural Network-Based Method for Medical Text Classification. Future Internet 2019,
11, 255. [CrossRef]
10. Gangavarapu, T.; Jayasimha, A.; Krishnan, G.S.; Kamath, S. Predicting ICD-9 code groups with fuzzy similarity based supervised
multi-label classification of unstructured clinical nursing notes. Knowl.-Based Syst. 2020, 190, 105321. [CrossRef]
11. Hu, S.; Teng, F.; Huang, L.; Yan, J.; Zhang, H. An explainable CNN approach for medical codes prediction from clinical text. BMC
Med. Inform. Decis. Mak. 2021, 21, 256. [CrossRef]
12. Peng, Y.; Yan, S.; Lu, Z. Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten
Benchmarking Datasets. arXiv 2019, arXiv:1906.05474.
13. Prabhakar, S.K.; Won, D.O. Medical Text Classification Using Hybrid Deep Learning Models with Multihead Attention. Comput.
Intell. Neurosci. 2021, 2021, 9425655. [CrossRef]
14. Pappagari, R.; Zelasko, P.; Villalba, J.; Carmiel, Y.; Dehak, N. Hierarchical Transformers for Long Document Classification. In
Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Sentosa, Singapore, 14–18
December 2019; pp. 838–844. [CrossRef]
15. Fang, F.; Hu, X.; Shu, J.; Wang, P.; Shen, T.; Li, F. Text Classification Model Based on Multi-head self-attention mechanism and
BiGRU. In Proceedings of the 2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS), Shenyang,
China, 11–13 December 2021; pp. 357–361. [CrossRef]
16. Qasim, R.; Bangyal, W.H.; Alqarni, M.A.; Ali, Almazroi, A. A Fine-Tuned BERT-Based Transfer Learning Approach for Text
Classification. J. Healthc. Eng. 2022, 2022, 3498123. [CrossRef]
17. Lu, H.; Ehwerhemuepha, L.; Rakovski, C. A comparative study on deep learning models for text classification of unstructured
medical notes with various levels of class imbalance. BMC Med. Res. Methodol. 2022, 22, 181. [CrossRef]
18. Schmidt, L.; Weeds, J.; Higgins, J. Data Mining in Clinical Trial Text: Transformers for Classification and Question Answering
Tasks. arXiv 2020, arXiv:2001.11268.
19. Achilonu, O.J.; Olago, V.; Singh, E.; Eijkemans, R.M.J.C.; Nimako, G.; Musenge, E. A Text Mining Approach in the Classification
of Free-Text Cancer Pathology Reports from the South African National Health Laboratory Services. Information 2021, 12, 451.
[CrossRef]
20. Shen, Z.; Zhang, S. A Novel Deep-Learning-Based Model for Medical Text Classification. In Proceedings of the 2020 9th
International Conference on Computing and Pattern Recognition (ICCPR 2020), Xiamen, China, 30 October–1 November 2020;
Association for Computing Machinery: New York, NY, USA, 2020; pp. 267–273. [CrossRef]
21. Liang, S.; Chen, X.; Ma, J.; Du, W.; Ma, H. An Improved Double Channel Long Short-Term Memory Model for Medical Text
Classification. J. Healthc. Eng. 2021, 2021, 6664893. [CrossRef]
22. Wang, S.; Pang, M.; Pan, C.; Yuan, J.; Xu, B.; Du, M.; Zhang, H. Information Extraction for Intestinal Cancer Electronic Medical
Records. IEEE Access 2020, 8, 125923–125934. [CrossRef]
23. Gangavarapu, T.; Krishnan, G.S.; Kamath, S.; Jeganathan, J. FarSight: Long-Term Disease Prediction Using Unstructured Clinical
Nursing Notes. IEEE Trans. Emerg. Top. Comput. 2021, 9, 1151–1169. [CrossRef]
24. Cai, L.; Song, Y.; Liu, T.; Zhang, K. A Hybrid BERT Model That Incorporates Label Semantics via Adjustive Attention for
Multi-Label Text Classification. IEEE Access 2020, 8, 152183–152192. [CrossRef]
25. Pan, Y.; Wang, C.; Hu, B.; Xiang, Y.; Wang, X.; Chen, Q.; Chen, J.; Du, J. A BERT-Based Generation Model to Transform Medical
Texts to SQL Queries for Electronic Medical Records: Model Development and Validation. JMIR Med. Inform. 2021, 9, e32698.
[CrossRef]
26. Liu, K.; Chen, L. Medical Social Media Text Classification Integrating Consumer Health Terminology. IEEE Access 2019, 7,
78185–78193. [CrossRef]
27. Weng, W.H.; Wagholikar, K.B.; McCray, A.T.; Szolovits, P.; Chueh, H.C. Medical subdomain classification of clinical notes using a
machine learning-based natural language processing approach. BMC Med. Inform. Decis. Mak. 2017, 17, 155. [CrossRef]
28. Hsu, J.-L.; Hsu, T.-J.; Hsieh, C.-H.; Singaravelan, A. Applying Convolutional Neural Networks to Predict the ICD-9 Codes of
Medical Records. Sensors 2020, 20, 7116. [CrossRef]
29. Moen, H.; Hakala, K.; Peltonen, L.M.; Suhonen, H.; Ginter, F.; Salakoski, T.; Salanterä, S. Supporting the use of standardized
nursing terminologies with automatic subject heading prediction: A comparison of sentence-level text classification methods. J.
Am. Med. Inform. Assoc. 2020, 27, 81–88. [CrossRef]
30. Chintalapudi, N.; Battineni, G.; Canio, M.D.; Sagaro, G.G.; Amenta, F. Text mining with sentiment analysis on seafarers’ medical
documents. Int. J. Inf. Manag. Data Insights 2021, 1, 100005. ISSN 2667-0968. [CrossRef]
Information 2022, 13, 499 18 of 19
31. Al-Doulat, A.; Obaidat, I.; Lee, M. Unstructured Medical Text Classification using Linguistic Analysis: A Supervised Deep
Learning Approach. In Proceedings of the 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications
(AICCSA), Abu Dhabi, United Arab Emirates, 3–7 November 2019; pp. 1–7. [CrossRef]
32. Audebert, N.; Herold, C.; Slimani, K.; Vidal, C. Multimodal Deep Networks for Text and Image-Based Document Classification.
In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Würzburg,
Germany, 16–20 September 2020. [CrossRef]
33. Choi, E.; Bahadori, M.T.; Searles, E.; Coffey, C.; Thompson, M.; Bost, J.; Tejedor-Sojo, J.; Sun, J. Multi-layer Representation Learning
for Medical Concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining (KDD ’16), San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA,
2016; pp. 1495–1504. [CrossRef]
34. Pappas, N.; Popescu-Belis, A. Multilingual hierarchical attention networks for document classification. arXiv 2017, arXiv:1707.00896.
35. Zhang, Q.; Yuan, Q.; Lv, P.; Zhang, M.; Lv, L. Research on Medical Text Classification Based on Improved Capsule Network.
Electronics 2022, 11, 2229. [CrossRef]
36. Yasunaga, I.; Leskovec, J.; Liang, P. LinkBERT: Pretraining Language Models with Document Links. In Proceedings of the 60th
Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022;
Association for Computational Linguistics: Dublin, Ireland, 2022; pp. 8003–8016.
37. Zhang, D.; Mishra, S.; Brynjolfsson, E.; Etchemendy, J.; Ganguli, D.; Grosz, B.; Lyons, T.; Manyika, J.; Niebles, J.C.; Sellitto, M.; et al.
“The AI Index 2022 Annual Report,” AI Index Steering Committee; Stanford Institute for Human-Centered AI, Stanford University:
Stanford, CA, USA, 2022.
38. Le, Q.; Mikolov, T. Distributed representations of sentences and documents. In Proceedings of the International Conference on
Machine Learning (PMLR), Bejing, China, 22–24 June 2014; pp. 1188–1196.
39. Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of Tricks for Efficient Text Classification. arXiv 2016, arXiv:1607.01759.
40. Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training. 2018.
Available online: https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf (accessed on 10 October
2022).
41. Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly
Optimized BERT Pretraining Approach. arXiv 2019, arxiv:1907.11692.
42. Abreu, J.; Fred, L.; Macêdo, D.; Zanchettin, C. Hierarchical Attentional Hybrid Neural Networks for Document Classification. In
Proceedings of the International Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019. [CrossRef]
43. Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. XLNet: Generalized Autoregressive Pretraining for
Language Understanding. arXiv 2019, arxiv:1906.08237.
44. Fries, J.A.; Weber, L.; Seelam, N.; Altay, G.; Datta, D.; Garda, S.; Kang, M.; Su, R.; Kusa, W.; Cahyawijaya, S.; et al. BigBIO: A
Framework for Data-Centric Biomedical Natural Language Processing. arXiv 2022, arXiv:2206.15076.
45. Zunic, A.; Corcoran, P. Spasic ISentiment Analysis in Health and Well-Being: Systematic Review. JMIR Med. Inform. 2020, 8,
e16023. [CrossRef] [PubMed]
46. Aattouchi, I.; Elmendili, S.; Elmendili, F. Sentiment Analysis of Health Care: Review. E3s Web Conf. 2021, 319, 01064. [CrossRef]
47. Tai, K.S.; Socher, R.; Manning, C.D. Improved Semantic Representations From Tree-Structured Long Short-Term Memory
Networks. arXiv 2015, arxiv:1503.00075.
48. Nii, M.; Tsuchida, Y.; Kato, Y.; Uchinuno, A.; Sakashita, R. Nursing-care text classification using word vector representation and
convolutional neural networks. In Proceedings of the 2017 Joint 17th World Congress of International Fuzzy Systems Association
and 9th International Conference on Soft Computing and Intelligent Systems (IFSA-SCIS), Otsu, Japan, 27–30 June 2017; pp. 1–5.
49. Qian, Y.; Woodland, P.C. Very Deep Convolutional Neural Networks for Robust Speech Recognition. arXiv 2016, arXiv:1607.01759.
50. Zhang, Y.; Wallace, B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classifica-
tion. arXiv 2015, arXiv:1510.03820.
51. Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. Int. J. Data Min. Knowl. Manag.
Process 2015, 5, 1–11. [CrossRef]
52. Bosc, T.; Vincent, P. Auto-Encoding Dictionary Definitions into Consistent Word Embeddings. In Proceedings of the 2018 Conference
on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 1522–1532. [CrossRef]
53. Spearman, C. ‘General Intelligence,’ Objectively Determined and Measured. Am. J. Psychol. 1904, 15, 201–292. [CrossRef]
54. Zhan, X.; Wang, F.; Gevaert, O. Reliably Filter Drug-Induced Liver Injury Literature With Natural Language Processing and
Conformal Prediction. IEEE J. Biomed. Health Inform. 2022, 26, 5033–5041. [CrossRef]
55. Rathee, S.; MacMahon, M.; Liu, A.; Katritsis, N.; Youssef, G.; Hwang, W.; Wollman, L.; Han, N. DILIc: An AI-based classifier to
search for Drug-Induced Liver Injury literature. bioRxiv 2022. [CrossRef]
56. Oh, J.H.; Tannenbaum, A.R.; Deasy, J.O. Automatic identification of drug-induced liver injury literature using natural language
processing and machine learning methods. bioRxiv 2022. [CrossRef]
57. Chen, Y.; Zhang, X.; Li, T. Medical Records Classification Model Based on Text-Image Dual-Mode Fusion. In Proceedings of the
2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 28–31 May 2021; pp. 432–436.
[CrossRef]
Information 2022, 13, 499 19 of 19
58. Jamaluddin, M.; Wibawa, A.D. Patient Diagnosis Classification based on Electronic Medical Record using Text Mining and
Support Vector Machine. In Proceedings of the 2021 International Seminar on Application for Technology of Information and
Communication (iSemantic), Semarangin, Indonesia, 18–19 September 2021; pp. 243–248. [CrossRef]
59. Yang, F.; Wang, X.; Ma, H.; Li, J. Transformers-sklearn: A toolkit for medical language understanding with transformer-based
models. BMC Med. Inform. Decis. Mak. 2021, 21, 90. [CrossRef]