Detection and Classification of Arrhythmia Using An Explainable Deep Learning Model
Detection and Classification of Arrhythmia Using An Explainable Deep Learning Model
Journal of Electrocardiology
a r t i c l e i n f o a b s t r a c t
Background: Early detection and intervention is the cornerstone for appropriate treatment of arrhythmia and pre-
vention of complications and mortality. Although diverse deep learning models have been developed to detect
Keywords: arrhythmia, they have been criticized due to their unexplainable nature. In this study, we developed an explain-
Electrocardiography able deep learning model (XDM) to classify arrhythmia, and validated its performance using diverse external
Artificial intelligence validation data.
Arrhythmia Methods: In this retrospective study, the Sejong dataset comprising 86,802 electrocardiograms (ECGs) was used
Deep learning to develop and internally variate the XDM. The XDM based on a neural network-backed ensemble tree was
developed with six feature modules that are able to explain the reasons for its decisions. The model was exter-
nally validated using data from 36,961 ECGs from four non-restricted datasets.
Results: During internal and external validation of the XDM, the average area under the receiver operating
characteristic curves (AUCs) using a 12‑lead ECG for arrhythmia classification were 0.976 and 0.966, respectively.
The XDM outperformed a previous simple multi-classification deep learning model that used the same method.
During internal and external validation, the AUCs of explainability were 0.925–0.991.
Conclusion: Our XDM successfully classified arrhythmia using diverse formats of ECGs and could effectively
describe the reason for the decisions. Therefore, an explainable deep learning methodology could improve accu-
racy compared to conventional deep learning methods, and that the transparency of XDM can be enhanced for its
application in clinical practice.
© 2021 Published by Elsevier Inc.
https://doi.org/10.1016/j.jelectrocard.2021.06.006
0022-0736/© 2021 Published by Elsevier Inc.
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
performance of a cardiologist [9–11]. However, the deep learning external validation dataset. The Physikalisch Technische Bundesanstalt
models developed in previous studies were simply black boxes that (PTB-XL) ECG dataset from Europe contained 18,065 ECGs with a sam-
did not explain their predictions in a way that humans could under- pling rate of 500 Hz [13]; the Georgia ECG challenge dataset from the
stand [12]. Consequently, clinicians could not trust this new technology United States contained 5541 ECGs with a sampling rate of 500 Hz
due to the lack of transparency and interpretability, which are key to [14]; the Chapman university ECG database from China contained
promote its use in clinical settings. In this study, we developed and val- 9269 ECGs with a sampling rate of 500 Hz [15]; and the China Physiolog-
idated an explainable deep learning model (XDM) based on a neural ical Signal Challenge (CPSC) ECG dataset from China contained 4086
network-backed ensemble tree (NBET), i.e., a highly fashioned artificial ECGs with a sampling rate of 500 Hz [16]. Given that the developed
intelligence (AI) technology. To the best of our knowledge, this is the XDM can be used with diverse formats of ECGs, we were able to confirm
first study to develop and validate an explainable AI to detect the performance of the deep learning model (DLM) using single‑lead
arrhythmia. ECG (lead I) from validation datasets.
This study was approved by the Institutional Review Boards (IRBs)
of SGH (2019–0411) and MSH (2019–083). Clinical data included digi-
Methods
tally stored ECGs, medical records, intervention results, and demo-
graphic information from both hospitals. Both IRBs waived the need
Study design and population
for informed consent due to the retrospective nature of the study, and
the fact that only fully anonymized ECGs and health data were used.
We conducted a retrospective multicenter cohort study in which an
XDM was developed using ECGs. The Sejong ECG dataset from Mediplex
Sejong Hospital (MSH) and Sejong General Hospital (SGH) was used for Procedures
the development and internal validation of the XDM. In the Sejong ECG
dataset, we identified patients with at least one standard digital 10-s ECG data were used as predictor variables. The digitally stored
12‑lead ECG acquired in the supine position within the study period, 12‑lead Sejong ECG dataset, amounting to 5000 per lead, were recorded
and at least one outpatient department visit, general-health checkup over 10 s (500 Hz). We removed 1 s at the beginning and end of each
center visit, or admission to the cardiovascular center of the two afore- ECG because these areas had more artifacts than other parts. Given
mentioned hospitals. We excluded individuals with missing demo- four open datasets were used as an external validation dataset, we
graphic, electrocardiographic, or medical records relating to diagnosis used only 8 s of ECG data extracted from the middle of each ECG. For ex-
and intervention procedures, as shown in Fig. 1. Study populations ample, if the length of the external validation ECG data was 30 s, we
from the Sejong ECG dataset were randomly split into algorithm- used only 8 s of ECG data extracted from the middle of those 30 s; con-
development (75%) and internal-validation (25%) datasets. We exter- sequently, the length of each ECG was 8 s (4,000 data points). The objec-
nally validated the developed XDM using four non-restricted ECG tive of this study was to classify arrhythmia, defined as normal sinus
datasets, and 36,961 ECGs that had arrhythmia labels were used as the rhythm (NSR), atrial fibrillation or flutter (AF or AFL), junctional rhythm
125
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
Fig. 2. Architecture of the explainable deep learning model (XDM) for classifying arrhythmia.
AF: Atrial fibrillation, AFL: Atrial flutter, AV: Atrioventricular, BN: Batch normalization layer, CAVB: Complete atrioventricular block, CONV: Convolutional neural network layer, ECG:
Electrocardiography, FC: Fully connected layer, JR: Junctional rhythm, NSR: Normal sinus rhythm, PM: Pacemaker rhythm, SDM: Simple multi-classification deep learning model, SVT:
Supraventricular tachycardia, VT: Ventricular tachycardia, XDM: Explainable deep learning model, 2AVB_T2: Second degree atrioventricular block Mobitz type II, 2AVB_T1: Second
degree atrioventricular block Mobitz type I with Wenckebach phenomenon.
(JR), supraventricular tachycardia (SVT), ventricular tachycardia (VT), Development of an XDM for arrhythmia classification
complete atrioventricular block (CAVB), second degree atrioventricular
block Mobitz type II (2AVB-T2), second degree atrioventricular block To develop an XDM, we developed modules to classify the character-
Mobitz type I with Wenckebach phenomenon (2AVB-T1), and pace- istics of arrhythmia, as opposed to detecting the presence of each possi-
maker rhythm (PM). Three cardiologists re-labeled each ECG in the ex- ble arrhythmia; we called this method an NBET. We developed six deep
ternal validation datasets. Specifically, the cardiologists labeled the learning modules for the features and final ensemble of the XDM using
Sejong ECG datasets based on medical records that included progression seven labels of each ECG based on supervised learning as shown in
notes and electrophysiological study reports. Fig. 2. To this end, cardiologists labeled each ECG, not only for the
Table 1
Baseline characteristics.
Table 2
Characteristics NSR Arrhythmia p-value Performance of XDM and SDM for classifying arrhythmia.
AF: Atrial fibrillation, AFL: Atrial flutter, CAVB: Complete atrioventricular block, JR: Junc- AF: Atrial fibrillation, AFL: Atrial flutter, CAVB: Complete atrioventricular block, JR: Junc-
tional rhythm, NSR: Normal sinus rhythm, PM: Pacemaker rhythm, SDM: Simple multi- tional rhythm, NSR: Normal sinus rhythm, PM: Pacemaker rhythm, SDM: Simple multi-
classification deep learning model, SVT: Supraventricular tachycardia, VT: Ventricular classification deep learning model, SVT: Supraventricular tachycardia, VT: Ventricular
tachycardia, XDM: Explainable deep learning model, 2AVB_T2: Second degree atrioven- tachycardia, XDM: Explainable deep learning model, 2AVB_T2 second degree atrioventric-
tricular block Mobitz type II, 2AVB_T1: Second degree atrioventricular block Mobitz type ular block Mobitz type II, and 2AVB_T1 second degree atrioventricular block Mobitz type I
I with Wenckebach phenomenon. with Wenckebach phenomenon.
126
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
classification of arrhythmia, but also for the presence of six features. Car- regularity of PR interval, atrioventricular dissociation, atrioventricular
diologists binarily labeled the ground truths of the features to each ECG. sequencing, and pacemaker spike presence. Each module was devel-
Cardiologist labeled a ground truth of irregularity feature as 1 when the oped using five residual blocks of the neural network to learn complex
ECG was irregulary irregular, indicating the absence of a regular pattern hierarchical non-linear representations from the data. In a residual
in an R wave. Irregulary irregular is a character of atrial fibrillation and it block with four stages, two convolution layers and two batch normaliza-
means that regularity is never observed in RR intervals. For example, re- tion layers were repeated. We used five residual blocks and two fully
current trigeminy has irregular rhythm but have repeat pattern of RR in- connected 1-dimensional (1D) layers to develop each feature model.
terval. However atrial fibrillation has no pattern of RR interval at all. The second fully connected 1D layer of each module was connected to
Similarly, Cardiologists labeled the ground truth of Regularity PR inter- the output node, which comprised one node. The corresponding values
val, atrioventricular dissociation, and atrioventricular sequencing as 1 of the output node for six modules represent the probability for each
when they observed a regular PR interval in 10 s ECG, no correlation be- feature of arrhythmia. The corresponding values are described as inter-
tween P wave and R wave, and 1:1 matching between P wave and R pretable scores in Fig. 2. A SoftMax function was used at the output node
wave. First, we developed each module to determine the features of of each module as an activation function because the output of the
heart rhythm, which were defined as irregularity, presence of P wave, SoftMax function ranges between 0 and 1. Finally, we concatenated
Fig. 3. Confusion matrixes for the explainable deep learning model (XDM) and simple multi-classification deep learning model (SDM) prediction on internal and external validation
datasets.
AF: Atrial fibrillation, AFL: Atrial flutter, AV: Atrioventricular, CAVB: Complete atrioventricular block, JR: Junctional rhythm, NSR: Normal sinus rhythm, PM: Pacemaker rhythm, SVT:
Supraventricular tachycardia, VT: Ventricular tachycardia, 2AVB_T2: Second degree atrioventricular block Mobitz type II, 2AVB_T1: Second degree atrioventricular block Mobitz type I
with Wenckebach phenomenon.
127
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
Fig. 4. Performances of the XDM and SDM on internal and external validation datasets.
AF: Atrial fibrillation, AFL: Atrial flutter, AUC: Area under the receiver operating characteristic curve, AV: Atrioventricular, CAVB: Complete atrioventricular block, CI: Confidence interval,
JR: Junctional rhythm, NSR: Normal sinus rhythm, NPV: Negative predictive value, PM: Pacemaker rhythm, PPV: Positive predictive value, ROC: Receiver operating characteristics curve,
SDM: simple multi-classification deep learning model, SEN: Sensitivity, SPE: Specificity, SVT: Supraventricular tachycardia, VT: Ventricular tachycardia, XDM: explainable deep learning
model, 2AVB_T2: Second degree atrioventricular block Mobitz type II, 2AVB_T1: Second degree atrioventricular block Mobitz type I with Wenckebach phenomenon.
128
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
six feature modules using a multi-layer perceptron architecture to pro- confusion matrix plot. For each ECG, the XDM produced a final predic-
duce the final arrhythmia classification. The multi-layer perceptron in- tion result that was compared against the ground truth of the ECG.
cluded three fully connected 1D layer and two dropout layers. The We confirmed the F1 score, precision, and recall on multi-
third fully connected 1D layer of multi-layer perceptron was connected classification. We aimed to evaluate the detection performance for
to the final nine output nodes. A SoftMax function was used at the nine each arrhythmia using the area under the receiver operating character-
output nodes, and the nine output nodes represented the probability for istic curve (AUC). The receiver operating characteristic curve was cre-
NSR, AF or AFL, JR, SVT, VT, CAVB, 2AVB-T2, 2AVB-T1, and PM. Given that ated by plotting the true positive rate against the false positive rate.
we could evaluate the output values of each module, we could deter- The XDM output the probability for each arrhythmia, which was com-
mine the underlying reasons for the final decision of the XDM. pared against the corresponding ground-truth label to obtain the AUC.
As a comparative method, we developed a simple multi- We applied the cutoff point to the validation data to calculate the sensi-
classification deep learning model (SDM) which had five residual blocks tivity, specificity, negative predictive value, and positive predictive
and three fully connected layers. The SDM is a conventional method that value. The sensitivity, specificity, PPV, and NPV were confirmed at the
has been used in previous studies to detect arrhythmias via ECG. The operating point from Youden J statistics in the development data [17],
architecture of the XDM and SDM were confirmed by a grid search. and the performance of the SDM was confirmed in the same manner.
We then compared the performance of the SDM with that of the XDM.
Statistical analysis We verified the explainability of the DLM through further analyses.
To verify the performance of each feature module, we compared the
Continuous variables are presented as mean values (applying stan- module-calculated probability with the ground-truth feature informa-
dard deviation [SD]) and compared using the unpaired Student's t-test tion provided by cardiologists. Exact 95% confidence intervals (CIs)
or Mann-Whitney U test. Categorical variables are expressed as fre- were used for all of the metrics of diagnostic performance, except for
quencies and percentages, and were compared using the χ2 test. the AUC. The CIs of the AUCs were determined according to Sun
We confirmed the overlap between the prediction of the XDM and and Su's optimization of the De-long method using the pROC package
the ground-truth label confirmed by cardiologists using a normalized by R (The R Foundation for Statistical Computing, Vienna, Austria).
Fig. 5. Performance of the XDM on the internal and external validation datasets.
AUC: Area under the receiver operating characteristic curve, AV: Atrioventricular, SEN: Sensitivity, SPE: Specificity, NPV: Negative predictive value, PPV: Positive predictive value. XDM
explainable deep learning model.
129
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
A significant difference in patient characteristics was defined as a from 14,062 patients in the internal validation dataset from the Sejong
two-sided p-value <0.001. Statistical analyses were computed using R ECG dataset. The DLM performance was externally validated using
software, version 3.4.2. In addition, we used PyTorch's open-source 9269, 4086, 5541, and 18,065 ECGs from the Chapman, CPSC, Georgia,
software library at the backend and Python (version 3.6.11) for the and PTB-XL ECG datasets, respectively.
analyses. Table 2 shows the performance of the XDM on the internal and ex-
ternal validation datasets for multi-variable classification. The XDM's
Visualizing the developed XDM for interpretation F1 score of NSR, AF or AFL, JR, SVT, VT, 2AVB-T1, 2AVB-T2, CAVB, and
PM, was 0.989, 0.961, 0.929, 0.965, 0.842, 0.887, 0.966, 0.923, and
To understand the model and compare it with existing medical 0.959 on the internal validation dataset, respectively. The XDM's F1
knowledge, it was important to identify which regions had a significant score of NSR, AF or AFL, SVT, CAVB, and PM, was 0.990, 0.955, 0.777,
effect on the decision made by the XDM. To this end, we employed a 0.828, and 0.671 on the external validation dataset, respectively. Fig. 3
sensitivity map using a saliency method. The map was computed shows a confusion matrix of the XDM and SDM on the internal and ex-
using the first-order gradients of the classifier probabilities with respect ternal validation datasets. The AUC of the internal and external valida-
to the input signals. If the probability of a classifier was sensitive to a tion datasets for detecting each arrhythmia is shown in Fig. 4.
specific region of the signal, the region was considered significant in Moreover, the sensitivity, specificity, PPV, and NPV were confirmed at
the model [18,19]. We used a gradient-class activation map as a sensi- the operating point from Youden J statistics using the development
tivity map and a guided gradient backpropagation method. The XDM data. The XDM outperformed the SDM in all measures.
showed the sensitivity map from each feature module. As shown in Fig. 5, the AUC of each irregularity, P-wave presence,
regular PR interval, atrioventricular dissociation, atrioventricular se-
Results quencing, and pacemaker spike presence was 0.984, 0.986, 0.991,
0.989, 0.982, and 0.949, respectively. To calculate the performance, we
The Sejong ECG dataset for development and internal validation compared the interpretable scores of each output node of each feature
dataset included patients who visited MSH (March 1, 2017 to March module ground truth of each module that was labeled by cardiologists.
31, 2020) and SGH (October 1, 2019 to December 31, 2019). A total of We employed a sensitivity map to visualize the ECG region to detect
55,083 patients at MSH and 2257 patients at SGH were eligible for inclu- each ECG feature as shown in Fig. 6. The map reveals that the XDM fo-
sion. We excluded 387 patients at MSH and 11 patients at SGH because cused on the part of the ECG related to each module. For example, the
of missing values, as shown in Fig. 1. The development dataset from the module for determining the presence of P-waves focused on P-waves,
Sejong ECG dataset included 72,740 ECGs of 42,880 patients (Table 1). whereas the module that made decisions on irregularity focused on
The performance of the algorithm was confirmed using 14,062 ECGs QRS complexes. Furthermore, the module that was used to determine
Fig. 6. Sensitivity map of the XDM for detecting each arrhythmia feature.
The sensitivity map shows the region in which the XDM module focused attention for deciding the presence of features. The most important region is in orange and the least important
region is in blue. AV: Atrioventricular, XDM: explainable deep learning model.
130
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
the presence of a Regular PR interval focused on the PR segment of each the decision made by the XDM in arrhythmia classification with high
beat. The module for determining the presence of AV dissociation fo- performance.
cused on peak of the P-, R-, and T-waves. The module that was used to
determine the presence of AV sequencing focused on the PR and ST seg- Funding
ments, and the module that was used to determine the presence of a
pacemaker spike focused on the pacemaker spike signal. This work was supported by the National Research Foundation of
Korea (NRF) grant funded by the Korea government (MSIT) (No.
2020R1F1A1073791).
Discussion
Affiliations
Although several previous studies applied deep learning algorithms
to diagnose arrhythmia using ECGs, such algorithms were still black
JK, KHK, KHJ, SYL, JP, and BHO (Mediplex Sejong Hospital); JK, YYJ,
boxes; in other words, we neither understood their decision nor knew
MSJ, YJL, YHC, and JHS (Medical AI Co. Ltd.); JK and JHB (Bodyfriend
the reasons for a particular diagnosis of arrhythmia. Our study group re-
Co. Ltd.).
cently adopted the saliency map in ECGs to achieve explainability, al-
though the method did not completely explain how the model made
Data availability statement
conclusions [21–23]. The saliency map only highlighted the part of the
ECG that was important to the decision but did not explain the exact
The data used in this study will be shared upon reasonable request to
reason for the meaning of the part [18]. For example, when a deep learn-
the corresponding author.
ing algorithm focused on the QRS complex to diagnose a disease in a sa-
liency map, we were unable to determine which particular factor of the
Declaration of Competing Interest
QRS complex that the diagnosis was based on.
To overcome these limitations of DLMs, we adopted state-of-the-art
KHJ, KHK, SYL, JP, and BHO declare that they have no competing in-
explainable AI technologies, i.e., an NBET, in our ECG research. Our key
terests. YYJ, JK, YHC, JHS, YJL, and MSJ are researchers of Medical AI Co., a
insight was to combine neural networks with decision trees, preserving
medical artificial intelligence company. JK and JHB are researchers of
high-level interpretability while using neural networks for low-level
Body friend Co. There are no products in development or marketed
decisions. These NBET models have accuracy that is matched to that of
products to declare. This does not alter our adherence to Journal
neural networks, while also preserving the interpretability of a decision
policies.
tree. In this study, we developed six modules for features based on deep
learning. Because of this, we not only classified arrhythmia, but also elu-
Acknowledgement
cidated the underlying reasons for the classification result. As shown in
Supplemental material, we described the correlation between features
None.
and arrhythmias. Atrial fibrillation and flutter were strongly correlated
with features such as presence of P-wave and irregularity, and CAVB ex-
Appendix A. Supplementary data
hibited strong correlation with AV dissociation. However, we were un-
able to elucidate the exact meaning of these correlations because we
Supplementary data to this article can be found online at https://doi.
could not the exploration the process of deep learning. In our next
org/10.1016/j.jelectrocard.2021.06.006.
study, we hope to reveal the exact decision process of deep learning ar-
chitecture. For example, if XLM decided that a normal ECG demon-
References
strated AF, we could determine the reason for the decision; such
reasons may include “XLM decided that the P wave was absent (XLM [1] Khurshid S, Choi SH, Weng L-C, Wang EY, Trinquart L, Benjamin EJ, et al. Frequency
could not find the P wave on input ECG)” or “XLM decided that the of cardiac rhythm abnormalities in a half million adults. Circ Arrhythmia
Electrophysiol. 2018;11.
rhythm was irregular.” This explainability is vital in determining and [2] Benjamin EJ, Blaha MJ, Chiuve SE, Cushman M, Das SR, Deo R, et al. Heart disease and
editing the error in the model. Doctors could also determine errors if stroke statistics-2017 update: a report from the American Heart Association. Circu-
their decisions did not match that of the XLM. For example, if a doctor lation. 2017;135:e146–603.
[3] Go AS, Hylek EM, Phillips KA, Chang Y, Henault LE, Selby JV, et al. Prevalence of diag-
could not find a small P wave and decided an ECG as AF, XLM could as-
nosed atrial fibrillation in adults: national implications for rhythm management and
sist the doctor because the XLM could determine the P-wave in the ECG stroke prevention: the AnTicoagulation and risk factors in atrial fibrillation (ATRIA)
based on the value of the P-wave module and display the focal P-wave study. JAMA. 2001;285:2370–5.
with a sensitivity map. In this study, we preserved the accuracy of [4] Corley SD, Epstein AE, DiMarco JP, Domanski MJ, Geller N, Greene HL, et al. Relation-
ships between sinus rhythm, treatment, and survival in the Atrial Fibrillation
XDMs by adopting explainability, and the XDM was found to outper- Follow-Up Investigation of Rhythm Management (AFFIRM) study. Circulation.
form SDM. 2004;109:1509–13.
There are several limitations to the present study. First, we devel- [5] Stewart S, Hart CL, Hole DJ, McMurray JJV. A population-based study of the
long-term risks associated with atrial fibrillation: 20-year follow-up of the
oped six feature modules to develop XDM. Although we selected ten Renfrew/Paisley study. Am J Med. 2002;113:359–64.
features based on current medical knowledge, it is possible to enhance [6] Orejarena LA, Vidaillet H, DeStefano F, Nordstrom DL, Vierkant RA, Smith PN, et al.
the XDM performance using other features of ECG. This is the next re- Paroxysmal supraventricular tachycardia in the general population. J Am Coll
Cardiol. 1998;31:150–7.
search area of our study group. Second, studies related to the clinical sig-
[7] Mustaqeem A, Anwar SM, Khan AR, Majid M. A statistical analysis based recom-
nificance of the new technology are required for application in clinical mender model for heart disease patients. Int J Med Inform. 2017;108:134–45.
practice. In our next study, we will verify the performance and signifi- [8] Giebel GD, Gissel C. Accuracy of mHealth devices for atrial fibrillation screening:
cance of XDM using a prospective study in daily clinical practice. systematic review. JMIR Mhealth Uhealth. 2019;7:e13641.
[9] Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, et al.
Cardiologist-level arrhythmia detection and classification in ambulatory electrocar-
diograms using a deep neural network. Nat Med. 2019;25:65–9.
Conclusion [10] Ribeiro AH, Ribeiro MH, Paixão GMM, Oliveira DM, Gomes PR, Canazart JA, et al.
Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat Commun.
We developed an XDM for arrhythmia classification and confirmed 2020;11:1760.
[11] van de Leur RR, Blom LJ, Gavves E, Hof IE, van der Heijden JF, Clappers NC, et al.
that the model accurately classifies arrhythmia in diverse formats of
Automatic triage of 12-lead ECGs using deep convolutional neural networks. J Am
ECGs using external validation datasets. The results indicate that the Heart Assoc. 2020;9:e015138.
proposed XAI methodology could be used to describe the reasons for [12] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.
131
Y.-Y. Jo, J. Kwon, K.-H. Jeon et al. Journal of Electrocardiology 67 (2021) 124–132
[13] Wagner P, Strodthoff N, Bousseljot R-D, Kreiseler D, Lunze FI, Samek W, et al. PTB-XL, [19] Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual
a large publicly available electrocardiography dataset. Sci Data. 2020;7:154. explanations from deep networks via gradient-based localization. Int J Comput Vis.
[14] Perez Alday EA, Gu A, Shah AJ, Robichaux C, Wong A-KI, Liu C, et al. Classification of 2020;128:336–59.
12-lead ECGs: the PhysioNet/Computing in cardiology challenge 2020. Physiol Meas. [20] Zimetbaum P, Goldman A. Ambulatory arrhythmia monitoring. Circulation. 2010;
2021;41(12):124003. https://doi.org/10.1088/1361-6579/abc960. 122:1629–36.
[15] Zheng J, Zhang J, Danioko S, Yao H, Guo H, Rakovski C. A 12-lead electrocardiogram [21] Kwon J, Cho Y, Jeon K-H, Cho S, Kim K-H, Baek SD, et al. A deep learning algorithm to
database for arrhythmia research covering more than 10,000 patients. Sci Data. detect anaemia with ECGs: a retrospective, multicentre study. Lancet Digit Heal.
2020;7:48. 2020;2:e358–67.
[16] Liu F, Liu C, Zhao L, Zhang X, Wu X, Xu X, et al. An open access database for evaluat- [22] Kwon J, Lee SY, Jeon K, Lee Y, Kim K, Park J, et al. Deep learning–based algorithm for
ing the algorithms of electrocardiogram rhythm and morphology abnormality detecting aortic stenosis using electrocardiography. J Am Heart Assoc. 2020;9.
detection. J Med Imaging Health Inform. 2018;8:1368–73.
[23] Kwon J-M, Jeon K-H, Kim HM, Kim MJ, Lim SM, Kim K-H, et al. Comparing the per-
[17] Schisterman EF, Perkins NJ, Liu A, Bondell H. Optimal cut-point and its correspond-
formance of artificial intelligence and conventional diagnosis criteria for detecting
ing Youden index to discriminate individuals using pooled blood samples. Epidemi-
left ventricular hypertrophy using electrocardiography. Europace. 2020;22(3):
ology. 2005;16:73–81.
412–9. https://doi.org/10.1093/europace/euz324.
[18] Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual
explanations from deep networks via gradient-based localization. Proceedings of
the IEEE International Conference on Computer Vision; 2017 p. 1;618–626.
132