Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
10 views11 pages

Chang 2012

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 11

Applied Soft Computing 12 (2012) 3165–3175

Contents lists available at SciVerse ScienceDirect

Applied Soft Computing


journal homepage: www.elsevier.com/locate/asoc

Myocardial infarction classification with multi-lead ECG using hidden Markov


models and Gaussian mixture models
Pei-Chann Chang a,∗ , Jyun-Jie Lin a , Jui-Chien Hsieh a , Julia Weng b
a
Department of Information Management, Yuan Ze University, Taoyuan 32026, Taiwan, ROC
b
Department of Computer Science & Engineering, Yuan Ze University, Taoyuan 32026, Taiwan, ROC

a r t i c l e i n f o a b s t r a c t

Article history: This study presented a new diagnosis system for myocardial infarction classification by converting multi-
Received 15 March 2011 lead ECG data into a density model for increasing accuracy and flexibility of diseases detection. In contrast
Received in revised form 9 January 2012 to the traditional approaches, a hybrid system with HMMs and GMMs was employed for data clas-
Accepted 12 June 2012
sification. A hybrid approach using multi-leads, i.e., lead-V1, V2, V3 and V4 for myocardial infarction
Available online 30 June 2012
were developed and HMMs were used not only to find the ECG segmentations but also to calculate the
log-likelihood value which was treated as statistical feature data of each heartbeat’s ECG complex. The
Keywords:
4-dimension feature vector extracted by HMMs was clustered by GMMs with different numbers of dis-
12-Lead ECG
Myocardial infarction
tribution (disease and normal data). SVMs classifier was also examined for comparison with our system
Hidden Markov models in experimental result. There were total 1129 samples of heartbeats from clinical data, including 582
Gaussian mixtures models data with myocardial infarction and 547 normal data. The sensitivity of this diagnosis system achieved
Support vector machines 85.71%, specificity achieved 79.82% and accuracy achieved 82.50% statistically.
© 2012 Elsevier B.V. All rights reserved.

1. Introduction standard 12-lead ECG is composed of six horizontal plane leads


called the limb leads, and six horizontal leads, which were also
In clinical medicine, electrocardiogram (ECG) is one of the most called chest leads [7]. Those leads offer 12 different angles for visu-
widely used non-invasive diagnostic tools for cardiopulmonary dis- alizing the activities of the heart and are named lead I, II, III, aVL,
eases. ECG monitors the patients’ heart-beat and clinically gives aVF, aVR, V1, V2, V3, V4, V5 and V6, respectively. It is worth not-
accurate and important information about myocardial infarction ing that ECG complex does not look the same in all the leads of
(MI) with coronary disease. Each heartbeat is comprised of a num- the standard 12-lead system and the shape of the ECG constituent
ber of distinct cardiac events. Fig. 1 shows a human’s normal ECG waves may vary depending on the lead. Mehta et al. used 12-lead
waveform. The basic components of an ECG complex are P wave, ECG data by applying support vector machines (SVMs) to detect and
which represents atrial depolarization, QRS complex, which rep- delineate the P and T waves in ECG complex [12]. The clinical value
resents ventricular depolarization and T wave corresponding to for acute myocardial infarction of 12-lead ECG were discovered by
the period of ventricular repolarization. The key in treating ECG Van’t Hof et al. [13] and van der Vleuten et al. used 12-lead ECG
complex is using the morphology in time detection and results in for ST elevation myocardial infarction (STEMI) and discovered the
concentrating researches about ECG process analysis in last decade predictive value of Q wave [14]. Thomas et al. [15] showed the K-
[1,2]. These researches mainly dealt with ECG pattern recognition means clustering approach in order to generate ECGs classes among
[3], segmentation [4], noise removal [5] and ischemia detection [6]. the training base and gave significantly better result.
Because of the morphology of ECG signals, hidden Markov The goal of this study is to accurately identify each heartbeat
models (HMMs) were mostly adopted for classification [8], ECG by its waveform. Hence, HMMs are adopted to calculate the likeli-
delineation [9], segmentation or components detection [10,11] and hood value of each heart-beat. For multi-lead ECG data, each case
some of them focus on ST-segments changes. This research pro- has more likelihood values in higher dimension decision space.
poses an ECG complex can be treated as an observation sequence These likelihood values are treated as different features for each
of HMMs and can be generated for each state transition. specific case. Then, the two famous classification algorithms, SVMs
Clinical 12-lead ECG data is now available in most hospitals and and Gaussian mixture models (GMMs) are applied for the diagno-
it includes more detailed information about cardiac disease. The sis according to the feature inputs from HMMs. Finally, this study
compares the performance of these two approaches using actual
clinical data.
∗ Corresponding author. SVMs are adopted in this study due to their excellent discrimi-
E-mail address: iepchang@saturn.yzu.edu.tw (P.-C. Chang). nation performance in high dimension [16,17] and most SVMs are

1568-4946/$ – see front matter © 2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.asoc.2012.06.004
3166 P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175

Fig. 2. An ECG complex of myocardial infarction.

Fig. 1. Basic components of an ECG complex.

myocardial infarction patient with ST segment elevation is shown


designed for two-class discrimination as a binary classifier. They are
as Fig. 2 and obviously it has a ST segmentation elevation.
suitable in classifying the MI and non-MI data. SVMs reply on the
Each lead corresponded to different aspects of a heart activity,
optimal hyper-plane as the decision surface so that the geometric
as listed in Table 1. This study focuses on the leads that represents
margin between two classes can be maximized. SVMs have been
the affected anterior wall and septum (an anteroseptal AMI), V1–V4
used for classifying arrhythmia by using reduced feature dimen-
and assumes that the disease and normal data would have different
sions [18]. In this study, the likelihood values generated by different
behaviors because of differences in the morphology of ECG signals.
HMMs are used as reduced features for SVM classifier.
In the early stages of acute myocardial infarction the electro-
GMMs are adopted to find the clusters or density model of
cardiogram may look like normal or near normal; less than half
data distribution based on the feature vectors [19]. In recent years,
of patients with acute myocardial infarction have clear diagnos-
GMMs were also employed as an automatic classifier of the ECG
tic changes on their first trace. Therefore, it is very important to
complex by characterizing an underlying distribution of features
identify a myocardial infarction from a patient’s 12-lead ECG data
statistically [20].
in the early stage. Then, a medical doctor can suggest a patient
The paper is organized as follows. The detailed descriptions
for expeditious reperfusion therapy if she or he is identified and
about 12-lead electrocardiogram, myocardial infarction and the
the therapy can improve prognosis significantly. However, owing
approaches, HMMs, GMMs and SVMs, used in this paper are
to the complexity and high dimensionality of the 12-lead ECG
described in Section 2. Section 3 compares the results obtained
data, it is not a trivial task to accurately identify the classifica-
by different parameters in GMMs and SVMs and in Section 4, we
tion of the myocardial infarction and normal data. This study is
discuss about this paper and Section 5 is the conclusion.
the first attempt from the literature survey that tries to identify
each beat by its waveform so that the classification of MI could be
2. Proposed method achieved.
In this study, HMMs are used for feature extraction of each ECG
In the clinical assessment of chest pain by a medical doctor, elec- complex and GMMs and SVM are adopted to classify the myocar-
trocardiography is an essential diagnosis to the clinical history and dial infarction. In the following sections, the detailed method will
physical examination. A rapid and accurate diagnosis in patients be explained. Initially, data with myocardial infarction are used for
with acute myocardial infarction is vital, since expeditious reper- training HMMs. This study focuses on the leads representing the
fusion therapy can improve prognosis in most patients. Myocardial affected anterior and septal wall. Therefore, lead-V1, V2, V3 and V4
infarction occurred when the blood supply to part of the heart was are used [21] and these leads are corresponding to 4 HMMs. These
interrupted. This is mostly due to occlusion of a coronary artery 4 HMMs are used not only to find the ECG segmentations but also to
following the rupture of a vulnerable atherosclerotic plaque. Tra- calculate the probability value (so called likelihood value in HMM).
ditionally, the 12-lead ECG is used to classify the patients into one The probability for each heartbeat will be transferred to logarithm,
of the three groups [21]: log-likelihood, and used as statistical feature data of each heart-
beat’s ECG complex. Next, in classification phase (including SVMs
(1) Those with ST segment elevation or new bundle branch block. and GMMs), the data with myocardial infarction and the normal
(2) Those with ST segment depression or T wave inversion. data are used together to calculate the 4-dimension feature vec-
(3) Those with a so-called non-diagnostic or normal ECG. tors for pattern distribution. Both SVMs and GMMs are applied to
classify a set of testing data into myocardial infarction and nor-
The examined data this research use mainly is the Type-1 mal classes. The complete flow diagram for myocardial infarction
described above. An example with typical ECG complex of a classification using a 12-lead ECG is shown in Fig. 3.

Table 1
The aspect of each lead.

I II III aVL aVF aVR V1 V2 V3 V4 V5 V6


√ √
Anterior
√ √
Septal
√ √
Posterior
√ √ √
Inferior
√ √ √ √
Lateral

Endocardial
P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175 3167

Fig. 3. The complete flow diagram for myocardial infarction classification using a 12-lead ECG.

2.1. ECG beat sampling value is unfair and cannot be used in classification. Therefore, the
length of each separated heartbeat is fixed as 400 in this research
In 12-lead ECG system, lead-II contains the most obvious heart- and Fig. 4 shows one instance of the data.
beat waveform. Therefore, the location of peak in R-wave in lead-II
is used to divide the ECG complex in each lead into separated heart-
beat. This study assumes that each heart-beat contained 402 points. 2.2. Feature extraction: using HMMs for each lead V1–V4
The number of 402 is decided by the past experiments according to
the priori knowledge. In this research, we first separate the whole This study focuses on anterior wall and septum in myocardial
known data set into single heartbeat and calculate the average infarction, lead V1–V4 are considered as feature data for training
length. Because of that the total length of an ECG complex is 5000 HMMs. In this study, 4 HMMs are adopted to learn lead V1–V4
points in 10 s, and the frequency of rhythm is between 60 and 80 per respectively. HMMs can find the best segmentation of a heartbeat
minute in usual. That means the length of a single heartbeat should and represent it as a step-like waveform through Viterbi algorithm
be larger than 375 and smaller than 500 points. Hence, according and the tool employed here is the hidden Markov model Toolbox
to the data set collected, we decide each heart has 400 points in for Matlab [22].
the ECG complex. The setting about 400 points of each ECG fits all A HMM is a stochastic model used for representing an underly-
collected data and can conclude the P, QRS and T waves which are ing stochastic process that is not observable, but can be observed
main components in an ECG complex. Another reason about why through the sequence of observed symbols [23]. HMMs were often
all heartbeats have the same length is that for a probability model used for signal processing such as speech recognition [24] and ECG
like HMM, the probability of each input data is calculate by each signal analysis [25]. A HMM with N states can be described via a
point, hence if the length differs from each other, the probability compact notation:
3168 P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175

Fig. 4. Heartbeat segment by R-wave in 10 s.

 = (A, B, ˘) and the following notation is other symbols in HMM Table 2


Initial setting of state transition matrix (in lead V1).
model.
N = number of states in the model. iso P Q R S T
T = total length of observation sequence. iso 0.5 0.5 0 0 0 0
P 0 0.5 0.5 0 0 0
A = {aij }N×N (1) Q 0 0 0.5 0.5 0 0
R 0 0 0 0.5 0.5 0
where aij = P(it+1 = j|it = i), the probability of being in state j at time S 0 0 0 0 0.5 0.5
t + 1 if state is I at time t. T 0.5 0 0 0 0 0.5

B = {bj (k)} = {P(vk at t|it = i)}N×|vk | (2)


the probability of symbol vk observed in state j at time t, and vk is so that P(O | ) or P(O, I | ) can be maximum. In left-to-right HMM,
one of the possible observation symbols. each state only has the probability to “jump” to its self or neighbor
˘ = {i }N×1 = P(i1 = i) (3) (next) state, for example, state P can transfer to next state Q or keep
in the same state at next time, but has no chance to transfer to state
the probability of being in state I at the start time (t = 1). R or S. After training by dataset with myocardial infarction, the state
HMMs assume that the observed sequence is generated in fol- transform matrix would be calculated as Table 3. Table 3 shows
lowing manner: at time 0, the model starts in one of the N states important information about the state transition matrix. There is a
with probability i . A random observation value k is selected with very high probability for each state to remain in its original state
probability bj (k) and then the model jumps to next state j from cur- and a lower probability for a “jump” to the next state. Because the
rent state i at time t with probability aij , and repeats the operations ECG complex can be divided into several segments (e.g., P-wave,
until T outputs have been generated. T-wave), if we compare the observations with the state transition
As shown in Fig. 1, a heartbeat can be seen as a waveform matrix, the specific ECG segment should be in the same state as cal-
sequence, this study assumes that an ECG waveform is sepa- culated by HMM. The detailed results and diagrams are illustrated
rated by segments: iso (baseline of ECG), P-wave, QRS-wave and in the next section.
ST-segment. The total number of states in each HMM is 6; the cor-
responding initial transform matrix is 6 by 6. The initial probability
setting here is 0.5 as fair chance for state transition. The initial Table 3
setting table is shown in Table 2. These segments are produced The calculated probabilities of state transition matrix (lead V1).
cyclically and reasonable to consider each segment as a state of a iso P Q R S T
left-to-right Markov model, like Fig. 5. Because of the normal hearts
iso 0.8486 0.1514 0 0 0 0
in the clinical data starts from baseline, this study set the initial P 0 0.9487 0.0512 0 0 0
probability ˘ as {1, 0, 0, 0, 0, 0}. Q 0 0 0.9471 0.0528 0 0
Since an ECG complex can be treated as an observation sequence, R 0 0 0 0.9733 0.0266 0
O, the segmental K-means algorithm and the Baum–Welch re- S 0 0 0 0 0.9689 0.0310
T 0.0660 0 0 0 0 0.9339
estimation formulas are used to adjusting the HMMs’ parameters
P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175 3169

Fig. 5. A heartbeat modeling by a HMM with 6 states and its state transitions.

Table 4 Figs. 8 and 9 show the result of normal and myocardial infarction
Four log-likelihood values HMMs generated.
data and its’ related results about HMMs state sequence.
Case HMM-V1 HMM-V2 HMM-V3 HMM-V4 This study uses left-to-right and full transition HMMs to find the
1 −1640.06 −1600.51 −1835.34 −1822.67 better experimental result in myocardial infarction identification.
The clinical data contained two parts, one is the data with
myocardial infarction, and another one includes the patients’ data
with cardiopathy but not myocardial infarction and normal cases.
When the model is trained, the Forward–Backward Procedure
In this study, the HMMs are trained by myocardial infarction data,
is used to calculate the likelihood value of each ECG complex. In
so that the disease and normal ECG complex can have difference
this study, lead-V1–V4 are calculated by HMMs, therefore, myocar-
likelihood value because trained HMMs have only learned about
dial infarction data and normal data have a 4-dimension vector
data with myocardial infarction.
(xV1 , xV2 , xV3 , xV4 ). Each xVi ’s log-likelihood value is calculated as
In this research, the inputs to the GMM model will be the
a heart-beat ECG complex corresponding to each related HMM-Vi.
4-dimensional HMMs feature data as shown in Table 4 and the
The calculated value of one heart-beat is shown in Table 4. To aid in
output will be a two-class classification of myocardial infarction
visualization, Fig. 6 shows the 2-D feature space of ECG data (circle
either positive or negative. An integrated diagram shows this
is myocardial infarction data and cross is normal one, the x-axis is
input–process–output relationship can be referred to Fig. 4. In addi-
HMM-V1 and y-axis is HMM-V2).
tion, two widely applied classification approaches, i.e., Gaussian
mixture models and support vector machines, are adopted to sep-
2.3. Data classification arate the test data into two parts. Once the model of distribution is
determined, the test data would be used and decide which distri-
This study also adopts the Viterbi Algorithm to find the state bution the data belongs to, myocardial infarction or not (including
sequence which finds the joint probability of the observation cardiopathy without myocardial infarction and normal case both).
sequence maximum and the state sequence to be illustrated with
the observation sequence (the ECG complex) together for visual
comparison. For left-to-right HMM, the illustrated state sequence 2.3.1. Gaussian mixture models
can be described as a step-like waveform shown in Fig. 7. The num- In this phase, the distributions of the data are to be deter-
ber listed above the dotted line is the number of state find by Viterbi mined. The tool adopted in this study is NETLAB [26] and the main
algorithm. Evan the result is different from the initial assumption parameter includes many components to the data, and the model
as we propose in introduction, the step-like waveform still present of the distribution would be fit by maximum likelihood using the
the well ability of dividing the basic components in ECG into appro- Expectation–Maximization (EM) algorithm. GMMs can deal with
priate segments, like R-wave is state-six and T-wave is state-four. overlapped classes as shown in Fig. 10. In our 12-lead ECG data,
The results also encourage us to do another experiment – full state there are cases that the data will be overlapped in more than two
transition test because some states are quite short in describing the clusters. Therefore, 2–6 components in GMMs will be explored to
components in an ECG complex and this may denote the transition
between each state shall “jump” more quickly.
30
-1200 ECG data
20 States sequence
-1400

-1600 10
56 6 6 6
3 4 3 4 3
-1800
1 2 1 2 1 1
0
HMM-V2

Amp

-2000
-10
-2200
-20
-2400

-2600 -30

-2800
-40
-3000
-3200 -3000 -2800 -2600 -2400 -2200 -2000 -1800 -1600 -1400 -1200
-50
HMM-V1 50 100 150 200 250 300 350 400
Time
Fig. 6. Two-dimension (V1–V2) visualization with MI and normal data (circle is
myocardial infarction data and cross is normal one). Fig. 7. An ECG complex and its corresponding step-like waveform.
3170 P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175

Fig. 8. Example for result of normal data. (The solid line was waveform of a heartbeat and the dotted line was the step-like waveform of Viterbi algorithm result.)

show if there is any major effect in myocardial infarction identifi- and data sparseness in this study. There are two reasons. First,
cation. the density modeling of full covariance matrices can be achieved
GMMs are the most statistically mature methods for clustering equally by using larger order diagonal covariance matrices. Sec-
and density modeling. Mixtures models are a type of density model ond, diagonal covariance matrices are more efficient than the full
comprised a number of component functions. These mixtures mod- covariance and empirically, diagonal covariance matrices have out-
els include probability mixture model, parametric mixture model performed full covariance matrices.
and continuous mixture model. GMMs have been successfully used The number of components and the distribution of data are used
for texture and color images analysis [27] and applied to speech to calculate the test data to determine fitting the Gaussian mixture
recognition [28]. This study assumes that a probability distribution distribution. The testing result will be described in next section in
existed that can represent the feature statistics of a block of each detail.
patient. The Gaussian mixture models are selected for this purpose.
For a D-dimensional feature vector, x, the mixture density used
for the likelihood function is defined as
2.3.2. Support vector machines

M
This section briefly describes the basic SVM and non-linear
p(x|) = ωi pi (x) (4) SVM concepts for typical two-class classification problems. Assume
i=1 there is a training set with N samples (Xi , yi |Xi ∈  n , yi ∈ {−1, + 1}).
In our case, D will be equal to 4 since we have 4-dimensional In the theory of basic SVM, a hyper-plane can be defined by the
HMM feature input. The density is a weighted linear combination following linear function
of M unimodal Gaussian densities, pi (x), each parameterized
 by a
mean D × 1 vector, i , and a D × D covariance matrix, i and ωi is f (X) = ωT X + b (7)
the mixture weight ith component, which satisfies ωi > 0 and


M
where w is the weight vector {w1 , w2 , . . . , wn } and n is the number
ωi = 1 (5)
of attributes (dimensions) and b is a bias. In order to obtain the sep-
i=1
arating hyper-plane with largest margin for each training example,
1

1
 the function yields f(X) ≥ 0 for y = +1 and f(X) < 0 for y = −1. In other
−1
pi (x) = exp − (x − i )T (˙i ) (x − i ) (6) words, training set from the two different classes are separated
(2)D/2 |˙i |1/2 2
by the hyper-plane f(X) = 0 and the SVM classifier is based on the
The above model supports both the full and diagonal covariance hyper-plane that maximized the separating margin as illustrated
matrices, but the diagonal covariance is selected for computational in Fig. 11.
P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175 3171

Fig. 9. Example for results of myocardial infarction.

SVMs can be extended to classify the nonlinear data through Gaussian radial basis function kernel:
nonlinear mapping function ϕ(x) to map the input pattern X into a  
||Xi − Xj ||2
higher H-dimensional space. The modified function is as follows. K(Xi , Xj ) = exp − (10)
2 2
N  
1 Sigmoid kernel:
f (X) = W T ˚(X) = ωj ϕ(X) + ω0 , ˚(X) = (8)
ϕ(X)
j=1 K(Xi , Xj ) = tanh(kXi · Xj − ı) (11)

with W = [ω0 , ω1 , . . ., ωH ]T is the weight vector. Through the clinical data are linearly inseparable data, the non-
SVMs were one of the kernel-based learning algorithm [29], linear SVMs are applied and Gaussian RBF is selected as the kernel
three admissible kernel functions K, used as nonlinear mapping function in this study. With the kernel mapping function, data from
function [30] were: two classes can always be separated by a hyper plane found by using
Polynomial kernel of degree h: support vectors and margins. Fig. 12 shows the result calculated by
RBF kernel (x-axis: lead-V1; y-axis: lead-V2).
K(Xi , Xj ) = (Xi · Xj + 1)h (9)

Fig. 11. SVM classification with a hyper-plane that maximized the separating mar-
Fig. 10. Overlapped data clustered by GMMs (four components of data). gin between the two classes.
3172 P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175

Fig. 12. The result calculated by SVMs.

In this study, Gaussian RBF kernel has the sigma value of 1 and next state. It proves that HMMs can learn ECG complex very well
we use SVM in Bioinformatics Toolbox of Matlab and randomly no matter what the initial setting was.
selection cross-validation is used to get the average accuracy. In GMMs phase, different numbers of components are also
tested for finding better result. In this study, the data with myocar-
dial infarction and normal data are treated as two main groups,
3. Experimental results each of them has its own Gaussian distribution, therefore differ-
ent numbers of distributions of disease and normal data would be
In the experiment, 6-state HMMs and 16-state HMMs are used experimented and in SVMs phase, different numbers of states are
for studying; both of those have left-to-right transition and full tested.
transition. 16-State HMMs are used for getting more detail infor- The performance for this experiment is measured by four crite-
mation about the complicate ECG complex. rions: accuracy, sensitivity (SE), expressing the fraction of events
Total numbers of heartbeats are 1129, including 582 data with correctly detected and specificity (SP), and reflecting the proba-
myocardial infarction and 547 normal data. There are 100 data bility of a negative test among patients without disease. These
selected from data with myocardial infarction and 100 data from performance measures are defined as following [31,32].
normal data randomly as test data, the rest is used as training data.
Each experiment with the same parameters is tested 30 times and TP + TN
Accuracy = (12)
the training set and test set are all random selecting during each TP + FN + FP + TN
run for verifying the robustness of this approach. Fig. 13 shows the TP
full transfer matrix calculated by 16-state HMMs. The result shows SE = (13)
TP + FN
similar results to left-to-right HMMs. Each state has very high prob-
TN
ability to keep in self-state and lower probability for transferring to SP = (14)
TN + FP
where TP (True Positive) is the number of matched events and FN
(False Negative) is the number of events that are not detected by
this approach. FP (False Positive) is the number of events detected
by this approach and non-matched to the detector annotations. TN
1
(True Negative) presents as the percentage of events truly identified
0.8 as not defectives, or normal.
Figs. 14–17 show the performance about averaged accuracy
probability

0.6 (after 30 runs) of four models verified in this research. The four
models include the number of state is 6 and 16 and transition types
0.4
are left-to-right and full state transition.
0.2
The best accuracy in 6-state HMMs is that the number of dis-
tributions of myocardial infarction is 6 while normal data is 4 and
0 state transition is left-to-right. Its sensitivity can reach to 71.29%,
15 and its specificity can reach to 71.72% and accuracy is 71.50%. In
10 15 16-state HMMs, the best result is also left-to-right state transition,
10 and the number of distributions of myocardial is 6 and normal is 6,
state 5
5 its sensitivity can achieve 85.71%, specificity achieved 79.82% and
0
state
0
accuracy is 82.50%. Table 5 gives the summary information.
Fig. 13. Results of 16-state HMMs with full transition (lead V1). The peak means
For SVMs, the average accuracy of left-to-right and full state
the state has higher probability to move to its self-state than to other states. transition in 6 and 16 states HMMs are shown in Table 6.
P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175 3173

MI= 2 0.84
0.72 MI= 3 MI = 2
MI= 4 0.82 MI = 3
MI= 5 MI = 4
0.7 MI= 6 0.8 MI = 5
MI = 6
0.78
0.68
0.76

Accuracy
Accuracy

0.66 0.74

0.72
0.64
0.7

0.68
0.62

0.66

0.6 0.64
2 3 4 5 6
2 3 4 5 6
The number of components for healthy data (from 2 to 6) The number of components for healthy data (from 2 to 6)

Fig. 14. The experimental result for 6-state and left-to-right HMM (the number of Fig. 17. The experimental result for 16-state and full state transition HMM (the
components for MI and healthy data is from two to six). number of components for MI and healthy data is from two to six).

MI= 2 Table 5
0.72 The overall accuracy of four models.
MI= 3
MI= 4
MI= 5 6 states 16 states
0.7 MI= 6 Left-to-right transition 72% 83%
Full transition 71% 82%

0.68
Table 6
Accuracy

The result from SVMs.


0.66
6 states 16 states

Left-to-right transition 75% 72%


0.64 Full transition 77% 71%

0.62 4. Discussions

In tradition, a single HMM or multi-HMMs is applied in myocar-


0.6
2 3 4 5 6
dial infarction for data segmentation such as to identify R or Q wave
in ECG data or to retrieve the nth state of these waves as a fea-
The number of components for healthy data (from 2 to 6)
ture input; or to derive the fluctuation values of ST-segmentation
Fig. 15. The experimental result for 6-state and full state transition HMM (the num- for myocardial infarction classification. However, in this research
ber of components for MI and healthy data is from two to six). HMM is applied to calculate the complete hear beat of ECG data.
Then the HMM probability is converted to log likelihood data and
treated as a feature input for the identification of the myocardial
MI = 2
0.85 MI = 3 infarction.
MI = 4 Patients with MI usually show very obvious sudden uptrend or
MI = 5
MI = 6
downtrend change in their ST segmentation among lead-V1–V4.
The basic assumption of this research is that in a single heart beat
0.8
if a person is abnormal then his HMM log likelihood value in ST
segmentation will be different from that of a normal person. This
HMM likelihood value will be applied as a feature input and totally
0.75 will have 4-leads, i.e., 4 features input. These 4-dimensional feature
Accuracy

input for a normal person will be again different from an abnormal


person. Therefore, GMMs and SVMs will be applied as a classifica-
0.7 tion model to identify if a person has a myocardial infarction.
In this study, HMMs, GMMs and SVMs are adopted to detect
myocardial infarction. This study focused on how to integrate 12-
lead ECG data into a cardiology diagnosis system. HMMs are used to
0.65
learn the 12-lead ECG complex and the likelihood of the observed
signals is used as the feature vectors in GMMs or SVMs. Six-state
and 16-state HMMs are setup because six-state assumed that a sin-
2 3 4 5 6 gle heart beat will be represented by its relative waves in ECG data.
The number of components for healthy data (from 2 to 6) However, 16-state assumed that for a single heart beat all the wave
changes should be well calculated through its state change. We try
Fig. 16. The experimental result for 16-state and left-to-right HMM (the number of two different state change sequence, i.e., left-to-right and full tran-
components for MI and healthy data is from two to six).
sition. Left-to-right assumes the waves of a heart beat are like a
3174 P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175

time-series data and the sequence does show some hidden infor- the left-to-right transition can get higher sensitivity and accuracy
mation. Full transition is only applied to test if the assumption of than full transition.
left-to-right is correct or not. According to the ability of ST segmentation detection in HMMs,
As shown in Fig. 7, the distribution of a 12-lead ECG data will be HMMs can calculate the significant log-likelihood value discrimi-
mixed in different clusters. Therefore, a density model, i.e., GMM nating myocardial infarction data from normal ones. For 12-lead
will be applied as a classification tool for the 4-dimensional fea- ECGs, more leads would give more detail information and increase
ture input data. In GMM operation as shown in Fig. 11, we have the successful ratio. This study explores the successful application
to assume the number of components the data comes from. In by combining HMMs and GMMs in the model. In this experiment,
myocardial infarction classification, the 12-lead ECG data comes the clinical data are treated as a mixture distribution rather than
from different patients and the distributions are very complicated nonlinear data processed by SVMs. Advantages of our approaches
to identify beforehand. Especially, the 4-dimensional drawing is includes: (1) a multiple-lead (4-lead) approach, (2) HMMs as a
hard to tell if the 12-lead ECG data of normal and abnormal patients feature selection tool instead of segmentation tool, (3) the combina-
are distributed in how many clusters. Therefore, we will try our tion of HMMs (as a feature extraction) and GMMs (as a classification
tests in 2–6 components. tool) provides a better classification approach (with 4-dimension
It is desired from the experiments that by having more states in feature inputs), (4) 6-state and 16-state HMMs are explored in the
HMM and the state change in left-to-right will have a better per- experiments. 16-State HMM does show its advantage over 6-state.
formance than others. As for the GMM classification, the more the In the future, other effective approaches for classification such
components are the higher the accuracy of the GMM model. The as probabilistic neural network can be applied as a classification
reason is because a single heart beat will contain more information tool and these models with different classifiers can be tested for
if it is described with more state changes in HMM likelihood val- comparisons. The ultimate goal is to derive an applicable model in
ues. In addition, the heart beat in ECG data is recorded in sequence; medical practice for medical doctors.
therefore HMM in left-to-right sequence will have better classifica-
tion accuracy. Finally, more components in GMM classifier will also
Acknowledgments
have a better accuracy since more components have more accu-
rate probability density function of data distribution. GMM with 6
The data used for the present study are obtained from the
components has the best accuracy value.
Taoyuan Armed Forces General Hospital located in Taiwan. We
GMM performs much better in 12-lead ECG classification than
want to thank Attending Physician Dr. Yeh and his clinical team
SVM. The reason is because 4-dimensional feature inputs are very
members for their general supports in providing the clinical data
complex plus the interactions among these states make it very chal-
of myocardial infarction and the 12-lead ECG data to make this
lenging to classify. The classifier of SVM is based on the kernel
research possible.
function for data mapping and a hyper-plane for space segmen-
tation. If the data mapping in kernel function cannot be clearly
separated into different distribution, the result will not be as satis- References
factory especially for 12-lead ECG data clustered together in space.
However, GMM classifier is based on the distribution of data den- [1] P. Trahanias, E. Skordalakis, Syntactic pattern recognition of the ECG, IEEE
Transactions on Pattern Analysis and Machine Intelligence 12 (1990) 648–657.
sity in space. In addition, the calculation can be based on multiple [2] F. Melgani, Y. Bazi, Classification of electrocardiogram signals with support
components overlaps. GMM have outperformed SVM in MI data vector machines and particle swarm optimization, IEEE Transactions on Infor-
classification in this research. SVM is designed to adopt different mation Technology in Biomedicine 12 (2008) 667–677.
[3] N. Maglaveras, T. Stamkopoulos, K. Diamantaras, C. Pappas, M. Strintzis, ECG
kernels to solve the non-linear problem, causing the selection of pattern recognition and classification using non-linear transformations and
kernel functions to become a big issue. Also, not of all the ker- neural networks: a review, International Journal of Medical Informatics 52
nel functions can guarantee that the non-linear hyper-plane does (1998) 191–208.
[4] A. Gacek, W. Pedrycz, A genetic segmentation of ECG signals, IEEE Transactions
solve the overlap distribution in the testing data. GMM can use the
on Biomedical Engineering 50 (2003) 1203–1208.
number of clusters to overcome the same situation alternatively. [5] M.P.S. Chawla, PCA, ICA processing methods for removal of artifacts and noise in
In GMM, the overlap situation can be regarded as the mixture data electrocardiograms: a survey and comparison, Applied Soft Computing Journal
11 (2) (2011) 2216–2226.
and recognized by the mixture models. Therefore GMM classifier
[6] T. Stamkopoulos, K. Diamantaras, N. Maglaveras, M. Strintzis, ECG analysis
is more suitable for our 12-lead ECG data. It is also true that more using nonlinear PCA neural networks for ischemia detection, IEEE Transactions
leads and more patients’ data are given. on Signal Processing 46 (1998) 3058–3067.
[7] T.B. Garcia, N.E. Holtz, Introduction to 12-Lead ECG: The Art of Interpretation,
Jones & Bartlett Publishers, 2002.
[8] D. Coast, R. Stern, G. Cano, S. Briller, An approach to cardiac arrhythmia analysis
5. Conclusion using hidden Markov models, IEEE Transactions on Biomedical Engineering 37
(1990) 826–836.
[9] S. Graja, J. Boucher, Hidden Markov tree model applied to ECG delineation, IEEE
In this study, HMMs, GMMs and SVMs are adopted to detect Transactions on Instrumentation and Measurement 54 (2005) 2163–2168.
myocardial infarction. This study focuses on how to integrate 12- [10] M. AL-Rousan, K. Assaleh, A. Tala’a, Video-based signer-independent Arabic
lead ECG data into a cardiology diagnosis system. HMMs are used sign language recognition using hidden Markov models, Applied Soft Comput-
ing Journal 9 (3) (2009) 990–999.
to learn the 12-lead ECG complex and the log-likelihood of the [11] R. Andreao, B. Dorizzi, J. Boudy, J. Mota, ST-segment analysis using hidden
observed signals are used as the feature vectors in GMMs or SVMs. Markov model beat segmentation: application to ischemia detection, Comput-
HMMs will calculate the probability of the state change in a single ers in Cardiology (2004) 381–384.
[12] S. Mehta, N. Lingayat, S. Sanghvi, Detection and delineation of P and T waves in
lead. The probability is further converted into a log likely hood value
12-lead electrocardiograms, Expert Systems 26 (2009) 125–143.
as a feature input to GMMs or SVMs model. Since there are 4 leads, [13] A. Van’t Hof, A. Liem, M. De Boer, F. Zijlstra, Clinical value of 12-lead electrocar-
there are totally 4 HMMs, i.e., 4 feature inputs for GMMs. In GMMs diogram after successful reperfusion therapy for acute myocardial infarction,
Lancet 350 (1997) 615–619.
phase, multiple Gaussian components had been verified and the
[14] P. van der Vleuten, M. Vogelzang, T. Svilaas, I. van der Horst, R. Tio, F. Zijlstra,
sensitivity of diagnosis reached to 85.71% and accuracy reached to Predictive value of Q waves on the 12-lead electrocardiogram after reperfusion
82.50%. In SVMs, the best result only reached 77%. The data proves therapy for ST elevation myocardial infarction, Journal of Electrocardiology 42
that the GMMs are more suitable than SVMs when the data distribu- (2009) 310–318.
[15] J. Thomas, C. Rose, F. Charpillet, A multi-HMM approach to ECG segmentation,
tion is overlapped as the feature space in this study. This is because Proceedings of International Conference on Tools with Artificial Intelligence,
the data is an ECG complex from a heart-beat, as a time-series data, ICTAI (2006) 609–616.
P.-C. Chang et al. / Applied Soft Computing 12 (2012) 3165–3175 3175

[16] C. Burges, A tutorial on support vector machines for pattern recognition, Data [25] L.R. Rabiner, Tutorial on hidden Markov models and selected applications in
Mining and Knowledge Discovery 2 (1998) 121–167. speech recognition, Proceedings of the IEEE 77 (2) (1989) 257–286.
[17] C.C. Chuang, Z.J. Lee, Hybrid robust support vector machines for regression with [26] R. Andreão, B. Dorizzi, J. Boudy, ECG signal analysis through hidden
outliers, Applied Soft Computing Journal 11 (1) (2011) 64–72. Markov models, IEEE Transactions on Biomedical Engineering 53 (2006)
[18] M. Song, J. Lee, S. Cho, K. Lee, S. Yoo, Support vector machine based arrhythmia 1541–1549.
classification using reduced features, International Journal of Control, Automa- [27] I.T. Nabney, Netlab, Springer, 2004.
tion and Systems 3 (2005) 571–579. [28] H. Permuter, J. Francos, I. Jermyn, Gaussian mixture models of texture
[19] J.M. Górriz, F. Segovia, J. Ramírez, A. Lassl, D. Salas-Gonzalez, GMM based SPECT and colour for image database retrieval, Proceedings of IEEE International
image classification for the diagnosis of Alzheimer’s disease, Applied Soft Com- Conference on Acoustics, Speech and Signal Processing, ICASSP (2003)
puting Journal 11 (2) (2011) 2313–2325. 569–572.
[20] K. Sung, T. Poggio, Example-based learning for view-based human face detec- [29] W. Kim, J. Hansen, Feature compensation in the cepstral domain employing
tion, IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998) model combination, Speech Communication 51 (2009) 83–96.
39–51. [30] K. Müller, S. Mika, G. Rätsch, K. Tsuda, B. Schölkopf, An introduction to kernel-
[21] R. Povinelli, Towards the prediction of transient ST changes, Computers in based learning algorithms, IEEE Transactions on Neural Networks 12 (2001)
Cardiology (2005) 663–666. 181–201.
[22] R. Povinelli, American heart association guidelines for cardiopulmonary resus- [31] J. Han, M. Kamber, Data Mining: Concepts and Techniques, second ed., Morgan
citation and emergency cardiovascular care, Circulation 112 (2005) IV1–IV203. Kaufmann, 2006.
[23] K. Murphy, Hidden Markov Model (HMM) Toolbox for Matlab, 2005. [32] F. Jager, G. Moody, A. Taddei, R. Mark, Performance measures for algorithms to
[24] L. Rabiner, B. Juang, An introduction to hidden Markov models, IEEE ASSP Mag- detect transient ischemic ST segment changes, Computers in Cardiology (1992)
azine 3 (1986) 4–16. 369–372.

You might also like