Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning
Previous Article in Journal
Convolutional Networks and Transformers for Mammography Classification: An Experimental Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intelligent Feature Selection for ECG-Based Personal Authentication Using Deep Reinforcement Learning

1
Department of Computer Engineering, Kwangwoon University, Seoul 01897, Republic of Korea
2
Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea
3
Department of Electrical and Communication Engineering, Daelim University, Kyoung 13916, Republic of Korea
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Sensors 2023, 23(3), 1230; https://doi.org/10.3390/s23031230
Submission received: 13 December 2022 / Revised: 16 January 2023 / Accepted: 17 January 2023 / Published: 20 January 2023
(This article belongs to the Section Biomedical Sensors)

Abstract

:
In this study, the optimal features of electrocardiogram (ECG) signals were investigated for the implementation of a personal authentication system using a reinforcement learning (RL) algorithm. ECG signals were recorded from 11 subjects for 6 days. Consecutive 5-day datasets (from the 1st to the 5th day) were trained, and the 6th dataset was tested. To search for the optimal features of ECG for the authentication problem, RL was utilized as an optimizer, and its internal model was designed based on deep learning structures. In addition, the deep learning architecture in RL was automatically constructed based on an optimization approach called Bayesian optimization hyperband. The experimental results demonstrate that the feature selection process is essential to improve the authentication performance with fewer features to implement an efficient system in terms of computation power and energy consumption for a wearable device intended to be used as an authentication system. Support vector machines in conjunction with the optimized RL algorithm yielded accuracy outcomes using fewer features that were approximately 5%, 3.6%, and 2.6% higher than those associated with information gain (IG), ReliefF, and pure reinforcement learning structures, respectively. Additionally, the optimized RL yielded mostly lower equal error rate (EER) values than the other feature selection algorithms, with fewer selected features.

1. Introduction

Security issues have been considered as a critical factor for the Internet of Things (IoT) owing to privacy challenge concerns [1,2,3,4]. For instance, a smart health card generated based on an IoT platform may enhance patient security and privacy information. However, when it is hacked, security issues are raised, such as theft risk, loss, insider misuse, and unintended behavior. Knowledge-based authentication methods rely on users’ memories, whereas token-based authentication methods utilize an external device [2,5].
For example, knowledge-based authentication methods use a personal identification number (PIN) and an identity (ID)/password, and token-based ones provide one-time passwords (OTPs) and short message services (SMSs) to the users. However, both approaches could be vulnerable to a brute-force dictionary attack, that is they can be guessed, duplicated, lost, or stolen. In particular, knowledge-based authentication methods could be attacked by hackers who may guess the users’ family name, birthday, or anniversary, and token-based methods could be critically risky when the external device is lost or stolen [6,7].
To solve these issues, researchers are investigating different personal authentication approaches using biometric data. With a biometric authentication system, users do not have to remember complex passwords or hold tokens, but may access the system using unique features of their own bodies that would be difficult to be cloned, lost, or stolen [6,7,8,9,10,11,12,13]. However, some types of biometrics, such as fingerprints, irises, and faces, are still vulnerable to attack. Fingerprints could be imitated and duplicated with silicone [14,15]; the iris features could be reproduced with contact lenses and printing [16]; the face could be easily fabricated with a photograph [17]. In addition, these biometric features have a critical flaw in that they cannot be remedied if they are damaged [6,13].
The electrocardiogram (ECG) is used as one of the biometrics. It is an electrical signal generated by the sinoatrial node in the heart to stimulate the cardiac muscle to contract and relax. It consists of various peaks referred to as P, Q, R, S, and T waves (see Figure 1). Compared with other biometrics, the ECG signals cannot be easily reproduced and have higher reliability, entropy, and randomness [18,19,20,21,22]. Additionally, ECG signals are affected by various other factors, including age, gender, physical condition, structure, and obesity [23,24,25,26]. To extract ECG signal features for the implementation of the authentication system, data-driven convolutional neural network (CNN) models have either been designed [27,28,29] or feature-engineering approaches have been applied based on predefined fixed models [30,31,32,33].
The extracted features from a single-lead ECG signal have been proven to provide reliable authentication results [34,35,36]. It has also been reported that the long-term stability of the features is guaranteed for several days or even years [34,37,38]. This study also explored the long-term stability of the ECG features for a personal authentication system using ECG signals that were recorded for six days. Additionally, this study identified the ideal ECG features that were considered the most significant for the classification of a user among others. It was found that the biometric authentication task that uses a high number of significant features, also known as “costly features”, performed better than the one that used all the features extracted from the biometric signals without taking into consideration their significance in relation to the task [39,40,41,42].
However, authentication with these ECGs has some limitations. Violent activities such as exercise may change the ECG features [43]; drugs such as caffeine may change the ECG features [19]; emotional changes may cause difficulties in ECG-based authentication [44]; the heart rate may change every day [45]. In this paper, experimental data were created to design robust models for problems caused by ECG that vary daily among these challenges. Unlike conventional data, the data used in this paper comprise different cardiac data over a continuous six-day period of one subject. The model optimized through the data will have the strength of having relatively robust results for daily varying ECG signals.
Among the algorithms used to search the costly features, we mainly applied the reinforcement learning (RL) algorithm [46] to ECG-based personal authentication. Recently, RL has achieved considerable performance improvements with the help of deep learning models, yielding state-of-the-art results in various areas, such as healthcare, autonomous driving, and resource management [47,48,49,50,51,52]. In addition, many studies have been conducted for feature selection using RL owing to its promising performance for the optimization [53,54,55]. The deep neural networks in RL, commonly referred to as deep Q-learning [56], were manually constructed in previous studies [48,51,57]. However, the performance of deep neural networks will vary depending on their architectures; additionally, they were developed mostly based on the developer’s experience and intuitions and may have suboptimal architectures. Thus, the networks in RL are automatically optimized using the Bayesian optimization hyperband (BOHB) method [58]. In this study, BOHB optimized the layers of the neural networks, the number of nodes in each layer, the learning rate, and the optimizer in the RL algorithm. As a benchmark test, the costly conventional feature selection algorithms, namely ReliefF [59,60] and information gain (IG) [61], were compared. The former is a Manhattan-distance-based feature selection algorithm that selects the significant features by calculating the sum of the distances among the instances of the features. The latter is an entropy-based feature selection algorithm computing the entropy of each feature and determines the significant features based on the calculated entropy.
This study is structured as follows. Section 2 elaborates on the RL deep Q-network (DQN) and BOHB algorithms for optimal feature selection. Section 3 describes the experimental methods, preprocessing, and feature extraction. The conducted experiments to demonstrate the effectiveness of the optimization of DQN as a model-independent classifier via BOHB are described in Section 3.3. Some experiments using different models are described in Section 3.4 for comparison with the optimized RL model and other costly feature selection algorithms. In Section 4, the authentication results (using RL and BOHB) are provided based on the benchmark tests with the conventional methods for both experiments.

2. Materials and Methods

2.1. Costly Features in IoT Environment

In an IoT environment, limited resources such as memory, computation, and power have always been issues [62,63,64,65,66,67]. In particular, the classification problem for a personal authentication system pertaining to wearable devices is also limited by these issues; this is referred to as classification with costly features (CwCF). Previously, the RL algorithm was designed to solve the CwCF issue to minimize the expected classification errors with incurred costs [46].

2.2. Deep Q-Network

RL is a machine learning approach in which an agent finds the optimal action and policy based on rewards from the environment. RL consists of Markov decision processes (MDPs) [68]. The elements of an MDP have a state s, action a, reward r, and depreciation rate of γ . Specifically, s denotes the current state, and a is the action taken in s. In turn, r is the reward obtained from the environment when the agent takes an action, and γ is the reliability in future rewards whose values range between 0 and 1.
Q-learning tries to identify the optimal policy of the MDP by updating the Q-function [69]. At the beginning of each episode, an agent moves from the current state s to the direction defined by the current action, a. The agent receives a reward r from the environment, yields a Q-value for the next action a , and obtains the maximum Q-value from the next state s . Subsequently, the Q-value is updated by multiplying the maximum Q-value and learning rate α according to Equation (1).
Q u p d a t e ( s t , a t ) ( 1 α ) Q ( s t , a t ) + α ( r t + 1 + γ m a x a Q ( s , a ) )
The rewards are obtained at t+1 based on the current state and environment.
The DQN applies deep neural networks to Q-learning to approximate the Q-value in more complicated environments than that of conventional Q-learning [56,70]. The loss function of the model calculates the mean-squared error (MSE) L ( θ ) based on Equation (2):
L ( θ ) = ( r t + 1 + γ m a x a Q ( s , a ; θ ) Q ( s t , a t ; θ ) ) 2
where θ are the parameters of the target network, which are fixed. The target network is updated in every predefined number of epochs. In addition, the DQN utilizes the experience replay method, wherein samples, including the set ( s t , a t , r t + 1 , s t + 1 , a t + 1 ), are stored in memory and a specific number of samples are randomly chosen to train the networks. This could solve the issue of dependence on consecutive samples or avoid unnecessary feedback loops.

2.3. Costly Feature Selection Using RL

The costly feature selection is described as follows. The variable ( x , y ) D denotes one of the samples from the data distribution D, where the vector x contains n input features, f i F = f 1 , , f n , and y is its class label. In one episode, the environment randomly selects one data sample from D, and the agent sequentially selects the features and classes with the highest Q-value [46]. The environment is represented by a partially observable MDP (POMDP) [71], which, unlike an MDP, provides the agent with limited information about the environment. State s = ( x , y , F ¯ ) S is denoted by a sample ( x , y ) , the state space S , and the agent-selected features F ¯ . In action a A ( A = A c A f ) , A f is an action taken to conduct the classification, and A f is an action taken to select a feature in a feature set. The episode ends when the agent selects a classification action, A c , and receives a reward of 0 if it is correctly classified and −1 if it is incorrect. When the agent selects an action A f , to select a feature, it receives a reward of λ c ( f i ) , where c ( f i ) is the cost for f i . The reward function r : S × A R is in accordance with S and A and is represented mathematically as follows.
r ( ( x , y , F ¯ ) , a ) = λ c ( f i ) if a A f , a = f i 0 if a A c and a = y 1 if a A c and a y
The value of λ provides a trade-off between the precision and average cost for this RL model. As λ increases, the cost is reduced and the focused episode becomes shorter. The transition function is defined as t : S × A S T .
t ( ( x , y , F ¯ ) , a ) = T if a A c ( x , y , F ¯ a ) if a A f
where T is the terminal state. When the agent selects a feature as an action, it adds the currently selected feature to F ¯ . If the agent selects an action to derive the classification result, it ends the episode.
In this paper, we designed a feature selection model using the DQN algorithm one of the promising RL models. If only the feature is placed in the action of the DQN model, the model acts as an optimizer [72], but by giving both feature and subject number, the model could possibly perform both feature selector and classifier functions as a pure RL model [46]. The procedure of this algorithm is shown in Algorithm 1.
Algorithm 1 Procedure of DQN Optimizer and Classifier.
  1 : Initialize replay memory 
  2 : Initialize action value function Q with random weights
  3 : for ϵ = 1, M do
  4 :     for t = 1, T do
  5 :           With probability epsilon, select a random action 
  6 :           if random action is feature:
  7 :                   Execute action in emulator, and observe reward 
  8 :                   Set state and preprocess policy
  9 :                   Store transition in replay memory
 10 :                  Perform a gradient descent step
 11 :           if random action is subject number: 
 12 :                   Execute action in the emulator, and observe reward     
 13 :                   Set state and preprocess policy
 14 :                   Store transition in replay memory
 15 :     end for
 16 : end for 

2.4. Hyperparameter Optimization

The performance of machine learning algorithms relies on internal hyperparametric settings. A machine learning algorithm could be represented as a function g : X R and its hyperparameters x X . The hyperparameter optimization (HPO) task aims to identify the optimal hyperparameters x argmin x X g ( x ) . However, most machine learning algorithms cannot observe g ( x ) owing to its randomness and uncertainty and, thus, assume that it is observable only based on noisy observations y ( x ) = g ( x ) + ϵ , with ϵ N ( 0 , σ n o i s e 2 ) [58,73,74].

2.5. Bayesian Optimization

In each iteration i, Bayesian optimization (BO) builds a probability function p ( g | D ) to model the objective function g using the Gaussian process, which is based on the already known (observation) dataset D = { ( x 0 , y 0 ) , , ( x i 1 , y i 1 } ) [58,73,75]. BO applies the acquisition function a : X R based on the current model p ( g | D ) , and the model considers a tradeoff between the processes of exploration and exploitation; iterations are conducted based on the following three steps:
(1) Select an observation at which the acquisition function is maximum x s e l e c t = argmax x X a ( x ) ;
(2) Evaluate the objective function y s e l e c t = g ( x s e l e c t ) + ϵ ;
(3) Augment the dataset with the selected observation, D = D ( x s e l e c t , y s e l e c t ) .
During the process, the model tries to identify the best observation x b e s t = a r g m i n x D g ( x ) .

2.6. Hyperband

Hyperband is a resource allocation problem-solving method executed in a purely exploration adaptive manner and constitutes a configuration evaluation approach based on the formulation of the hyperparameter optimization [58,76]. This method uses a principled early stopping strategy to allocate resources; the strategy aims to quickly identify superior hyperparameters by examining larger-scale hyperparameter configurations instead of using a strategy based on the uniform training of all configurations.

2.7. BOHB Hyperparameter Optimization

BOHB [58] is an HPO method that combines BO and the hyperband (HB). The BO process in BOHB uses a tree Parzen estimator (TPE) [77], which models a density function using a kernel density estimator. Algorithm 2 displays the procedure of the BOHB algorithm. Both feature selection using the BO algorithm and hyperparameter optimization using the HB algorithm are conducted simultaneously. Although the algorithm follows the budget selection approach of the HB, it guides the search by replacing a random sampling using a BO component. BOHB often searches for a good solution at a much faster rate than BO and converges to the best solution at a much faster rate than hyperband. In this study, the hyperparametric optimization method was applied to determine the number of hidden layers, learning rate, and optimizer for the DQN. The entire procedure of this algorithm is shown in Figure 2. State s consists of tuples ( x ¯ , m ) , where x ¯ is the masked vector of the original η , and is defined by the mask vector m; the latter is composed of (0, 1) and is responsible for the index of the selected feature.
Algorithm 2 Procedure of BOHB algorithm.
  1 : Input the number of maximum budget R, setting η
  2 : Initialization the number of setting S m a x = ceil(log η R)
  3 : for s = S m a x to 0 do
  4 :     set current Configuration A
  5 :           for i = 0 to s do
  6 :                  Select hyperparameter Configuration A i   
  7 :                  Get loss L using Configuration A i
  8 :                  A = min(L(A), L(A i ))
  9 :           for t = 1 to T do
 10 :                  Calculate a probability function p(g|D) using Gaussian process
 11 :                  Select observation where x s e l e c t = a r g m a x x x a(x)
 12 :                  Evaluate the objective function y s e l e c t = g(x s e l e c t ) + ϵ
 13 :                  Add dataset D = D ∪ ( x s e l e c t , y s e l e c t
 14 :                  Update best observation x b e s t = a r g m i n x D g ( x )
 15 :     end for
 16 : end for 
x ¯ i = x i if f i F ¯ 0 otherwise
m ¯ i = 1 if f i F ¯ 0 otherwise
The agent selects Q c l a s s or Q f e a t u r e corresponding to the current state. The previously selected features cannot be chosen again owing to the mask vector m.

3. Experiments

3.1. ECG Measurement Experiments

An experiment was conducted to generate a dataset to train and evaluate the proposed model. To record the ECG signals from the subjects, a commercially available real-time recording system was used (MP36, Biopac Systems, Goleta, CA, USA) at a sampling rate of 1000 Hz. Eleven subjects were invited and their ECG signals were recorded for 10 min for six days at random times from 10:00 a.m. to 4:00 p.m. The subjects were seated in a comfortable chair in an enclosed space and kept in a relaxed state. During the experiment, ECG signals were recorded from the left wrist with reference to the right wrist and with a ground electrode on the ankle, a configuration known as the driven-right leg [30,78]. A bandpass software filter with a finite impulse response (FIR) filter between 1 Hz and 35 Hz was used to minimize the ambient noise components [79,80,81]. Figure 3 illustrates the noise reduction using the bandpass filter process. The subjects had an average weight of 73.20 kg (±9.2 kg), an average height of 174.6 cm (±6.8 cm), an average BMI of 23.93 (±1.93), and a average age of 27.4 (±5.1). A group of 11 male subjects participated in the experiments.

3.2. Feature Extraction

The features of the ECG signals were extracted using the information of the P, Q, R, S, and T peaks. For the automatic peak extraction, the Pan and Tompkins [82] algorithm was applied to the ECG signals. They were defined based on the amplitudes, intervals, slopes, and angles of the peaks; in total, 31 features were derived in combinations with all peak points [30,31,32,33]. Figure 1 displays a typical ECG pattern with the five peaks; the extracted features are listed in Table 1. The features of the amplitude were extracted by the ratios among the peaks.

3.3. Evaluation of BOHB-Optimized DQN Authentication Algorithm

Experiments were conducted to investigate whether the BOHB optimization of the DQN could improve the authentication performance. These experiments evaluated the independent performance of the DQN model with BOHB used as an RL classifier. BOHB was applied to optimize the DQN for the RL-based, costly feature selection algorithm. In this experiment, the hyperparameters in the DQN (to be optimized) included the number of layers, nodes in each layer, learning rate, optimizer, and stochastic gradient descent (SGD) momentum, as summarized in Table 2. The minimum budget of BOHB was set to one and the maximum budget to nine. Table 3 shows the number of beat data generated by each subject for training and evaluating the proposed model. During the training process, the synthetic minority oversampling technique (SMOTE) [83], which is an oversampling method for the data augmentation, was applied to improve performance during the training process [45]. It is the method of generating a new sample using the distance between selected samples within the same group by applying the K-nearest neighbor (KNN) [84] algorithm. The dataset recorded from the 1st to 5th days was trained, and the 6th day of recordings were tested. Additionally, five-fold cross-validation was used to evaluate the generalization of the trained DQN model.

3.4. Evaluation of Costly Feature Selection Algorithms

This experiment was designed to evaluate the costly feature selection performance of the RL model. In this experiment, the proposed RL-based, costly feature selection algorithm was compared with the conventional feature selection methods, including ReliefF [59,60] and IG [61]. Therefore, the RL algorithm model (DQN) was only utilized for the selection of costly features; the selected features were then fed into the conventional classifiers for evaluation. Furthermore, an effective classifier for the authentication problem was also evaluated using support vector machines (SVMs) [85] and random forest (RF) [86]. The SVM and RF were chosen based on their promising performances in various machine learning problems, such as featured-based classification [87,88], image classification [89,90], and anomaly detection [91,92]. To evaluate the SVM and RF machine learning algorithms, personal authentication for input on the 6th day was conducted based on the sequentially cumulative trained model from 1 to 5 days of 11 subjects’ data. It was utilized as a training and verification dataset from the 1st to 5th days of subject data, and the model’s performance was tested with data from 6th day of ECG signals.

4. Results

4.1. Results of BOHB Optimized DQN Authentication Algorithm

Figure 4 shows the testing accuracy of the ECG-based authentication task using the optimized and non-optimized DQN models. The feature selection and authentication were performed simultaneously through the DQN models, and each training dataset was incrementally increased with the following day’s dataset. The results are the averaged authentication results across all subjects. Overall, the average accuracy of the non-optimized RL method was 95.3%, while that of the optimized method was 97.4%. In particular, note the significant improvement of the F1-score using the optimized DQN (95.2%), which was 10.9% higher than that of the non-optimized DQN (84.2%).

4.2. Results of Costly Feature Selection Algorithms

Figure 5 and Figure 6 depict the classification results including the performance indices, accuracy, and F1-scores, using the SVM and RF, respectively, where the indices were the averages of 10 simulation repetitions for the test dataset. In the figures, four different costly feature selection algorithms are compared: DQN with BOHB optimization (optimized RL), DQN without any optimization process (RL), and the ReliefF and IG feature selection algorithms. For the optimized DQN classifier, the model with the highest validation accuracy was chosen, and the feature selection was conducted. The x-axis of the subplots in Figure 5 and Figure 6 displays the average numbers of the selected features for the model’s final decision, while the y-axis displays the accuracy and F1-score performance indices. In Figure 5, the use of the SVM classifier of the optimized DQN algorithm outperformed the other feature selection algorithms with accuracies of 96.5%, 97.2%, 98.1%, 98.3%, and 98.5%, and F1-scores of 75%, 75.9%, 74.8%, 84.6%, and 91.1%. The numbers of the selected costly features were approximately equal to 3.9, 3.7, 2.9, 4.5, and 5.2.
In Figure 6, the accuracy and F1-score are shown to be higher using the 1st–3rd day training dataset based on the use of the RF as the classifier and the optimized RF as the feature selection algorithm, despite the fact that the accuracy and F1-score of ReliefF (using the 1st–5th days training dataset) were higher than those of the optimized DQN model and that they required more features, that is 4.3, 5.2, and 6.2 in the case of the optimized DQN and 6.5, 8.2, and 7.8 in the case of ReliefF.
As shown in Figure 5 and Figure 6 the “Optimized Reinforcement Learning” method proposed in this paper reported higher accuracy and F1-score when using the same number of features compared to other methods. These were the result of the model’s selection of the most-optimized features from possible combinations of ECG features, demonstrating that the model’s optimization through reinforcement learning was effective to improve the authentication task.
The equal error rate ( E E R ) was determined by the false acceptance rate ( F A R and false rejection rate ( F R R ) when they are equal [93]. The F A R and F R R are calculated using Equation (7), where F P , T N , F N , and T P denote false positive, true negative, false negative, and true positive outcomes, respectively.
F A R = F P F P + T N , F R R = F N F N + T P
Figure 7 illustrates the EER results of all combinations among the costly feature selection algorithms and classifiers. The x-axis of the subplot in Figure 7 displays the average number of selected features for the model’s final decision, while the y-axis displays the EER value. The best performance with the fewest number of features and lowest EER could be obtained using the optimized DQN with SVM using the training dataset recorded from the 1st to the 3rd day, that is using approximately three features and an EER of 4.7%. Although the lowest EER was obtained with ReliefF and RF using all five-day training datasets, more features (1.5-times) were used than those of the second-best method (see Figure 7e).

5. Discussion

In this study, a personal authentication task was conducted based on ECG signals recorded for 6 days. The ReliefF and information gain algorithms are representative conventional feature selection methods, which are simpler than the optimized and pure RL methods. Although the accuracy incrementally improved as shown in Figure 5 and Figure 6, they are not reliable in terms of the F1-score and equal error rate performances. The optimized model based on the method proposed in this study yielded high performance compared with the other conventional feature selection approaches. This demonstrates that the ECG signal could be feasible in implementing the biometric authentication system in our daily lives.
The optimized RL using BOHB produced the most-efficient and best performance in selecting costly features compared with other conventional methods, as proven by the accuracy, F1-score, and EER outcomes of the authentication tasks. This also proved the effectiveness of the model optimization process, commonly referred to as the automatic machine learning (Auto ML) process [94], based on the feature selection tasks using the RL algorithms. The results in Figure 5 and Figure 6 show that the proposed costly feature selection method could yield different performances depending on the classifier of the machine learning algorithm. The proposed approach was clearly improved by the SVM model compared with the RF model. This implies the optimal combination of the costly feature selection method and the classifier for the ECG-based authentication task. The RL model performed best with the two machine learning classifiers, thus implying that the costly feature selection method proposed in this study could be optimal in yielding the improvement of the authentication performance. This could be supported by the various optimization studies based on the RL algorithms, which typically perform better than other approaches [8,48,95].
It is noted that the suggested feature selection method outperformed the others (see Figure 5 and Figure 6). In particular, from the perspective of the F1-score, the non-optimized RL model yielded similar results to the other traditional feature selection methods, while the optimized BOHB-based model yielded improved results. This trend may indicate that the optimized DQN model could select significant features even with a small number of datasets. In addition, the proposed model selected a relatively small number of features compared with those selected by the other methods. During the learning process of the RL algorithm, the received rewards decreased as the number of learning episodes increased; this resulted in an automatic termination of the feature selection process at the appropriate level of training, while the traditional models require stopping the selection of features manually based on the experience of the model designer. This automatic stopping property of the RL algorithm could provide an efficient approach to saving learning and resources.
Figure 8 displays the number of subjects who selected the features through the optimized RL by increasing the training dataset. Note that some specific features, such as the “QS slope”, include more subjects than the others as the training data increase. The model selected the QS slope, F1-score, and EER using the training dataset recorded from the 1st to 5th days and produced the best results with the highest accuracy. Additionally, the selection of the QR slope, RT amplitude, and PT amplitude gradually increased as more training datasets were included.
We recorded ECG data from the subjects for six days. Among them, the data from the 6th day were used as the test dataset and were evaluated based on various feature selection and classification algorithms. Among the various optimal feature selection algorithms, the BOHB-optimized DQN algorithm produced the most-improved results compared with the SVM model. When there were adequate data for the training, the accuracy converged to values greater than 90%. The results produced by the optimal number of features could suggest the implementation of the ECG-based personal authentication model with a tight-sized structure in edge devices, such as smartwatches and mobile devices. To demonstrate this implementation, the machine-learning-based algorithm (SVM, RF) algorithm proposed in this paper was run on the Raspberry Pi4 board, confirming that it can be processed in less than 10 seconds per about 1000 heartbeats. The reason this can be implemented is that we optimized the costly features and classified them using relatively light-sized classifiers, rather than optimizing complex neural networks. Thus, this study demonstrated that this personal authentication model could be utilized in various embedded equipment types or low-power environments.

6. Conclusions

In this study, an RL-based personal authentication model and its optimization were proposed. These yielded significant performance enhancements compared with the conventional methods. Furthermore, they can be applied to various embedded systems with machine learning classifiers with relatively low resource consumption, such as the SVM and RF algorithms. In a follow-up study, the proposed model will be investigated further to identify the physiological meanings of the ECG features, such as the QS slope and RR interval, when used for personal authentication purposes.

Author Contributions

Conceptualization, S.B., H.Y., J.K. and C.P.; methodology, J.K.; software, J.K. and S.B.; validation, S.B., J.K. and H.Y.; formal analysis, J.K.; investigation, J.K. and C.P.; resources, H.Y. and G.Y.; data curation, H.Y., G.Y. and Y.C.; writing, S.B. and J.K.; visualization, J.K.; supervision, C.P., I.S. and Y.C.; review and editing, C.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2017R1A5A1015596) and the Technology Innovation Program (RS-2022-00154678, Development of Intelligent Sensor Platform Technology for Connected Sensor) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea), and the Excellent Researcher Support Project of Kwangwoon University in 2022.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki and was approved by the Institutional Review Board of Kwangwoon University (IRB No. 7001546-20200102-HR(SB)-001-03, 17 January 2020).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data are not publicly available due to ethical issues.

Acknowledgments

We would like to thank Peter Flach for his help in the pursuit and completion of this study.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of the data; in the writing of the manuscript; nor in the decision to publish the results.

References

  1. Alaba, F.A.; Othman, M.; Hashem, I.A.T.; Alotaibi, F. Internet of Things security: A survey. J. Netw. Comput. Appl. 2017, 88, 10–28. [Google Scholar] [CrossRef]
  2. Wang, C.; Wang, Y.; Chen, Y.; Liu, H.; Liu, J. User authentication on mobile devices: Approaches, threats and trends. Comput. Netw. 2020, 170, 107118. [Google Scholar] [CrossRef]
  3. Sicari, S.; Rizzardi, A.; Grieco, L.A.; Coen-Porisini, A. Security, privacy and trust in Internet of Things: The road ahead. Comput. Netw. 2015, 76, 146–164. [Google Scholar] [CrossRef]
  4. Kumar, J.S.; Patel, D.R. A survey on internet of things: Security and privacy issues. Int. J. Comput. Appl. 2014, 90, 20–26. [Google Scholar]
  5. Sandhu, R.; Samarati, P. Authentication, access control, and audit. ACM Comput. Surv. (CSUR) 1996, 28, 241–243. [Google Scholar] [CrossRef]
  6. Jain, A.K.; Ross, A.; Prabhakar, S. An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 2004, 14, 4–20. [Google Scholar] [CrossRef] [Green Version]
  7. O’Gorman, L. Comparing passwords, tokens, and biometrics for user authentication. Proc. IEEE 2003, 91, 2021–2040. [Google Scholar] [CrossRef]
  8. Frischholz, R.W.; Dieckmann, U. BiolD: A multimodal biometric identification system. Computer 2000, 33, 64–68. [Google Scholar] [CrossRef] [Green Version]
  9. Unar, J.; Seng, W.C.; Abbasi, A. A review of biometric technology along with trends and prospects. Pattern Recognit. 2014, 47, 2673–2688. [Google Scholar] [CrossRef]
  10. Pankanti, S.; Bolle, R.M.; Jain, A. Biometrics: The future of identification [guest eeditors’ introduction]. Computer 2000, 33, 46–49. [Google Scholar] [CrossRef]
  11. Jain, A.K.; Ross, A.; Pankanti, S. Biometrics: A tool for information security. IEEE Trans. Inf. Forensics Secur. 2006, 1, 125–143. [Google Scholar] [CrossRef]
  12. De Luis-Garcia, R.; Alberola-Lopez, C.; Aghzout, O.; Ruiz-Alzola, J. Biometric identification systems. Signal Process. 2003, 83, 2539–2557. [Google Scholar] [CrossRef]
  13. Prabhakar, S.; Pankanti, S.; Jain, A.K. Biometric recognition: Security and privacy concerns. IEEE Secur. Priv. 2003, 1, 33–42. [Google Scholar] [CrossRef]
  14. Maltoni, D.; Maio, D.; Jain, A.K.; Prabhakar, S. Handbook of Fingerprint Recognition; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  15. Van der Putte, T.; Keuning, J. Biometrical fingerprint recognition: Don’t get your fingers burned. In Smart Card Research and Advanced Applications; Springer: Berlin/Heidelberg, Germany, 2000; pp. 289–303. [Google Scholar]
  16. Gupta, P.; Behera, S.; Vatsa, M.; Singh, R. On iris spoofing using print attack. In Proceedings of the 2014 22nd International Conference on Pattern Recognition, Stockholm, Sweden, 24–28 August 2014; pp. 1681–1686. [Google Scholar]
  17. Tolosana, R.; Vera-Rodriguez, R.; Fierrez, J.; Morales, A.; Ortega-Garcia, J. Deepfakes and beyond: A survey of face manipulation and fake detection. arXiv 2020, arXiv:2001.00179. [Google Scholar] [CrossRef]
  18. Singh, Y.N.; Singh, S.K. Vitality detection from biometrics: State-of-the-art. In Proceedings of the 2011 World Congress on Information and Communication Technologies, Mumbai, India, 11–14 December 2011; pp. 106–111. [Google Scholar]
  19. Odinaka, I.; Lai, P.H.; Kaplan, A.D.; O’Sullivan, J.A.; Sirevaag, E.J.; Rohrbaugh, J.W. ECG biometric recognition: A comparative analysis. IEEE Trans. Inf. Forensics Secur. 2012, 7, 1812–1824. [Google Scholar] [CrossRef]
  20. Singh, Y.N.; Singh, S.K.; Ray, A.K. Bioelectrical signals as emerging biometrics: Issues and challenges. ISRN Signal Process. 2012, 2012, 712032. [Google Scholar] [CrossRef] [Green Version]
  21. Li, S.Z. Encyclopedia of Biometrics: I-Z; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009; Volume 2. [Google Scholar]
  22. Karimian, N.; Wortman, P.A.; Tehranipoor, F. Evolving authentication design considerations for the internet of biometric things (IoBT). In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, Pittsburgh, PA, USA, 1–7 October 2016; pp. 1–10. [Google Scholar]
  23. Hoekema, R.; Uijen, G.J.; Van Oosterom, A. Geometrical aspects of the interindividual variability of multilead ECG recordings. IEEE Trans. Biomed. Eng. 2001, 48, 551–559. [Google Scholar] [CrossRef]
  24. Van Oosterom, A.; Hoekema, R.; Uijen, G. Geometrical factors affecting the interindividual variability of the ECG and the VCG. J. Electrocardiol. 2000, 33, 219–227. [Google Scholar] [CrossRef] [Green Version]
  25. Green, L.S.; Lux, R.L.; Haws, C.W.; Williams, R.R.; Hunt, S.C.; Burgess, M.J. Effects of age, sex, and body habitus on QRS and ST-T potential maps of 1100 normal subjects. Circulation 1985, 71, 244–253. [Google Scholar] [CrossRef] [Green Version]
  26. Frank, S.; Colliver, J.A.; Frank, A. The electrocardiogram in obesity: Statistical analysis of 1029 patients. J. Am. Coll. Cardiol. 1986, 7, 295–299. [Google Scholar] [CrossRef] [Green Version]
  27. Labati, R.D.; Muñoz, E.; Piuri, V.; Sassi, R.; Scotti, F. Deep-ECG: Convolutional neural networks for ECG biometric recognition. Pattern Recognit. Lett. 2019, 126, 78–85. [Google Scholar] [CrossRef]
  28. Zhang, Q.; Zhou, D.; Zeng, X. HeartID: A multiresolution convolutional neural network for ECG-based biometric human identification in smart health applications. IEEE Access 2017, 5, 11805–11816. [Google Scholar] [CrossRef]
  29. Hammad, M.; Zhang, S.; Wang, K. A novel two-dimensional ECG feature extraction and classification algorithm based on convolution neural network for human authentication. Future Gener. Comput. Syst. 2019, 101, 180–196. [Google Scholar] [CrossRef]
  30. Biel, L.; Pettersson, O.; Philipson, L.; Wide, P. ECG analysis: A new approach in human identification. IEEE Trans. Instrum. Meas. 2001, 50, 808–812. [Google Scholar] [CrossRef] [Green Version]
  31. Singh, Y.N.; Gupta, P. Biometrics method for human identification using electrocardiogram. In Proceedings of the International Conference on Biometrics, Alghero, Italy, 2–5 June 2009; pp. 1270–1279. [Google Scholar]
  32. Israel, S.A.; Irvine, J.M.; Cheng, A.; Wiederhold, M.D.; Wiederhold, B.K. ECG to identify individuals. Pattern Recognit. 2005, 38, 133–142. [Google Scholar] [CrossRef]
  33. Arteaga-Falconi, J.S.; Al Osman, H.; El Saddik, A. ECG authentication for mobile devices. IEEE Trans. Instrum. Meas. 2015, 65, 591–600. [Google Scholar] [CrossRef]
  34. Wübbeler, G.; Stavridis, M.; Kreiseler, D.; Bousseljot, R.D.; Elster, C. Verification of humans using the electrocardiogram. Pattern Recognit. Lett. 2007, 28, 1172–1175. [Google Scholar] [CrossRef]
  35. Shen, T.W.; Tompkins, W.; Hu, Y. One-lead ECG for identity verification. In Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society Engineering in Medicine and Biology, Houston, TX, USA, 23–26 October 2002; Volume 1, pp. 62–63. [Google Scholar]
  36. Gutta, S.; Cheng, Q. Joint feature extraction and classifier design for ECG-based biometric recognition. IEEE J. Biomed. Health Inform. 2015, 20, 460–468. [Google Scholar] [CrossRef]
  37. Odinaka, I.; Lai, P.H.; Kaplan, A.D.; O’Sullivan, J.A.; Sirevaag, E.J.; Kristjansson, S.D.; Sheffield, A.K.; Rohrbaugh, J.W. ECG biometrics: A robust short-time frequency analysis. In Proceedings of the 2010 IEEE International Workshop on Information Forensics and Security, Seattle, WA, USA, 12–15 December 2010; pp. 1–6. [Google Scholar]
  38. Chan, A.D.; Hamdy, M.M.; Badre, A.; Badee, V. Wavelet distance measure for person identification using electrocardiograms. IEEE Trans. Instrum. Meas. 2008, 57, 248–253. [Google Scholar] [CrossRef]
  39. Liau, H.F.; Isa, D. Feature selection for support vector machine-based face-iris multimodal biometric system. Expert Syst. Appl. 2011, 38, 11105–11111. [Google Scholar] [CrossRef]
  40. Sun, Z.; Wang, L.; Tan, T. Ordinal feature selection for iris and palmprint recognition. IEEE Trans. Image Process. 2014, 23, 3922–3934. [Google Scholar] [CrossRef]
  41. Farmanbar, M.; Toygar, Ö. Feature selection for the fusion of face and palmprint biometrics. Signal Image Video Process. 2016, 10, 951–958. [Google Scholar] [CrossRef]
  42. Patro, K.K.; Jaya Prakash, A.; Jayamanmadha Rao, M.; Rajesh Kumar, P. An efficient optimized feature selection with machine learning approach for ECG biometric recognition. IETE J. Res. 2020, 68, 2743–2754. [Google Scholar] [CrossRef]
  43. Sung, D.; Kim, J.; Koh, M.; Park, K. ECG authentication in post-exercise situation. In Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju Island, Korea, 11–15 July 2017; pp. 446–449. [Google Scholar]
  44. Hwang, H.B.; Kwon, H.; Chung, B.; Lee, J.; Kim, I.Y. ECG authentication based on non-linear normalization under various physiological conditions. Sensors 2021, 21, 6966. [Google Scholar] [CrossRef]
  45. Kim, J.; Yang, G.; Kim, J.; Lee, S.; Kim, K.K.; Park, C. Efficiently Updating ECG-Based Biometric Authentication Based on Incremental Learning. Sensors 2021, 21, 1568. [Google Scholar] [CrossRef]
  46. Janisch, J.; Pevnỳ, T.; Lisỳ, V. Classification with costly features using deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 3959–3966. [Google Scholar]
  47. Gottesman, O.; Johansson, F.; Komorowski, M.; Faisal, A.; Sontag, D.; Doshi-Velez, F.; Celi, L.A. Guidelines for reinforcement learning in healthcare. Nat. Med. 2019, 25, 16–18. [Google Scholar] [CrossRef]
  48. Seok, W.; Yeo, M.; You, J.; Lee, H.; Cho, T.; Hwang, B.; Park, C. Optimal feature search for vigilance estimation using deep reinforcement learning. Electronics 2020, 9, 142. [Google Scholar] [CrossRef] [Green Version]
  49. Dulac-Arnold, G.; Mankowitz, D.; Hester, T. Challenges of real-world reinforcement learning. arXiv 2019, arXiv:1904.12901. [Google Scholar]
  50. Espeholt, L.; Soyer, H.; Munos, R.; Simonyan, K.; Mnih, V.; Ward, T.; Doron, Y.; Firoiu, V.; Harley, T.; Dunning, I.; et al. Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1407–1416. [Google Scholar]
  51. Mao, H.; Alizadeh, M.; Menache, I.; Kandula, S. Resource management with deep reinforcement learning. In Proceedings of the 15th ACM Workshop on Hot Topics in Networks, Atlanta, GA, USA, 9–10 November 2016; pp. 50–56. [Google Scholar]
  52. Kiran, B.R.; Sobh, I.; Talpaert, V.; Mannion, P.; Sallab, A.A.A.; Yogamani, S.; Pérez, P. Deep reinforcement learning for autonomous driving: A survey. arXiv 2020, arXiv:2002.00444. [Google Scholar] [CrossRef]
  53. Rasoul, S.; Adewole, S.; Akakpo, A. Feature selection using reinforcement learning. arXiv 2021, arXiv:2101.09460. [Google Scholar]
  54. Liu, D.R.; Li, H.L.; Wang, D. Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey. Int. J. Autom. Comput. 2015, 12, 229–242. [Google Scholar] [CrossRef]
  55. Fan, W.; Liu, K.; Liu, H.; Ge, Y.; Xiong, H.; Fu, Y. Interactive reinforcement learning for feature selection with decision tree in the loop. IEEE Trans. Knowl. Data Eng. 2021, 35, 1624–1636. [Google Scholar] [CrossRef]
  56. Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
  57. Lample, G.; Chaplot, D.S. Playing FPS games with deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
  58. Falkner, S.; Klein, A.; Hutter, F. BOHB: Robust and Efficient Hyperparameter Optimization at Scale. In Proceedings of the 35th International Conference on Machine Learning; Dy, J., Krause, A., Eds.; PMLR: Stockholm, Sweden, 2018; Volume 80, pp. 1437–1446. [Google Scholar]
  59. Robnik-Šikonja, M.; Kononenko, I. Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 2003, 53, 23–69. [Google Scholar] [CrossRef]
  60. Urbanowicz, R.J.; Meeker, M.; La Cava, W.; Olson, R.S.; Moore, J.H. Relief-based feature selection: Introduction and review. J. Biomed. Inform. 2018, 85, 189–203. [Google Scholar] [CrossRef]
  61. Mitchell, T. Introduction to machine learning. Mach. Learn. 1997, 7, 2–5. [Google Scholar]
  62. Sun, G.; Li, J.; Dai, J.; Song, Z.; Lang, F. Feature selection for IoT based on maximal information coefficient. Future Gener. Comput. Syst. 2018, 89, 606–616. [Google Scholar] [CrossRef]
  63. Lin, Y.; Zhu, X.; Zheng, Z.; Dou, Z.; Zhou, R. The individual identification method of wireless device based on dimensionality reduction and machine learning. J. Supercomput. 2019, 75, 3010–3027. [Google Scholar] [CrossRef]
  64. Memon, M.H.; Li, J.P.; Haq, A.U.; Memon, M.H.; Zhou, W. Breast cancer detection in the IOT health environment using modified recursive feature selection. Wirel. Commun. Mob. Comput. 2019, 2019, 5176705. [Google Scholar] [CrossRef] [Green Version]
  65. Venkatesh, B.; Anuradha, J. A review of feature selection and its methods. Cybern. Inf. Technol. 2019, 19, 3–26. [Google Scholar] [CrossRef] [Green Version]
  66. Okafor, N.U.; Alghorani, Y.; Delaney, D.T. Improving Data Quality of Low-cost IoT Sensors in Environmental Monitoring Networks Using Data Fusion and Machine Learning Approach. ICT Express 2020, 6, 220–228. [Google Scholar] [CrossRef]
  67. Jha, R.; Bhattacharjee, V.; Mustafi, A. IoT in Healthcare: A Big Data Perspective. In Smart Healthcare Analytics in IoT Enabled Environment; Springer: Berlin/Heidelberg, Germany, 2020; pp. 201–211. [Google Scholar]
  68. Sutton, R.S.; Barto, A.G. Introduction to Reinforcement Learning; MIT Press: Cambridge, UK, 1998; Volume 135. [Google Scholar]
  69. Watkins, C.J.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
  70. Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
  71. Kaelbling, L.P.; Littman, M.L.; Cassandra, A.R. Planning and acting in partially observable stochastic domains. Artif. Intell. 1998, 101, 99–134. [Google Scholar] [CrossRef] [Green Version]
  72. Wang, Q.; Guo, Y.; Yu, L.; Chen, X.; Li, P. Deep Q-network-based feature selection for multisourced data cleaning. IEEE Internet Things J. 2020, 8, 16153–16164. [Google Scholar] [CrossRef]
  73. Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R.P.; De Freitas, N. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 2015, 104, 148–175. [Google Scholar] [CrossRef] [Green Version]
  74. Feurer, M.; Hutter, F. Hyperparameter optimization. In Automated Machine Learning; Springer: Cham, Switzerland, 2019; pp. 3–33. [Google Scholar]
  75. Snoek, J.; Larochelle, H.; Adams, R.P. Practical bayesian optimization of machine learning algorithms. Adv. Neural Inf. Process. Syst. 2012, 25, 2951–2959. [Google Scholar]
  76. Li, L.; Jamieson, K.; DeSalvo, G.; Rostamizadeh, A.; Talwalkar, A. Hyperband: A novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 2017, 18, 6765–6816. [Google Scholar]
  77. Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for Hyper-Parameter Optimization. In Proceedings of the Advances in Neural Information Processing Systems 24 (NIPS 2011), Granada, Spain, 12–15 December 2011; Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2011; Volume 24, pp. 2546–2554. [Google Scholar]
  78. García-González, M.A.; Argelagós-Palau, A.; Fernández-Chimeno, M.; Ramos-Castro, J. A comparison of heartbeat detectors for the seismocardiogram. In Proceedings of the Computing in Cardiology 2013, Zaragoza, Spain, 22–25 September 2013; pp. 461–464. [Google Scholar]
  79. Tawfik, M.M.; Kamal, H.S.T. Human identification using QT signal and QRS complex of the ECG. Online J. Electron. Electr. Eng. (OJEEE) 2011, 3, 1–5. [Google Scholar]
  80. Singh, B.; Singh, P.; Budhiraja, S. Various approaches to minimise noises in ECG signal: A survey. In Proceedings of the 2015 Fifth International Conference on Advanced Computing & Communication Technologies, Rohtak, India, 21–22 February 2015; pp. 131–137. [Google Scholar]
  81. Hammad, M.; Pławiak, P.; Wang, K.; Acharya, U.R. ResNet-Attention model for human authentication using ECG signals. Expert Syst. 2020, 38, e12547. [Google Scholar] [CrossRef]
  82. Pan, J.; Tompkins, W.J. A Real-Time QRS Detection Algorithm. IEEE Trans. Biomed. Eng. 1985, BME-32, 230–236. [Google Scholar] [CrossRef] [PubMed]
  83. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  84. Peterson, L.E. K-nearest neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
  85. Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef] [Green Version]
  86. Ho, T.K. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 832–844. [Google Scholar]
  87. Bazi, Y.; Melgani, F. Toward an optimal SVM classification system for hyperspectral remote sensing images. IEEE Trans. Geosci. Remote Sens. 2006, 44, 3374–3385. [Google Scholar] [CrossRef]
  88. Pal, M.; Foody, G.M. Feature selection for classification of hyperspectral data by SVM. IEEE Trans. Geosci. Remote Sens. 2010, 48, 2297–2307. [Google Scholar] [CrossRef] [Green Version]
  89. Foody, G.M.; Mathur, A. A relative evaluation of multiclass image classification by support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1335–1343. [Google Scholar] [CrossRef] [Green Version]
  90. Bosch, A.; Zisserman, A.; Munoz, X. Image classification using random forests and ferns. In Proceedings of the 2007 IEEE 11th International Conference on Computer Vision, Rio De Janeiro, Brazil, 14–21 October 2007; pp. 1–8. [Google Scholar]
  91. Patcha, A.; Park, J.M. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Netw. 2007, 51, 3448–3470. [Google Scholar] [CrossRef]
  92. Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. (CSUR) 2009, 41, 1–58. [Google Scholar] [CrossRef]
  93. Malik, J.; Girdhar, D.; Dahiya, R.; Sainarayanan, G. Reference threshold calculation for biometric authentication. IJ Image Graph. Signal Process. 2014, 2, 46–53. [Google Scholar] [CrossRef]
  94. He, X.; Zhao, K.; Chu, X. AutoML: A survey of the state-of-the-art. Knowl.-Based Syst. 2021, 212, 106622. [Google Scholar] [CrossRef]
  95. Mazyavkina, N.; Sviridov, S.; Ivanov, S.; Burnaev, E. Reinforcement learning for combinatorial optimization: A survey. Comput. Oper. Res. 2021, 134, 105400. [Google Scholar] [CrossRef]
Figure 1. ECG feature extraction. Features were extracted from the amplitudes, intervals, angles, and slopes of the P, Q, R, S, and T peaks and the combinations of their peak points.
Figure 1. ECG feature extraction. Features were extracted from the amplitudes, intervals, angles, and slopes of the P, Q, R, S, and T peaks and the combinations of their peak points.
Sensors 23 01230 g001
Figure 2. Costly feature selection and classification model based on the reinforcement learning algorithm. Q c l a s s denotes the optional action with which the feature selection and classification are performed. Without Q c l a s s , only the feature selection task is conducted.
Figure 2. Costly feature selection and classification model based on the reinforcement learning algorithm. Q c l a s s denotes the optional action with which the feature selection and classification are performed. Without Q c l a s s , only the feature selection task is conducted.
Sensors 23 01230 g002
Figure 3. Noise–canceled electrocardiogram (ECG) signal using an FIR filter. Note that the noisy components in the raw ECG signal (in blue color) are attenuated to derive the actual signal (in red color).
Figure 3. Noise–canceled electrocardiogram (ECG) signal using an FIR filter. Note that the noisy components in the raw ECG signal (in blue color) are attenuated to derive the actual signal (in red color).
Sensors 23 01230 g003
Figure 4. Testing results of RL based on the costly feature selection algorithm calculated using the accuracy (upper) and F1-score (lower) and plotted as a function of tested days.
Figure 4. Testing results of RL based on the costly feature selection algorithm calculated using the accuracy (upper) and F1-score (lower) and plotted as a function of tested days.
Sensors 23 01230 g004
Figure 5. Support vector machine (SVM) results using the costly features. The accuracy and F1-score values across all subjects are illustrated with their variances in shades (corresponding to the training days). Each training dataset includes the electrocardiogram (ECG) signals recorded on the (a,b) 1st day, (c,d) 1st–2nd days, (e,f) 1st–3rd days, (g,h) 1st–4th days, and (i,j) 1st–5th days.
Figure 5. Support vector machine (SVM) results using the costly features. The accuracy and F1-score values across all subjects are illustrated with their variances in shades (corresponding to the training days). Each training dataset includes the electrocardiogram (ECG) signals recorded on the (a,b) 1st day, (c,d) 1st–2nd days, (e,f) 1st–3rd days, (g,h) 1st–4th days, and (i,j) 1st–5th days.
Sensors 23 01230 g005
Figure 6. Random forest (RF) results using costly features. The accuracy and F1-score values across all subjects are illustrated with their variances in shades corresponding to the training days. Each training dataset includes the ECG signals recorded on the (a,b) 1st days, (c,d) 1st–2nd days, (e,f) 1st–3rd days, (g,h) 1st–4th days, and (i,j) 1st–5th days.
Figure 6. Random forest (RF) results using costly features. The accuracy and F1-score values across all subjects are illustrated with their variances in shades corresponding to the training days. Each training dataset includes the ECG signals recorded on the (a,b) 1st days, (c,d) 1st–2nd days, (e,f) 1st–3rd days, (g,h) 1st–4th days, and (i,j) 1st–5th days.
Sensors 23 01230 g006
Figure 7. Equal error rate (EER) results using the costly features with SVM or RF. The EER values across all subjects are illustrated with their variances in shades (corresponding to the training days). Each training dataset includes the ECG signals recorded on the (a) 1st day, (b) 1st–2nd days, (c) 1st–3rd days, (d) 1st–4th days, and (e) 1st–5th days.
Figure 7. Equal error rate (EER) results using the costly features with SVM or RF. The EER values across all subjects are illustrated with their variances in shades (corresponding to the training days). Each training dataset includes the ECG signals recorded on the (a) 1st day, (b) 1st–2nd days, (c) 1st–3rd days, (d) 1st–4th days, and (e) 1st–5th days.
Sensors 23 01230 g007
Figure 8. Feature selection results based on optimized RL.
Figure 8. Feature selection results based on optimized RL.
Sensors 23 01230 g008
Table 1. Features extracted from an electrocardiogram (ECG) signal.
Table 1. Features extracted from an electrocardiogram (ECG) signal.
Features
AmplitudeR–P Amplitude R–S Amplitude R–T Amplitude
P–S Amplitude P–T Amplitude S–T Amplitude
R–Q Amplitude Q–T Amplitude Q–S Amplitude
P–Q Amplitude
IntervalR–P Interval R–Q Interval R–S Interval
R–T Interval P–Q Interval P–S Interval
P–T Interval Q–S Interval Q–T Interval
S–T Interval R–R Interval R–T Interval
SlopeP–R Slope R–S Slope S–T Slope
Q–R Slope P–Q Slope Q–S Slope
AngleQ Angle R Angle S Angle
Table 2. Search space of the DQN hyperparameters to be optimized using the BOHB algorithm.
Table 2. Search space of the DQN hyperparameters to be optimized using the BOHB algorithm.
RangeMinMaxDefault
Hyperparameter
Number of layers142
Numbers of nodes in each layer166432
Learning rate0.0010.10.01
OptimizerAdam, SGD, RMSprop
SGD momentum00.990.9
Table 3. The number of beat data for each subject from Day 1 to Day 6.
Table 3. The number of beat data for each subject from Day 1 to Day 6.
Subject No.Day 1Day 2Day 3Day 4Day 5Day 6
1305530373067307131322931
2283933133294323131813075
3292331502949360629622805
4313029253339307528323099
5302134233129298230343399
6362233993131293131783147
7309329553172306232643414
8330333213284275129313255
9266727713034280629943063
10300733772898313129853327
11342936933281352233673491
Total34,08935,36434,57834,16833,86035,006
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Baek, S.; Kim, J.; Yu, H.; Yang, G.; Sohn, I.; Cho, Y.; Park, C. Intelligent Feature Selection for ECG-Based Personal Authentication Using Deep Reinforcement Learning. Sensors 2023, 23, 1230. https://doi.org/10.3390/s23031230

AMA Style

Baek S, Kim J, Yu H, Yang G, Sohn I, Cho Y, Park C. Intelligent Feature Selection for ECG-Based Personal Authentication Using Deep Reinforcement Learning. Sensors. 2023; 23(3):1230. https://doi.org/10.3390/s23031230

Chicago/Turabian Style

Baek, Suwhan, Juhyeong Kim, Hyunsoo Yu, Geunbo Yang, Illsoo Sohn, Youngho Cho, and Cheolsoo Park. 2023. "Intelligent Feature Selection for ECG-Based Personal Authentication Using Deep Reinforcement Learning" Sensors 23, no. 3: 1230. https://doi.org/10.3390/s23031230

APA Style

Baek, S., Kim, J., Yu, H., Yang, G., Sohn, I., Cho, Y., & Park, C. (2023). Intelligent Feature Selection for ECG-Based Personal Authentication Using Deep Reinforcement Learning. Sensors, 23(3), 1230. https://doi.org/10.3390/s23031230

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop