Abstract
Living beings are subjected to many hazards during their course of life. Owing to high mortality rate, heart disease (HD) is among leading hazards for living being. It is world’s one of the critical disease due to its complex diagnosis and expansive treatment. It has predominantly affected the health care sector of developing as well as developed countries. Inadequate preventive measures, diagnosis shortcomings, inefficient medical support, lack of medical staff and advancements have led to severe impacts on developing countries. The paper exhibits state-of-the-art of various intelligent solutions for HD detection with an empirical analysis of machine learning algorithms on electrocardiogram-based arrhythmia dataset for disease detection. A critical investigation is being performed using eight machine learning algorithms, Support Vector Machine, K-Nearest Neighbors, Random Forest, Extra Tree, Bagging, Decision Tree, Linear Regression, and Adaptive Boosting, under imbalanced and balanced class paradigms. The performance of these algorithms is tested with four metrics namely, precision, recall, accuracy, and f1-score. The empirical analysis presents an interesting insight on the structure of dataset. Initially for binary class balancing problem majority class have more accuracy than the minority class because model’s training dataset is crowded with majority class tuples than minority class. The paper uses Synthetic Minority Over-sampling Technique for data balancing. It has not only increased the overall accuracy of the algorithm but also the individual accuracy of the classes. Hence, the accuracy of the minority class will not be sacrificed.
Similar content being viewed by others
References
Nashif, S.; Raihan, M.R.; Islam, M.R.; Imam, M.H.: Heart disease detection by using machine learning algorithms and a real-time cardiovascular health monitoring system. World J. Eng. Technol. 6(4), 854–873 (2018)
Stefanovska, A.: Physics of the human cardiovascular system. Contemp. Phys. 40(1), 31–55 (1999)
Mendis, S.; Puska, P.; Norrving, B.; World Health Organization: Global atlas on cardiovascular disease prevention and control. World Health Organization, Geneva (2011)
Najafi, F.; Jamrozik, K.; Dobson, A.J.: Understanding the ‘epidemic of heart failure’: a systematic review of trends in determinants of heart failure. Eur. J. Heart Fail. 11(5), 472–479 (2009)
World Health Organization. (2020). Hearts: technical package for cardiovascular disease management in primary health care.
World Health Organization. (2013). Global action plan for the prevention and control of noncommunicable diseases 2013–2020.
Nikhar, S.; Karandikar, A.M.: Prediction of heart disease using machine learning algorithms. Int. J. Adv. Eng. Manag. Sci. 2(6), 239484 (2016)
Ketu, S.; Mishra, P.K.: Hybrid classification model for eye state detection using electroencephalogram signals. Cogn. Neurodyn. (2021). https://doi.org/10.1007/s11571-021-09678-x
Ketu, S.; Mishra, P.K.: Performance analysis of machine learning algorithms for IoT-based human activity recognition. In: Advances in Electrical and Computer Technologies (pp. 579–591). Springer, Singapore (2020)
Ketu, S.; Mishra, P.K.: Enhanced Gaussian process regression-based forecasting model for COVID-19 outbreak and significance of IoT for its detection. Appl. Intell. 51(3), 1492–1512 (2021)
Ketu, S.; Mishra, P.K.: Scalable kernel-based SVM classification algorithm on imbalance air quality data for proficient healthcare. Complex Intell. Syst. (2021). https://doi.org/10.1007/s40747-021-00435-5
Yu, S.N.; Lee, M.Y.: Bispectral analysis and genetic algorithm for congestive heart failure recognition based on heart rate variability. Comput. Biol. Med. 42(8), 816–825 (2012)
Martis, R.J.; Acharya, U.R.; Mandana, K.M.; Ray, A.K.; Chakraborty, C.: Application of principal component analysis to ECG signals for automated diagnosis of cardiac health. Expert Syst. Appl. 39(14), 11792–11800 (2012)
Pal, D.; Mandana, K.M.; Pal, S.; Sarkar, D.; Chakraborty, C.: Fuzzy expert system approach for coronary artery disease screening using clinical parameters. Knowl.-Based Syst. 36, 162–174 (2012)
Yu, S.N.; Lee, M.Y.: Conditional mutual information-based feature selection for congestive heart failure recognition using heart rate variability. Comput. Methods Programs Biomed. 108(1), 299–309 (2012)
Kim, J.K.; Lee, J.S.; Park, D.K.; Lim, Y.S.; Lee, Y.H.; Jung, E.Y.: Adaptive mining prediction model for content recommendation to coronary heart disease patients. Clust. Comput. 17(3), 881–891 (2014)
Melillo, P.; De Luca, N.; Bracale, M.; Pecchia, L.: Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. Biomed. Health Inform. 17(3), 727–733 (2013)
Lainscsek, C.; Sejnowski, T.J.: Electrocardiogram classification using delay differential equations. Chaos Interdiscip J. Nonlinear Sci. 23(2), 023132 (2013)
Mašetic, Z.; Subasi, A.: Detection of congestive heart failures using c4.5 decision tree. Southeast Eur. J. Soft Comput. 2(2), 74 (2013)
Guidi, G.; Pettenati, M.C.; Melillo, P.; Iadanza, E.: A machine learning system to improve heart failure patient assistance. IEEE J. Biomed. Health Inform. 18(6), 1750–1756 (2014)
Liu, G.; Wang, L.; Wang, Q.; Zhou, G.; Wang, Y.; Jiang, Q.: A new approach to detect congestive heart failure using short-term heart rate variability measures. PLoS ONE 9(4), e93399 (2014)
Vafaie, M.H.; Ataei, M.; Koofigar, H.R.: Heart diseases prediction based on ECG signals’ classification using a genetic-fuzzy system and dynamical model of ECG signals. Biomed. Signal Process. Control 14, 291–296 (2014)
Long, N.C.; Meesad, P.; Unger, H.: A highly accurate firefly based algorithm for heart disease prediction. Expert Syst. Appl. 42(21), 8221–8231 (2015)
Tay, D.; Poh, C.L.; Kitney, R.I.: A novel neural-inspired learning algorithm with application to clinical risk prediction. J. Biomed. Inform. 54, 305–314 (2015)
Acharya, U.R.; Fujita, H.; Sudarshan, V.K.; Sree, V.S.; Eugene, L.W.J.; Ghista, D.N.; San Tan, R.: An integrated index for detection of sudden cardiac death using discrete wavelet transform and nonlinear features. Knowl.-Based Syst. 83, 149–158 (2015)
Abdar, M.; Kalhori, S.R.N.; Sutikno, T.; Subroto, I.M.I.; Arji, G.: Comparing performance of data mining algorithms in prediction heart diseases. Int. J. Electr. Comput. Eng. 5(6), 1569–1576 (2015)
Saxena, K.; Sharma, R.: Efficient heart disease prediction system. Procedia Comput. Sci. 85, 962–969 (2016)
Samuel, O.W.; Asogbon, G.M.; Sangaiah, A.K.; Fang, P.; Li, G.: An integrated decision support system based on ANN and Fuzzy_AHP for heart failure risk prediction. Expert Syst. Appl. 68, 163–172 (2017)
Bashir, S.; Qamar, U.; Khan, F.H.: IntelliHealth: a medical decision support application using a novel weighted multi-layer classifier ensemble framework. J. Biomed. Inform. 59, 185–200 (2016)
Fujita, H.; Acharya, U.R.; Sudarshan, V.K.; Ghista, D.N.; Sree, S.V.; Eugene, L.W.J.; Koh, J.E.: Sudden cardiac death (SCD) prediction based on nonlinear heart rate variability features and SCD index. Appl. Soft Comput. 43, 510–519 (2016)
Taslimitehrani, V.; Dong, G.; Pereira, N.L.; Panahiazar, M.; Pathak, J.: Developing EHR-driven heart failure risk prediction models using CPXR (Log) with the probabilistic loss function. J. Biomed. Inform. 60, 260–269 (2016)
Weng, C.H.; Huang, T.C.K.; Han, R.P.: Disease prediction with different types of neural network classifiers. Telematics Inform. 33(2), 277–292 (2016)
Altan, G.; Kutlu, Y.; Allahverdi, N.: A new approach to early diagnosis of congestive heart failure disease by using Hilbert-Huang transform. Comput. Methods Programs Biomed. 137, 23–34 (2016)
Masetic, Z.; Subasi, A.: Congestive heart failure detection using random forest classifier. Comput. Methods Programs Biomed. 130, 54–64 (2016)
Leema, N.; Nehemiah, H.K.; Kannan, A.: Neural network classifier optimization using differential evolution with global information and back propagation algorithm for clinical datasets. Appl. Soft Comput. 49, 834–844 (2016)
Arabasadi, Z.; Alizadehsani, R.; Roshanzamir, M.; Moosaei, H.; Yarifard, A.A.: Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Comput. Methods Programs Biomed. 141, 19–26 (2017)
Dolatabadi, A.D.; Khadem, S.E.Z.; Asl, B.M.: Automated diagnosis of coronary artery disease (CAD) patients using optimized SVM. Comput. Methods Programs Biomed. 138, 117–126 (2017)
Tayefi, M.; Tajfard, M.; Saffar, S.; Hanachi, P.; Amirabadizadeh, A.R.; Esmaeily, H.; Taghipour, A.; Ferns, G.A.; Moohebati, M.; Ghayour-Mobarhan, M.: hs-CRP is strongly associated with coronary heart disease (CHD): A data mining approach using decision tree algorithm. Comput. Methods Programs Biomed. 141, 105–109 (2017)
Mustaqeem, A.; Anwar, S.M.; Khan, A.R.; Majid, M.: A statistical analysis based recommender model for heart disease patients. Int. J. Med. Inform. 108, 134–145 (2017)
Mahajan, R.; Viangteeravat, T.; Akbilgic, O.: Improved detection of congestive heart failure via probabilistic symbolic pattern recognition and heart rate variability metrics. Int. J. Med. Inform. 108, 55–63 (2017)
Sudarshan, V.K.; Acharya, U.R.; Oh, S.L.; Adam, M.; Tan, J.H.; Chua, C.K.; Chua, K.P.; San Tan, R.: Automated diagnosis of congestive heart failure using dual tree complex wavelet transform and statistical features extracted from 2 s of ECG signals. Comput. Biol. Med. 83, 48–58 (2017)
Zhang, J.; Lafta, R.L.; Tao, X.; Li, Y.; Chen, F.; Luo, Y.; Zhu, X.: Coupling a fast fourier transformation with a machine learning ensemble model to support recommendations for heart disease patients in a telehealth environment. IEEE Access 5, 10674–10685 (2017)
Mokeddem, S.A.: A fuzzy classification model for myocardial infarction risk assessment. Appl. Intell. 48(5), 1233–1250 (2018)
Boon, K.H.; Khalil-Hani, M.; Malarvili, M.B.: Paroxysmal atrial fibrillation prediction based on HRV analysis and non-dominated sorting genetic algorithm III. Comput. Methods Programs Biomed. 153, 171–184 (2018)
Zheng, Y.; Guo, X.; Qin, J.; Xiao, S.: Computer-assisted diagnosis for chronic heart failure by the analysis of their cardiac reserve and heart sound characteristics. Comput. Methods Programs Biomed. 122(3), 372–383 (2015)
Rasmy, L.; Wu, Y.; Wang, N.; Geng, X.; Zheng, W.J.; Wang, F.; Wu, H.; Xu, H.; Zhi, D.: A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set. J. Biomed. Inform. 84, 11–16 (2018)
Aborokbah, M.M.; Al-Mutairi, S.; Sangaiah, A.K.; Samuel, O.W.: Adaptive context aware decision computing paradigm for intensive health care delivery in smart cities—a case analysis. Sustain. Cities Soc. 41, 919–924 (2018)
Pławiak, P.: Novel methodology of cardiac health recognition based on ECG signals and evolutionary-neural system. Expert Syst. Appl. 92, 334–349 (2018)
Tan, J.H.; Hagiwara, Y.; Pang, W.; Lim, I.; Oh, S.L.; Adam, M.; Tan, R.S.; Chen, M.; Acharya, U.R.: Application of stacked convolutional and long short-term memory network for accurate identification of CAD ECG signals. Comput. Biol. Med. 94, 19–26 (2018)
Bozkurt, B.; Germanakis, I.; Stylianou, Y.: A study of time-frequency features for CNN-based automatic heart sound classification for pathology detection. Comput. Biol. Med. 100, 132–143 (2018)
Miao, F.; Cai, Y.P.; Zhang, Y.X.; Fan, X.M.; Li, Y.: Predictive modeling of hospital mortality for patients with heart failure by using an improved random survival forest. IEEE Access 6, 7244–7253 (2018)
Dominguez-Morales, J.P.; Jimenez-Fernandez, A.F.; Dominguez-Morales, M.J.; Jimenez-Moreno, G.: Deep neural networks for the recognition and classification of heart murmurs using neuromorphic auditory sensors. IEEE Trans. Biomed. Circuits Syst. 12(1), 24–34 (2017)
Jin, B.; Che, C.; Liu, Z.; Zhang, S.; Yin, X.; Wei, X.: Predicting the risk of heart failure with EHR sequential data modeling. Ieee Access 6, 9256–9261 (2018)
Yahaya, L.; Oye, N.D.; Garba, E.J.: A Comprehensive review on heart disease prediction using data mining and machine learning techniques. Am. J. Artif. Intell. 4(1), 20–29 (2020)
Subhadra, K.; Vikas, B.: Neural network based intelligent system for predicting heart disease. Int. J. Innov. Technol. Exploring Eng. (IJITEE) 8(5), 484–487 (2019)
Ayatollahi, H.; Gholamhosseini, L.; Salehi, M.: Predicting coronary artery disease: a comparison between two data mining algorithms. BMC Public Health 19(1), 1–9 (2019)
Padmanabhan, M.; Yuan, P.; Chada, G.; Nguyen, H.V.: Physician-friendly machine learning: A case study with cardiovascular disease risk prediction. J. Clin. Med. 8(7), 1050 (2019)
Lakshmanarao, A.; Swathi, Y.; Sri, P.; Sundareswar, S.: Machine learning techniques for heart disease prediction. Int. J. Sci. Technol. Res. 8(11), 374–377 (2019)
Reddy, P.K.; Reddy, T.S.; Balakrishnan, S.; Basha, S.M.; Poluru, R.K.: Heart disease prediction using machine learning algorithm. Int. J. Innov. Technol. Explor. Eng. 8(10), 2603–2606 (2019)
Annepu, D.; Gowtham, G.: Cardiovascular disease prediction using machine learning techniques. Int. Res. J. Eng. Technol. 6(4), 3963–3971 (2019)
MIT-BIH Arrhythmia Database Available Online: https://www.physionet.org/physiobank/database/mitdb/
Heart Disease Data Set Available Online: https://archive.ics.uci.edu/ml/datasets/Heart+Disease
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Fernández, A.; Garcia, S.; Herrera, F.; Chawla, N.V.: SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018)
Bardenet, R.; Brendel, M.; Kégl, B.; Sebag, M. (2013) Collaborative hyperparameter tuning. In: International Conference on Machine Learning, pp. 199–207
Yogatama, D.; Mann, G. (2014). Efficient transfer learning method for automatic hyperparameter tuning. In: Artificial Intelligence and Statistics, pp. 1077–1085
Goutte, C.; Gaussier, E. (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: European Conference on Information Retrieval, pp. 345–359. Springer, Berlin
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ketu, S., Mishra, P.K. Empirical Analysis of Machine Learning Algorithms on Imbalance Electrocardiogram Based Arrhythmia Dataset for Heart Disease Detection. Arab J Sci Eng 47, 1447–1469 (2022). https://doi.org/10.1007/s13369-021-05972-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-021-05972-2