Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

RETRACTED ARTICLE: Optimal feature selection through a cluster-based DT learning (CDTL) in heart disease prediction

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

This article was retracted on 12 December 2022

This article has been updated

Abstract

In the rural side, there is the absence of centers for cardiovascular ailment. Due to this, around 12 million people passing worldwide reported by WHO. The principal purpose of coronary illness is a propensity for smoking. ML classifiers are applied to predict the risk of cardiovascular disease. However, the ML model has some inherent problems like it’s serene to feature selection, splitting attribute, and imbalanced datasets prediction. Most of the mass datasets have multi-class labels, but their combinations are in different proportions. In this paper, we experiment with our system with Cleveland’s heart samples from the UCI repository. Our cluster-based DT learning (CDTL) mainly includes five key stages. At first, the original set has partitioned through target label distribution. From the high distribution samples, the other possible class combination has made. For each class-set combination, the significant features have identified through entropy. With the significant critical features, an entropy-based partition has made. At last, on these entropy clusters, RF performance is made through significant and all features in the prediction of heart disease. From our CDTL approach, the RF classifier achieves 89.30% improved prediction accuracy from 76.70% accuracy (without CDTL). Hence, the error rate of RF with CDTL has significantly reduced from 23.30 to 9.70%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Change history

References

  1. Razmjooy N, Sheykhahmad FR, Ghadimi N (2018) A hybrid neural network–world cup optimization algorithm for melanoma detection. Open Med 13(1):9–16. https://doi.org/10.1515/med-2018-0002

    Article  Google Scholar 

  2. Moallem P, Navid R, Mohsen A (2013) Computer vision-based potato defect detection using neural networks and support vector machine. Int J Robot Autom 28(2):137–145. https://doi.org/10.2316/Journal.206.2013.2.206-3746

    Article  Google Scholar 

  3. Mousavi S, Sargolzaei P, Razmjooy N, Soleymani F (2011) Digital image segmentation using rule-base classifier. Am J Sci Res 35(1):17–23

    Google Scholar 

  4. Detrano R, V.A. Medical Center, Long Beach, and Cleveland Clinic Foundation. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/heart+disease

  5. Cheung N (2001) Machine learning techniques for medical analysis. School of Information Technology and Electrical Engineering, B.Sc. Thesis, University of Queenland

  6. Polat K, Sahan S, Kodaz H, Günes S (2005) A new classification method to diagnosis heart disease: Supervised artificial immune system (AIRS). In Proceedings of the Turkish symposium on artificial intelligence and neural networks (TAINN)

  7. Ozsen S, Gunes S (2009) Attribute weighting via genetic algorithms for attribute weighted artificial immune system (AWAIS) and its application to heart disease and liver disorders problems. Expert Systems with Applications

  8. Das R, Turkoglu I, Sengur A (2009) Effective diagnosis of heart disease through neural networks ensembles. Expert Syst Appl 36(4):7675–7680

    Article  Google Scholar 

  9. Liu W, Chawla S, Cieslak DA, Chawla NV (2010) A robust decision tree algorithm for imbalanced data sets. In: Proceedings of the SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Columbus, Ohio, pp 766–777

  10. Paul AK, Shill PC, Rabin MRI, Akhand MAH (2016) Genetic algorithm-based fuzzy decision support system for the diagnosis of heart disease. In: 2016 5th international conference on informatics, Electron. Vision, ICIEV, pp 145–150

  11. Verma L, Srivastava S, Negi PC (2016) A hybrid data mining model to predict coronary artery disease cases using non-invasive clinical data. J Med Syst 40(7):1–7

    Article  Google Scholar 

  12. El-Bialy R, Salamay MA, Karam OH, Khalifa ME (2015) Feature analysis of coronary artery heart disease data sets. Procedia Comput Sci 65:459–468

    Article  Google Scholar 

  13. Shouman M, Turner T, Stocker R (2011) Using decision tree for diagnosing heart disease patients. In: Proceedings of the ninth australasian data mining conference (AusDM’11), Darlinghurst, Australia, pp 23–30

  14. Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554

    Article  Google Scholar 

  15. Kumar PS et al (2016) A computational intelligence method for effective diagnosis of heart disease using genetic algorithm. Int J Bio-Sci Bio-Technol 8(2):363–372

    Article  Google Scholar 

  16. Manogaran G, Varatharajan R, Priyan MK (2018) Hybrid recommendation system for heart disease diagnosis based on multiple kernel learning with adaptive neuro-fuzzy inference system. Multimed Tools Appl 77:4379

    Article  Google Scholar 

  17. Dey A, Singh J, Singh N (2016) Analysis of supervised machine learning algorithms for heart disease prediction with reduced number of attributes using principal component analysis. Int J Comput Appl 140(2):27–31

    Google Scholar 

  18. Nguyen CL, Phayung M, Herwig U (2015) A highly accurate firefly based algorithm for heart disease prediction. J Exp Sys Appl 42:1–11

    Google Scholar 

  19. Nazari S, Fallah M, Kazemipoor H, Salehipour A (2018) A fuzzy inference- fuzzy analytic hierarchy process-based clinical decision support system for diagnosis of heart diseases. Expert Syst Appl 95:261–271

    Article  Google Scholar 

  20. Sabahi F (2018) Bimodal fuzzy analytic hierarchy process (BFAHP) for coronary heart disease risk assessment. J Biomed Inform 83(April):204–216

    Article  Google Scholar 

  21. Ravish DK, Shenoy NR (2014) Heart function monitoring, prediction, and prevention of heart attacks: using artificial neural networks, pp 1–6

  22. Anooj P (2011) Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules and decision tree rules. Open Comput Sci 1(4):27–40

    Article  Google Scholar 

  23. Samuel OW, Asogbon GM, Sangaiah AK, Fang P, Li G (2017) An integrated decision support system based on ANN and Fuzzy_AHP for heart failure risk prediction. Expert Syst Appl 68:163–172

    Article  Google Scholar 

  24. Nahar J, Imam T, Tickle KS, Chen YPP (2013) Computational intelligence for heart disease diagnosis: a medical knowledge driven approach. Expert Syst Appl 40(1):96–104

    Article  Google Scholar 

  25. Nahato KB, Harichandran KN, Arputharaj K (2015) Knowledge mining from clinical datasets using rough sets and backpropagation neural network. Comput Math Methods Med 2015:1–13

    Article  Google Scholar 

  26. Thirumalai C, Duba A, Reddy R (2017) Decision making system using machine learning and Pearson for heart attack. In: Proceedings on international conference of electronics, communication and aerospace technology ICECA, 2017, vol 2017–January, pp 206–210

  27. Rao SN, Shenoy PM, Gopalakrishnan M, Kiran AB (2018) Applicability of the Cleveland clinic scoring system for the risk prediction of acute kidney injury after cardiac surgery in a South Asian cohort. Indian Heart J 70(4):533–537

    Article  Google Scholar 

  28. Ahmadi E, Weckman GR, Masel DT (2018) Decision making model to predict presence of coronary artery disease using neural network and C5.0 decision tree. J Ambient Intell Humaniz Comput 9(4):999–1011

    Article  Google Scholar 

  29. Shao YE, Hou CD, Chiu CC (2014) Hybrid intelligent modeling schemes for heart disease classification. Appl Soft Comput J 14(PART A):47–52

    Article  Google Scholar 

  30. Thirumalai C, Manzoor R (2017) Cost optimization using normal linear regression method for breast cancer Type I skin, pp 264–268

  31. Abdel-Basset M, Gamal A, Manogaran G (2019) A novel group decision making model based on neutrosophic sets for heart disease diagnosis. Multimed Tools Appl

  32. Jiang W, Xing X, Li S, Zhang X, Wang W (2019) Synthesis, characterization and machine learning based performance prediction of straw activated carbon. J Clean Prod 212(x):1210–1223

    Article  Google Scholar 

  33. Han J, Kamber M, Pei J (2006) Data mining concepts and techniques, 3rd edn. Morgan Kaufman, Waltham

    MATH  Google Scholar 

  34. Dianhong W, Liangxiao J (2007) An improved attribute selection measure for decision tree induction. In: Proceedings of the fourth international conference proceedings on fuzzy systems and knowledge discovery_FSDK, IEEE, Haikou, China, pp 654–658

  35. Chandra B, Kothari R, Paul P (2010) A new node splitting measure for decision tree construction. Pattern Recognit 43(8):2725–2731

    Article  MATH  Google Scholar 

  36. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  37. Kozak J, Boryczka U (2016) Collective data mining in the ant colony decision tree approach. Information Sciences 372:126–147

    Article  Google Scholar 

  38. Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the KDD, Boston, MA USA, ACM, pp 71–80

  39. Sun X, Liu Y, Xu M, Chen H, Han J, Wang K (2013) Feature selection using dynamic weights for classification. Knowl-Based Syst 37:541–549

    Article  Google Scholar 

  40. Vivekanandan T, Iyengar NCSN (2017) Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. Comput Biol Med 90:125–136

    Article  Google Scholar 

  41. Amin MS, Chiam YK, Varathan KD (2019) Identification of significant features and data mining techniques in predicting heart disease. Telemat Inform 36(November):82–93

    Article  Google Scholar 

  42. Dey A, Singh J, Singh N (2016) Analysis of supervised machine learning algorithms for heart disease prediction with reduced number of attributes using principal component analysis. Analysis 140(2):27–31

    Google Scholar 

  43. Storn R, Price K (1995) Differential evolution—a simple and efficient adaptive scheme for global optimization over continuous space, Technical Report TR-95-012, Berkeley, CA

  44. Wang J, Zhou S, Yi Y, Kong J (2014) An improved feature selection effective range for classification. Sci World J 2014:8

    Google Scholar 

  45. Vivekanandan T, Iyengar NCSN (2017) Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. Comput Biol Med 90(April):125–136

    Article  Google Scholar 

  46. Liu X, Wang X, Su Q, Zhang M, Zhu Y, Wang Q, Wang Q (2017) A hybrid classification system for heart disease diagnosis based on the RFRS method. Comput Math Methods Med 2017:1–11

    MathSciNet  Google Scholar 

  47. Shah SMS, Batool S, Khan I, Ashraf MU, Abbas SH, Hussain SA (2017) Feature extraction through parallel probabilistic principal component analysis for heart disease diagnosis. Phys Stat Mech Appl 482:796–807

    Article  MATH  Google Scholar 

  48. Wiharto HK, Herianto H (2017) Hybrid system of tiered multivariate analysis and artificial neural network for coronary heart disease diagnosis. Int J Electr Comput Eng 7(2):1023–1031

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Magesh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s12065-022-00807-x

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Magesh, G., Swarnalatha, P. RETRACTED ARTICLE: Optimal feature selection through a cluster-based DT learning (CDTL) in heart disease prediction. Evol. Intel. 14, 583–593 (2021). https://doi.org/10.1007/s12065-019-00336-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-019-00336-0

Keywords