A Safe-Region Imputation Method for Handling Medical Data with Missing Values
Abstract
:1. Introduction
- (1)
- Developing a safe-region imputation method for handling MVs, and comparing its performance with the k-nearest neighbors and multiple imputations via chained equations imputation methods [3];
- (2)
- Using four attribute selection methods to select the important attributes, and then integrating the selected attributes of the four attribute selection methods;
- (3)
- Experimentally collecting four open medical datasets from UCI and one international stroke trial dataset.
- (4)
- Comparing the generated rules and accuracy before and after imputation; and providing the results to medical stakeholders as a reference.
2. Related Work
2.1. Medical Data Imputation
2.2. Attribute Selection
2.2.1. Correlation attribute
2.2.2. Information gain
2.2.3. Gain ratio
2.2.4. ReliefF
2.3. Imputation Methods
2.3.1. kNN Imputation
2.3.2. Multiple Imputation
2.4. Classification Techniques
2.4.1. Random Forest
2.4.2. Decision Tree C4.5
2.4.3. Reduced Error Pruning Tree
2.4.4. Logistic Model Tree
3. Proposed Method
- (1)
- Identifying data the point area
- (2)
- Imputing the missing value
Algorithm 1 safe-region imputation |
c_ins: complete instance, i_ins: incomplete instance, k: the k nearest points of dataset, and new_ins: imputed instance. # calculate the minimal distance for ith in c_ins: for jth in c_ins: if ith != jth then dis[ith] = distance_complete(c_ins[ith], c_ins[jth]) set top_dis[k] as top k distance array; top_dis = get_shortest(dis, k) # get the k minimal distance points # impute missing value set min_dis as min distance variable set min_index as min distance data point missTh = 0.05 failTh = 0 for ith in i_ins: for jth in top_dis: i_dis = distance_incomplete(i_ins[ith], c_ins[jth]) if i_dis < min_dis or min_dis is null then min_dis = i_dis min_index = jth for th in 0.01 to 3.00 step 0.01 for ath in len(i_ins.arrt): for ith in i_ins: if i_ins[ith].attr[ath] is missing then/the attribute a is a missing value in the ith instance set score = 0 for jth in top_dis: if top_dis[jth].attr[ath] < th then score++ if score == 0 then failTh++ else for s in score: new_val = new_val + c_ins[top_dis[s]].attr(ath) new_val = new_val/s new_ins[ith].attr(ath) = new_val else new_ins[ith].attr(ath) = i_ins[ith].attr(ath) |
4. Experiment and Results
4.1. Experimental Environment and Parameter
4.2. Experimental Datasets
- (1)
- Diabetes: The diabetes dataset is from the 130-American Hospital and spans 10 years (1999–2008); this dataset includes 50 attributes (including one class attribute) and 101766 records with 2273 MVs. The dataset is divided into factors related to the readmission of diabetic patients and factors related to diabetic patients [47].
- (2)
- Audiology: This is the standardized version of the original audiology dataset donated by Ross Quinlan and mainly studies data on the types of hearing disorders in patients with hearing impairment. This dataset has 69 attributes (including one class attribute) and 200 records with 291 MVs [48].
- (3)
- Thyroid disease: This dataset is from Ross Quinlan, Garavan Institute in Sydney, Australia [14]. This study used 2800 instances with 28 attributes (including one class attribute) and 1756 MVs as an experimental dataset.
- (4)
- Breast Cancer: Wolberg, Street, and Mangasarian [49] created the original breast cancer dataset (Wisconsin). The numerical attributes are calculated from the digitized images of the fine-needle aspiration (FNA) of breast masses and involve nine numerical attributes (excluding the ID attribute) for calculating each nucleus. Hence, this study applied nine numerical attributes and one class attribute with 699 records and 16 MVs as an experimental dataset.
- (5)
- Stroke Disease: This dataset is from the international stroke trial database [42], which emerged from the largest randomized trial of acute stroke in the history of patients. The original dataset has 112 attributes and 19435 instances. This study used the dataset to predict stroke deaths; hence, we deleted some irrelevant attributes and instances of living patients and then created two class labels (death from stroke and death from other causes). Ultimately, the experimental dataset had 69 attributes and 4239 instances with 6578 MVs.
4.3. Results of Attribute Selection
4.4. Imputation Results and Comparisons
- (1)
- Comparison of before and after imputation
- (2)
- Comparing imputation with removing MVs for the selected attributes
4.5. Findings and Discussion
- (1)
- Imputation or removing missing values
- (2)
- Imputation after selecting attributes
- (3)
- Safe-region imputation
5. Conclusions and Future Research
Author Contributions
Funding
Conflicts of Interest
References
- WHO. The Top Ten Causes of Death. 2018. Available online: https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death (accessed on 10 May 2020).
- Little, R.; Rubin, D. Statistical Analysis with Missing Data; John Wiley and Sons Publishers: New York, NY, USA, 1987. [Google Scholar]
- Raghunathan, T.W.; Lepkowksi, J.M.; Van Hoewyk, J.; Solenbeger, P. A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv. Methodol. 2001, 27, 85–95. [Google Scholar]
- Sterne, J.A.C.; White, I.R.; Carlin, J.B.; Spratt, M.; Royston, P.; Kenward, M.G.; Wood, A.M.; Carpenter, J.R. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ 2009, 338, b2393. [Google Scholar] [CrossRef]
- Purwar, A.; Singh, S.K. Hybrid prediction model with missing value imputation for medical data. Expert Syst. Appl. 2015, 42, 5621–5631. [Google Scholar] [CrossRef]
- Bania, R.K.; Halder, A. R-Ensembler: A greedy rough set based ensemble attribute selection algorithm with kNN imputation for classification of medical data. Comput. Methods Programs Biomed. 2020, 184, 105122. [Google Scholar] [CrossRef]
- Ozair, F.F.; Jamshed, N.; Sharma, A.; Aggarwal, P. Ethical issues in electronic health records: A general overview. Perspect. Clin. Res. 2015, 6, 73–76. [Google Scholar] [CrossRef]
- Yelipe, U.; Porika, S.; Golla, M. An efficient approach for imputation and classification of medical data values using class-based clustering of medical records. Comput. Electr. Eng. 2018, 66, 487–504. [Google Scholar] [CrossRef]
- Chandrashekar, G.; Sahin, F. A survey on feature selection methods. Comput. Electr. Eng. 2014, 40, 16–28. [Google Scholar] [CrossRef]
- Gnanambal, S.; Thangaraj, M.; Meenatchi, V.; Gayathri, V. Classification Algorithms with Attribute Selection: An evaluation study using WEKA. Int. J. Adv. Netw. Appl. 2018, 9, 3640–3644. [Google Scholar]
- Uğuz, H. A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl. Based Syst. 2011, 24, 1024–1032. [Google Scholar] [CrossRef]
- Lai, C.-M.; Yeh, W.-C.; Chang, C.-Y. Gene selection using information gain and improved simplified swarm optimization. Neurocomputing 2016, 218, 331–338. [Google Scholar] [CrossRef]
- Shannon, C. A note on the concept of entropy. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
- Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef] [Green Version]
- Han, J.; Kamber, M.; Pei, J. Data Mining Concepts and Techniques, 3rd ed.; Morgan Kaufmann: Burlington, MA, USA, 2011. [Google Scholar]
- Kira, K.; Rendell, L.A. A practical approach to feature selection. In Proceedings of the Ninth International Workshop on Machine Learning; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1992; pp. 249–256. [Google Scholar]
- Zhang, M.; Ding, C.; Zhang, Y.; Nie, F. Feature selection at the discrete limit. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014. [Google Scholar]
- Cheliotis, M.; Gkerekos, C.; Lazakis, I.; Theotokatos, G. A novel data condition and performance hybrid imputation method for energy efficient operations of marine systems. Ocean Eng. 2019, 188, 106220. [Google Scholar] [CrossRef]
- Donders, A.R.; van der Heijden, G.J.; Stijnen, T.; Moons, K.G. Review: A gentle introduction to imputation of missing values. J. Clin. Epidemiol. 2006, 59, 1087–1091. [Google Scholar] [CrossRef] [PubMed]
- Enders, C.K. Applied Missing Data Analysis; Guilford Press: New York, NY, USA, 2010. [Google Scholar]
- Ghomrawi, H.M.K.; Mandl, L.A.; Rutledge, J.; Alexiades, M.M.; Mazumdar, M. Is there a role for expectation maximization imputation in addressing missing data in research using WOMAC questionnaire? Comparison to the standard mean approach and a tutorial. BMC Musculoskelet. Disord. 2011, 12, 109. [Google Scholar] [CrossRef] [Green Version]
- Lin, W.-C.; Tsai, C.-F. Missing value imputation: A review and analysis of the literature (2006–2017). Artif. Intell. Rev. 2020, 53, 1487–1509. [Google Scholar] [CrossRef]
- Cover, T.M.; Hart, P.E. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Batista, G.E.; Monard, M.C. An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 2003, 17, 519–533. [Google Scholar] [CrossRef]
- Zhang, S. Nearest neighbor selection for iteratively kNN imputation. J. Syst. Softw. 2012, 85, 2541–2552. [Google Scholar] [CrossRef]
- Rubin, D.B. Multiple Imputation for Nonresponse in Surveys; Wiley: New York, NY, USA, 1987. [Google Scholar]
- van Buuren, S.; Groothuis-Oudshoorn, K. Mice: Multivariate Imputation by Chained Equations in R. J. Stat. Softw. 2011, 45, 1–67. [Google Scholar] [CrossRef] [Green Version]
- Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Hall, M. Correlation-Based Feature Selection for Machine Learning; The University of Waikato: Hamilton, New Zealand, 1999. [Google Scholar]
- Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: Los Altos, CA, USA, 1993. [Google Scholar]
- Elomaa, T.; Kaariainen, M. An analysis of reduced error pruning. J. Artif. Intell. Res. 2011, 15, 163–187. [Google Scholar] [CrossRef] [Green Version]
- Pham, B.T.; Prakash, I.; Singh, S.K.; Shirzadi, A.; Shahabi, H.; Bui, D.T. Landslide susceptibility modeling using Reduced Error Pruning Trees and different ensemble techniques: Hybrid machine learning approaches. Catena 2019, 175, 203–218. [Google Scholar] [CrossRef]
- Jayanthi, S.K.; Sasikala, S. Reptree classifier for identifying link spam in web search engines. ICTACT J. Soft. Comput. 2013, 3, 498–505. [Google Scholar] [CrossRef]
- Chen, W.; Hong, H.; Li, S.; Shahabi, H.; Wang, Y.; Wang, X.; Bin Ahmad, B. Flood susceptibility modelling using novel hybrid approach of reduced-error pruning trees with bagging and random subspace ensembles. J. Hydrol. 2019, 575, 864–873. [Google Scholar] [CrossRef]
- Landwehr, N.; Hall, M.; Frank, E. Logistic Model Trees. Mach. Learn. 2005, 59, 161–205. [Google Scholar] [CrossRef] [Green Version]
- Lee, S.; Jun, C.-H. Fast incremental learning of logistic model tree using least angle regression. Expert Syst. Appl. 2018, 97, 137–145. [Google Scholar] [CrossRef]
- Breiman, L.; Friedman, J.; Olshen, R.; Stone, C. Classification and Regression Trees; Wadsworth: Belmont, CA, USA, 1984. [Google Scholar]
- Cheng, C.H.; Chang, J.R.; Huang, H.H. A novel weighted distance threshold method for handling medical missing values. Comput. Biol. Med. 2020, 122, 103824. [Google Scholar] [CrossRef]
- Sarkar, M. Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets Syst. 2007, 158, 2134–2152. [Google Scholar] [CrossRef]
- Dua, D.; Graff, C. UCI Machine Learning Repository. School of Information and Computer Science, University of California. 2019. Available online: http://archive.ics.uci.edu/ml (accessed on 10 May 2020).
- Sandercock, P.A.; Niewada, M.; Członkowska, A. The International Stroke Trial database. Trials 2011, 12, 101. [Google Scholar] [CrossRef] [Green Version]
- Pivato, M. Condorcet meets Bentham. J. Math. Econ. 2015, 59, 58–65. [Google Scholar] [CrossRef]
- Rohlf, F.J.; Sokal, R.R. Statistical Tables, 3rd ed.; Freeman: New York, NY, USA, 1995. [Google Scholar]
- Moayedikia, A.; Ong, K.-L.; Boo, Y.L.; Yeoh, W.; Jensen, R. Feature selection for high dimensional imbalanced class data using harmony search. Eng. Appl. Artif. Intell. 2017, 57, 38–49. [Google Scholar] [CrossRef] [Green Version]
- Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning; Springer Publishing Company: Boston, MA, USA, 2010. [Google Scholar]
- Strack, B.; DeShazo, J.P.; Gennings, C.; Olmo, J.L.; Ventura, S.; Cios, K.J.; Clore, J.N. Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records. Biomed Res. Int. 2014, 2014, 1–11. [Google Scholar] [CrossRef] [PubMed]
- UCI. Machine Learning Repository. 2020. Available online: https://archive.ics.uci.edu/ml/datasets/Audiology+(Standardized) (accessed on 26 July 2020).
- Wolberg, W.H.; Street, W.N.; Mangasarian, O.L. Machine learning techniques to diagnose breast cancer from fine-needle aspirates. Cancer Lett. 1994, 77, 163–171. [Google Scholar] [CrossRef]
- Kayes, A.S.M.; Kalaria, R.; Sarker, I.H.; Islam, S.; Watters, P.A.; Ng, A.; Hammoudeh, M.; Badsha, S.; Kumara, I. A Survey of Context-Aware Access Control Mechanisms for Cloud and Fog Networks: Taxonomy and Open Research Issues. Sensors 2020, 20, 2464. [Google Scholar] [CrossRef]
- Kayes, A.; Rahayu, W.; Watters, P.; Alazab, M.; Dillon, T.; Chang, E. Achieving security scalability and flexibility using Fog-Based Context-Aware Access Control. Future Gener. Comput. Syst. 2020, 107, 307–323. [Google Scholar] [CrossRef]
- Chickerur, A.; Joshi, P.; Aminian, P.; Semencato, G.T.; Pournasseh, L.; Nair, P.A. Classification and Management of Personally Identifiable Data. U.S. Patent Application No. 16/252320, 26 July 2020. [Google Scholar]
Class | Safe Area | Boundary Area | Sparse Area | Outlier Area |
---|---|---|---|---|
cn(e), cn(i) | K ≥ 4 | 3 or 2 | 1 | 0 |
Imputation Method | Parameter |
---|---|
k Nearest Neighbor Imputation | k = {3, 5, 7,9} |
Multiple Imputation | {M = 3, Max_it = 50} |
Safe Region Imputation | K = {5} |
Classifier | Parameter |
Decision tree | Confidence Factor: 0.25 |
Random Forest | Iterations: 100 |
REP Tree | VarianceProp:0.001 |
LMT | Boosting iterations:2 |
Dataset | Number of Classes | Number of Attributes | Number of Instances | Number of Missing Values | Balanced Class (Ratio of Min and Max Class Instances) |
---|---|---|---|---|---|
Diabetes | 3 | 50 | 101766 | 2273 | No (0.21 = 11357/54864) |
Audiology | 24 | 69 | 200 | 291 | No (0.02 = 1/46) |
Thyroid Disease | 2 | 28 | 2800 | 1756 | No (0.04 = 119/2681) |
Breast Cancer | 2 | 10 | 699 | 16 | No (0.53 = 241/458) |
Stroke | 2 | 69 | 4239 | 6578 | Yes (0.98 = 2096/2143) |
Attribute | Correlation (Rank) | InfoGain (Rank) | GainRatio (Rank) | ReliefF (Rank) | Sum of Rank | Order |
---|---|---|---|---|---|---|
Bare Nuclei | 0.823 (1) | 0.596 (3) | 0.396 (3) | 0.272 (1) | 8 | 1 |
Uniformity of Cell Shape | 0.822 (2) | 0.670 (2) | 0.363 (5) | 0.168 (3) | 12 | 2 |
Uniformity of Cell Size | 0.821 (3) | 0.693 (1) | 0.395 (4) | 0.166 (4) | 12 | 2 |
Normal Nucleoli | 0.719 (5) | 0.480 (6) | 0.402 (2) | 0.156 (5) | 18 | 4 |
Bland Chromatin | 0.758 (4) | 0.550 (4) | 0.306 (6) | 0.148 (6) | 20 | 5 |
Single Epithelial Cell Size | 0.691 (8) | 0.525 (5) | 0.408 (1) | 0.073 (8) | 22 | 6 |
Clump Thickness | 0.715 (6) | 0.457 (8) | 0.209 (9) | 0.264 (2) | 25 | 7 |
Marginal Adhesion | 0.706 (7) | 0.462 (7) | 0.280 (8) | 0.132 (7) | 29 | 8 |
Mitoses | 0.423 (9) | 0.199 (9) | 0.297 (7) | 0.041 (9) | 34 | 9 |
Dataset | Imputation Method | Tree | Random Forest | REP Tree | LMT |
---|---|---|---|---|---|
Diabetes | Removed MVs | 52.87 | 52.98 | 55.93 | 59.63 |
Proposed | 70.08 | 68.39 | 68.91 | 69.27 | |
kNN | 68.18 | 68.19 | 68.41 | 69.21 | |
Multiple | 65.21 | 65.32 | 66.01 | 66.41 | |
Audiology | Removed MVs | 62.90 | 59.27 | 57.20 | 67.69 |
Proposed | 89.28 | 89.12 | 87.59 | 87.94 | |
kNN | 70.39 | 70.76 | 67.10 | 73.48 | |
Multiple | 70.37 | 68.18 | 66.34 | 73.64 | |
Thyroid disease | Removed MVs | 96.05 | 96.83 | 96.68 | 96.61 |
Proposed | 99.08 | 99.18 | 99.97 | 99.08 | |
kNN | 98.93 | 99.81 | 99.96 | 98.09 | |
Multiple | 97.86 | 97.16 | 97.01 | 97.76 | |
Breast cancer | Removed MVs | 94.72 | 96.64 | 94.46 | 96.20 |
Proposed | 96.35 | 98.61 | 93.77 | 99.28 | |
kNN | 96.61 | 97.27 | 96.71 | 97.01 | |
Multiple | 95.03 | 97.01 | 95.72 | 96.46 | |
Stroke | Removed MVs | 56.59 | 61.05 | 58.69 | 61.54 |
Proposed | 62.38 | 67.94 | 68.06 | 68.39 | |
kNN | 60.01 | 65.65 | 63.67 | 65.56 | |
Multiple | 59.01 | 63.62 | 62.03 | 64.05 |
Dataset | Imputation Method | Tree | Random Forest | REP Tree | LMT |
---|---|---|---|---|---|
Diabetes | Removed MVs | 0.57 | 0.65 | 0.63 | 0.65 |
Proposed | 0.53 | 0.66 | 0.60 | 0.56 | |
kNN | 0.50 | 0.64 | 0.60 | 0.54 | |
Multiple | 0.53 | 0.60 | 0.65 | 0.54 | |
Audiology | Removed MVs | 0.88 | 0.95 | 0.50 | 0.98 |
Proposed | 0.56 | 0.53 | 0.53 | 0.53 | |
kNN | 0.89 | 0.93 | 0.83 | 0.93 | |
Multiple | 0.88 | 0.91 | 0.81 | 0.95 | |
Thyroid disease | Removed MVs | 0.84 | 0.98 | 0.89 | 0.90 |
Proposed | 0.53 | 0.62 | 0.51 | 0.53 | |
kNN | 0.99 | 0.99 | 0.99 | 0.99 | |
Multiple | 0.79 | 0.97 | 0.87 | 0.89 | |
Breast cancer | Removed MVs | 0.95 | 0.98 | 0.96 | 0.98 |
Proposed | 0.90 | 0.99 | 0.85 | 0.97 | |
KNN | 0.50 | 0.66 | 0.50 | 0.50 | |
Multiple | 0.95 | 0.90 | 0.90 | 0.93 | |
Stroke | Removed MVs | 0.61 | 0.69 | 0.64 | 0.68 |
Proposed | 0.62 | 0.74 | 0.66 | 0.73 | |
kNN | 0.61 | 0.72 | 0.66 | 0.71 | |
Multiple | 0.61 | 0.72 | 0.66 | 0.71 |
Dataset | Imputation Method | Tree | Random Forest | REP Tree | LMT |
---|---|---|---|---|---|
Audiology | Removing MV | 62.36 | 57.15 | 23.36 | 67.88 |
kNN | 70.00 | 66.98 | 56.92 | 68.77 | |
Multiple | 69.88 | 66.84 | 56.23 | 70.39 | |
Proposed | 87.70 | 89.12 | 87.26 | 88.67 | |
Diabetes | Removing MV | 55.06 | 56.14 | 57.31 | 58.83 |
kNN | 55.10 | 53.08 | 55.08 | 55.43 | |
Multiple | 55.03 | 55.11 | 55.03 | 55.43 | |
Proposed | 67.81 | 63.18 | 67.35 | 68.28 | |
Thyroid | Removing MV | 96.39 | 96.70 | 96.66 | 96.47 |
kNN | 96.77 | 96.94 | 96.57 | 96.53 | |
Multiple | 96.97 | 97.08 | 96.87 | 96.72 | |
Proposed | 95.08 | 95.21 | 94.90 | 95.08 | |
Breast cancer | Removed MVs | 94.72 | 96.64 | 94.46 | 96.20 |
Multiple | 95.03 | 97.01 | 95.72 | 96.46 | |
kNN | 96.61 | 97.27 | 96.71 | 97.01 | |
Proposed | 96.35 | 98.61 | 93.77 | 99.28 | |
Stroke | Removing MV | 64.60 | 62.97 | 64.31 | 66.00 |
kNN | 64.01 | 62.46 | 62.94 | 65.44 | |
Multiple | 64.01 | 62.46 | 62.94 | 65.44 | |
Proposed | 63.04 | 66.12 | 62.55 | 64.85 |
Dataset | Imputation Method | Tree | Random Forest | REP Tree | LMT |
---|---|---|---|---|---|
Audiology | Removing MVt | 0.88 | 0.94 | 0.50 | 0.97 |
kNN | 0.91 | 0.92 | 0.83 | 0.93 | |
Multiple | 0.89 | 0.90 | 0.82 | 0.95 | |
Proposed | 0.50 | 0.50 | 0.50 | 0.50 | |
Diabetes | Removing MV | 0.59 | 0.64 | 0.64 | 0.67 |
kNN | 0.56 | 0.57 | 0.57 | 0.59 | |
Multiple | 0.56 | 0.57 | 0.59 | 0.59 | |
Proposed | 0.52 | 0.64 | 0.59 | 0.57 | |
Thyroid | Removing MV | 0.81 | 0.97 | 0.87 | 0.89 |
kNN | 0.76 | 0.96 | 0.88 | 0.87 | |
Multiple | 0.77 | 0.95 | 0.85 | 0.88 | |
Proposed | 0.50 | 0.60 | 0.51 | 0.50 | |
Breast cancer | Removed MVs | 0.95 | 0.98 | 0.96 | 0.98 |
Multiple | 0.95 | 0.90 | 0.90 | 0.93 | |
KNN | 0.50 | 0.66 | 0.50 | 0.50 | |
Proposed | 0.90 | 0.99 | 0.85 | 0.97 | |
Stroke | Removing MV | 0.65 | 0.65 | 0.64 | 0.69 |
kNN | 0.67 | 0.67 | 0.67 | 0.72 | |
Multiple | 0.67 | 0.67 | 0.67 | 0.72 | |
Proposed | 0.66 | 0.73 | 0.65 | 0.73 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, S.-F.; Cheng, C.-H. A Safe-Region Imputation Method for Handling Medical Data with Missing Values. Symmetry 2020, 12, 1792. https://doi.org/10.3390/sym12111792
Huang S-F, Cheng C-H. A Safe-Region Imputation Method for Handling Medical Data with Missing Values. Symmetry. 2020; 12(11):1792. https://doi.org/10.3390/sym12111792
Chicago/Turabian StyleHuang, Shu-Fen, and Ching-Hsue Cheng. 2020. "A Safe-Region Imputation Method for Handling Medical Data with Missing Values" Symmetry 12, no. 11: 1792. https://doi.org/10.3390/sym12111792
APA StyleHuang, S.-F., & Cheng, C.-H. (2020). A Safe-Region Imputation Method for Handling Medical Data with Missing Values. Symmetry, 12(11), 1792. https://doi.org/10.3390/sym12111792