Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3375959.3375974acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaicccConference Proceedingsconference-collections
research-article

Improving Bees-based Imputation using Nearest Neighbor for Heuristic Function in Imputing Data

Published: 16 February 2020 Publication History

Abstract

Data imputation is a necessary task to solve missing value problem for better data mining result. The current data imputation with Bees algorithm contains several random procedures including instance selection and feature selection, and the randomness causes inconsistency and swinging result in iteration. Thus, this work proposes to solve them by applying a heuristic function in those procedures from using importance score in selecting attribute to handle and probability in selecting correlated value. These calculations provide the bees with a guidance direction; thus, there are less random processes and should lower inconsistent and swinging results from randomness. From evaluation, the proposed Bees-based imputation obtained higher accuracy than the previous Bees-based and Genetic algorithm-based imputation method from all data sets for all missing data percentage between 10% to 50%. The best improvement in accuracy for 23% in average was found in SPECT data set which consists of only binary type values. For the data sets with values mixing of binary and category type, the proposed method gained about 3-7% improvement in average.

References

[1]
Alizadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C., Lossos, I. S., Rosenwald, A.,... & Powell, J. I. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature, 403(6769), 503.
[2]
Gustavo EAPA Batista and Maria Carolina Monard. 2003. An analy- sis of four missing data treatment methods for supervised learning. Applied artificial intelligence 17, 5-6 (2003), 519--533.
[3]
Kritanut Chungnoy, Pornthep Khongchai, and Pokpong Songmuang. [n.d.]. Missing Data Imputation based on Bees Algorithm. Artificial Intelligence and Natural Language Processing (iSAI-NLP 2017) ([n. d.]),
[4]
Kritanut Chungnoy, Pornthep Khongchai, and Pokpong Songmuang. [n.d.]. Missing Data Imputation based on Enhanced Bees Algorithm. 12th International Conference on Knowledge, Information and Creativity Support Systems ([n. d.]).
[5]
Abtin Hassani and Jonatan Treijs. 1975. An overview of standard and parallel genetic algorithms. In IDT Workshop on Interesting Results in Computer Science and Engineering, Mälardalen University. 1--7.
[6]
K. Hengpraphrom, S.N. Wlchian, and Phayung Meesad. 2011. Missing value imputation using genetic algorithm. 5 (02 2011), 355--360.
[7]
Phayung Meesad and Kairung Hengpraprohm. 2008. Combination of knn-based feature selection and knnbased missing-value imputation of microarray data. In Innovative Computing Information and Control, 2008. ICICIC'08. 3rd International Conference on. IEEE, 341--341.
[8]
Dipak V Patil and RS Bichkar. 2010. Multiple imputation of missing data with genetic algorithm based techniques. IJCA Special Issue on" Evolutionary Computation for Optimization Techniques (2010), 74--78.
[9]
Duc Truong Pham, Afshin Ghanbarzadeh, Ebubekir Koç, Sameh Otri, S Rahim, and Muhamad Zaidi. 2006. The bees algorithm---a novel tool for complex optimisation problems. In Intelligent Production Machines and Systems. Elsevier, 454--459.
[10]
R Devi Priya and S Kuppuswami. 2012. A genetic algorithm based approach for imputing missing discrete attribute values in databases. WSEAS Transactions on Information Science and Applications 9, 6 (2012), 169--178.
[11]
Donald B Rubin and Roderick JA Little. 2002. Statistical analysis with missing data. Hoboken, NJ: J Wiley & Sons (2002).
[12]
Waseem Shahzad, Qamar Rehman, and Ejaz Ahmed. 2017. Missing data imputation using genetic algorithm for supervised learning. In- ternational Journal of Advanced Computer Science and Applications (IJACSA) 3 (2017).
[13]
Qinbao Song, Martin Shepperd, Xiangru Chen, and Jun Liu. 2008. Can k-NN imputation improve the performance of C4. 5 with small software project data sets? A comparative evaluation. Journal of Systems and software 81, 12 (2008), 2361--2370.
[14]
Pokpong Songmuang and Maomi Ueno. 2010. Bees algorithm for construction of multiple test forms in e-testing. IEEE Transactions on Learning Technologies 4, 3 (2010), 209--221.
[15]
Olga Troyanskaya, Michael Cantor, Gavin Sherlock, Pat Brown, Trevor Hastie, Robert Tibshirani, David Botstein, and Russ B Altman. 2001. Missing value estimation methods for DNA microarrays. Bioinformat- ics 17, 6 (2001), 520--525.
[16]
Li-Pei Wong, Malcolm Yoke Hean Low, and Chin Soon Chong. 2008. A bee colony optimization algorithm for traveling salesman problem. In 2008 Second Asia International Conference on Modelling & Simulation (AMS). IEEE, 818--823.

Cited By

View all
  • (2024)Drug Recommendation based on drug details and Optimization using Bee Algorithm2024 IEEE International Conference on Cybernetics and Innovations (ICCI)10.1109/ICCI60780.2024.10532744(1-6)Online publication date: 29-Mar-2024
  • (2023)Missing Values Imputation Framework for Mixed Datasets2023 IEEE International Conference on Cybernetics and Innovations (ICCI)10.1109/ICCI57424.2023.10111846(1-5)Online publication date: 30-Mar-2023

Index Terms

  1. Improving Bees-based Imputation using Nearest Neighbor for Heuristic Function in Imputing Data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AICCC '19: Proceedings of the 2019 2nd Artificial Intelligence and Cloud Computing Conference
    December 2019
    216 pages
    ISBN:9781450372633
    DOI:10.1145/3375959
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Kobe University: Kobe University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 February 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Bee Algorithm
    2. Genetic Algorithm
    3. Imputation
    4. K-Nearest Neighbor
    5. Missing Values

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AICCC 2019

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Drug Recommendation based on drug details and Optimization using Bee Algorithm2024 IEEE International Conference on Cybernetics and Innovations (ICCI)10.1109/ICCI60780.2024.10532744(1-6)Online publication date: 29-Mar-2024
    • (2023)Missing Values Imputation Framework for Mixed Datasets2023 IEEE International Conference on Cybernetics and Innovations (ICCI)10.1109/ICCI57424.2023.10111846(1-5)Online publication date: 30-Mar-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media