A Classifier Combining Local Distance Mean and Centroid for Imbalanced Datasets

Zhao, Yingying; Liu, Xingcheng

doi:10.1007/978-3-030-41117-6_11

Yingying Zhao¹⁹ &
Xingcheng Liu^19,20

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 313))

Included in the following conference series:

International Conference on Communications and Networking in China

638 Accesses

Abstract

The K-Nearest Neighbor (KNN) algorithm is widely used in practical life because of its simplicity and easy understanding. However, the traditional KNN algorithm has some shortcomings. It only considers the number of samples of different classes in k neighbors, but ignores the distance and location distribution of the unknown sample relative to the k nearest training samples. Moreover, classes imbalance problem is always a challenge faced with the KNN algorithm. To solve the above problems, we propose an improved KNN classification method for classes imbalanced datasets based on local distance mean and centroid (LDMC-KNN) in this paper. In the proposed scheme, different numbers of nearest neighbor training samples are selected from each class, and the unknown sample is classified according to the distance and position of these nearest training samples. Experiments are performed on the UCI datasets. The results show that the proposed algorithm has strong competitiveness and is always far superior to KNN algorithm and its variants.

Supported by the National Natural Science Foundation of China (Grant Nos. 61572534 and 61873290), the Special Project for Promoting Economic Development in Guangdong Province (Grant No. GDME-2018D004), and the Opening Project of Guangdong Province Key Laboratory of Information Security Technology under Grant 2017B030314131.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CD-KNN: A Modified K-Nearest Neighbor Classifier with Dynamic K Value

A new globally adaptive k-nearest neighbor classifier based on local mean optimization

Article 03 October 2020

Combining k-Nearest Neighbor and Centroid Neighbor Classifier for Fast and Robust Classification

References

Wu, X., Zuo, W., Lin, L., Jia, W., Zhang, D.: F-SVM: combination of feature transformation and SVM learning via convex relaxation. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 5185–5199 (2018)
Article Google Scholar
Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)
Article MathSciNet Google Scholar
Jiang, L., Zhang, L., Li, C., Wu, J.: A correlation-based feature weighting filter for Naive Bayes. IEEE Trans. Knowl. Data Eng. 31(2), 201–213 (2019)
Article Google Scholar
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(10), 21–27 (1967)
Article MATH Google Scholar
Wu, X., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)
Article Google Scholar
Mullick, S.S., Datta, S., Das, S.: Adaptive learning-based k-nearest neighbor classifiers with resilience to class imbalance. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 5713–5725 (2018)
MathSciNet Google Scholar
García-Pedrajas, N., Romero del Castillo, J.A. Cerruela-García, G.: A proposal for local k values for k-nearest neighbor rule. IEEE Trans. Neural Netw. Learn. Syst. 28(2), 470–475 (2017)
Google Scholar
Zeng, Y., Yang, Y., Zhao, L.: Pseudo nearest neighbor rule for pattern classification. Pattern Recogn. Lett. 36(2), 3587–3595 (2009)
Google Scholar
Mitani, Y., Hamamoto, Y.: A local mean-based nonparametric classifier. Pattern Recogn. Lett. 27(10), 1151–1159 (2006)
Article Google Scholar
Pan, Z., Wang, Y., Ku, W.: A new k-harmonic nearest neighbor classifier based on the multi-local means. Expert Syst. Appl. 67, 115–125 (2017)
Article Google Scholar
Japkowicz, N.: The class imbalance problem: significance and strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence: Special Track on Inductive Learning, Las Vegas, pp. 111–117 (2000)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Philip Kegelmeyer, W.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Article MATH Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, pp. 1322–1328. IEEE, Hong Kong (2008)
Google Scholar
Zhang, X., Li, Y., Kotagiri, R., Wu, L., Tari, Z., Cheriet, M.: KRNN: k rare-class nearest neighbour classification. Pattern Recogn. 62, 33–44 (2017)
Article Google Scholar
Dubey, H., Pudi, V.: Class based weighted k-nearest neighbor over imbalance dataset. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 305–316. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_26
Chapter Google Scholar
Li, Y., Zhang, X.: Improving k nearest neighbor with exemplar generalization for imbalanced classification. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6635, pp. 321–332. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20847-8_27
Chapter Google Scholar
Liu, W., Chawla, S.: Class confidence weighted kNN algorithms for imbalanced data sets. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6635, pp. 345–356. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20847-8_29
Chapter Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2019)
Google Scholar
Zhang, X., Li, Y.: A positive-biased nearest neighbour algorithm for imbalanced classification. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 293–304. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_25
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou, 510006, China
Yingying Zhao & Xingcheng Liu
School of Information Science, Xinhua College of Sun Yat-sen University, Guangzhou, 510520, China
Xingcheng Liu

Authors

Yingying Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xingcheng Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xingcheng Liu .

Editor information

Editors and Affiliations

Shanghai University, Shanghai, China
Honghao Gao
School of Computer Software, Tianjin University, Tianjin, China
Zhiyong Feng
Hangzhou Dianzi University, Hangzhou, China
Jun Yu
Tongji University, Shanghai Shi, China
Jun Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, Y., Liu, X. (2020). A Classifier Combining Local Distance Mean and Centroid for Imbalanced Datasets. In: Gao, H., Feng, Z., Yu, J., Wu, J. (eds) Communications and Networking. ChinaCom 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 313. Springer, Cham. https://doi.org/10.1007/978-3-030-41117-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-41117-6_11
Published: 27 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41116-9
Online ISBN: 978-3-030-41117-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Classifier Combining Local Distance Mean and Centroid for Imbalanced Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

CD-KNN: A Modified K-Nearest Neighbor Classifier with Dynamic K Value

A new globally adaptive k-nearest neighbor classifier based on local mean optimization

Combining k-Nearest Neighbor and Centroid Neighbor Classifier for Fast and Robust Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Classifier Combining Local Distance Mean and Centroid for Imbalanced Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

CD-KNN: A Modified K-Nearest Neighbor Classifier with Dynamic K Value

A new globally adaptive k-nearest neighbor classifier based on local mean optimization

Combining k-Nearest Neighbor and Centroid Neighbor Classifier for Fast and Robust Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation