Abstract
The k-nearest-neighbor (kNN) algorithm is a simple but effective classification method which predicts the class label of a query sample based on information contained in its neighborhood. Previous versions of kNN usually consider the k nearest neighbors separately by the quantity or distance information. However, the quantity and the isolated distance information may be insufficient for effective classification decision. This paper investigates the kNN method from a perspective of local distribution based on which we propose an improved implementation of kNN. The proposed method performs the classification task by assigning the query sample to the class with the maximum posterior probability which is estimated from the local distribution based on the Bayesian rule. Experiments have been conducted using 15 benchmark datasets and the reported experimental results demonstrate excellent performance and robustness for the proposed method when compared to other state-of-the-art classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011). http://www.csie.ntu.edu.tw/cjlin/libsvm
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. John Wiley & Sons (2012)
Dudani, S.: The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man and Cybernetics 4, 325–327 (1976)
Duong, T.: ks: Kernel density estimation and kernel discriminant analysis for multivariate data in r. Journal of Statistical Software 21(7), 1–16 (2007)
Friedman, J., et al.: Flexible metric nearest neighbor classification. Unpublished manuscript available by anonymous FTP from playfair. stanford. edu (see pub/friedman/README) (1994)
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian data analysis. CRC Press (2013)
Govindarajan, M., Chandrasekaran, R.: Evaluation of k-nearest neighbor classifier performance for direct marketing. Expert Systems with Applications 37(1), 253–258 (2010)
Han, J., Kamber, M., Pei, J.: Data mining: concepts and techniques. Morgan kaufmann (2006)
Hand, D., Mannila, H., Smyth, P.: Principles of data mining. MIT Press (2001)
Hollander, M., Wolfe, D.A.: Nonparametric statistical methods. John Wiley & Sons, NY (1999)
Hotta, S., Kiyasu, S., Miyahara, S.: Pattern recognition using average patterns of categorical k-nearest neighbors. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 4, pp. 412–415. IEEE (2004)
Kononenko, I., Kukar, M.: Machine learning and data mining. Elsevier (2007)
Lehmann, E.L., Casella, G.: Theory of point estimation, vol. 31. Springer (1998)
Li, B., Chen, Y., Chen, Y.: The nearest neighbor algorithm of local probability centers. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 38(1), 141–154 (2008)
Magnussen, S., McRoberts, R.E., Tomppo, E.O.: Model-based mean square error estimators for k-nearest neighbour predictions and applications using remotely sensed data for forest inventories. Remote Sensing of Environment 113(3), 476–488 (2009)
Mitani, Y., Hamamoto, Y.: A local mean-based nonparametric classifier. Pattern Recognition Letters 27(10), 1151–1159 (2006)
Reynolds, D.: Gaussian mixture models. In: Encyclopedia of Biometrics, pp. 659–663 (2009)
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Philip, S.Y., et al.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mao, C., Hu, B., Moore, P., Su, Y., Wang, M. (2015). Nearest Neighbor Method Based on Local Distribution for Classification. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9077. Springer, Cham. https://doi.org/10.1007/978-3-319-18038-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-18038-0_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18037-3
Online ISBN: 978-3-319-18038-0
eBook Packages: Computer ScienceComputer Science (R0)