Abstract
Machine learning algorithms perform differently in settings with varying levels of training set mislabeling noise. Therefore, the choice of a good algorithm for a particular learning problem is crucial. In this paper, we introduce the “Sigmoid Rule” Framework focusing on the description of classifier behavior in noisy settings. The framework uses an existing model of the expected performance of learning algorithms as a sigmoid function of the signal-to-noise ratio in the training instances. We study the parameters of the above sigmoid function using five different classifiers, namely, Naive Bayes, kNN, SVM, a decision tree classifier, and a rule-based classifier. Our study leads to the definition of intuitive criteria based on the sigmoid parameters that can be used to compare the behavior of learning algorithms in the presence of varying levels of noise. Furthermore, we show that there exists a connection between these parameters and the characteristics of the underlying dataset, hinting at how the inherent properties of a dataset affect learning. The framework is applicable to concept drift scenaria, including modeling user behavior over time, and mining of noisy data series, as in sensor networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ali, S., Smith, K.A.: On learning algorithm selection for classification. Applied Soft Computing 6(2), 119–138 (2006)
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)
Camastra, F., Vinciarelli, A.: Estimating the intrinsic dimension of data with a fractal-based method. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(10), 1404–1407 (2002)
Chevaleyre, Y., Zucker, J.-D.: Noise-tolerant rule induction from multi-instance data. In: De Raedt, L. (ed.) Proceedings of the ICML 2000 Workshop on Attribute-Value and Relational Learning: Crossing the Boundaries (2000)
Cohen, W.W.: Fast effective rule induction. In: ICML (1995)
de Sousa, E., Traina, A., Traina Jr., C., Faloutsos, C.: Evaluating the intrinsic dimension of evolving data streams. In: Proceedings of the 2006 ACM Symposium on Applied Computing, pp. 643–648. ACM (2006)
Frank, A., Asuncion, A.: UCI machine learning repository (2010)
Giannakopoulos, G., Palpanas, T.: Adaptivity in entity subscription services. In: ADAPTIVE (2009)
Giannakopoulos, G., Palpanas, T.: Content and type as orthogonal modeling features: a study on user interest awareness in entity subscription services. International Journal of Advances on Networks and Services 3(2) (2010)
Giannakopoulos, G., Palpanas, T.: The effect of history on modeling systems’ performance: The problem of the demanding lord. In: ICDM (2010)
Giraud-Carrier, C., Vilalta, R., Brazdil, P.: Introduction to the special issue on meta-learning. Machine Learning 54(3), 187–193 (2004)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann (2006)
Kalapanidas, E., Avouris, N., Craciun, M., Neagu, D.: Machine learning algorithms: a study on noise sensitivity. In: Proc. 1st Balcan Conference in Informatics, pp. 356–365 (2003)
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to platt’s smo algorithm for svm classifier design. Neural Computation 13(3), 637–649 (2001)
Kuh, A., Petsche, T., Rivest, R.L.: Learning time-varying concepts. In: NIPS, pp. 183–189 (1990)
Li, Q., Li, T., Zhu, S., Kambhamettu, C.: Improving medical/biological data classification performance by wavelet preprocessing. In: Proceedings ICDM Conference (2002)
Pendrith, M., Sammut, C.: On reinforcement learning of control actions in noisy and non-markovian domains. Technical report, School of Computer Science and Engineering, The University of New South Wales, Sydney, Australia (1994)
Teytaud, O.: Learning with noise. Extension to regression. In: Proceedings of International Joint Conference on Neural Networks, IJCNN 2001, vol. 3, pp. 1787–1792. IEEE (2002)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press (2003)
Wolpert, D.: The existence of a priori distinctions between learning algorithms. Neural Computation 8, 1391–1421 (1996)
Wolpert, D.: The supervised learning no-free-lunch theorems. In: Proc. 6th Online World Conference on Soft Computing in Industrial Applications. Citeseer (2001)
Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Computation 8, 1341–1390 (1996)
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A.F.M., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mirylenka, K., Giannakopoulos, G., Palpanas, T. (2012). SRF: A Framework for the Study of Classifier Behavior under Training Set Mislabeling Noise. In: Tan, PN., Chawla, S., Ho, C.K., Bailey, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7301. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30217-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-30217-6_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30216-9
Online ISBN: 978-3-642-30217-6
eBook Packages: Computer ScienceComputer Science (R0)