Abstract
This paper presents a 4-objective evolutionary multiobjective optimization study for optimizing the error rates (false positives, false negatives), reliability, and complexity of binary classifiers. The example taken is the email anti-spam filtering problem.
The two major goals of the optimization is to minimize the error rates that is the false negative rate and the false positive rate. Our approach discusses three-way classification, that is the binary classifier can also not classify an instance in cases where there is not enough evidence to assign the instance to one of the two classes. In this case the instance is marked as suspicious but still presented to the user. The number of unclassified (suspicious) instances should be minimized, as long as this does not lead to errors. This will be termed the coverage objective. The set (ensemble) of rules needed for the anti-spam filter to operate in optimal conditions is addressed as a fourth objective. All objectives stated above are in general conflicting with each other and that is why we address the problem as a 4-objective (quadcriteria) optimization problem. We assess the performance of a set of state-of-the-art evolutionary multiobjective optimization algorithms. These are NSGA-II, SPEA2, and the hypervolume indicator-based SMS-EMOA. Focusing on the anti-spam filter optimization, statistical comparisons on algorithm performance are provided on several benchmarks and a range of performance indicators. Moreover, the resulting 4-D Pareto hyper-surface is discussed in the context of binary classifier optimization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang, P., Emmerich, M., Li, R., Tang, K., Bäck, T., Yao, X.: Convex hull-based multi-objective genetic programming for maximizing receiver operating characteristic performance. IEEE Trans. Evol. Comput. 19(2), 188–200 (2015)
Li, R., Emmerich, M.T., Eggermont, J., Bäck, T., Schütz, M., Dijkstra, J., Reiber, J.H.: Mixed integer evolution strategies for parameter optimization. Evolu. Comput. 21(1), 29–64 (2013)
Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R.: Anti-spam multiobjective genetic algorithms optimization analysis. Int. Resour. Manage. J. 26(1), 54–67 (2012)
Yevseyeva, I., Basto-Fernandes, V., Méndez, J.R.: Survey on anti-spam single and multi-objective optimization. In: Cruz-Cunha, M.M., Varajo, J., Powell, P., Martinho, R. (eds.), ENTERprise Information Systems. Communications in Computer and Information Science, vol. 220, pp. 120–129. Springer, Heidelberg (2011)
Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R.: Optimization of anti-spam systems with multiobjective evolutionary algorithms. Int. Resour. Manage. J. 26, 54–67 (2012)
Yevseyeva, I., Basto-Fernandes, V., Ruano-Ordás, D., Méndez, J.R.: Optimising anti-spam filters with evolutionary algorithms. Expert Syst. Appl. 40(10), 4010–4021 (2013)
Jin, Y.: Multi-objective Machine Learning. Studies in Computational Intelligence. Springer, Heidelberg (2006)
Zhao, J., Basto-Fernandes, V., Jiao, L., Yevseyeva, L., Maulana, A., Li, R., Bäck, T., Emmerich, M.T.M.: Multiobjective optimization of classifiers by means of 3-d convex hull based evolutionary algorithm, ARXIV Computer Science abs/1412.5710 (2014). http://arxiv.org/abs/1412.5710
The Apache SpamAssassin Project - SpamAssassin public corpus (2005). http://spamassassin.apache.org/publiccorpus
SpamAssassin Team: The apache spamassassin project (2011). http://spamassassin.apache.org/
Durillo, J.J., Nebro, A.J.: jMetal: a java framework for multi-objective optimization. Adv. Eng. Softw. 42, 760–771 (2011)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Basto-Fernandes, V., Yevseyeva, I., Frantz, R.Z., Grilo, C., Daz, N.P., Emmerich, M.: An automatic generation of textual pattern rules for digital content filters proposal, using grammatical evolution genetic programming. Procedia Technol. 16, 806–812 (2014)
Yao, Y.: The superiority of three-way decisions in probabilistic rough set models. Inf. Sci. 181(6), 1080–1096 (2011)
Miettinen, K.: Nonlinear Multiobjective Optimization. Springer, New York (1999)
Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: improving the strength Pareto evolutionary algorithm. In: Proceedings of EUROGEN 2001, Athens Greece. CIMNE, Barcelona (2001)
Emmerich, M., Beume, N., Naujoks, B.: An EMO algorithm using the hypervolume measure as selection criterion. In: Coello Coello, C.A., Hernández Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 62–76. Springer, Heidelberg (2005)
While, L., Bradstreet, L., Barone, L.: A fast way of calculating exact hypervolumes. IEEE Trans. Evol. Comput. 16(1), 86–95 (2012)
Emmerich, M.T.M., Fonseca, C.M.: Computing hypervolume contributions in low dimensions: asymptotically optimal algorithm and complexity results. In: Evolutionary Multi-Criterion Optimization. Springer, Heidelberg (2011)
Guerreiro, A.P., Fonseca, C.M., Emmerich, M.T.: A fast dimension-sweep algorithm for the hypervolume indicator in four dimensions. In: CCCG, pp. 77–82 (2012)
Tušar, T., Filipič, B.: Visualizing 4D approximation sets of multiobjective optimizers with prosections. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 737–744. ACM (2011)
Acknowledgements
This work was partially funded by the [14VI05] Contract-Programme from the University of Vigo. Iryna Yevseyeva acknowledges Engineering and Physical Sciences Research Council (EPSRC), UK, and Government Communications Headquarters (GCHQ), UK, for funding Choice Architecture for Information Security (ChAISe) project EP/K006568/1 as a part of Cyber Research Institute.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Basto-Fernandes, V. et al. (2018). Quadcriteria Optimization of Binary Classifiers: Error Rates, Coverage, and Complexity. In: Tantar, AA., Tantar, E., Emmerich, M., Legrand, P., Alboaie, L., Luchian, H. (eds) EVOLVE - A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation VI. Advances in Intelligent Systems and Computing, vol 674. Springer, Cham. https://doi.org/10.1007/978-3-319-69710-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-69710-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69708-6
Online ISBN: 978-3-319-69710-9
eBook Packages: EngineeringEngineering (R0)