Feature selection for ensembles applied to handwriting recognition

Oliveira, Luiz S.; Morita, Marisa; Sabourin, Robert

doi:10.1007/s10032-005-0013-6

Feature selection for ensembles applied to handwriting recognition

Original Paper
Published: 09 March 2006

Volume 8, pages 262–279, (2006)
Cite this article

International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Luiz S. Oliveira¹,
Marisa Morita² &
Robert Sabourin²

197 Accesses
35 Citations
Explore all metrics

Abstract

Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hierarchical multi-objective genetic algorithm. The underpinning paradigm is the “overproduce and choose”. The algorithm operates in two levels. Firstly, it performs feature selection in order to generate a set of classifiers and then it chooses the best team of classifiers. In order to show its robustness, the method is evaluated in two different contexts:supervised and unsupervised feature selection. In the former, we have considered the problem of handwritten digit recognition and used three different feature sets and multi-layer perceptron neural networks as classifiers. In the latter, we took into account the problem of handwritten month word recognition and used three different feature sets and hidden Markov models as classifiers. Experiments and comparisons with classical methods, such as Bagging and Boosting, demonstrated that the proposed methodology brings compelling improvements when classifiers have to work with very low error rates. Comparisons have been done by considering the recognition rates only.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Hansen, L., Salomon, O.: Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 12(10), 993–1001 (1990)
Article Google Scholar
Hashem, S.: Optimal linear combinations of neural networks. Neural Networks 10(4), 599–614 (1997)
Article PubMed Google Scholar
Krogh, A., Vedelsby, J.: Neural networks ensembles, cross validation, and active learning. In: Tesauro, G. et al. (eds.). Advances in Neural Information Processing Systems 7, pp. 231–238. MIT Press (1995)
Optiz, D.W.: Feature selection for ensembles. In: Proceedings of the 16th International Conference on Artificial Intelligence, pp. 379–384 (1999)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Article Google Scholar
Last, M., Bunke, H., Kandel, A.: A feature-based serial approach to classifier combination. Pattern Anal. Appl. 5, 385–398 (2002)
Article MathSciNet Google Scholar
Tsymbal, A., Puuronen, S., Patterson, D.W.: Ensemble feature selection with the simple Bayesian classification. Inf. Fusion 4, 87–100 (2003)
Article Google Scholar
Gerra-Salcedo, C., Whitley, D.: Genetic approach to feature selection for ensemble creatin. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 236–243. Orlando-USA (1999)
De Stefano, C., Cioppa, A.D., Marcelli, A.: Exploiting reliability for dynamic selection of classifiers by means of genetic algorithms. In: Proceedings of the 7th International Conference on Document Analysis and Recognition, pp. 671–675. Edinbugh-Scotland IEEE Computer Society (2003)
Emmanouilidis, C., Hunter, A., MacIntyre, J.: A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator. In: Proceedings of the Congress on Evolutionary Computation, vol. 1, pp. 309–316 (2000)
Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: A methodology for feature selection using multi-objective genetic algorithms for handwritten digit string recognition. Int. J. Pattern Recog. Artif. Intell. 17(6), 903–930 (2003)
Article Google Scholar
Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large scale on feature selection. Pattern Recog. Lett. 10, 335–347 (1989)
Article MATH Google Scholar
Kudo, M., Sklansky, J.: Comparision of algorithms that select features for pattern classifiers. Pattern Recog. 33(1), 25–41 (2000)
Article Google Scholar
Partridge, D., Yates, W.B.: Engineering multiversion neural-net systems. Neural Comput. 8(4), 869–893 (1996)
Article PubMed Google Scholar
Giacinto, G., Roli, F.: Design of effective neural network ensemble for image classification purposes. Image Vision Comput. J. 9–10, 697–705 (2001)
Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles. Mach. Learn. 51, 181–207 (2003)
Google Scholar
Ruta, D., Gabrys, B.: Analysis of the correlation between majority voting error and the diversity measures in multiple classifier systems. In: Proceedings of the 4th International Symposium on Soft Computing. Paisley, UK (2001)
Ruta, D.: Multilayer selection-fusion model for pattern classification. In: Proceedings of the IASTED Artificial Intelligence and Application Conference. Insbruck, Austria (2004)
Kuncheva, L.: That elusive diversity in classifier ensembles. In: Proceedings of the ibPRIA, LNCS 2652, pp. 1126–1138. Mallorca, Spain (2003)
Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)
Article Google Scholar
Kuncheva, L., Bezdek, J.C., Duin, R.P.W.: Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recog. 34(2), 299–314 (2001)
Article MATH Google Scholar
Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman and Hall (1993)
Breiman, L.: Stacked regressions. Mach. Learn. 24(1), 49–64 (1996)
MATH MathSciNet Google Scholar
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the 13th International Conference on Machine Learning, pp. 148–156. Bary-Italy (1996)
Tumer, K., Oza, N.C.: Input decimated ensembles. Pattern Anal. Appl. 6, 65–77 (2003)
Article MathSciNet MATH Google Scholar
Kuncheva, L., Jain, L.C.: Designing classifier fusion systems by genetic algorithms. IEEE Trans. Evol. Comput. 4(4), 327–336 (2000)
Article Google Scholar
Gunter, S., Bunke, H.: Creation of classifier ensembles for handwritten word recogntion using feature selection algorithms. In: Proceedings of the 8th IWFHR, pp. 183–188. Niagara-on-the-Lake, Canada (2002)
Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms. Wiley, 2nd edn. April (2002)
Srinivas, N., Deb, K.: Multiobjective optimization using nondominated sorting in genetic algorithms. Evol. Comput. 2(3), 221–248 (1995)
Article Google Scholar
Deb, K., Goldberg, D.E.: An investigation of niche and species formation in genetic function. In: Proceedings of the 3rd International Conference on Genetic Algorithms, pp. 42–50 (1989)
Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Automatic recognition of handwritten numerical strings: A recognition and verification strategy. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1438–1454 (2002)
Article Google Scholar
Oh, I.-S., Suen, C.Y.: Distance features for neural network-based recognition of handwritten characters. Int. J. Doc. Anal. Recog. 1(2), 73–88 (1998)
Article Google Scholar
Chim, Y.C., Kassim, A.A., Ibrahim, Y.: Dual classifier system for handprinted alphanumeric character recognition. Pattern Anal. Appl. 1(3), 155–162 (1998)
Article MATH Google Scholar
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford, U.K. (1995)
Google Scholar
Fumera, G., Roli, F., Giacinto, G.: Reject option with multiple thresholds. Pattern Recog. 33(12), 2099–2101 (2000)
Article Google Scholar
Oliveira, J.J. Jr., Carvalho, J.M., Freitas, C.O.A., Sabourin, R.: Evaluating NN and HMM classifiers for handwritten word recognition. In: Proceedings of the 15th Brazilian Symposium on Computer Graphics and Image Processing, pp. 210–217. IEEE Computer Society Fortaleza, Brazil (2002)
Morita, M., El Yacoubi, A., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Handwritten month word recognition on Brazilian bank cheques. In: Proceedings of the 6th ICDAR, pp. 972–976 (2001)
John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problems. In: Proceedings of the 11th International Conference on Machine Learning, pp. 121–129 (1994)
Moody, J., Utans, J.: Principled architecture selection for neural networks: Application to corporate bond rating prediction. In: Moody, J., Hanson, S.J., Lippmann, R.P. (eds.). Advances in Neural Information Processing Systems 4. Morgan Kaufmann (1991)
Yuan, H., Tseng, S.S., Gangshan, W., Fuyan, Z.: A two-phase feature selection method using both filter and wrapper. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, vol. 2, pp. 132–136 (1999)
Kuncheva, L.I., Whitaker, C.J.: Ten measures of diversity in classifier ensembles: limits for two classifiers. In: Proceedings of the IEE Workshop on Intelligent Sensor Processing, pp. 1–10 (2001)
Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)
Article Google Scholar
Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Models for Speech Recognition. Edinburgh University Press (1990)
Dy, J.G., Brodley, C.E.: Feature subset selection and order identification for unsupervised learning. In: Proceedings of the 17th International Conference on Machine Learning. Stanford University, CA, July (2000)
Kim, Y.S., Street, W.N., Menczer, F.: Feature selection in unsupervised learning via evolutionary search. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 365–369 (2000)
Morita, M., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition. In: Proceedings of the 7th International Conference on Document Analysis and Recognition, pp. 666–670. IEEE Computer Society, Edinbugh-Scotland (2003)
Davis, L.: Handbook on Genetic Algorithms. Van Nostrand Reinhold (1991)
Cantu-Paz, E.: Efficient and Accurate Parallel Genetic Algorithms. Kluwer Academic Publishers (2000)
Miki, M., Hiroyasu, T., Kaneko, K., Hatanaka, K.: A parallel genetic algorithm with distributed environment scheme. In: Proceedings of the International Conference on System, Man, and Cybernetics, vol. 1, pp. 695–700 (1999)
Morita, M., Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: An HMM-MLP hybrid system to recognize handwritten dates. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1–6. IEEE Computer Society, Honolulu, USA (2002)
Tumer, K., Ghosh, J.: Error correlation and error reduction in ensemble classifiers. Connect. Sci. 8(3–4), 385–404 (1996)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Pontíficia Universidade Católica do Paraná (PUCPR), Programa de Póos-Graduação em Inform´tica Aplicada (PPGIA), Rua Imaculada Conceição 1155,Prado Velho, 80215-901, Curitiba, Pr, Brazil
Luiz S. Oliveira
École de Technologie Supérieure (ETS), Laboratoire d'Imagerie, de Vison et d'Intelligence Artificielle 1100,rue Notre Dame Ouest, H3C 1K3, Montreal, Canada
Marisa Morita & Robert Sabourin

Authors

Luiz S. Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Marisa Morita
View author publications
You can also search for this author in PubMed Google Scholar
Robert Sabourin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luiz S. Oliveira.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oliveira, L.S., Morita, M. & Sabourin, R. Feature selection for ensembles applied to handwriting recognition. IJDAR 8, 262–279 (2006). https://doi.org/10.1007/s10032-005-0013-6

Download citation

Received: 08 October 2004
Revised: 19 September 2005
Accepted: 15 November 2005
Published: 09 March 2006
Issue Date: September 2006
DOI: https://doi.org/10.1007/s10032-005-0013-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection for ensembles applied to handwriting recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A dynamic ensemble learning algorithm for neural networks

Ensemble Feature Selection Method Based on Recently Developed Nature-Inspired Algorithms

Evolutionary Three-Stage Approach for Designing of Neural Networks Ensembles for Classification Problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Feature selection for ensembles applied to handwriting recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A dynamic ensemble learning algorithm for neural networks

Ensemble Feature Selection Method Based on Recently Developed Nature-Inspired Algorithms

Evolutionary Three-Stage Approach for Designing of Neural Networks Ensembles for Classification Problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation