Abstract
Learning classifier systems are leading evolutionary machine learning systems that employ genetic algorithms to search for a set of optimally general and correct classification rules for a variety of machine learning problems, including supervised classification data mining problems. The curse of dimensionality is a phenomenon that arises when analysing data in high-dimensional spaces. Performance issues when dealing with increasing dimensionality in the training data, such as poor classification accuracy and stalled genetic search, are well known for learning classifier systems. However, a systematic study to establish the relationship between increasing dimensionality and learning challenges in these systems is lacking. The aim of this paper is to analyse the behaviour of Michigan-style learning classifier systems that use the most commonly adopted and expressive interval-based rules representation, under curse of dimensionality (also known as Hughes Phenomenon) problem. In this paper, we use well-established and mathematically founded formal geometrical properties of high-dimensional data spaces and generalisation theory of these systems to propose a formulation of such relationship. The proposed formulations are validated experimentally using a set of synthetic, two-class classification problems. The findings demonstrate that the curse of dimensionality occurs for as few as ten dimensions and negatively affects the evolutionary search with a hyper-rectangular rule representation. A number of approaches to overcome some of the difficulties uncovered by the presented analysis are then discussed. Three approaches are then analysed in more detail using a set of synthetic, two-class classification problems. Experimental study demonstrates the effectiveness of these approaches to handle increasing dimensional data.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Strictly, a classifier refers to the combination of a rule, represented in an antecedent-consequent form, and a set of rule-level performance parameters. Informally, rule is used commonly to refer to a classifier.
Throughout this paper, LCS will be used to refer to Michigan-style supervised LCS, unless otherwise noted.
References
Abedini M, Kirley M (2009) CoXCS: a coevolutionary learning classifier based on feature space partitioning. In: Nicholson A, Li X (eds) AI 2009: advances in artificial intelligence, lecture notes in computer science, vol 5866. Springer, Berlin, pp 360–369
Abedini M, Kirley M (2011) Guided rule discovery in XCS for high-dimensional classification problems. In: AI 2011: advances in artificial intelligence. Springer, pp 1–10
Aguilar-Rivera R, Valenzuela-Rendón M, Rodríguez-Ortiz J (2015) Genetic algorithms and darwinian approaches in financial applications: a survey. Exp Syst Appl 42(21):7684–7697
Bacardit J, Krasnogor N (2006) Biohel: bioinformatics-oriented hierarchical evolutionary learning. University of Nottingham, Nottingham
Barandela R, Valdovinos R, Sánchez J (2003) New applications of ensembles of classifiers. Pattern Anal Appl 6(3):245–256. doi:10.1007/s10044-003-0192-z
Behdad M, French T, Barone L, Bennamoun M (2011) PCA for Improving the Performance of XCSR in classification of high-dimensional problems. In: Proceedings of the 13th annual conference companion on genetic and evolutionary computation, ACM, New York, NY, USA, GECCO ’11, pp 361–368. doi:10.1145/2001858.2002020
Berlanga FJ, del Jesus MJ, Herrera F (2008) A novel genetic cooperative-competitive fuzzy rule based learning method using genetic programming for high dimensional problems. In: 3rd International workshop on genetic and evolving systems, GEFS 2008, pp 101–106
Bernadó E, Garrell-Guiu MJ (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When Is nearest neighbor meaningful? Lecture notes in computer science. Springer, Berlin, pp 217–235
Bhowan U, Zhang M, Johnston M (2010) Genetic programming for classification with unbalanced data. In: Genetic programming. Springer, Berlin, pp 1–13
Butz MV (2005a) Kernel-based, ellipsoidal conditions in the real-valued XCS classifier system. In: Proceedings of the 2005 conference on genetic and evolutionary computation, ACM, pp 1835–1842
Butz MV (2005b) Rule-based evolutionary online learning systems: A principled approach to LCS analysis and design, vol 191. Springer, Berlin
Butz MV, Goldberg DE (2004) Rule-based evolutionary online learning systems: learning bounds, classification, and prediction. University of Illinois at Urbana-Champaign, Urbana
Butz MV, Kovacs T, Lanzi PL, Wilson SW (2001) How XCS evolves accurate classifiers. In: Proceedings of the third genetic and evolutionary computation conference (GECCO-2001), Citeseer, pp 927–934
Butz MV, Kovacs T, Lanzi P, Wilson S (2004) Toward a theory of generalization and learning in XCS. IEEE Trans Evol Comput 8(1):28–46
Butz MV, Lanzi P, Wilson S (2008) Function approximation with XCS: hyperellipsoidal conditions, recursive least squares, and compaction. IEEE Trans Evol Comput 12(3):355–376
De Jong KA, Spears WM, Gordon DF (1994) Using genetic algorithms for concept learning. Springer, Berlin
Debie E, Shafi K, Lokan C, Merrick K (2013) Performance analysis of rough set ensemble of learning classifier systems with differential evolution based rule discovery. Evol Intel 6(2):109–126
Debie E, Shafi K, Merrick K, Lokan C (2014) An online evolutionary rule learning algorithm with incremental attribute discretization. In: Accepted to appear in the proceedings of the 2014 IEEE congress on evolutionary computation (IEEE CEC 2014)
Fernández A, García S, Luengo J, Bernadó E, Herrera F (2010) Genetics-based machine learning for rule induction: taxonomy, experimental study and state of the art. IEEE Trans Evol Comput 4(6):913–941
Franco MA, Krasnogor N, Bacardit J (2013) Gassist vs. biohel: critical assessment of two paradigms of genetics-based machine learning. Soft Comput 17(6):953–981
Hérault J, Guérin-Dugué A, Villemain P (2002) Searching for the embedded manifolds in high-dimensional data, problems and unsolved questions. In: ESANN, pp 173–184
Holland JH (1976) Adaptation. In: Press A (ed) Progress in theoretical biology IV. Academic Press, New York, pp 263–293
Hu Q, Yu D, Xie Z, Li X (2007) EROS: ensemble rough subspaces. Pattern Recogn 40(12):3728–3739
Iqbal M, Browne WN, Zhang M (2013) Extending learning classifier system with cyclic graphs for scalability on complex, large-scale boolean problems. In: Proceedings of the 15th annual conference on Genetic and evolutionary computation, ACM, pp 1045–1052
Ishibuchi H, Nakashima T, Murata T (2001) Three-objective genetics-based machine learning for linguistic rule extraction. Inf Sci 136(1):109–133
Köppen M (2000) The curse of dimensionality. In: 5th Online world conference on soft computing in industrial applications (WSC5), pp 4–8
Lanzi PL (2003) XCS with stack-based genetic programming. In: The 2003 IEEE congress on evolutionary computation, CEC’03, vol 2, 2003, pp 1186–1191
Michalski RS (1983) A theory and methodology of inductive learning. Artif Intel 20(2):111–161
Pedrycz W, Lee DJ, Pizzi NJ (2010) Representation and classification of high-dimensional biomedical spectral data. Pattern Anal Appl 13(4):423–436. doi:10.1007/s10044-009-0170-1
Prasad S, Bruce LM (2008) Limitations of principal components analysis for hyperspectral target recognition. IEEE Geosci Remote Sens Lett 5(4):625–629
Scott DW, Thompson JR (1983) Probability density estimation in higher dimensions. In: Computer science and statistics: proceedings of the fifteenth symposium on the interface, North-Holland, Amsterdam vol 528, pp 173–179
Shafi K, Abbass HA (2013) Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection. Pattern Anal Appl 16(4):549–566
Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput 11(3):299–336
Urbanowicz RJ (2012) The detection and characterization of epistasis and heterogeneity: a learning classifier system approach a thesis submitted to the faculty. Ph.D. thesis, Dartmouth College Hanover, New Hampshire
Urbanowicz RJ, Moore JH (2015) Exstracs 2.0: description and evaluation of a scalable learning classifier system. Evol Intel 8(2–3):89–116
Wilson S (2001) Mining oblique data with XCS. In: Luca Lanzi P, Stolzmann W, Wilson S (eds) Advances in learning classifier systems, lecture notes in computer science, vol 1996. Springer, Berlin, pp 158–174
Wilson SW (1994) ZCS: a zeroth level classifier system. Evol Comput 2(1):1–18
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Wilson SW (2000) Get real! XCS with continuous-valued inputs. In: Learning classifier systems, Springer, Berlin, pp 209–219
Yang J, Xu H, Jia P (2012) Effective search for Pittsburgh learning classifier systems via estimation of distribution algorithms. Inf Sci 198:100–117
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Debie, E., Shafi, K. Implications of the curse of dimensionality for supervised learning classifier systems: theoretical and empirical analyses. Pattern Anal Applic 22, 519–536 (2019). https://doi.org/10.1007/s10044-017-0649-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-017-0649-0