Instance-linked attribute tracking and feedback for michigan-style supervised learning classifier systems

R Urbanowicz, A Granizo-Mackenzie… - Proceedings of the 14th …, 2012 - dl.acm.org
R Urbanowicz, A Granizo-Mackenzie, J Moore
Proceedings of the 14th annual conference on Genetic and evolutionary …, 2012dl.acm.org
The application of learning classifier systems (LCSs) to classification and data mining in
genetic association studies has been the target of previous work. Recent efforts have
focused on:(1) correctly discriminating between predictive and non-predictive attributes, and
(2) detecting and characterizing epistasis (attribute interaction) and heterogeneity. While the
solutions evolved by Michigan-style LCSs (M-LCSs) are conceptually well suited to address
these phenomena, the explicit characterization of heterogeneity remains a particular …
The application of learning classifier systems (LCSs) to classification and data mining in genetic association studies has been the target of previous work. Recent efforts have focused on: (1) correctly discriminating between predictive and non-predictive attributes, and (2) detecting and characterizing epistasis (attribute interaction) and heterogeneity. While the solutions evolved by Michigan-style LCSs (M-LCSs) are conceptually well suited to address these phenomena, the explicit characterization of heterogeneity remains a particular challenge. In this study we introduce attribute tracking, a mechanism akin to memory, for supervised learning in M-LCSs. Given a finite training set, a vector of accuracy scores is maintained for each instance in the data. Post-training, we apply these scores to characterize patterns of association in the dataset. Additionally we introduce attribute feedback to the mutation and crossover mechanisms, probabilistically directing rule generalization based on an instance's tracking scores. We find that attribute tracking combined with clustering and visualization facilitates the characterization of epistasis and heterogeneity while uniquely linking individual instances in the dataset to etiologically heterogeneous subgroups. Moreover, these analyses demonstrate that attribute feedback significantly improves test accuracy, efficient generalization, run time, and the power to discriminate between predictive and non-predictive attributes in the presence of heterogeneity.
ACM Digital Library