Statistical independence and novelty detection with information preserving nonlinear maps

L Parra, G Deco, S Miesbach - Neural Computation, 1996 - ieeexplore.ieee.org
L Parra, G Deco, S Miesbach
Neural Computation, 1996ieeexplore.ieee.org
According to Barlow (1989), feature extraction can be understood as finding a statistically
independent representation of the probability distribution underlying the measured signals.
The search for a statistically independent representation can be formulated by the criterion
of minimal mutual information, which reduces to decorrelation in the case of gaussian
distributions. If nongaussian distributions are to be considered, minimal mutual information
is the appropriate generalization of decorrelation as used in linear Principal Component …
According to Barlow (1989), feature extraction can be understood as finding a statistically independent representation of the probability distribution underlying the measured signals. The search for a statistically independent representation can be formulated by the criterion of minimal mutual information, which reduces to decorrelation in the case of gaussian distributions. If nongaussian distributions are to be considered, minimal mutual information is the appropriate generalization of decorrelation as used in linear Principal Component Analyses (PCA). We also generalize to nonlinear transformations by only demanding perfect transmission of information. This leads to a general class of nonlinear transformations, namely symplectic maps. Conservation of information allows us to consider only the statistics of single coordinates. The resulting factorial representation of the joint probability distribution gives a density estimation. We apply this concept to the real world problem of electrical motor fault detection treated as a novelty detection task.
ieeexplore.ieee.org