Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Exploiting fisher and fukunaga-koontz transforms in chernoff dimensionality reduction

Published: 02 August 2013 Publication History
  • Get Citation Alerts
  • Abstract

    Knowledge discovery from big data demands effective representation of data. However, big data are often characterized by high dimensionality, which makes knowledge discovery more difficult. Many techniques for dimensionality reudction have been proposed, including well-known Fisher's Linear Discriminant Analysis (LDA). However, the Fisher criterion is incapable of dealing with heteroscedasticity in the data. A technique based on the Chernoff criterion for linear dimensionality reduction has been proposed that is capable of exploiting heteroscedastic information in the data. While the Chernoff criterion has been shown to outperform the Fisher's, a clear understanding of its exact behavior is lacking. In this article, we show precisely what can be expected from the Chernoff criterion. In particular, we show that the Chernoff criterion exploits the Fisher and Fukunaga-Koontz transforms in computing its linear discriminants. Furthermore, we show that a recently proposed decomposition of the data space into four subspaces is incomplete. We provide arguments on how to best enrich the decomposition of the data space in order to account for heteroscedasticity in the data. Finally, we provide experimental results validating our theoretical analysis.

    References

    [1]
    Belhumeur, V. I., Hespanha, J. P., and Kriegman, D. J. 1997. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19, 7, 711--720.
    [2]
    Bellman, R. E. 1961. Adaptive Control Precesses: A Guided Tour. Princeton University Press.
    [3]
    Bian, W. and Tao, D. 2011. Max-min distance analysis by using sequential sdp relaxation for dimension reduction. IEEE Trans. Pattern Anal. Mach. Intell. 33, 5, 1037--1050.
    [4]
    Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. 1984. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA.
    [5]
    Chen, C. H. 1979. On information and distance measures, error bounds, and feature selection. Inf. Scientist 10, 159--173.
    [6]
    Chen, L. F., Liao, H. Y., Ko, M. T., Lin, J. C., and Yu, G. U. 2001. A new lda-based face recognition system which can solve the small sample size problem. Pattern Recogn. 33, 1713--1726.
    [7]
    Chung, J. K., Kannappan, P. L., Ng, C. T., and Sahoo, P. K. 1989. Measures of distance between probability distributions. J. Math. Anal. Appl. 138, 280--292.
    [8]
    Cootes, T. F. and Taylor, C. J. 1992. Active shape models: Smart snakes. In Proceedings of the British Machine Vision Conference. 9--18.
    [9]
    Cover, T. M. and Hart, P. E. 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13, 1, 21--27.
    [10]
    Cristianini, N. and Shawe-Taylor, J. 2000. An Introduction to Support Vector Machines and other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK.
    [11]
    Dai, G., Yeung, D. Y., and Chang, H. 2006. Extending kernel fisher discriminant analysis with the weighted pairwise chernoff criterion. In Proceedings of the European Conference on Computer Vision. 308--320.
    [12]
    Decell, H. P. and Mayekar, S. M. 1977. Feature combinations and the divergence criterion. Comput. Math. Appl. 3, 71--76.
    [13]
    Devijver, P. A. and Pattern, J. K. 1982. Recognition: A Statistical Approach. Prentice-Hall, London.
    [14]
    Etemad, K. and Chellappa, R. 1996. Discriminant analysis for recognition of human faces. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 2148--2151.
    [15]
    Friedman, J. H. 1994. Flexible metric nearest neighbor classification. Tech. rep. Deptartment of Statistics, Stanford University. http://www-stat.stanford.edu/∼jhf/ftp/flexmet.pdf.
    [16]
    Fukunaga, F. and Koontz, W. 1970. Applications of the karhunen-loeve expansion to feature selection and ordering. IEEE Trans. Comput. 19, 5, 311--318.
    [17]
    Fukunaga, K. 1990. Introduction to Statistical Pattern Recognition. Academic Press.
    [18]
    Gu, Q., Li, Z., and Han, J. 2011a. Joint feature selection and subspace learning. In Proceedings of the International Joint Conference on Artificial Intelligence.
    [19]
    Gu, Q., Li, Z., and Han, J. 2011b. Linear discriminant dimensionality reduction. In Proceedings of the European Conference on Machine Learning.
    [20]
    Hamsici, O. C. and Martinez, A. 2008. Bayes optimality in linear discriminant analysis. IEEE Trans. Pattern Anal. Mach. Intell. 30, 4, 647--657.
    [21]
    Howland, P. and Park, H. 2004. Generalizing discriminant analysis using the generalized singular value decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 26, 8, 995--1006.
    [22]
    Huang, R., Liu, Q., Lu, H., and Ma, S. 2002. Solving the small sample size problem of lda. In Proceedings of 16th International Conference on Pattern Recognition. Vol. 3. 29--32.
    [23]
    Huo, X., Elad, M., Flesia, A. G., Muise, R. R., Stanfill, S. R., et al. 2003. Optimal reduced-rank quadratic classiers using the fukunaga-koontz transform, with applications to automated target recognition. In Proceedings of the SPIE Conference.
    [24]
    Juo, D., Ding, C., and Huang, H. 2011. Linear discriminant analysis: New formulations and overfit analysis. In Proceedings of the 25th National Conference on Artificial Intelligence (AAAI'11). 417--422.
    [25]
    Kira, K. and Rendell, L. A. 1992. A practical approach to feature selection. In Proceedings of the 9th International Conference on Machine Learning. 249--256.
    [26]
    Kumar, N. and Andreou, A. G. 1996. Generalization of linear discriminant analysis in a maximum likelihood framework. In Proceedings of the Joint Meeting of the American Statistical Association.
    [27]
    Kyperountas, M., Tefas, A., and Pitas, I. 2007. Weighted piecewise lda for solving the small sample size problem in face verification. IEEE Trans. Neural Netw. 18, 2, 506--519.
    [28]
    Li, H., Jiang, T., and Zhang, K. 2006. Efficient and robust feature extraction by maximum margin criterion. IEEE Trans. Neural Netw. 17, 1, 157--165.
    [29]
    Loog, M. and Duin, P. W. 2004. Linear dimensionality reduction via a heteroscedastic extension of lda: The chernoff criterion. IEEE Trans. Pattern Anal. Mach. Intell. 26, 6, 732--739.
    [30]
    Moghaddamand, B. and Pentland, A. 1997. Probabilistic visual learning for object representation. IEEE Trans. Pattern Anal. Mach. Intell. 19, 7, 696--710.
    [31]
    Nayar, S. K., Baker, S., and Murase, H. 1994. Parametric feature detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 471--477.
    [32]
    Paige, C. C. and Saunders, M. A. 1981. Towards a generalized singular value decomposition. SIAM J. Numer. Anal. 18, 3, 398--405.
    [33]
    Pentland, A., Moghaddam, B., and Starner, T. 1994. View-based and modular eigenspaces for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 84--91.
    [34]
    Rueda, L. and Herrera, M. 2008. Linear dimensionality reduction by maximizing the chernoff distance in the transformed space. Pattern Recogn. 41, 10, 3138--3152.
    [35]
    Sun, Y. and Wu, D. 2008. A relief based feature extraction algorithm. In Proceedings of the SIAM International Conference on Data Mining. 188--195.
    [36]
    Tao, D., Li, X., Wu, X., and Maybank, S. J. 2009. Geometric mean for subspace selection. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2, 260--274.
    [37]
    Turk, M. and Pentland, A. 1991. Eigenfaces for recognition. Cogn. Neurosci. 3, 1, 71--86.
    [38]
    van Loan, C. F. 1976. Generalizing the singular value decomposition. SIAM J. Numer. Anal. 13, 1, 76--83.
    [39]
    Ye, J. 2005. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. J. Mach. Learn. Res. 6, 483--502.
    [40]
    Ye, J. and Li, Q. 2005. A two-stage linear discriminant analysis via qr-decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 27, 6, 929--942.
    [41]
    Zhang, S. and Sim, T. 2007. Discriminant subspace analysis: A fukunaga-koontz approach. IEEE Trans. Pattern Anal. Mach. Intell. 29, 10, 1732--1745.
    [42]
    Zhang, W., Xue, X., Sun, Z., Guo, Y., and Lu, H. 2007. Optimal dimensionality of metric space for classification. In Proceedings of the 24th International Conference on Machine Learning. 1135--1142.

    Cited By

    View all
    • (2022)Computational Estimation by Scientific Data Mining with Classical Methods to Automate Learning Strategies of ScientistsACM Transactions on Knowledge Discovery from Data10.1145/350273616:5(1-52)Online publication date: 9-Mar-2022
    • (2021)Learning Latent Variable Models with Discriminant RegularizationAgents and Artificial Intelligence10.1007/978-3-030-71158-0_18(378-398)Online publication date: 14-Mar-2021

    Index Terms

    1. Exploiting fisher and fukunaga-koontz transforms in chernoff dimensionality reduction

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 7, Issue 2
      July 2013
      107 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/2499907
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 August 2013
      Accepted: 01 January 2013
      Revised: 01 November 2012
      Received: 01 August 2012
      Published in TKDD Volume 7, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Chernoff distance
      2. FKT
      3. Feature evaluation and selection
      4. LDA
      5. dimensionality reduction

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 12 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Computational Estimation by Scientific Data Mining with Classical Methods to Automate Learning Strategies of ScientistsACM Transactions on Knowledge Discovery from Data10.1145/350273616:5(1-52)Online publication date: 9-Mar-2022
      • (2021)Learning Latent Variable Models with Discriminant RegularizationAgents and Artificial Intelligence10.1007/978-3-030-71158-0_18(378-398)Online publication date: 14-Mar-2021

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media