Abstract
Kernel canonical correlation analysis (CCA) aims to extract common features from a pair of multivariate data sets by maximizing a linear correlation between nonlinear mappings of the data. However, the kernel CCA tends to obtain the features that have only small information of original multivariates in spite of their high correlation, because it considers only statistics of the extracted features and the nonlinear mappings have high degree of freedom. We propose a kernel method for common feature extraction based on mutual information that maximizes a new objective function. The objective function is a linear combination of two kinds of mutual information, one between the extracted features and the other between the multivariate and its feature. A large value of the former mutual information provides strong dependency to the features, and the latter prevents loss of the feature’s information related to the multivariate. We maximize the objective function by using the Parallel Tempering MCMC in order to overcome a local maximum problem. We show the effectiveness of the proposed method via numerical experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Akaho, S.: A kernel method for canonical correlation analysis. In: Proceedings of the International Meeting of the Psychometric Society (IMPS 2001) (2001)
Araki, T., Ikeda, K.: Adaptive Markov chain Monte Carlo for auxiliary variable method and its application to Parallel Tempering. Neural Networks 43, 33–40 (2013)
Bach, F.R., Jordan, M.I.: Kernel independent component analysis. The Journal of Machine Learning Research 3, 1–48 (2003)
Faivishevsky, L., Goldberger, J.: ICA based on a smooth estimation of the differential entropy. In: Advances in Neural Information Processing Systems, pp. 433–440 (2008)
Geyer, C.: Markov chain Monte Carlo maximum likelihood. In: Proc. 23rd Symp. Interface Comput. Sci. Statist., pp. 156–216 (1991)
Hino, H., Murata, N.: A conditional entropy minimization criterion for dimensionality reduction and multiple kernel learning. Neural Computation 22(11), 2887–2923 (2010)
Hukushima, K., Nemoto, K.: Exchange Monte Carlo method and application to spin glass simulations. Journal of the Physical Society of Japan 65(6), 1604–1608 (1996)
Melzer, T., Reiter, M.K., Bischof, H.: Nonlinear feature extraction using generalized canonical correlation analysis. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) ICANN 2001. LNCS, vol. 2130, pp. 353–360. Springer, Heidelberg (2001)
Robert, C., Casella, G.: Monte Carlo Statistical Methods. Springer (2004)
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D.P., Williamson, B. (eds.) COLT/EuroCOLT 2001. LNCS (LNAI), vol. 2111, pp. 416–426. Springer, Heidelberg (2001)
Schölkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Araki, T., Hino, H., Akaho, S. (2014). A Kernel Method to Extract Common Features Based on Mutual Information. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds) Neural Information Processing. ICONIP 2014. Lecture Notes in Computer Science, vol 8835. Springer, Cham. https://doi.org/10.1007/978-3-319-12640-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-12640-1_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12639-5
Online ISBN: 978-3-319-12640-1
eBook Packages: Computer ScienceComputer Science (R0)