Article

Supervised probabilistic principal component analysis

Authors:

Hans-Peter Kriegel,

Mingrui WuAuthors Info & Claims

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 464 - 473

https://doi.org/10.1145/1150402.1150454

Published: 20 August 2006 Publication History

Abstract

Principal component analysis (PCA) has been extensively applied in data mining, pattern recognition and information retrieval for unsupervised dimensionality reduction. When labels of data are available, e.g., in a classification or regression task, PCA is however not able to use this information. The problem is more interesting if only part of the input data are labeled, i.e., in a semi-supervised setting. In this paper we propose a supervised PCA model called SPPCA and a semi-supervised PCA model called S²PPCA, both of which are extensions of a probabilistic PCA model. The proposed models are able to incorporate the label information into the projection phase, and can naturally handle multiple outputs (i.e., in multi-task learning problems). We derive an efficient EM learning algorithm for both models, and also provide theoretical justifications of the model behaviors. SPPCA and S²PPCA are compared with other supervised projection methods on various learning tasks, and show not only promising performance but also good scalability.

References

[1]

E. Bair, T. Hastie, D. Paul, and R. Tibshirani. Prediction by supervised principal components. Technical report, 2004.

[2]

S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.

[3]

A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B, 39:1--38, 1977.

[4]

R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley, 2000.

Digital Library

[5]

T. Evgeniou and M. Pontil. Regularized multi-task learning. In Proceedings of SIGKDD, 2004.

Digital Library

[6]

K. Fukumizu, F. R. Bach, and M. I. Jordan. Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. Journal of Machine Learning Research, 5(Jan):73--99, 2004.

Digital Library

[7]

T. Hastie and R. Tibshirani. Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society series B, 58:158--176, 1996.

[8]

T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Verlag, 2001.

[9]

H. Hotelling. Relations between two sets of variables. Biometrika, 28:321--377, 1936.

[10]

I. T. Jolliffe. Principal Component Analysis. Springer Verlag, 2002.

[11]

D. D. Lewis, Y. Yang, T. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361--397, 2004.

Digital Library

[12]

R. G. Miller. Beyond Anova: Basics of Applied Statistics. John Wiley, 1986.

[13]

T. P. Minka. A family of algorithms for approximate Bayesian inference. PhD thesis, Massachusetts Institute of Technology, 2001.

Digital Library

[14]

S. Roweis and Z. Ghahramani. A unifying review of linear Gaussian models. Neural Computaion, 11(2):305--345, 1999.

Digital Library

[15]

S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000.

[16]

B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, 2002.

[17]

M. E. Tipping and C. M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statisitical Society, B(61):611--622, 1999.

[18]

H. Wold. Soft modeling by latent variables; the nonlinear iterative partial least squares approach. Perspectives in Probability and Statistics, Papers in Honour of M. S. Bartlett, 1975.

[19]

J. Ye, R. Janardan, Q. Li, and H. Park. Feature extraction via generalized uncorrelated linear discriminant analysis. In Proceedings of ICML, 2004.

Digital Library

[20]

K. Yu, S. Yu, and T. Volker. Multi-label informed latent semantic indexing. In Proceedings of 27th Annual International ACM SIGIR Conference, 2005.

Digital Library

Cited By

Huang XXia HYin WLiu Y(2025)Condition monitoring and breakage assessment of steam generator heat transfer tubes in nuclear power plantsAnnals of Nuclear Energy10.1016/j.anucene.2024.111032212(111032)Online publication date: Mar-2025
https://doi.org/10.1016/j.anucene.2024.111032
Chen ZWang HChen GMa YYao LGe ZSong Z(2024)Analyzing and Improving Supervised Nonlinear Dynamical Probabilistic Latent Variable Model for Inferential SensorsIEEE Transactions on Industrial Informatics10.1109/TII.2024.343546620:11(13296-13307)Online publication date: Nov-2024
https://doi.org/10.1109/TII.2024.3435466
Ziaei NNazari BEden UWidge AYousefi A(2024)A Bayesian Gaussian Process-Based Latent Discriminative Generative Decoder (LDGD) Model for High-Dimensional DataIEEE Access10.1109/ACCESS.2024.344364612(113314-113335)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3443646
Show More Cited By

Index Terms

Supervised probabilistic principal component analysis
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Principal Component Analysis: A Natural Approach to Data Exploration

Principal component analysis (PCA) is often applied for analyzing data in the most diverse areas. This work reports, in an accessible and integrated manner, several theoretical and practical aspects of PCA. The basic principles underlying PCA, data ...
A new discriminant principal component analysis method with partial supervision

Principal component analysis (PCA) is one of the most widely used unsupervised dimensionality reduction methods in pattern recognition. It preserves the global covariance structure of data when labels of data are not available. However, in many ...
Semi-supervised local Fisher discriminant analysis for dimensionality reduction

When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2006

986 pages

ISBN:1595933395

DOI:10.1145/1150402

Conference Chair:
Tina Eliassi-Rad
LLNL
,
General Chair:
Lyle Ungar
University of Pennsylvania
,
Program Chairs:
Mark Craven
University of Wisconsin
,
Dimitrios Gunopulos
University of California, Riverside

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

KDD06

Sponsor:

KDD06: The 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 20 - 23, 2006

PA, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

90
Total Citations
View Citations
1,988
Total Downloads

Downloads (Last 12 months)79
Downloads (Last 6 weeks)17

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Huang XXia HYin WLiu Y(2025)Condition monitoring and breakage assessment of steam generator heat transfer tubes in nuclear power plantsAnnals of Nuclear Energy10.1016/j.anucene.2024.111032212(111032)Online publication date: Mar-2025
https://doi.org/10.1016/j.anucene.2024.111032
Chen ZWang HChen GMa YYao LGe ZSong Z(2024)Analyzing and Improving Supervised Nonlinear Dynamical Probabilistic Latent Variable Model for Inferential SensorsIEEE Transactions on Industrial Informatics10.1109/TII.2024.343546620:11(13296-13307)Online publication date: Nov-2024
https://doi.org/10.1109/TII.2024.3435466
Ziaei NNazari BEden UWidge AYousefi A(2024)A Bayesian Gaussian Process-Based Latent Discriminative Generative Decoder (LDGD) Model for High-Dimensional DataIEEE Access10.1109/ACCESS.2024.344364612(113314-113335)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3443646
Bonetti PMetelli ARestelli M(2024)Interpretable linear dimensionality reduction based on bias-variance analysisData Mining and Knowledge Discovery10.1007/s10618-024-01015-038:4(1713-1781)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s10618-024-01015-0
Aktukmak MZhu HChevrette MNepper JMagesh SHandelsman JHero A(2023)A Graphical Model for Fusing Diverse Microbiome DataIEEE Transactions on Signal Processing10.1109/TSP.2023.330946471(3399-3412)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TSP.2023.3309464
Aktukmak MYilmaz YHero A(2023)Any-Shot Learning From Multimodal Observations (ALMO)IEEE Access10.1109/ACCESS.2023.328293211(61513-61524)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3282932
Zhou YLuo KLiang LChen MHe X(2023)A new Bayesian factor analysis method improves detection of genes and biological processes affected by perturbations in single-cell CRISPR screeningNature Methods10.1038/s41592-023-02017-420:11(1693-1703)Online publication date: 28-Sep-2023
https://doi.org/10.1038/s41592-023-02017-4
Niu XMa W(2023)Semi-supervised classifier ensemble model for high-dimensional dataInformation Sciences10.1016/j.ins.2023.119203(119203)Online publication date: May-2023
https://doi.org/10.1016/j.ins.2023.119203
Tu LTalbot AGallagher NCarlson D(2022)Supervising the Decoder of Variational Autoencoders to Improve Scientific UtilityIEEE Transactions on Signal Processing10.1109/TSP.2022.323032970(5954-5966)Online publication date: 2022
https://doi.org/10.1109/TSP.2022.3230329
Watanabe KMaeda KOgawa THaseyama M(2022)Distributed Label Dequantized Gaussian Process Latent Variable Model for Multi-View Data IntegrationICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP43922.2022.9747681(4643-4647)Online publication date: 23-May-2022
https://doi.org/10.1109/ICASSP43922.2022.9747681
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten