AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Carson IV, William E.; Talbot, Austin; Carlson, David

Statistics > Machine Learning

arXiv:2201.02547 (stat)

[Submitted on 7 Jan 2022]

Title:AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Authors:William E. Carson IV, Austin Talbot, David Carlson

View PDF

Abstract:Deep autoencoders are often extended with a supervised or adversarial loss to learn latent representations with desirable properties, such as greater predictivity of labels and outcomes or fairness with respects to a sensitive variable. Despite the ubiquity of supervised and adversarial deep latent factor models, these methods should demonstrate improvement over simpler linear approaches to be preferred in practice. This necessitates a reproducible linear analog that still adheres to an augmenting supervised or adversarial objective. We address this methodological gap by presenting methods that augment the principal component analysis (PCA) objective with either a supervised or an adversarial objective and provide analytic and reproducible solutions. We implement these methods in an open-source Python package, AugmentedPCA, that can produce excellent real-world baselines. We demonstrate the utility of these factor models on an open-source, RNA-seq cancer gene expression dataset, showing that augmenting with a supervised objective results in improved downstream classification performance, produces principal components with greater class fidelity, and facilitates identification of genes aligned with the principal axes of data variance with implications to development of specific types of cancer.

Comments:	NeurIPS 2021 (Learning Meaningful Representations of Life Workshop)
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Genomics (q-bio.GN)
Cite as:	arXiv:2201.02547 [stat.ML]
	(or arXiv:2201.02547v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2201.02547

Submission history

From: William Carson [view email]
[v1] Fri, 7 Jan 2022 17:08:59 UTC (6,218 KB)

Statistics > Machine Learning

Title:AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators