Eigenvector-based sparse canonical correlation analysis: Fast computation for estimation of multiple canonical vectors

Wang, Wenjia; Zhou, Yi-Hui

Statistics > Methodology

arXiv:2004.10231 (stat)

[Submitted on 21 Apr 2020 (v1), last revised 8 Jun 2021 (this version, v2)]

Title:Eigenvector-based sparse canonical correlation analysis: Fast computation for estimation of multiple canonical vectors

Authors:Wenjia Wang, Yi-Hui Zhou

View PDF

Abstract:Classical canonical correlation analysis (CCA) requires matrices to be low dimensional, i.e. the number of features cannot exceed the sample size. Recent developments in CCA have mainly focused on the high-dimensional setting, where the number of features in both matrices under analysis greatly exceeds the sample size. These approaches impose penalties in the optimization problems that are needed to be solve iteratively, and estimate multiple canonical vectors sequentially. In this work, we provide an explicit link between sparse multiple regression with sparse canonical correlation analysis, and an efficient algorithm that can estimate multiple canonical pairs simultaneously rather than sequentially. Furthermore, the algorithm naturally allows parallel computing. These properties make the algorithm much efficient. We provide theoretical results on the consistency of canonical pairs. The algorithm and theoretical development are based on solving an eigenvectors problem, which significantly differentiate our method with existing methods. Simulation results support the improved performance of the proposed approach. We apply eigenvector-based CCA to analysis of the GTEx thyroid histology images, analysis of SNPs and RNA-seq gene expression data, and a microbiome study. The real data analysis also shows improved performance compared to traditional sparse CCA.

Subjects:	Methodology (stat.ME); Applications (stat.AP)
Cite as:	arXiv:2004.10231 [stat.ME]
	(or arXiv:2004.10231v2 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.2004.10231

Submission history

From: Wenjia Wang [view email]
[v1] Tue, 21 Apr 2020 18:32:28 UTC (3,083 KB)
[v2] Tue, 8 Jun 2021 09:10:19 UTC (2,491 KB)

Statistics > Methodology

Title:Eigenvector-based sparse canonical correlation analysis: Fast computation for estimation of multiple canonical vectors

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:Eigenvector-based sparse canonical correlation analysis: Fast computation for estimation of multiple canonical vectors

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators