Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction

Do, Tu T.; Vu, Mai Anh; Ly, Hoang Thien; Nguyen, Thu; Hicks, Steven A.; Riegler, Michael A.; Halvorsen, Pål; Nguyen, Binh T.

Computer Science > Machine Learning

arXiv:2305.06042v1 (cs)

[Submitted on 10 May 2023 (this version), latest version 10 Jan 2024 (v2)]

Title:Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction

Authors:Tu T. Do, Mai Anh Vu, Hoang Thien Ly, Thu Nguyen, Steven A. Hicks, Michael A. Riegler, Pål Halvorsen, Binh T. Nguyen

View PDF

Abstract:Monotone missing data is a common problem in data analysis. However, imputation combined with dimensionality reduction can be computationally expensive, especially with the increasing size of datasets. To address this issue, we propose a Blockwise principal component analysis Imputation (BPI) framework for dimensionality reduction and imputation of monotone missing data. The framework conducts Principal Component Analysis (PCA) on the observed part of each monotone block of the data and then imputes on merging the obtained principal components using a chosen imputation technique. BPI can work with various imputation techniques and can significantly reduce imputation time compared to conducting dimensionality reduction after imputation. This makes it a practical and efficient approach for large datasets with monotone missing data. Our experiments validate the improvement in speed. In addition, our experiments also show that while applying MICE imputation directly on missing data may not yield convergence, applying BPI with MICE for the data may lead to convergence.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2305.06042 [cs.LG]
	(or arXiv:2305.06042v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2305.06042

Submission history

From: Thu Nguyen Ms. [view email]
[v1] Wed, 10 May 2023 10:51:36 UTC (91 KB)
[v2] Wed, 10 Jan 2024 15:25:45 UTC (243 KB)

Computer Science > Machine Learning

Title:Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators