Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

Li, Siyuan; Zhang, Luyuan; Wang, Zedong; Wu, Di; Wu, Lirong; Liu, Zicheng; Xia, Jun; Tan, Cheng; Liu, Yang; Sun, Baigui; Li, Stan Z.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2401.00897 (cs)

[Submitted on 31 Dec 2023 (v1), last revised 9 Jan 2024 (this version, v2)]

Title:Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

Authors:Siyuan Li, Luyuan Zhang, Zedong Wang, Di Wu, Lirong Wu, Zicheng Liu, Jun Xia, Cheng Tan, Yang Liu, Baigui Sun, Stan Z. Li

View PDF

Abstract:As the deep learning revolution marches on, self-supervised learning has garnered increasing attention in recent years thanks to its remarkable representation learning ability and the low dependence on labeled data. Among these varied self-supervised techniques, masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training. This paradigm enables deep models to learn robust representations and has demonstrated exceptional performance in the context of computer vision, natural language processing, and other modalities. In this survey, we present a comprehensive review of the masked modeling framework and its methodology. We elaborate on the details of techniques within masked modeling, including diverse masking strategies, recovering targets, network architectures, and more. Then, we systematically investigate its wide-ranging applications across domains. Furthermore, we also explore the commonalities and differences between masked modeling methods in different fields. Toward the end of this paper, we conclude by discussing the limitations of current techniques and point out several potential avenues for advancing masked modeling research. A paper list project with this survey is available at \url{this https URL}.

Comments:	Preprint v2 (fix typos and citations). GitHub project at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2401.00897 [cs.CV]
	(or arXiv:2401.00897v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2401.00897

Submission history

From: Siyuan Li [view email]
[v1] Sun, 31 Dec 2023 12:03:21 UTC (20,885 KB)
[v2] Tue, 9 Jan 2024 16:09:47 UTC (20,298 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators