Hybrid BYOL-ViT: Efficient approach to deal with small datasets

Naimi, Safwen; van Leeuwen, Rien; Souidene, Wided; Saoud, Slim Ben

Computer Science > Computer Vision and Pattern Recognition

arXiv:2111.04845 (cs)

[Submitted on 8 Nov 2021 (v1), last revised 15 Nov 2021 (this version, v2)]

Title:Hybrid BYOL-ViT: Efficient approach to deal with small datasets

Authors:Safwen Naimi, Rien van Leeuwen, Wided Souidene, Slim Ben Saoud

View PDF

Abstract:Supervised learning can learn large representational spaces, which are crucial for handling difficult learning tasks. However, due to the design of the model, classical image classification approaches struggle to generalize to new problems and new situations when dealing with small datasets. In fact, supervised learning can lose the location of image features which leads to supervision collapse in very deep architectures. In this paper, we investigate how self-supervision with strong and sufficient augmentation of unlabeled data can train effectively the first layers of a neural network even better than supervised learning, with no need for millions of labeled data. The main goal is to disconnect pixel data from annotation by getting generic task-agnostic low-level features. Furthermore, we look into Vision Transformers (ViT) and show that the low-level features derived from a self-supervised architecture can improve the robustness and the overall performance of this emergent architecture. We evaluated our method on one of the smallest open-source datasets STL-10 and we obtained a significant boost of performance from 41.66% to 83.25% when inputting low-level features from a self-supervised learning architecture to the ViT instead of the raw images.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2111.04845 [cs.CV]
	(or arXiv:2111.04845v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2111.04845

Submission history

From: Safwen Naimi [view email]
[v1] Mon, 8 Nov 2021 21:44:31 UTC (6,190 KB)
[v2] Mon, 15 Nov 2021 16:54:48 UTC (6,190 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Hybrid BYOL-ViT: Efficient approach to deal with small datasets

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Hybrid BYOL-ViT: Efficient approach to deal with small datasets

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators