Generic Knowledge Boosted Pre-training For Remote Sensing Images

Z Huang, M Zhang, Y Gong, Q Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Z Huang, M Zhang, Y Gong, Q Liu, Y Wang
IEEE Transactions on Geoscience and Remote Sensing, 2024ieeexplore.ieee.org
Deep learning models are essential for scene classification, change detection, land cover
segmentation, and other remote sensing (RS) image understanding tasks. Most backbones
of existing RS deep learning models are typically initialized by pretrained weights obtained
from ImageNet pretraining (IMP). However, domain gaps exist between RS images and
natural images (eg, ImageNet), making deep learning models initialized by pretrained
weights of IMP perform poorly for RS image understanding. Although some pretraining …
Deep learning models are essential for scene classification, change detection, land cover segmentation, and other remote sensing (RS) image understanding tasks. Most backbones of existing RS deep learning models are typically initialized by pretrained weights obtained from ImageNet pretraining (IMP). However, domain gaps exist between RS images and natural images (e.g., ImageNet), making deep learning models initialized by pretrained weights of IMP perform poorly for RS image understanding. Although some pretraining methods are studied in the RS community, current RS pretraining (RSP) methods face the problem of vague generalization by only using RS images. In this article, we propose a novel RSP framework, generic knowledge boosted RSP (GeRSP), to learn robust representations from RS and natural images for RS understanding tasks. GeRSP contains two pretraining branches: 1) a self-supervised pretraining branch is adopted to learn domain-related representations from unlabeled RS images and 2) a supervised pretraining branch is integrated into GeRSP for general knowledge learning from labeled natural images. Moreover, GeRSP combines two pretraining branches using a teacher–student architecture to simultaneously learn representations with general and special knowledge, which generates a powerful pretrained model for deep learning model initialization. Finally, we evaluate GeRSP and other RSP methods on three downstream tasks, i.e., object detection, semantic segmentation, and scene classification. The extensive experimental results consistently demonstrate that GeRSP can effectively learn robust representations in a unified manner, improving the performance of RS downstream tasks. Code and pretrained models: https://github.com/floatingstarZ/GeRSP .
ieeexplore.ieee.org