CS-HSNet: A cross-Siamese change detection network based on hierarchical-split attention

Q Ke, P Zhang - IEEE Journal of Selected Topics in Applied …, 2021 - ieeexplore.ieee.org
Q Ke, P Zhang
IEEE Journal of Selected Topics in Applied Earth Observations and …, 2021ieeexplore.ieee.org
Change detection methods for optical remote sensing images play an important role in
environmental resource management. Although recent methods based on deep learning
demonstrate incredible ability by constructing networks, first, extracting bitemporal features
in a separate manner; second, fusing bitemporal images before forwarding them into the
single-level network. Both severely neglect the effect of spatial-temporal feature correlation
between bitemporal images. In addition, most existing methods represent multiscale feature …
Change detection methods for optical remote sensing images play an important role in environmental resource management. Although recent methods based on deep learning demonstrate incredible ability by constructing networks, first, extracting bitemporal features in a separate manner; second, fusing bitemporal images before forwarding them into the single-level network. Both severely neglect the effect of spatial-temporal feature correlation between bitemporal images. In addition, most existing methods represent multiscale feature pairs in a layer-wise manner like ResNet, failing to consider the inner multilevel structure. In this work, we propose a new siamese change detection feature encoder backbone named cross-siamese Res2Net (CSRes2Net), by establishing crossed and hierarchical residual-like connections within one single residual block. The CSRes2Net represents dual features in a fine-grained manner and fully leads to the flow of bitemporal features. In addition, recent learning-based methods designed some spatial-temporal relation modules to capture the pixel-level pairwise relationship and channel dependency based on self-attention mechanism, but they only consider spatial and channel dimension corrections separately with excessive parameters. So we propose a lightweight cross spatial-channel triplet attention module to capture cross-dimensional long-range relationship between triplet combinations: channel with height, channel with width, channel with channel. Finally, we propose a hierarchical-split block for generating multiscale feature representations in a coarse-to-fine fashion. The experiments results on LEVIR-CD and season-varying change detection dataset outperform most state-of-the-art models.
ieeexplore.ieee.org