Ct-net: Asymmetric compound branch transformer for medical image segmentation

N Zhang, L Yu, D Zhang, W Wu, S Tian, X Kang, M Li - Neural Networks, 2024 - Elsevier
N Zhang, L Yu, D Zhang, W Wu, S Tian, X Kang, M Li
Neural Networks, 2024Elsevier
The Transformer architecture has been widely applied in the field of image segmentation
due to its powerful ability to capture long-range dependencies. However, its ability to capture
local features is relatively weak and it requires a large amount of data for training. Medical
image segmentation tasks, on the other hand, demand high requirements for local features
and are often applied to small datasets. Therefore, existing Transformer networks show a
significant decrease in performance when applied directly to this task. To address these …
Abstract
The Transformer architecture has been widely applied in the field of image segmentation due to its powerful ability to capture long-range dependencies. However, its ability to capture local features is relatively weak and it requires a large amount of data for training. Medical image segmentation tasks, on the other hand, demand high requirements for local features and are often applied to small datasets. Therefore, existing Transformer networks show a significant decrease in performance when applied directly to this task. To address these issues, we have designed a new medical image segmentation architecture called CT-Net. It effectively extracts local and global representations using an asymmetric asynchronous branch parallel structure, while reducing unnecessary computational costs. In addition, we propose a high-density information fusion strategy that efficiently fuses the features of two branches using a fusion module of only 0.05M. This strategy ensures high portability and provides conditions for directly applying transfer learning to solve dataset dependency issues. Finally, we have designed a parameter-adjustable multi-perceptive loss function for this architecture to optimize the training process from both pixel-level and global perspectives. We have tested this network on 5 different tasks with 9 datasets, and compared to SwinUNet, CT-Net improves the IoU by 7.3% and 1.8% on Glas and MoNuSeg datasets respectively. Moreover, compared to SwinUNet, the average DSC on the Synapse dataset is improved by 3.5%.
Elsevier