research-article

Unite-Divide-Unite: Joint Boosting Trunk and Structure for High-accuracy Dichotomous Image Segmentation

Authors:

Jialun Pei,

Zhangjun Zhou,

Yueming Jin,

He Tang,

Pheng-Ann HengAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 2139 - 2147

https://doi.org/10.1145/3581783.3611811

Published: 27 October 2023 Publication History

Get Access

Abstract

High-accuracy Dichotomous Image Segmentation (DIS) aims to pinpoint category-agnostic foreground objects from natural scenes. The main challenge for DIS involves identifying the highly accurate dominant area while rendering detailed object structure. However, directly using a general encoder-decoder architecture may result in an oversupply of high-level features and neglect the shallow spatial information necessary for partitioning meticulous structures. To fill this gap, we introduce a novel Unite-Divide-Unite Network (UDUN) that restructures and bipartitely arranges complementary features to simultaneously boost the effectiveness of trunk and structure identification. The proposed UDUN proceeds from several strengths. First, a dual-size input feeds into the shared backbone to produce more holistic and detailed features while keeping the model lightweight. Second, a simple Divide-and-Conquer Module (DCM) is proposed to decouple multiscale low- and high-level features into our structure decoder and trunk decoder to obtain structure and trunk information respectively. Moreover, we design a Trunk-Structure Aggregation module (TSA) in our union decoder that performs cascade integration for uniform high-accuracy segmentation. As a result, UDUN performs favorably against state-of-the-art competitors in all six evaluation metrics on overall DIS-TE, i.e., achieving 0.772 weighted F-measure and 977 HCE. Using 1024X1024 input, our model enables real-time inference at 65.3 fps with ResNet-18. The source code is available at https://github.com/PJLallen/UDUN.

Supplemental Material

MP4 File

This video provides a comprehensive introduction to UDUN, a novel Unite-Divide-Unite Network for high-accuracy dichotomous image segmentation. The video's structure is as follows: We start with an overview of the High-accuracy Dichotomous Image Segmentation (DIS) task and present our unique insights and motivations. Next, we look in depth at the overall architecture of our UDUN network and provide a thorough explanation of its workflow. Then, we conduct a comprehensive comparison with other cutting-edge models, presenting both quantitative and qualitative results to highlight the superior performance of our method. Finally, we showcase a series of ablation studies to demonstrate the effectiveness of each component and illustrate real-world applications of the UDUN network in relevant scenarios.

Download
219.71 MB

References

[1]

Radhakrishna Achanta, Sheila Hemami, Francisco Estrada, and Sabine Susstrunk. 2009. Frequency-tuned salient region detection. In CVPR. IEEE, Miami, Florida, USA, 1597--1604.

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Static standing trunk sway assessment in amputees --effects of sub-threshold stimulation

Automatic segmentation method using FCN with multi-scale dilated convolution for medical ultrasound image

Highly Accurate Dichotomous Image Segmentation

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations