Article

EfficientMatting: Bilateral Matting Network for Real-Time Human Matting

Authors:

Nong SangAuthors Info & Claims

Pattern Recognition and Computer Vision: 7th Chinese Conference, PRCV 2024, Urumqi, China, October 18–20, 2024, Proceedings, Part XII

Pages 128 - 142

https://doi.org/10.1007/978-981-97-8858-3_9

Published: 03 November 2024 Publication History

Abstract

Recent human matting methods typically suffer from two drawbacks: 1) high computation overhead caused by multiple stages, and 2) limited practical application due to the need for auxiliary guidance (e.g., trimap, mask, or background). To address these issues, we propose EfficientMatting, a real-time human matting method using only a single image as input. Specifically, EfficientMatting incorporates a bilateral network composed of two complementary branches: a transformer-based context information branch and a CNN-based spatial information branch. Furthermore, we introduce three novel techniques to enhance model performance while maintaining high inference efficiency. Firstly, we design a Semantic Guided Fusion Module (SGFM), which empowers the model to dynamically acquire valuable features with the assistance of context information. Secondly, we design a lightweight Detail Preservation Module (DPM) to achieve detail preservation and mitigate image artifacts during the upsampling process. Thirdly, we introduce the Supervised-Enhanced Training Strategy (SETS) to explicitly provide supervision on hidden features. Extensive experiments on P3M-10k, Human-2K, and PPM-100 datasets show that EfficientMatting outperforms state-of-the-art real-time human matting methods in terms of both model performance and inference speed.

References

[1]

Cai, S., Zhang, X., Fan, H., Huang, H., Liu, J., Liu, J., Liu, J., Wang, J., Sun, J.: Disentangled image matting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8819–8828 (2019)

[2]

Chen Q, Li D, and Tang CK KNN matting IEEE Trans. Pattern Anal. Mach. Intell. 2013 35 9 2175-2188

Digital Library

[3]

Chen, Q., Ge, T., Xu, Y., Zhang, Z., Yang, X., Gai, K.: Semantic human matting. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 618–626 (2018)

[4]

Chuang, Y.Y., Curless, B., Salesin, D.H., Szeliski, R.: A bayesian approach to digital matting. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001. vol. 2, pp. II–II. IEEE (2001)

[5]

Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: RepVGG: making VGG-style convnets great again. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742 (2021)

[6]

Hong, J., Zuo, J., Han, C., Zheng, R., Tian, M., Gao, C., Sang, N.: Spatial cascaded clustering and weighted memory for unsupervised person re-identification (2024). arXiv:2403.00261

[7]

Hong, Y., Pan, H., Sun, W., Jia, Y.: Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes (2021). arXiv:2101.06085

[8]

Hou, Q., Liu, F.: Context-aware image matting for simultaneous foreground and alpha estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4130–4139 (2019)

[9]

, Karacan, L., Erdem, A., Erdem, E.: Image matting with KL-divergence based sparse sampling. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 424–432 (2015)

[10]

Ke, Z., Sun, J., Li, K., Yan, Q., Lau, R.W.: MODNet: real-time trimap-free portrait matting via objective decomposition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1140–1147 (2022)

[11]

Lee, P., Wu, Y.: Nonlocal matting. In: CVPR 2011, pp. 2193–2200. IEEE (2011)

[12]

Levin A, Lischinski D, and Weiss Y A closed-form solution to natural image matting IEEE Trans. Pattern Anal. Mach. Intell. 2007 30 2 228-242

Digital Library

[13]

Li, J., Ma, S., Zhang, J., Tao, D.: Privacy-preserving portrait matting. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3501–3509 (2021)

[14]

Li J, Zhang J, Maybank SJ, and Tao D Bridging composite and real: towards end-to-end deep image matting Int. J. Comput. Vision 2022 130 2 246-266

Digital Library

[15]

Li, J., Zhang, J., Tao, D.: Deep automatic natural image matting (2021). arXiv:2107.07235

[16]

Li, Y., Lu, H.: Natural image matting via guided contextual attention. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11450–11457 (2020)

[17]

Lin, S., Ryabtsev, A., Sengupta, S., Curless, B.L., Seitz, S.M., Kemelmacher-Shlizerman, I.: Real-time high-resolution background matting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8762–8771 (2021)

[18]

Liu, X., Peng, H., Zheng, N., Yang, Y., Hu, H., Yuan, Y.: Efficientvit: Memory efficient vision transformer with cascaded group attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14420–14430 (2023)

[19]

Liu, Y., Xie, J., Shi, X., Qiao, Y., Huang, Y., Tang, Y., Yang, X.: Tripartite information mining and integration for image matting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7555–7564 (2021)

[20]

Lu, H., Dai, Y., Shen, C., Xu, S.: Indices matter: Learning to index for deep image matting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3266–3275 (2019)

[21]

Luo, R., Wei, R., Gao, C., Sang, N.: Frequency information matters for image matting. In: Asian Conference on Pattern Recognition, pp. 81–94. Springer, Berlin (2023)

[22]

Park, G., Son, S., Yoo, J., Kim, S., Kwak, N.: Matteformer: transformer-based image matting via prior-tokens. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11696–11706 (2022)

[23]

Qiao, Y., Liu, Y., Yang, X., Zhou, D., Xu, M., Zhang, Q., Wei, X.: Attention-guided hierarchical structure aggregation for image matting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13676–13685 (2020)

[24]

Rhemann, C., Rother, C., Wang, J., Gelautz, M., Kohli, P., Rott, P.: A perceptually motivated online benchmark for image matting. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1826–1833. IEEE (2009)

[25]

Sengupta, S., Jayaram, V., Curless, B., Seitz, S.M., Kemelmacher-Shlizerman, I.: Background matting: the world is your green screen. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2291–2300 (2020)

[26]

Shahrian, E., Rajan, D., Price, B., Cohen, S.: Improving image matting using comprehensive sampling sets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 636–643 (2013)

[27]

Wang, J., Cohen, M.F.: Optimized color sampling for robust matting. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)

[28]

Wei, R., Liu, Y., Song, J., Cui, H., Xie, Y., Zhou, K.: Chain: Exploring global-local spatio-temporal information for improved self-supervised video hashing. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 1677–1688 (2023)

[29]

Xu, N., Price, B., Cohen, S., Huang, T.: Deep image matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2970–2979 (2017)

[30]

Xu Z, Shang H, Yang S, Xu R, Yan Y, Li Y, Huang J, Yang HC, and Zhou J Hierarchical painter: Chinese landscape painting restoration with fine-grained styles Vis. Intell. 2023 1 1 19

[31]

Yao J, Wang X, Yang S, and Wang B Vitmatte: boosting image matting with pre-trained plain vision transformers Inf. Fusion 2024 103

Digital Library

[32]

Yao, J., Wang, X., Ye, L., Liu, W.: Matte anything: interactive natural image matting with segment anything models (2023). arXiv:2306.04121

[33]

Yu, Q., Zhang, J., Zhang, H., Wang, Y., Lin, Z., Xu, N., Bai, Y., Yuille, A.: Mask guided matting via progressive refinement network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1154–1163 (2021)

[34]

Zhang, H., Wang, X., Xu, X., Qing, Z., Gao, C., Sang, N.: Hr-pro: Point-supervised temporal action localization via hierarchical reliability propagation (2023). arXiv:2308.12608

[35]

Zhang, Y., Gong, L., Fan, L., Ren, P., Huang, Q., Bao, H., Xu, W.: A late fusion cnn for digital matting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7469–7478 (2019)

[36]

Zhou Y, Lu R, Xue F, and Gao Y Occlusion relationship reasoning with a feature separation and interaction network Vis. Intell. 2023 1 1 23

Index Terms

EfficientMatting: Bilateral Matting Network for Real-Time Human Matting
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Automatic image matting using component-hue-difference-based spectral matting
ACIIDS'12: Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part II

This paper presents automatic image matting using component-hue-difference-based spectral matting to obtain accurate alpha mattes. Spectral matting is the state-of-the-art image matting and it is also a milestone in theoretic matting research. However, ...
Unsupervised and reliable image matting based on modified spectral matting

Spectral matting is the state-of-the-art image matting and also a milestone in theoretic matting research. For spectral matting without user intervention, the accuracy of alpha matte is low and the computational cost is high. Therefore, this paper ...
Automatic spectral video matting

This paper proposes automatic spectral video matting based on adaptive component detection and component-matching-based spectral matting. In the proposed automatic spectral video matting, adaptive component detection is used to automatically generate ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Pattern Recognition and Computer Vision: 7th Chinese Conference, PRCV 2024, Urumqi, China, October 18–20, 2024, Proceedings, Part XII

Oct 2024

595 pages

ISBN:978-981-97-8857-6

DOI:10.1007/978-981-97-8858-3

Editors:
Zhouchen Lin
Peking University, Beijing, China
,
Ming-Ming Cheng
Nankai University, Tianjin, China
,
Ran He
Chinese Academy of Sciences, Beijing, China
,
Kurban Ubul
Xinjiang University, Ürümqi, Xinjiang, China
,
Wushouer Silamu
Xinjiang University, Ürümqi, China
,
Hongbin Zha
https://ror.org/02v51f717Peking University, Beijing, China
,
Jie Zhou
Tsinghua University, Beijing, China
,
Cheng-Lin Liu
Chinese Academy of Sciences, Beijing, China

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 03 November 2024

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents