research-article

Hierarchical and Progressive Image Matting

Authors:

Xin YangAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 19, Issue 2

Article No.: 52, Pages 1 - 23

https://doi.org/10.1145/3540201

Published: 06 February 2023 Publication History

Abstract

Most matting research resorts to advanced semantics to achieve high-quality alpha mattes, and a direct low-level features combination is usually explored to complement alpha details. However, we argue that appearance-agnostic integration can only provide biased foreground (FG) details and that alpha mattes require different-level feature aggregation for better pixel-wise opacity perception. In this article, we propose an end-to-end hierarchical and progressive attention matting network (HAttMatting++), which can better predict the opacity of the FG from single RGB images without additional input. Specifically, we utilize channel-wise attention (CA) to distill pyramidal features and employ spatial attention (SA) at different levels to filter appearance cues. This progressive attention mechanism can estimate alpha mattes from adaptive semantics and semantics-indicated boundaries. We also introduce a hybrid loss function fusing structural similarity, mean square error, adversarial loss, and sentry supervision to guide the network to further improve the overall FG structure. In addition, we construct a large-scale and challenging image matting dataset comprised of 59,000 training images and 1,000 test images (a total of 646 distinct FG alpha mattes), which can further improve the robustness of our hierarchical and progressive aggregation model. Extensive experiments demonstrate that the proposed HAttMatting++ can capture sophisticated FG structures and achieve state-of-the-art performance with single RGB images as input.

References

[1]

Yagiz Aksoy, Tunc Ozan Aydin, and Marc Pollefeys. 2017. Designing effective inter-pixel information flow for natural image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 228–236.

[2]

Yağız Aksoy, Tae-Hyun Oh, Sylvain Paris, Marc Pollefeys, and Wojciech Matusik. 2018. Semantic soft segmentation. ACM Transactions on Graphics 37, 4 (2018), Article 72.

Digital Library

[3]

Shaofan Cai, Xiaoshuai Zhang, Haoqiang Fan, Haibin Huang, Jiangyu Liu, Jiaming Liu, Jiaying Liu, Jue Wang, and Jian Sun. 2019. Disentangled image matting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). 8818–8827.

[4]

Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, and Tat-Seng Chua. 2017. SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 6298–6306.

[5]

L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. 2018. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 4 (2018), 834–848.

[6]

Quan Chen, Tiezheng Ge, Yanyu Xu, Zhiqiang Zhang, Xinxin Yang, and Kun Gai. 2018. Semantic human matting. In Proceedings of the ACM International Conference on Multimedia (MM’18). 618–626.

Digital Library

[7]

Qifeng Chen, Dingzeyu Li, and Chi Keung Tang. 2013. KNN matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 9 (2013), 2175–2188.

Digital Library

[8]

D. Cho, S. Kim, Y. W. Tai, and I. S. Kweon. 2016. Automatic trimap generation and consistent matting for light-field images. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 8 (2016), 1504–1517.

Digital Library

[9]

Donghyeon Cho, Yu-Wing Tai, and In So Kweon. 2019. Deep convolutional neural network for natural image matting using initial alpha mattes. IEEE Transactions on Image Processing 28, 3 (2019), 1054–1067.

[10]

Yutong Dai, Hao Lu, and Chunhua Shen. 2021. Learning affinity-aware upsampling for deep image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 6841–6850.

[11]

Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. 2010. The PASCAL Visual Object Classes (VOC) challenge. International Journal of Computer Vision 88, 2 (2010), 303–338.

Digital Library

[12]

Eduardo S. L. Gastal and Manuel M. Oliveira. 2010. Shared sampling for real-time alpha matting. Computer Graphics Forum 29, 2 (2010), 575–584.

[13]

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Xu Bing, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS’14). 2672–2680.

[14]

Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, Ali Borji, Zhuowen Tu, and Philip H. S. Torr. 2017. Deeply supervised salient object detection with short connections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 3203–3212.

[15]

Qiqi Hou and Feng Liu. 2019. Context-aware image matting for simultaneous foreground and alpha estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). 4129–4138.

[16]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 5967–5976.

[17]

L. Karacan, A. Erdem, and E. Erdem. 2015. Image matting with KL-divergence based sparse sampling. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’15). 424–432.

Digital Library

[18]

P. Lee and Ying Wu. 2011. Nonlocal matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’11). 2193–2200.

Digital Library

[19]

Anat Levin, Dani Lischinski, and Yair Weiss. 2007. A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 2 (2007), 228–242.

Digital Library

[20]

Anat Levin, Alex Rav-Acha, and Dani Lischinski. 2008. Spectral matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 10 (2008), 1699–1712.

Digital Library

[21]

Yaoyi Li and Hongtao Lu. 2020. Natural image matting via guided contextual attention. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’20). 11450–11457.

[22]

Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian L. Curless, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. 2021. Real-time high-resolution background matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 8762–8771.

[23]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV’14). 740–755.

[24]

Wei Liu, Andrew Rabinovich, and Alexander C. Berg. 2015. ParseNet: Looking wider to see better. arXiv preprint arXiv:1506.04579 (2015).

[25]

Hao Lu, Yutong Dai, Chunhua Shen, and Songcen Xu. 2019. Indices matter: Learning to index for deep image matting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). 3265–3274.

[26]

Sebastian Lutz, Konstantinos Amplianitis, and Aljoscha Smolic. 2018. AlphaGAN: Generative adversarial networks for natural image matting. In Proceedings of the British Machine Vision Conference (BMVC’18). 259.

[27]

Haiyang Mei, Yuanyuan Liu, Ziqi Wei, Dongsheng Zhou, Xiaopeng Xiaopeng, Qiang Zhang, and Xin Yang. 2021. Exploring dense context for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology 32, 3 (2021), 1378–1389.

[28]

Haiyang Mei, Xin Yang, Yang Wang, Yuanyuan Liu, Shengfeng He, Qiang Zhang, Xiaopeng Wei, and Rynson W. H. Lau. 2020. Don’t hit me! Glass detection in real-world scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’20).

[29]

Yu Qiao, Yuhao Liu, Xin Yang, Dongsheng Zhou, Mingliang Xu, Qiang Zhang, and Xiaopeng Wei. 2020. Attention-guided hierarchical structure aggregation for image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).

[30]

Yu Qiao, Yuhao Liu, Qiang Zhu, Xin Yang, Yuxin Wang, Qiang Zhang, and Xiaopeng Wei. 2020. Multi-scale information assembly for image matting. Computer Graphics Forum 39 (2020), 565–574.

[31]

Xuebin Qin, Zichen Zhang, Chenyang Huang, Chao Gao, Masood Dehghan, and Martin Jagersand. 2019. BASNet: Boundary-aware salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 7471–7481.

[32]

C. Rhemann and C. Rother. 2011. A global sampling method for alpha matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’11). 2049–2056.

[33]

Christoph Rhemann, Carsten Rother, Jue Wang, Margrit Gelautz, Pushmeet Kohli, and Pamela Rott. 2009. A perceptually motivated online benchmark for image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’09). 1826–1833.

[34]

Swami Sankaranarayanan, Yogesh Balaji, Arpit Jain, Ser Nam Lim, and Rama Chellappa. 2018. Learning from synthetic data: Addressing domain shift for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 3752–3761.

[35]

Soumyadip Sengupta, Vivek Jayaram, Brian Curless, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. 2020. Background matting: The world is your green screen. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 2288–2297.

[36]

Ehsan Shahrian, Deepu Rajan, Brian Price, and Scott Cohen. 2013. Improving image matting using comprehensive sampling sets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’13). 636–643.

Digital Library

[37]

Xiaoyong Shen, Xin Tao, Hongyun Gao, Chao Zhou, and Jiaya Jia. 2016. Deep automatic portrait matting. In Proceedings of the European Conference on Computer Vision (ECCV’16). 92–107.

[38]

Yanan Sun, Chi-Keung Tang, and Yu-Wing Tai. 2021. Semantic image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 11120–11129.

[39]

Jingwei Tang, Yagiz Aksoy, Cengiz Oztireli, Markus Gross, and Tunc Ozan Aydin. 2019. Learning-based sampling for natural image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 3050–3058.

[40]

Xin Tian, Ke Xu, Xin Yang, Lin Du, Baocai Yin, and Rynson W. H. Lau. 2022. Bi-directional object-context prioritization learning for saliency ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’22).

[41]

Xin Tian, Ke Xu, Xin Yang, Baocai Yin, and Rynson W. H. Lau. 2020. Weakly-supervised salient instance detection. In Proceedings of the British Machine Vision Conference (BMVC’20).

[42]

Xin Tian, Ke Xu, Xin Yang, Baocai Yin, and Rynson W. H. Lau. 2021. Learning to detect instance-level salient objects using complementary image labels. International Journal of Computer Vision 130 (2021), 729–746.

[43]

Renjie Wan, Boxin Shi, Ling-Yu Duan, Ah-Hwee Tan, and Alex C. Kot. 2018. CRRN: Multi-scale guided concurrent reflection removal network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 4777–4785.

[44]

Jue Wang and Michael F. Cohen. 2007. Optimized color sampling for robust matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’07). 1–8.

[45]

Yu Wang, Yi Niu, Peiyong Duan, Jianwei Lin, and Yuanjie Zheng. 2018. Deep propagation based image matting. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’18). 999–1006.

[46]

Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.

Digital Library

[47]

Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Hanqing Zhao, Weiming Zhang, and Nenghai Yu. 2021. Improved image matting via real-time user clicks and uncertainty estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 15374–15383.

[48]

Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV’18). 3–19.

Digital Library

[49]

Yuanbo Xiangli, Yubin Deng, Bo Dai, Chen Change Loy, and Dahua Lin. 2020. Real or not real, that is the question. In Proceedings of the International Conference on Learning Representations (ICLR’20).

[50]

Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 5987–5995.

[51]

Ning Xu, Brian Price, Scott Cohen, and Thomas Huang. 2017. Deep image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 311–320.

[52]

Xin Yang, Yu Qiao, Shaozhe Chen, Shengfeng He, Baocai Yin, Qiang Zhang, Xiaopeng Wei, and Rynson W. H. Lau. 2020. Smart scribbles for image matting. ACM Transactions on Multimedia Computing Communications and Applications 16, 4 (2020), Article 121, 21 pages.

Digital Library

[53]

Xin Yang, Ke Xu, Shaozhe Chen, Shengfeng He, Baocai Yin Yin, and Rynson Lau. 2018. Active matting. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS’18). 4590–4600.

[54]

Guan Yu, Wei Chen, Xiao Liang, Zi’ang Ding, and Qunsheng Peng. 2006. Easy matting—A stroke based approach for continuous image matting. Computer Graphics Forum 25, 3 (2006), 567–576.

[55]

Haichao Yu, Ning Xu, Zilong Huang, Yuqian Zhou, and Humphrey Shi. 2021. High-resolution deep image matting. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’21). 3217–3224.

[56]

Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 5505–5514.

[57]

Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, and Alan Yuille. 2021. Mask guided matting via progressive refinement network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 1154–1163.

[58]

Yunke Zhang, Lixue Gong, Lubin Fan, Peiran Ren, Qixing Huang, Hujun Bao, and Weiwei Xu. 2019. A late fusion CNN for digital matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 7461–7470.

[59]

Yuanjie Zheng and Chandra Kambhamettu. 2009. Learning based digital matting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’09). 889–896.

[60]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’17). 2242–2251.

Cited By

Yao JWang XYang SWang B(2024)ViTMatteInformation Fusion10.1016/j.inffus.2023.102091103:COnline publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.inffus.2023.102091
Yao GSun A(2024)Multi-guided-based image matting via boundary detectionComputer Vision and Image Understanding10.1016/j.cviu.2024.103998243:COnline publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1016/j.cviu.2024.103998
Zhou FTian YZhu S(2024)Deep image matting with cross-layer contextual information propagationNeural Computing and Applications10.1007/s00521-024-09431-536:12(6809-6825)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1007/s00521-024-09431-5
Show More Cited By

Index Terms

Hierarchical and Progressive Image Matting
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation

Recommendations

Smart Scribbles for Image Matting

Image matting is an ill-posed problem that usually requires additional user input, such as trimaps or scribbles. Drawing a fine trimap requires a large amount of user effort, while using scribbles can hardly obtain satisfactory alpha mattes for non-...
Automatic and accurate image matting
ICCCI'10: Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part III

This paper presents a modified spectral matting to obtain automatic and accurate image matting. Spectral matting is the state-of-the-art image matting and also a milestone in theoretic matting research. However, using spectral matting without user ...
Automatic image matting using component-hue-difference-based spectral matting
ACIIDS'12: Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part II

This paper presents automatic image matting using component-hue-difference-based spectral matting to obtain accurate alpha mattes. Spectral matting is the state-of-the-art image matting and it is also a milestone in theoretic matting research. However, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 19, Issue 2

March 2023

540 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3572860

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 February 2023

Online AM: 11 June 2022

Accepted: 23 May 2022

Revised: 11 May 2022

Received: 29 August 2021

Published in TOMM Volume 19, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Innovation Technology Funding of Dalian
Open Research Fund of Beijing Key Laboratory of Big Data Technology for Food Safety
National Key Research and Development Program of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
611
Total Downloads

Downloads (Last 12 months)116
Downloads (Last 6 weeks)10

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yao JWang XYang SWang B(2024)ViTMatteInformation Fusion10.1016/j.inffus.2023.102091103:COnline publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.inffus.2023.102091
Yao GSun A(2024)Multi-guided-based image matting via boundary detectionComputer Vision and Image Understanding10.1016/j.cviu.2024.103998243:COnline publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1016/j.cviu.2024.103998
Zhou FTian YZhu S(2024)Deep image matting with cross-layer contextual information propagationNeural Computing and Applications10.1007/s00521-024-09431-536:12(6809-6825)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1007/s00521-024-09431-5
Jin YWu JWang WYan YJiang JZheng J(2023)Cascading Blend Network for Image InpaintingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/360895220:1(1-21)Online publication date: 25-Aug-2023
https://dl.acm.org/doi/10.1145/3608952
Liu QLv XYu WGuo CZhang S(2023)Dual-context aggregation for universal image mattingMultimedia Tools and Applications10.1007/s11042-023-17517-w83:17(53119-53137)Online publication date: 15-Nov-2023
https://doi.org/10.1007/s11042-023-17517-w

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents