Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

FactorMatte: Redefining Video Matting for Re-Composition Tasks

Published: 26 July 2023 Publication History

Abstract

We propose Factor Matting, an alternative formulation of the video matting problem in terms of counterfactual video synthesis that is better suited for re-composition tasks. The goal of factor matting is to separate the contents of a video into independent components, each representing a counterfactual version of the scene where the contents of other components have been removed. We show that factor matting maps well to a more general Bayesian framing of the matting problem that accounts for complex conditional interactions between layers. Based on this observation, we present a method for solving the factor matting problem that learns augmented patch-based appearance priors to produce useful decompositions even for video with complex cross-layer interactions like splashes, shadows, and reflections. Our method is trained per-video and does not require external training data or any knowledge about the 3D structure of the scene. Through extensive experiments, we show that it is able to produce useful decompositions of scenes with such complex interactions while performing competitively on classical matting tasks as well. We also demonstrate the benefits of our approach on a wide range of downstream video editing tasks. Our project website is at: https://factormatte.github.io/.

Supplementary Material

ZIP File (papers_604-supplemental.zip)
supplemental material
MP4 File (papers_604_VOD.mp4)
presentation

References

[1]
Xue Bai, Jue Wang, and David Simons. 2011. Towards temporally-coherent video matting. In International Conference on Computer Vision/Computer Graphics Collaboration Techniques and Applications.
[2]
Xue Bai, Jue Wang, David Simons, and Guillermo Sapiro. 2009. Video snapcut: robust video object cutout using localized classifiers. ACM Trans. Graph. 28, 3 (2009).
[3]
Olivier Barnich and Marc Van Droogenbroeck. 2010. ViBe: A universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20, 6 (2010).
[4]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proc. European Conference on Computer Vision (ECCV).
[5]
Qifeng Chen, Dingzeyu Li, and Chi-Keung Tang. 2013. KNN matting. Trans. Pattern Anal. Mach. Intell. 35, 9 (2013).
[6]
Donghyeon Cho, Yu-Wing Tai, and Inso Kweon. 2016. Natural image matting using deep convolutional neural networks. In Proc. European Conference on Computer Vision (ECCV).
[7]
Inchang Choi, Minhaeng Lee, and Yu-Wing Tai. 2012. Video matting using multi-frame nonlocal matting laplacian. In Proc. European Conference on Computer Vision (ECCV). 540--553.
[8]
Yung-Yu Chuang, Aseem Agarwala, Brian Curless, David H Salesin, and Richard Szeliski. 2002. Video matting of complex scenes. In SIGGRAPH.
[9]
Yung-Yu Chuang, Brian Curless, David H Salesin, and Richard Szeliski. 2001. A bayesian approach to digital matting. In Proc. Computer Vision and Pattern Recognition (CVPR), Vol. 2.
[10]
Mikhail Erofeev, Yury Gitman, Dmitriy S Vatolin, Alexey Fedorov, and Jue Wang. 2015. Perceptually Motivated Benchmark for Video Matting. In BMVC. 99--1.
[11]
Marco Forte and François Pitié. 2020. F, B, Alpha Matting. arXiv preprint arXiv:2003.07711 (2020).
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.
[13]
Qiqi Hou and Feng Liu. 2019. Context-aware image matting for simultaneous foreground and alpha estimation. In Proc. Int. Conf. on Computer Vision (ICCV).
[14]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proc. Computer Vision and Pattern Recognition (CVPR).
[15]
Animesh Karnewar and Oliver Wang. 2020. Msg-gan: Multi-scale gradients for generative adversarial networks. In Proc. Computer Vision and Pattern Recognition (CVPR).
[16]
Yoni Kasten, Dolev Ofri, Oliver Wang, and Tali Dekel. 2021. Layered neural atlases for consistent video editing. ACM Trans. Graph. 40, 6 (2021).
[17]
Zhanghan Ke, Jiayu Sun, Kaican Li, Qiong Yan, and Rynson WH Lau. 2022. Modnet: Real-time trimap-free portrait matting via objective decomposition. In AAAI, Vol. 36.
[18]
Sun-Young Lee, Jong-Chul Yoon, and In-Kwon Lee. 2010. Temporally coherent video matting. Graphical Models 72, 3 (2010).
[19]
Anat Levin, Dani Lischinski, and Yair Weiss. 2008. A Closed-Form Solution to Natural Image Matting. Trans. Pattern Anal. Mach. Intell. 30, 2 (2008).
[20]
Dingzeyu Li, Qifeng Chen, and Chi-Keung Tang. 2013. Motion-aware KNN Laplacian for video matting. In Proc. Int. Conf. on Computer Vision (ICCV).
[21]
Yaoyi Li and Hongtao Lu. 2020. Natural image matting via guided contextual attention. In AAAI, Vol. 34. 11450--11457.
[22]
Zhen Li, Cheng-Ze Lu, Jianhua Qin, Chun-Le Guo, and Ming-Ming Cheng. 2022. Towards an end-to-end framework for flow-guided video inpainting. In CVPR. 17562--17571.
[23]
Long Ang Lim and Hacer Yalim Keles. 2018. Foreground segmentation using convolutional neural networks for multiscale feature encoding. Pattern Recognition Letters 112 (2018).
[24]
Long Ang Lim and Hacer Yalim Keles. 2020. Learning multi-scale features for foreground segmentation. Pattern Analysis and Applications 23, 3 (2020).
[25]
Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian L Curless, Steven M Seitz, and Ira Kemelmacher-Shlizerman. 2021. Real-time high-resolution background matting. In Proc. Computer Vision and Pattern Recognition (CVPR).
[26]
Shanchuan Lin, Linjie Yang, Imran Saleemi, and Soumyadip Sengupta. 2022. Robust high-resolution video matting with temporal guidance. In IEEE Winter Conf. on Applications of Computer Vision (WACV).
[27]
Erika Lu, Forrester Cole, Tali Dekel, Weidi Xie, Andrew Zisserman, David Salesin, William T Freeman, and Michael Rubinstein. 2020. Layered neural rendering for retiming people in video. arXiv preprint arXiv:2009.07833 (2020).
[28]
Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T Freeman, and Michael Rubinstein. 2021. Omnimatte: Associating objects and their effects in video. In Proc. Computer Vision and Pattern Recognition (CVPR).
[29]
Hao Lu, Yutong Dai, Chunhua Shen, and Songcen Xu. 2019. Indices matter: Learning to index for deep image matting. In ICCV. 3266--3275.
[30]
Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alexander Sorkine-Hornung, and Luc Van Gool. 2017. The 2017 DAVIS Challenge on Video Object Segmentation. arXiv:1704.00675 (2017).
[31]
Thomas Porter and Tom Duff. 1984. Compositing digital images. In SIGGRAPH.
[32]
Richard J Qian and M Ibrahim Sezan. 1999. Video background replacement without a blue screen. In IEEE Int. Conf. Image Process., Vol. 4.
[33]
M.A. Ruzon and C. Tomasi. 2000. Alpha estimation in natural images. In Proc. Computer Vision and Pattern Recognition (CVPR), Vol. 1.
[34]
Soumyadip Sengupta, Vivek Jayaram, Brian Curless, Steven M Seitz, and Ira Kemelmacher-Shlizerman. 2020. Background matting: The world is your green screen. In Proc. Computer Vision and Pattern Recognition (CVPR).
[35]
Hongje Seong, Seoung Wug Oh, Brian Price, Euntai Kim, and Joon-Young Lee. 2022. One-Trimap Video Matting. In ECCV. 430--448.
[36]
Dmitriy Smirnov, Michael Gharbi, Matthew Fisher, Vitor Guizilini, Alexei Efros, and Justin M Solomon. 2021. Marionette: Self-supervised sprite learning. Adv. Neural Inform. Process. Syst. 34 (2021).
[37]
Alvy Ray Smith and James F. Blinn. 1996. Blue Screen Matting. In SIGGRAPH (SIGGRAPH '96). New York, NY, USA, 10 pages.
[38]
Jian Sun, Jiaya Jia, Chi-Keung Tang, and Heung-Yeung Shum. 2004. Poisson Matting. ACM Trans. Graph. 23, 3 (August 2004), 7 pages.
[39]
Yanan Sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, and Yu-Wing Tai. 2021. Deep video matting via spatio-temporal alignment and aggregation. In Proc. Computer Vision and Pattern Recognition (CVPR).
[40]
Jingwei Tang, Yagiz Aksoy, Cengiz Oztireli, Markus Gross, and Tunc Ozan Aydin. 2019. Learning-based sampling for natural image matting. In Proc. Computer Vision and Pattern Recognition (CVPR).
[41]
Zachary Teed and Jia Deng. 2020. Raft: Recurrent all-pairs field transforms for optical flow. In Proc. European Conference on Computer Vision (ECCV).
[42]
Ozan Tezcan, Prakash Ishwar, and Janusz Konrad. 2020. BSUV-Net: A fully-convolutional neural network for background subtraction of unseen videos. In IEEE Winter Conf. on Applications of Computer Vision (WACV).
[43]
J. Wang and M.F. Cohen. 2005. An iterative optimization approach for unified image segmentation and matting. In Proc. Int. Conf. on Computer Vision (ICCV), Vol. 2.
[44]
Ning Xu, Brian Price, Scott Cohen, and Thomas Huang. 2017. Deep image matting. In CVPR. 2970--2979.
[45]
Vickie Ye, Zhengqi Li, Richard Tucker, Angjoo Kanazawa, and Noah Snavely. 2022. Deformable Sprites for Unsupervised Video Decomposition. In Proc. Computer Vision and Pattern Recognition (CVPR).
[46]
Yunke Zhang, Chi Wang, Miaomiao Cui, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Hujun Bao, Qixing Huang, and Weiwei Xu. 2021. Attention-guided Temporally Coherent Video Object Matting. In ACMMM. 5128--5137.

Cited By

View all
  • (2024)Real-Time Multi-Person Video Synthesis with Controllable Prior-Guided MattingSensors10.3390/s2409279524:9(2795)Online publication date: 27-Apr-2024
  • (2024)Matting Algorithm with Improved Portrait Details for Images with Complex BackgroundsApplied Sciences10.3390/app1405194214:5(1942)Online publication date: 27-Feb-2024
  • (2023)OmnimatteRF: Robust Omnimatte with 3D Background Modeling2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.02145(23414-23423)Online publication date: 1-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 42, Issue 4
August 2023
1912 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3609020
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2023
Published in TOG Volume 42, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. matting
  2. video matting
  3. compositing
  4. video layer decomposition

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)6
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Real-Time Multi-Person Video Synthesis with Controllable Prior-Guided MattingSensors10.3390/s2409279524:9(2795)Online publication date: 27-Apr-2024
  • (2024)Matting Algorithm with Improved Portrait Details for Images with Complex BackgroundsApplied Sciences10.3390/app1405194214:5(1942)Online publication date: 27-Feb-2024
  • (2023)OmnimatteRF: Robust Omnimatte with 3D Background Modeling2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.02145(23414-23423)Online publication date: 1-Oct-2023
  • (2023)Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00712(7709-7719)Online publication date: 1-Oct-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media