Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3551626.3564943acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Asymmetric Label Propagation for Video Object Segmentation

Published: 13 December 2022 Publication History

Abstract

Semi-supervised video object segmentation aims to segment foreground objects across a video sequence based on their masks given at the first frame. The motion in adjacent frames tends to be smooth, yet object appearances could change substantially in subsequent frames due to clutters or occlusions. Most existing works segment a video frame by equally referring to segmentation masks of its previous frame and the first frame, and are prone to unreliable matching and accumulated segmentation errors. In order to alleviate this issue, this paper proposes to treat the first and previous frames differently to leverage the motion and appearance clues reliably, and presents an Asymmetric Label Propagation (ALP) method. ALP consists of a Confidence-guided Local Propagation (CLP) module and a Global Label Matching (GLM) module, respectively. CLP propagates labels from the previous frame to the current frame based on local affinity and appearance matching uncertainty. To further recover potential missing objects and alleviate error accumulation, GLM matches the current frame to both the foreground and background of the first frame, and adaptively fuses their matching results. The CLP and GLM outputs are fused to generate object-specific feature maps to perform multi-object segmentation. Extensive experiments on DAVIS and Youtube-VOS datasets demonstrate the effectiveness of the proposed method.

References

[1]
Linchao Bao, Baoyuan Wu, and Wei Liu. 2018. CNN in MRF: Video object segmentation via inference in a CNN-based higher-order spatio-temporal MRF. In CVPR.
[2]
Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS Torr. 2016. Fully-convolutional siamese networks for object tracking. In ECCV.
[3]
Goutam Bhat, Felix Järemo Lawin, Martin Danelljan, Andreas Robinson, Michael Felsberg, Luc Van Gool, and Radu Timofte. 2020. Learning what to learn for video object segmentation. In ECCV.
[4]
Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, and Luc Van Gool. 2017. One-shot video object segmentation. In CVPR.
[5]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV.
[6]
Xi Chen, Zuoxin Li, Ye Yuan, Gang Yu, Jianxin Shen, and Donglian Qi. 2020. State-Aware Tracker for Real-Time Video Object Segmentation. In CVPR.
[7]
Yuhua Chen, Jordi Pont-Tuset, Alberto Montes, and Luc Van Gool. 2018. Blazingly fast video object segmentation with pixel-wise metric learning. In CVPR.
[8]
Ho Kei Cheng, Yu-Wing Tai, and Chi-Keung Tang. 2021. Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation. In NeurIPS.
[9]
Jingchun Cheng, Yi-Hsuan Tsai, Wei-Chih Hung, Shengjin Wang, and Ming-Hsuan Yang. 2018. Fast and accurate online video object segmentation via tracking parts. In CVPR.
[10]
Jingchun Cheng, Yi-Hsuan Tsai, Shengjin Wang, and Ming-Hsuan Yang. 2017. Segflow: Joint learning for video object segmentation and optical flow. In ICCV.
[11]
Hai Ci, Chunyu Wang, and Yizhou Wang. 2018. Video object segmentation by learning location-sensitive embeddings. In ECCV.
[12]
Brendan Duke, Abdalla Ahmed, Christian Wolf, Parham Aarabi, and Graham W Taylor. 2021. Sstvos: Sparse spatiotemporal transformers for video object segmentation. In CVPR.
[13]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In ICML. 1126--1135.
[14]
Yan Gui, Ying Tian, Dao-Jian Zeng, Zhi-Feng Xie, and Yi-Yu Cai. 2019. Reliable and Dynamic Appearance Modeling and Label Consistency Enforcing for Fast and Coherent Video Object Segmentation with the Bilateral Grid. T-CSVT (2019).
[15]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.
[16]
Yuan-Ting Hu, Jia-Bin Huang, and Alexander G Schwing. 2018. Videomatch: Matching based video object segmentation. In ECCV.
[17]
Xuhua Huang, Jiarui Xu, Yu-Wing Tai, and Chi-Keung Tang. 2020. Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching. In CVPR.
[18]
Varun Jampani, Raghudeep Gadde, and Peter V Gehler. 2017. Video propagation networks. In CVPR.
[19]
Won-Dong Jang and Chang-Su Kim. 2017. Online video object segmentation via convolutional trident network. In CVPR.
[20]
Joakim Johnander, Martin Danelljan, Emil Brissman, Fahad Shahbaz Khan, and Michael Felsberg. 2019. A generative appearance model for end-to-end video object segmentation. In CVPR.
[21]
Anna Khoreva, Rodrigo Benenson, Eddy Ilg, Thomas Brox, and Bernt Schiele. 2019. Lucid data dreaming for video object segmentation. IJCV (2019).
[22]
Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, and Xiaolin Hu. 2018. High performance visual tracking with siamese region proposal network. In CVPR.
[23]
Xiaoxiao Li and Chen Change Loy. 2018. Video object segmentation with joint re-identification and attention-aware mask propagation. In ECCV.
[24]
Huaijia Lin, Xiaojuan Qi, and Jiaya Jia. 2019. AGSS-VOS: Attention Guided Single-Shot Video Object Segmentation In ICCV.
[25]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In ECCV.
[26]
Weide Liu, Guosheng Lin, Tianyi Zhang, and Zichuan Liu. 2020. Guided Co-Segmentation Network for Fast Video Object Segmentation. T-CSVT (2020).
[27]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR.
[28]
Jonathon Luiten, Paul Voigtlaender, and Bastian Leibe. 2018. PReMVOS: Proposal-generation, refinement and merging for video object segmentation. In ACCV.
[29]
K-K Maninis, Sergi Caelles, Yuhua Chen, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, and Luc Van Gool. 2018. Video object segmentation without temporal information. T-PAMI (2018).
[30]
Seoung Wug Oh, Joon-Young Lee, Ning Xu, and Seon Joo Kim. 2019. Video object segmentation using space-time memory networks. In ICCV.
[31]
Federico Perazzi, Anna Khoreva, Rodrigo Benenson, Bernt Schiele, and Alexander Sorkine-Hornung. 2017. Learning video object segmentation from static images. In CVPR.
[32]
Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alex Sorkine-Hornung, and Luc Van Gool. 2017. The 2017 davis challenge on video object segmentation. arXiv:1704.00675 (2017).
[33]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In NeurIPS.
[34]
Andreas Robinson, Felix Jaremo Lawin, Martin Danelljan, Fahad Shahbaz Khan, and Michael Felsberg. 2020. Learning Fast and Robust Target Models for Video Object Segmentation. In CVPR.
[35]
Jae Shin Yoon, Francois Rameau, Junsik Kim, Seokju Lee, Seunghak Shin, and In So Kweon. 2017. Pixel-level matching for video object segmentation using convolutional neural networks. In ICCV.
[36]
Yi-Hsuan Tsai, Ming-Hsuan Yang, and Michael J Black. 2016. Video segmentation via object flow. In CVPR.
[37]
Paul Voigtlaender, Yuning Chai, Florian Schroff, Hartwig Adam, Bastian Leibe, and Liang-Chieh Chen. 2019. Feelvos: Fast end-to-end embedding learning for video object segmentation. In CVPR.
[38]
Paul Voigtlaender and Bastian Leibe. 2017. Online adaptation of convolutional neural networks for video object segmentation. In BMVC.
[39]
Paul Voigtlaender, Jonathon Luiten, Philip HS Torr, and Bastian Leibe. 2020. Siam r-cnn: Visual tracking by re-detection. In CVPR.
[40]
Qiang Wang, Li Zhang, Luca Bertinetto, Weiming Hu, and Philip HS Torr. 2019. Fast online object tracking and segmentation: A unifying approach. In CVPR.
[41]
Ziqin Wang, Jun Xu, Li Liu, Fan Zhu, and Ling Shao. 2019. Ranet: Ranking attention network for fast video object segmentation. In ICCV.
[42]
Seoung Wug Oh, Joon-Young Lee, Kalyan Sunkavalli, and Seon Joo Kim. 2018. Fast video object segmentation by reference-guided mask propagation. In CVPR.
[43]
Ning Xu, Linjie Yang, Yuchen Fan, Jianchao Yang, Dingcheng Yue, Yuchen Liang, Brian Price, Scott Cohen, and Thomas Huang. 2018. Youtube-vos: Sequence-to-sequence video object segmentation. In ECCV.
[44]
Linjie Yang, Yanran Wang, Xuehan Xiong, Jianchao Yang, and Aggelos K Katsaggelos. 2018. Efficient video object segmentation via network modulation. In CVPR.
[45]
Zongxin Yang, Yunchao Wei, and Yi Yang. 2020. Collaborative video object segmentation by foreground-background integration. In ECCV.
[46]
Zongxin Yang, Yunchao Wei, and Yi Yang. 2021. Associating objects with transformers for video object segmentation. (2021).
[47]
Lu Zhang, Zhe Lin, Jianming Zhang, Huchuan Lu, and You He. 2019. Fast video object segmentation via dynamic targeting network. In ICCV.
[48]
Yizhuo Zhang, Zhirong Wu, Houwen Peng, and Stephen Lin. 2020. A Transductive Approach for Video Object Segmentation. In CVPR.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia
December 2022
296 pages
ISBN:9781450394789
DOI:10.1145/3551626
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 December 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. label propagation
  2. video object segmentation

Qualifiers

  • Research-article

Funding Sources

  • Natural Science Foundation of China
  • National Key Research and Development Program of China

Conference

MMAsia '22
Sponsor:
MMAsia '22: ACM Multimedia Asia
December 13 - 16, 2022
Tokyo, Japan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 131
    Total Downloads
  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)4
Reflects downloads up to 06 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media