research-article

Motion-Aware Structured Matrix Factorization for Foreground Detection in Complex Scenes

Authors:

Yonghong TianAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 16, Issue 4

Article No.: 123, Pages 1 - 23

https://doi.org/10.1145/3407188

Published: 17 December 2020 Publication History

Abstract

Foreground detection is one of the key steps in computer vision applications. Many foreground and background models have been proposed and achieved promising performance in static scenes. However, due to challenges such as dynamic background, irregular movement, and noise, most algorithms degrade sharply in complex scenes. To address the problem, we propose a motion-aware structured matrix factorization approach (MSMF), which integrates the structural and spatiotemporal motion information into a unified sparse-low-rank matrix factorization framework. Technologically, it has three main contributions: First, a variant of structured sparsity-inducing norm is proposed to constrain both structure and sparsity of foreground. The model is robust to the statistical variability of the underlying foreground pixels in complex scenes. Second, to capture the ambiguous pixels, a spatiotemporal cube-based motion trajectory is extracted for assisting matrix factorization. Finally, to solve the optimization problem of structured matrix factorization, we develop an augmented Lagrange multiplier method with the alternating direction strategy and Douglas-Rachford monotone operator splitting algorithm. Experiments demonstrate that the proposed approach achieves impressive performance in separating irregular moving foreground while suppressing the dynamic background and the noise, and outperforms some state-of-the-art algorithms.

References

[1]

M. Babaee, D. T. Dinh, and G. Rigoll. 2018. A deep convolutional neural network for video sequence background subtraction. Pattern Recognit. 76 (2018), 635--649.

Digital Library

[2]

O. Barnich and M. Van Droogenbroeck. 2011. ViBe: A universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20, 6 (2011), 1709--1724.

Digital Library

[3]

A. Beck and M. Teboulle. 2009. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18, 11 (2009), 2419--2434.

Digital Library

[4]

T. Bouwmans, A. Sobral, S. Javed, S. K. Jung, and E. H. Zahzah. 2017. Decomposition into low-rank plus additive matrices for background/foreground separation: A review for a comparative evaluation with a large-scale dataset. Comput. Sci. Rev. 23 (2017), 1--71.

Digital Library

[5]

T. Brox, A. Bruhn, N. Papenberg, and J. Weickert. 2004. High accuracy optical flow estimation based on a theory for warping. In Proc. ECCV (2004), 25--36.

[6]

Tianyi Zhou and Dacheng Tao. 2011. Godec: Randomized low-rank 8 sparse matrix decomposition in noisy case. In Proceedings of the 28th International Conference on Machine Learning (ICML'11).

[7]

E. J. Cands, X. Li, Y. Ma, and J. Wright. 2011. Robust principal component analysis? J. ACM 58, 3 (2011), 1--37.

Digital Library

[8]

W. Cao, J. Sun, and Z. Xu. 2013. Fast image deconvolution using closed-form thresholding formulas of Lq (q=1/2,2/3) regularization. J. Vis Commun. Image Represent. 24, 1 (2013), 31--41.

Digital Library

[9]

X. Cao, L. Yang, and X. Guo. 2015. Total variation regularized RPCA for irregularly moving object detection under dynamic back-ground. IEEE Trans. Cyber. 46, 4 (2015), 1014--1027.

[10]

P. L. Combettes and J. Pesquet. 2007. A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery. IEEE J. Sel. Top. Signal Process. 1, 4 (2007), 564--574.

[11]

X. Ding, L. He, and L. Carin. 2011. Bayesian robust principal component analysis. IEEE Trans. Image Process. 20, 12 (2011), 3419--3430.

Digital Library

[12]

S. E. Ebadi and E. Izquierdo. 2017. Foreground segmentation with tree-structured sparse RPCA. IEEE Trans. Pattern Anal. Mach. Intell. 40, 9 (2017), 2273--2280.

[13]

A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis. 2002. Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. In Proc. IEEE 90, 7 (2002), 1151--1163.

[14]

J. Friedman, T. Hastie, and H. Hofling. 2007. Pathwise coordinate optimization. Ann. Appl. Stat. 1, 2 (2007), 302--332.

[15]

Z. Gao, L.-F. Cheong, and Y.-X. Wang. 2014. Block-sparse RPCA for salient motion detection. IEEE Trans. Pattern Anal. Mach. Intell. 36, 10 (2014), 1975--1987.

[16]

N. Goyette, P. Jodoin, F. Porikli, J. Konrad, and P. Ishwar. 2012. Changedetection.net: A new change detection benchmark dataset. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops (2012), 1--8.

[17]

J. Guo, P. Zheng, and J. Huang. 2017. An efficient motion detection and tracking scheme for encrypted surveillance videos. ACM Trans. Multimedia Comput. Commun. Appl. 13, 4 (2017), 1--23.

Digital Library

[18]

Xiaojie Guo, Xinggang Wang, Liang Yang, Xiaochun Cao, and Yi Ma. 2014. Robust foreground detection using smoothness and arbitrariness constraints. In ECCV.

[19]

J. He, L. Balzano, and A. Szlam. 2012. Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Providence (2012), 1568--1575.

[20]

M. Heikkila and M. Pietikainen. 2006. A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28, 4 (2006), 657--662.

Digital Library

[21]

S. Javed, A. Mahmood, S. Al-Maadeed, T. Bouwmans, and S. K. Jung. 2019. Moving object detection in complex scene using spatiotemporal structured-sparse RPCA. IEEE Transactions on Image Processing 28, 2 (Feb. 2019), 1007--1022.

Digital Library

[22]

S. Javed, A. Mahmood, T. Bouwmans, and S. K. Jung. 2017. Background--foreground modeling based on spatiotemporal sparse subspace clustering. IEEE Transactions on Image Processing 26, 12 (Dec. 2017), 5840--5854.

Digital Library

[23]

S. Javed, A. Mahmood, T. Bouwmans, and S. K. Jung. 2018. Spatiotemporal low-rank modeling for complex scene background initialization. IEEE Trans. Circuits Syst. Video Technol. 28, 6 (2018), 1315--1329.

Digital Library

[24]

S. Javed, Soon Ki Jung, A. Mahmood, and T. Bouwmans. 2016. Motion-aware graph regularized RPCA for background modeling of complex scenes. In 2016 23rd International Conference on Pattern Recognition (ICPR). 120--125.

[25]

R. Jenatton, J. Y. Audibert, and F. Bach. 2011. Structured variable selection with sparsity-inducing norms. J. Mach. Learn. Res. 12, 10 (2011), 2777--2824.

Digital Library

[26]

S. Jiang and X. Lu. 2017. WeSamBE: A weight-sample-based method for background subtraction. IEEE Trans. Circuits Syst. Video Technol. 28, 9 (2017), 2105--2115.

[27]

K. Kim, T. Chalidabhongse, D. Harwood, and L. Davis. 2005. Real-time foreground-background segmentation using codebook model. Real-Time Imaging 11, 3 (2005), 172--185.

Digital Library

[28]

C. Li, S. Q. Zheng, and B. Prabhakaran. 2007. Segmentation and recognition of motion streams by similarity search. ACM Trans. Multimedia Comput. Commun. Appl. 3, 3 (2007).

Digital Library

[29]

L. Li, W. Huang, I. Y.-H. Gu, and Q. Tian. 2004. Statistical modeling of complex backgrounds for foreground object detection. IEEE Trans. Image Process. 13, 11 (2004), 1459--1472.

[30]

Tom S. F. Haines and Tao Xiang. 2013. Background subtraction with DirichletProcess mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 4 (2013), 670--683.

Digital Library

[31]

J. Liu, S. Ji, and J. Ye. 2009. Multi-task feature learning via efficient l2,1 norm minimization. In Proc. UAI., AUAI Press, Arlington, Virginia (2009), 339--348.

[32]

J. Liu and J. Ye. 2010. Moreau-Yosida regularization for grouped tree structure learning. In Proc. Int. Conf. Neural Inf. Process. Syst. (2010), 1459--1467.

[33]

X. Liu and et al. 2015. Background subtraction based on low-rank and structured sparse decomposition. IEEE Trans. Image Process. 24, 8 (2015), 2502--2514.

Digital Library

[34]

X. Liu, J. Yao, X. Hong, X. Huang, Z. Zhou, C. Qi, and G. Zhao. 2018. Background subtraction using spatio-temporal group sparsity recovery. IEEE Trans. Circuits Syst. Video Technol. 28, 8 (Aug. 2018), 1737--1751.

[35]

X. Liu and J. Zhao. 2019. Background subtraction using multi-channel fused Lasso. IS8T International Symposium on Electronic Imaging (2019), 2691--2695.

[36]

V. Mahadevan and N. Vasconcelos. 2010. Spatiotemporal saliency in highly dynamic scenes. IEEE Trans. Pattern Anal. Mach. Intell 32, 1 (2010), 171--177.

[37]

T. Matsuyama, T. Ohya, and H. Habe. 1999. Background subtraction for non-stationary scenes. In Proc. Asian Conf. Comput. Vis. (1999), 662--667.

[38]

T. Moeslund, A. Hilton, and V. Kruger. 2006. A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 2 (2006), 90--126.

Digital Library

[39]

V. Mondejar-Guerra, J. Rouco, J. Novo, and M. Ortega. 2019. An end-to-end deep learning approach for simultaneous background modeling and subtraction. In British Machine Vision Conference 2019. 2691--2695.

[40]

O. Oreifej, X. Li, and M. Shah. 2013. Simultaneous video stabilization and moving object detection in turbulence. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2 (2013), 450--462.

Digital Library

[41]

P. L. St-Charles, G. A. Bilodeau, and R. Bergevin. 2016. Universal background subtraction using word consensus models. IEEE Trans. Image Process. 25, 10 (2016), 4768--4781.

Digital Library

[42]

H. Sajid and S. C. S. Cheung. 2017. Universal multimode background subtraction. IEEE Trans. Image Process. 26, 7 (2017), 3249--3260.

Digital Library

[43]

M. Shakeri and H. Zhang. 2016. COROLA: A sequential solution to moving object detection using low-rank approximation. Comput. Vis. Image Understand. 146 (2016), 27--39.

Digital Library

[44]

St-Charles, P. L., G. A. Bilodeau, and R. Bergevin. 2015. SuBSENSE: A universal change detection method with local adaptive sensitivity. IEEE Trans. Image Process. 24, 1 (2015), 359--373.

[45]

C. Stauffer and W. E. L. Grimson. 1999. Adaptive background mixture models for real-time tracking. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2 (1999), 246--252.

[46]

G. Tang and A. Nehorai. 2011. Robust principal component analysis based on low-rank and block-sparse matrix decomposition. In Proc. 45th Annu. Conf. Inf. Sci. Syst. (2011), 1--5.

[47]

A. Vacavant, T. Chateau, A. Wilhelm, and L. Lequivre. 2012. A benchmark dataset for outdoor foreground/background extraction. In Proc. Asian Conf. Comput. Vis. Workshops (2012), 291--300.

[48]

Arun, Varghese, and G. Sreelekha. 2017. Sample-based integrated background subtraction and shadow detection. IPSJ Trans. Comput. Vision Appl. 9, 1 (2017), 25.

[49]

N. Wang, T. Yao, J. Wang, and D.-Y. Yeung. 2012. A probabilistic approach to robust matrix factorization. In Proc. Eur. Conf. Comput. Vis. (2012), 126--139.

Digital Library

[50]

Y. Wang, Z. Luo, and P. M. Jodoin. 2017. Interactive deep learning method for segmenting moving objects. Elsevier Science Inc. 96 (2017), 66--75.

[51]

Lucia Maddalena and Alfredo Petrosino. 2008. A self-organizing approach to background subtraction for visual surveillance applications. IEEE Transactions on Image Processing 17, 7 (2008), 1168--1177.

Digital Library

[52]

B. Xin, Y. Tian, Y. Wang, and W. Gao. 2015. Background subtraction via generalized fused Lasso foreground modeling. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2015), 4676--4684.

[53]

J. Xu, V. Ithapu, L. Mukherjee, J. Rehg, and V. Singh. 2013. GOSUS: Grassmannian online subspace updates with structured-sparsity. In Proc. ICCV. 3376--3383.

[54]

Z. B. Xu, X. Chang, and F. Xu. 2012. L1/2 regularization: A thresholding representation theory and a fast solver. IEEE Trans. Neural Netw. Learn Syst. 23, 7 (2012), 1013--1027.

[55]

Martin Hofmann, Philipp Tiefenbacher, and Gerhard Rigoll. 2012. Background segmentation with feedback: The pixel-based adaptive segmenter. In Proceedings of the 2012 CVPR Workshops. IEEE, 38--43.

[56]

X. Ye, J. Yang, X. Sun, K. Li, C. Hou, and Y. Wang. 2015. Foreground--background separation from video clips via motion-assisted matrix restoration. IEEE Trans. Circuits Syst. Video Technol. 25, 11 (2015), 1721--1734.

Digital Library

[57]

J. Zhang, M. Wang, L. Lin, X. Yang, J. Gao, and Y. Rui. 2017. Saliency detection on light field: A multi-cue approach. ACM Trans. Multimedia Comput. Commun. Appl. 13, 3 (2017).

Digital Library

[58]

Q. Zhao, D. Meng, Z. Xu, W. Zuo, and L. Zhang. 2014. Robust principal component analysis with complex noise. In Proc. Int. Conf. Mach. Learning (2014), 55--63.

[59]

Aihua Zheng, Tian Zou, Yumiao Zhao, Bo Jiang, Jin Tang, and Chenglong Li. 2019. Background subtraction with multi-scale structured low-rank and sparse factorization. Neurocomputing 328 (2019), 113--121. Chinese Conference on Computer Vision 2017.

[60]

X. Zhou, C. Yang, and W. Yu. 2013. Moving object detection by detecting contiguous outliers in the low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 3 (2013), 597--610.

Digital Library

[61]

L. Zhu, Y. Hao, and Y. Song. 2018. L1/2 norm and spatial continuity regularized low-rank approximation for moving object detection in dynamic background. IEEE Signal Process. Lett. 25, 1 (2018), 15--19.

[62]

Z. Zivkovic. 2004. Improved adaptive Gaussian mixture model for background subtraction. In Proc. Int. Conf. Pattern Recognit. 2 (2004), 28--31.

Cited By

Yang QLi YLi CWang HYan SWei LDai WZou JXiong HFrossard P(2024)SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual AttentionIEEE Transactions on Multimedia10.1109/TMM.2023.330659626(3061-3076)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3306596
Jiang WShen HXu ZYang CYang J(2024)A feature compression method based on similarity matchingDisplays10.1016/j.displa.2024.102728(102728)Online publication date: Apr-2024
https://doi.org/10.1016/j.displa.2024.102728
Li Y(2023)Detection of Moving Object Using Superpixel Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357999819:5(1-15)Online publication date: 16-Mar-2023
https://dl.acm.org/doi/10.1145/3579998
Show More Cited By

Index Terms

Motion-Aware Structured Matrix Factorization for Foreground Detection in Complex Scenes
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection
        Video segmentation
      2. Image and video acquisition
        Motion capture
  2. Machine learning
    1. Machine learning approaches
      1. Factorization methods
        Principal component analysis

Recommendations

Detection of foreground in dynamic scene via two-step background subtraction

Various computer vision applications such as video surveillance and gait analysis have to perform human detection. This is usually done via background modeling and subtraction. It is a challenging problem when the image sequence captures the human ...
Motion detection with nonstationary background

This paper proposes a new background subtraction method for detecting moving foreground objects from a nonstationary background. While background subtraction has traditionally worked well for a stationary background, the same cannot be implied for a ...
An edge-based approach for robust foreground detection
ACIVS'11: Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems

Foreground segmentation is an essential task in many image processing applications and a commonly used approach to obtain foreground objects from the background. Many techniques exist, but due to shadows and changes in illumination the segmentation of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 16, Issue 4

November 2020

372 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3444749

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2020

Accepted: 01 June 2020

Revised: 01 March 2020

Received: 01 May 2019

Published in TOMM Volume 16, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
180
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yang QLi YLi CWang HYan SWei LDai WZou JXiong HFrossard P(2024)SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual AttentionIEEE Transactions on Multimedia10.1109/TMM.2023.330659626(3061-3076)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3306596
Jiang WShen HXu ZYang CYang J(2024)A feature compression method based on similarity matchingDisplays10.1016/j.displa.2024.102728(102728)Online publication date: Apr-2024
https://doi.org/10.1016/j.displa.2024.102728
Li Y(2023)Detection of Moving Object Using Superpixel Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357999819:5(1-15)Online publication date: 16-Mar-2023
https://dl.acm.org/doi/10.1145/3579998
Yan JWu LJiang WLiu CShen F(2023)Revisiting the robustness of spatio-temporal modeling in video quality assessmentDisplays10.1016/j.displa.2023.102585(102585)Online publication date: Nov-2023
https://doi.org/10.1016/j.displa.2023.102585
Shi ZWu QLi HMeng FNgan K(2023)Dual-graph hierarchical interaction network for referring image segmentationDisplays10.1016/j.displa.2023.10257580(102575)Online publication date: Dec-2023
https://doi.org/10.1016/j.displa.2023.102575
Fu SZhao XYang CFang Z(2023)Data heterogeneous federated learning algorithm for industrial entity extractionDisplays10.1016/j.displa.2023.10250480(102504)Online publication date: Dec-2023
https://doi.org/10.1016/j.displa.2023.102504
Zheng GYuan L(2023)A review of QoE research progress in metaverseDisplays10.1016/j.displa.2023.10238977(102389)Online publication date: Apr-2023
https://doi.org/10.1016/j.displa.2023.102389
Huang ROu HQing CXu X(2023)Visual Sentiment Analysis with a VR Sentiment Dataset on Omni-Directional ImagesAdvances in Brain Inspired Cognitive Systems10.1007/978-981-97-1417-9_28(300-309)Online publication date: 5-Aug-2023
https://dl.acm.org/doi/10.1007/978-981-97-1417-9_28
Pokala PHemadri RSeelamantula C(2022)Iteratively Reweighted Minimax-Concave Penalty Minimization for Accurate Low-rank Plus Sparse Matrix DecompositionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.312225944:12(8992-9010)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TPAMI.2021.3122259
Abolfazli Esfahani MJamadi AAbolfazli Esfahani M(2022)ISAIR: Deep inpainted semantic aware image representation for background subtractionExpert Systems with Applications10.1016/j.eswa.2022.117947207(117947)Online publication date: Nov-2022
https://doi.org/10.1016/j.eswa.2022.117947
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents