Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Motion-Aware Structured Matrix Factorization for Foreground Detection in Complex Scenes

Published: 17 December 2020 Publication History

Abstract

Foreground detection is one of the key steps in computer vision applications. Many foreground and background models have been proposed and achieved promising performance in static scenes. However, due to challenges such as dynamic background, irregular movement, and noise, most algorithms degrade sharply in complex scenes. To address the problem, we propose a motion-aware structured matrix factorization approach (MSMF), which integrates the structural and spatiotemporal motion information into a unified sparse-low-rank matrix factorization framework. Technologically, it has three main contributions: First, a variant of structured sparsity-inducing norm is proposed to constrain both structure and sparsity of foreground. The model is robust to the statistical variability of the underlying foreground pixels in complex scenes. Second, to capture the ambiguous pixels, a spatiotemporal cube-based motion trajectory is extracted for assisting matrix factorization. Finally, to solve the optimization problem of structured matrix factorization, we develop an augmented Lagrange multiplier method with the alternating direction strategy and Douglas-Rachford monotone operator splitting algorithm. Experiments demonstrate that the proposed approach achieves impressive performance in separating irregular moving foreground while suppressing the dynamic background and the noise, and outperforms some state-of-the-art algorithms.

References

[1]
M. Babaee, D. T. Dinh, and G. Rigoll. 2018. A deep convolutional neural network for video sequence background subtraction. Pattern Recognit. 76 (2018), 635--649.
[2]
O. Barnich and M. Van Droogenbroeck. 2011. ViBe: A universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20, 6 (2011), 1709--1724.
[3]
A. Beck and M. Teboulle. 2009. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18, 11 (2009), 2419--2434.
[4]
T. Bouwmans, A. Sobral, S. Javed, S. K. Jung, and E. H. Zahzah. 2017. Decomposition into low-rank plus additive matrices for background/foreground separation: A review for a comparative evaluation with a large-scale dataset. Comput. Sci. Rev. 23 (2017), 1--71.
[5]
T. Brox, A. Bruhn, N. Papenberg, and J. Weickert. 2004. High accuracy optical flow estimation based on a theory for warping. In Proc. ECCV (2004), 25--36.
[6]
Tianyi Zhou and Dacheng Tao. 2011. Godec: Randomized low-rank 8 sparse matrix decomposition in noisy case. In Proceedings of the 28th International Conference on Machine Learning (ICML'11).
[7]
E. J. Cands, X. Li, Y. Ma, and J. Wright. 2011. Robust principal component analysis? J. ACM 58, 3 (2011), 1--37.
[8]
W. Cao, J. Sun, and Z. Xu. 2013. Fast image deconvolution using closed-form thresholding formulas of Lq (q=1/2,2/3) regularization. J. Vis Commun. Image Represent. 24, 1 (2013), 31--41.
[9]
X. Cao, L. Yang, and X. Guo. 2015. Total variation regularized RPCA for irregularly moving object detection under dynamic back-ground. IEEE Trans. Cyber. 46, 4 (2015), 1014--1027.
[10]
P. L. Combettes and J. Pesquet. 2007. A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery. IEEE J. Sel. Top. Signal Process. 1, 4 (2007), 564--574.
[11]
X. Ding, L. He, and L. Carin. 2011. Bayesian robust principal component analysis. IEEE Trans. Image Process. 20, 12 (2011), 3419--3430.
[12]
S. E. Ebadi and E. Izquierdo. 2017. Foreground segmentation with tree-structured sparse RPCA. IEEE Trans. Pattern Anal. Mach. Intell. 40, 9 (2017), 2273--2280.
[13]
A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis. 2002. Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. In Proc. IEEE 90, 7 (2002), 1151--1163.
[14]
J. Friedman, T. Hastie, and H. Hofling. 2007. Pathwise coordinate optimization. Ann. Appl. Stat. 1, 2 (2007), 302--332.
[15]
Z. Gao, L.-F. Cheong, and Y.-X. Wang. 2014. Block-sparse RPCA for salient motion detection. IEEE Trans. Pattern Anal. Mach. Intell. 36, 10 (2014), 1975--1987.
[16]
N. Goyette, P. Jodoin, F. Porikli, J. Konrad, and P. Ishwar. 2012. Changedetection.net: A new change detection benchmark dataset. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops (2012), 1--8.
[17]
J. Guo, P. Zheng, and J. Huang. 2017. An efficient motion detection and tracking scheme for encrypted surveillance videos. ACM Trans. Multimedia Comput. Commun. Appl. 13, 4 (2017), 1--23.
[18]
Xiaojie Guo, Xinggang Wang, Liang Yang, Xiaochun Cao, and Yi Ma. 2014. Robust foreground detection using smoothness and arbitrariness constraints. In ECCV.
[19]
J. He, L. Balzano, and A. Szlam. 2012. Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Providence (2012), 1568--1575.
[20]
M. Heikkila and M. Pietikainen. 2006. A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28, 4 (2006), 657--662.
[21]
S. Javed, A. Mahmood, S. Al-Maadeed, T. Bouwmans, and S. K. Jung. 2019. Moving object detection in complex scene using spatiotemporal structured-sparse RPCA. IEEE Transactions on Image Processing 28, 2 (Feb. 2019), 1007--1022.
[22]
S. Javed, A. Mahmood, T. Bouwmans, and S. K. Jung. 2017. Background--foreground modeling based on spatiotemporal sparse subspace clustering. IEEE Transactions on Image Processing 26, 12 (Dec. 2017), 5840--5854.
[23]
S. Javed, A. Mahmood, T. Bouwmans, and S. K. Jung. 2018. Spatiotemporal low-rank modeling for complex scene background initialization. IEEE Trans. Circuits Syst. Video Technol. 28, 6 (2018), 1315--1329.
[24]
S. Javed, Soon Ki Jung, A. Mahmood, and T. Bouwmans. 2016. Motion-aware graph regularized RPCA for background modeling of complex scenes. In 2016 23rd International Conference on Pattern Recognition (ICPR). 120--125.
[25]
R. Jenatton, J. Y. Audibert, and F. Bach. 2011. Structured variable selection with sparsity-inducing norms. J. Mach. Learn. Res. 12, 10 (2011), 2777--2824.
[26]
S. Jiang and X. Lu. 2017. WeSamBE: A weight-sample-based method for background subtraction. IEEE Trans. Circuits Syst. Video Technol. 28, 9 (2017), 2105--2115.
[27]
K. Kim, T. Chalidabhongse, D. Harwood, and L. Davis. 2005. Real-time foreground-background segmentation using codebook model. Real-Time Imaging 11, 3 (2005), 172--185.
[28]
C. Li, S. Q. Zheng, and B. Prabhakaran. 2007. Segmentation and recognition of motion streams by similarity search. ACM Trans. Multimedia Comput. Commun. Appl. 3, 3 (2007).
[29]
L. Li, W. Huang, I. Y.-H. Gu, and Q. Tian. 2004. Statistical modeling of complex backgrounds for foreground object detection. IEEE Trans. Image Process. 13, 11 (2004), 1459--1472.
[30]
Tom S. F. Haines and Tao Xiang. 2013. Background subtraction with DirichletProcess mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 4 (2013), 670--683.
[31]
J. Liu, S. Ji, and J. Ye. 2009. Multi-task feature learning via efficient l2,1 norm minimization. In Proc. UAI., AUAI Press, Arlington, Virginia (2009), 339--348.
[32]
J. Liu and J. Ye. 2010. Moreau-Yosida regularization for grouped tree structure learning. In Proc. Int. Conf. Neural Inf. Process. Syst. (2010), 1459--1467.
[33]
X. Liu and et al. 2015. Background subtraction based on low-rank and structured sparse decomposition. IEEE Trans. Image Process. 24, 8 (2015), 2502--2514.
[34]
X. Liu, J. Yao, X. Hong, X. Huang, Z. Zhou, C. Qi, and G. Zhao. 2018. Background subtraction using spatio-temporal group sparsity recovery. IEEE Trans. Circuits Syst. Video Technol. 28, 8 (Aug. 2018), 1737--1751.
[35]
X. Liu and J. Zhao. 2019. Background subtraction using multi-channel fused Lasso. IS8T International Symposium on Electronic Imaging (2019), 2691--2695.
[36]
V. Mahadevan and N. Vasconcelos. 2010. Spatiotemporal saliency in highly dynamic scenes. IEEE Trans. Pattern Anal. Mach. Intell 32, 1 (2010), 171--177.
[37]
T. Matsuyama, T. Ohya, and H. Habe. 1999. Background subtraction for non-stationary scenes. In Proc. Asian Conf. Comput. Vis. (1999), 662--667.
[38]
T. Moeslund, A. Hilton, and V. Kruger. 2006. A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 2 (2006), 90--126.
[39]
V. Mondejar-Guerra, J. Rouco, J. Novo, and M. Ortega. 2019. An end-to-end deep learning approach for simultaneous background modeling and subtraction. In British Machine Vision Conference 2019. 2691--2695.
[40]
O. Oreifej, X. Li, and M. Shah. 2013. Simultaneous video stabilization and moving object detection in turbulence. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2 (2013), 450--462.
[41]
P. L. St-Charles, G. A. Bilodeau, and R. Bergevin. 2016. Universal background subtraction using word consensus models. IEEE Trans. Image Process. 25, 10 (2016), 4768--4781.
[42]
H. Sajid and S. C. S. Cheung. 2017. Universal multimode background subtraction. IEEE Trans. Image Process. 26, 7 (2017), 3249--3260.
[43]
M. Shakeri and H. Zhang. 2016. COROLA: A sequential solution to moving object detection using low-rank approximation. Comput. Vis. Image Understand. 146 (2016), 27--39.
[44]
St-Charles, P. L., G. A. Bilodeau, and R. Bergevin. 2015. SuBSENSE: A universal change detection method with local adaptive sensitivity. IEEE Trans. Image Process. 24, 1 (2015), 359--373.
[45]
C. Stauffer and W. E. L. Grimson. 1999. Adaptive background mixture models for real-time tracking. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2 (1999), 246--252.
[46]
G. Tang and A. Nehorai. 2011. Robust principal component analysis based on low-rank and block-sparse matrix decomposition. In Proc. 45th Annu. Conf. Inf. Sci. Syst. (2011), 1--5.
[47]
A. Vacavant, T. Chateau, A. Wilhelm, and L. Lequivre. 2012. A benchmark dataset for outdoor foreground/background extraction. In Proc. Asian Conf. Comput. Vis. Workshops (2012), 291--300.
[48]
Arun, Varghese, and G. Sreelekha. 2017. Sample-based integrated background subtraction and shadow detection. IPSJ Trans. Comput. Vision Appl. 9, 1 (2017), 25.
[49]
N. Wang, T. Yao, J. Wang, and D.-Y. Yeung. 2012. A probabilistic approach to robust matrix factorization. In Proc. Eur. Conf. Comput. Vis. (2012), 126--139.
[50]
Y. Wang, Z. Luo, and P. M. Jodoin. 2017. Interactive deep learning method for segmenting moving objects. Elsevier Science Inc. 96 (2017), 66--75.
[51]
Lucia Maddalena and Alfredo Petrosino. 2008. A self-organizing approach to background subtraction for visual surveillance applications. IEEE Transactions on Image Processing 17, 7 (2008), 1168--1177.
[52]
B. Xin, Y. Tian, Y. Wang, and W. Gao. 2015. Background subtraction via generalized fused Lasso foreground modeling. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2015), 4676--4684.
[53]
J. Xu, V. Ithapu, L. Mukherjee, J. Rehg, and V. Singh. 2013. GOSUS: Grassmannian online subspace updates with structured-sparsity. In Proc. ICCV. 3376--3383.
[54]
Z. B. Xu, X. Chang, and F. Xu. 2012. L1/2 regularization: A thresholding representation theory and a fast solver. IEEE Trans. Neural Netw. Learn Syst. 23, 7 (2012), 1013--1027.
[55]
Martin Hofmann, Philipp Tiefenbacher, and Gerhard Rigoll. 2012. Background segmentation with feedback: The pixel-based adaptive segmenter. In Proceedings of the 2012 CVPR Workshops. IEEE, 38--43.
[56]
X. Ye, J. Yang, X. Sun, K. Li, C. Hou, and Y. Wang. 2015. Foreground--background separation from video clips via motion-assisted matrix restoration. IEEE Trans. Circuits Syst. Video Technol. 25, 11 (2015), 1721--1734.
[57]
J. Zhang, M. Wang, L. Lin, X. Yang, J. Gao, and Y. Rui. 2017. Saliency detection on light field: A multi-cue approach. ACM Trans. Multimedia Comput. Commun. Appl. 13, 3 (2017).
[58]
Q. Zhao, D. Meng, Z. Xu, W. Zuo, and L. Zhang. 2014. Robust principal component analysis with complex noise. In Proc. Int. Conf. Mach. Learning (2014), 55--63.
[59]
Aihua Zheng, Tian Zou, Yumiao Zhao, Bo Jiang, Jin Tang, and Chenglong Li. 2019. Background subtraction with multi-scale structured low-rank and sparse factorization. Neurocomputing 328 (2019), 113--121. Chinese Conference on Computer Vision 2017.
[60]
X. Zhou, C. Yang, and W. Yu. 2013. Moving object detection by detecting contiguous outliers in the low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 3 (2013), 597--610.
[61]
L. Zhu, Y. Hao, and Y. Song. 2018. L1/2 norm and spatial continuity regularized low-rank approximation for moving object detection in dynamic background. IEEE Signal Process. Lett. 25, 1 (2018), 15--19.
[62]
Z. Zivkovic. 2004. Improved adaptive Gaussian mixture model for background subtraction. In Proc. Int. Conf. Pattern Recognit. 2 (2004), 28--31.

Cited By

View all
  • (2024)SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual AttentionIEEE Transactions on Multimedia10.1109/TMM.2023.330659626(3061-3076)Online publication date: 1-Jan-2024
  • (2024)A feature compression method based on similarity matchingDisplays10.1016/j.displa.2024.102728(102728)Online publication date: Apr-2024
  • (2023)Detection of Moving Object Using Superpixel Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357999819:5(1-15)Online publication date: 16-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 4
November 2020
372 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3444749
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2020
Accepted: 01 June 2020
Revised: 01 March 2020
Received: 01 May 2019
Published in TOMM Volume 16, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. l2,q regularization
  2. Matrix factorization
  3. foreground detection
  4. structured sparse-inducing norm

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual AttentionIEEE Transactions on Multimedia10.1109/TMM.2023.330659626(3061-3076)Online publication date: 1-Jan-2024
  • (2024)A feature compression method based on similarity matchingDisplays10.1016/j.displa.2024.102728(102728)Online publication date: Apr-2024
  • (2023)Detection of Moving Object Using Superpixel Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357999819:5(1-15)Online publication date: 16-Mar-2023
  • (2023)Revisiting the robustness of spatio-temporal modeling in video quality assessmentDisplays10.1016/j.displa.2023.102585(102585)Online publication date: Nov-2023
  • (2023)Dual-graph hierarchical interaction network for referring image segmentationDisplays10.1016/j.displa.2023.10257580(102575)Online publication date: Dec-2023
  • (2023)Data heterogeneous federated learning algorithm for industrial entity extractionDisplays10.1016/j.displa.2023.10250480(102504)Online publication date: Dec-2023
  • (2023)A review of QoE research progress in metaverseDisplays10.1016/j.displa.2023.10238977(102389)Online publication date: Apr-2023
  • (2023)Visual Sentiment Analysis with a VR Sentiment Dataset on Omni-Directional ImagesAdvances in Brain Inspired Cognitive Systems10.1007/978-981-97-1417-9_28(300-309)Online publication date: 5-Aug-2023
  • (2022)Iteratively Reweighted Minimax-Concave Penalty Minimization for Accurate Low-rank Plus Sparse Matrix DecompositionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.312225944:12(8992-9010)Online publication date: 1-Dec-2022
  • (2022)ISAIR: Deep inpainted semantic aware image representation for background subtractionExpert Systems with Applications10.1016/j.eswa.2022.117947207(117947)Online publication date: Nov-2022
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media