research-article

AnoPCN: Video Anomaly Detection via Deep Predictive Coding Network

Authors:

Xiaojiang Peng,

Yu QiaoAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 1805 - 1813

https://doi.org/10.1145/3343031.3350899

Published: 15 October 2019 Publication History

Abstract

Video anomaly detection is a challenging problem due to the ambiguity and complexity of how anomalies are defined. Recent approaches for this task mainly utilize deep reconstruction methods and deep prediction ones, but their performances suffer when they cannot guarantee either higher reconstruction errors for abnormal events or lower prediction errors for normal events. Inspired by the predictive coding mechanism explaining how brains detect events violating regularities, we address the Anomaly detection problem with a novel deep Predictive Coding Network, termed as AnoPCN, which consists of a Predictive Coding Module (PCM) and an Error Refinement Module (ERM). Specifically, PCM is designed as a convolutional recurrent neural network with feedback connections carrying frame predictions and feedforward connections carrying prediction errors. By using motion information explicitly, PCM yields better prediction results. To further solve the problem of narrow regularity score gaps in deep reconstruction methods, we decompose reconstruction into prediction and refinement, introducing ERM to reconstruct current prediction error and refine the coarse prediction. AnoPCN unifies reconstruction and prediction methods in an end-to-end framework, and it achieves state-of-the-art performance with better prediction results and larger regularity score gaps on three benchmark datasets including ShanghaiTech Campus, CUHK Avenue, and UCSD Ped2.

References

[1]

Mart'in Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et almbox. 2016. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv preprint arXiv:1603.04467 .

Digital Library

[2]

Davide Abati, Angelo Porrello, Simone Calderara, and Rita Cucchiara. 2019. Latent Space Autoregression for Novelty Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[3]

Rakesh Chalasani and Jose C Principe. 2013. Deep predictive coding networks. arXiv preprint arXiv:1301.3541 (2013).

[4]

Kai-Wen Cheng, Yie-Tarng Chen, and Wen-Hsien Fang. 2015. Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2909--2917.

[5]

Yang Cong, Junsong Yuan, and Ji Liu. 2011. Sparse reconstruction cost for abnormal event detection. In CVPR 2011. IEEE, 3449--3456.

Digital Library

[6]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In international Conference on computer vision & Pattern Recognition (CVPR'05), Vol. 1. IEEE Computer Society, 886--893.

Digital Library

[7]

Navneet Dalal, Bill Triggs, and Cordelia Schmid. 2006. Human detection using oriented histograms of flow and appearance. In European conference on computer vision. Springer, 428--441.

Digital Library

[8]

Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip Hausser, Caner Hazirbas, Vladimir Golkov, Patrick Van Der Smagt, Daniel Cremers, and Thomas Brox. 2015. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision. 2758--2766.

Digital Library

[9]

Yachuang Feng, Yuan Yuan, and Xiaoqiang Lu. 2016. Deep representation for abnormal event detection in crowded scenes. In Proceedings of the 24th ACM international conference on Multimedia. ACM, 591--595.

Digital Library

[10]

Basura Fernando, Hakan Bilen, Efstratios Gavves, and Stephen Gould. 2017. Self-supervised video representation learning with odd-one-out networks. In Proceedings of the IEEE conference on computer vision and pattern recognition . 3636--3645.

[11]

Karl Friston and Stefan Kiebel. 2009. Predictive coding under the free-energy principle. Philosophical Transactions of the Royal Society B: Biological Sciences, Vol. 364, 1521 (2009), 1211--1221.

[12]

Allison Del Giorno, J. Andrew Bagnell, and Martial Hebert. 2016. A discriminative framework for anomaly detection in large videos. In European Conference on Computer Vision. Springer, 334--349.

[13]

Kuan Han, Haiguang Wen, Yizhen Zhang, Di Fu, Eugenio Culurciello, and Zhongming Liu. 2018. Deep predictive coding network with local recurrent processing for object recognition. In Advances in Neural Information Processing Systems. 9221--9233.

[14]

Mahmudul Hasan, Jonghyun Choi, Jan Neumann, Amit K Roy-Chowdhury, and Larry S. Davis. 2016. Learning temporal regularity in video sequences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 733--742.

[15]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[16]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition . 1125--1134.

[17]

Jaechul Kim and Kristen Grauman. 2009. Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2921--2928.

[18]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[19]

Jan Kremlávc ek, Kairi Kreegipuu, Andrea Tales, Piia Astikainen, Nele Poldver, Risto N"a"at"anen, and Gábor Stefanics. 2016. Visual mismatch negativity (vMMN): A review and meta-analysis of studies in psychiatric and neurological disorders. Cortex, Vol. 80 (2016), 76--112.

[20]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

Digital Library

[21]

Hsin-Ying Lee, Jia-Bin Huang, Maneesh Singh, and Ming-Hsuan Yang. 2017. Unsupervised representation learning by sorting sequences. In Proceedings of the IEEE International Conference on Computer Vision. 667--676.

[22]

Weixin Li, Vijay Mahadevan, and Nuno Vasconcelos. 2014. Anomaly detection and localization in crowded scenes. IEEE transactions on pattern analysis and machine intelligence, Vol. 36, 1 (2014), 18--32.

[23]

Wen Liu, Weixin Luo, Dongze Lian, and Shenghua Gao. 2018. Future frame prediction for anomaly detection--a new baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 6536--6545.

[24]

William Lotter, Gabriel Kreiman, and David Cox. 2016. Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv:1605.08104 (2016).

[25]

Cewu Lu, Jianping Shi, and Jiaya Jia. 2013. Abnormal event detection at 150 fps in matlab. In Proceedings of the IEEE international conference on computer vision. 2720--2727.

Digital Library

[26]

Weixin Luo, Wen Liu, and Shenghua Gao. 2017a. Remembering history with convolutional LSTM for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 439--444.

[27]

Weixin Luo, Wen Liu, and Shenghua Gao. 2017b. A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE International Conference on Computer Vision . 341--349.

[28]

Vijay Mahadevan, Weixin Li, Viral Bhalodia, and Nuno Vasconcelos. 2010. Anomaly detection in crowded scenes. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1975--1981.

[29]

Michael Mathieu, Camille Couprie, and Yann LeCun. 2015. Deep multi-scale video prediction beyond mean square error. arXiv preprint arXiv:1511.05440 (2015).

[30]

Jefferson Ryan Medel and Andreas Savakis. 2016. Anomaly detection in video using predictive convolutional long short-term memory networks. arXiv preprint arXiv:1612.00390 (2016).

[31]

Rajesh P.N. Rao and Dana H. Ballard. 1999. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature neuroscience, Vol. 2, 1 (1999), 79.

[32]

Mahdyar Ravanbakhsh, Moin Nabi, Enver Sangineto, Lucio Marcenaro, Carlo Regazzoni, and Nicu Sebe. 2017. Abnormal event detection in videos using generative adversarial nets. In 2017 IEEE International Conference on Image Processing (ICIP). 1577--1581. https://doi.org/10.1109/ICIP.2017.8296547

[33]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.

[34]

Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-Kin Wong, and Wang chun Woo. 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems. 802--810.

[35]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[36]

Gabor Stefanics, Jakob Heinzle, András Attila Horváth, and Klaas Enno Stephan. 2018. Visual mismatch and predictive coding: a computational single-trial ERP study. Journal of Neuroscience, Vol. 38, 16 (2018), 4020--4030.

[37]

Gábor Stefanics, Jan Kremlávc ek, and István Czigler. 2014. Visual mismatch negativity: a predictive coding view. Frontiers in human neuroscience, Vol. 8 (2014), 666.

[38]

Qianru Sun, Hong Liu, and Tatsuya Harada. 2017. Online growing neural gas for anomaly detection in changing surveillance scenes. Pattern Recognition, Vol. 64 (2017), 187--201.

Digital Library

[39]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.

[40]

Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the IEEE international conference on computer vision. 4489--4497.

Digital Library

[41]

Hanh TM Tran and David Hogg. 2017. Anomaly detection using a convolutional winner-take-all autoencoder. In Proceedings of the British Machine Vision Conference 2017. British Machine Vision Association.

[42]

Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. 2016. Temporal segment networks: Towards good practices for deep action recognition. In European conference on computer vision. Springer, 20--36.

[43]

Siqi Wang, Yijie Zeng, Qiang Liu, Chengzhang Zhu, En Zhu, and Jianping Yin. 2018. Detecting Abnormality without Knowing Normality: A Two-stage Approach for Unsupervised Video Abnormal Event Detection. In 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 636--644.

Digital Library

[44]

Shandong Wu, Brian E. Moore, and Mubarak Shah. 2010. Chaotic invariants of lagrangian particle trajectories for anomaly detection in crowded scenes. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2054--2060.

[45]

Dan Xu, Elisa Ricci, Yan Yan, Jingkuan Song, and Nicu Sebe. 2015. Learning deep representations of appearance and motion for anomalous event detection. arXiv preprint arXiv:1510.01553 (2015).

[46]

Bin Zhao, Li Fei-Fei, and Eric P. Xing. 2011. Online detection of unusual events in videos via dynamic sparse coding. In CVPR 2011. IEEE, 3313--3320.

[47]

Yiru Zhao, Bing Deng, Chen Shen, Yao Liu, Hongtao Lu, and Xian-Sheng Hua. 2017. Spatio-temporal autoencoder for video anomaly detection. In Proceedings of the 25th ACM international conference on Multimedia. ACM, 1933--1941.

Digital Library

Cited By

Yang YXie LFu ZYan JNaqvi S(2025)Pose-oriented scene-adaptive matching for abnormal event detectionNeurocomputing10.1016/j.neucom.2024.128673611(128673)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128673
Fu YYang BYe O(2024)Spatiotemporal Masked Autoencoder with Multi-Memory and Skip Connections for Video Anomaly DetectionElectronics10.3390/electronics1302035313:2(353)Online publication date: 14-Jan-2024
https://doi.org/10.3390/electronics13020353
Pi RWu PHe XPeng Y(2024)EOGT: Video Anomaly Detection with Enhanced Object Information and Global Temporal DependencyACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366218520:10(1-21)Online publication date: 12-Sep-2024
https://dl.acm.org/doi/10.1145/3662185
Show More Cited By

Index Terms

AnoPCN: Video Anomaly Detection via Deep Predictive Coding Network
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene anomaly detection

Recommendations

Convolutional Transformer based Dual Discriminator Generative Adversarial Networks for Video Anomaly Detection
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Detecting abnormal activities in real-world surveillance videos is an important yet challenging task as the prior knowledge about video anomalies is usually limited or unavailable. Despite that many approaches have been developed to resolve this problem,...
Spatio-Temporal AutoEncoder for Video Anomaly Detection
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Anomalous events detection in real-world video scenes is a challenging problem due to the complexity of "anomaly" as well as the cluttered backgrounds, objects and motions in the scenes. Most existing methods use hand-crafted features in local spatial ...
An integration of Pseudo Anomalies and Memory Augmented Autoencoder for Video Anomaly Detection
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication Technology

Video anomaly detection (VAD) has received a lot of attention from the research community in recent years. The purpose of VAD is to identify the anomalous appearance and behavior of objects in videos. Due to the difficulty in collecting anomalous data, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the Joint Lab of CAS-HK
National Natural Science Foundation of China
Natural Science Foundation of China
Shenzhen Basic Research Program

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

133
Total Citations
View Citations
1,283
Total Downloads

Downloads (Last 12 months)182
Downloads (Last 6 weeks)15

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yang YXie LFu ZYan JNaqvi S(2025)Pose-oriented scene-adaptive matching for abnormal event detectionNeurocomputing10.1016/j.neucom.2024.128673611(128673)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128673
Fu YYang BYe O(2024)Spatiotemporal Masked Autoencoder with Multi-Memory and Skip Connections for Video Anomaly DetectionElectronics10.3390/electronics1302035313:2(353)Online publication date: 14-Jan-2024
https://doi.org/10.3390/electronics13020353
Pi RWu PHe XPeng Y(2024)EOGT: Video Anomaly Detection with Enhanced Object Information and Global Temporal DependencyACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366218520:10(1-21)Online publication date: 12-Sep-2024
https://dl.acm.org/doi/10.1145/3662185
Liu YYang DWang YLiu JLiu JBoukerche ASun PSong L(2024)Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep ModelsACM Computing Surveys10.1145/364510156:7(1-38)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3645101
Veluri RKhan SSankareswaran SShabaz MFarouk AInnab N(2024) Modified M‐RCNN approach for abandoned object detection in public places Expert Systems10.1111/exsy.13648Online publication date: 16-Jun-2024
https://doi.org/10.1111/exsy.13648
Wang SLiu JYu GLiu XZhou SZhu EYang YYin JYang W(2024)Multiview Deep Anomaly Detection: A Systematic ExplorationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318472335:2(1651-1665)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3184723
Yu HZhang XWang YHuang QYin B(2024)Fine-Grained Accident Detection: Database and AlgorithmIEEE Transactions on Image Processing10.1109/TIP.2024.335581233(1059-1069)Online publication date: 2024
https://doi.org/10.1109/TIP.2024.3355812
Liu YLiu JYang KJu BLiu SWang YYang DSun PSong L(2024)AMP-Net: Appearance-Motion Prototype Network Assisted Automatic Video Anomaly Detection SystemIEEE Transactions on Industrial Informatics10.1109/TII.2023.329847620:2(2843-2855)Online publication date: Feb-2024
https://doi.org/10.1109/TII.2023.3298476
Luo LXie SYin HPeng COng Y(2024)Detecting and Quantifying Crowd-Level Abnormal Behaviors in Crowd EventsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.342338819(6810-6823)Online publication date: 2024
https://doi.org/10.1109/TIFS.2024.3423388
Li DNie XGong RLin XYu H(2024)Multi-Branch GAN-Based Abnormal Events Detection via Context Learning in Surveillance VideosIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.332545134:5(3439-3450)Online publication date: May-2024
https://doi.org/10.1109/TCSVT.2023.3325451
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents