research-article

Dual-Channel Improved ShuffleNet (DCISN) for Real-time Violence Detection

Authors:

Deqiang WangAuthors Info & Claims

ICIGP '23: Proceedings of the 2023 6th International Conference on Image and Graphics Processing

Pages 142 - 147

https://doi.org/10.1145/3582649.3582653

Published: 07 April 2023 Publication History

Abstract

In this paper, we propose a lightweight deep learning network architecture, named dual-channel improved ShuffleNet (DCISN), for real-time violence detection in videos. The proposed extracts space-time features using two parallel channels like SlowFast networks and adopts newly designed ShuffleNet units to construct lightweight stage modules. Cross-stage connections are introduced to boost the accuracy of the DCISN network. Cascaded depth-wise convolution layers and Squeeze-and-Excitation (SE) block are employed in the newly designed ShuffleNet units to lower computation cost and meanwhile ensure good accuracy. A DCISN model has been designed and evaluated on recognized benchmark datasets, namely Hockey-Fight, Movies-Fight and RWF-2000. The DCISN model has 0.168M parameters and requires only 0.253GFlops in computation cost. Experiment results suggest that, in comparison with reported schemes, the DCISN model achieves competitive accuracy with much lower computation cost.

References

[1]

Batyrkhan Omarov, Sergazi Narynov, Zhandos Zhumanov, Aidana Gumar, and Mariyam Khassanova. 2022. State-of-the-art violence detection techniques in video surveillance security systems: A systematic review. PeerJ Computer Science 8 (2022). http://dx.doi.org/10.7717/PEERJ-CS.920

[2]

Laptev and Lindeberg. 2003. Space-time interest points. In Proceedings Ninth IEEE International Conference on Computer Vision. 432–439 vol.1. https://doi.org/10.1109/ICCV.2003.1238378

[3]

Heng Wang and Cordelia Schmid. 2013. Action Recognition with Improved Trajectories. In 2013 IEEE International Conference on Computer Vision. 3551–3558. https://doi.org/10.1109/ICCV.2013.441

Digital Library

[4]

Enrique Bermejo Nievas, Oscar Deniz Suarez, Gloria Bueno García, and Rahul Sukthankar. 2011. Violence Detection in Video Using Computer Vision Techniques. In Computer Analysis of Images and Patterns, Pedro Real, Daniel Diaz-Pernil, Helena Molina-Abril, Ainhoa Berciano, and Walter Kropatsch (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 332–339.

[5]

Long Xu, Chen Gong, Jie Yang, Qiang Wu, and Lixiu Yao. 2014. Violent video detection based on MoSIFT feature and sparse coding. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3538–3542. https://doi.org/10.1109/ICASSP.2014.6854259

[6]

Tasweer Ahmad, Junaid Rafique, Hassam Muazzam, and Tahir Rizvi. 2015. Using Discrete Cosine Transform Based Features for Human Action Recognition. Journal of Image and Graphics, Vol. 3, No. 2, pp. 96-101.

[7]

Piotr Bilinski and Francois Bremond. 2016. Human violence recognition and detection in surveillance videos. In 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 30–36. https: //doi.org/10.1109/AVSS.2016.7738019

[8]

Peipei, Zhou, Qinghai, Ding, Haibo, Luo, Xinglin, and Hou. 2018. Violence detection in surveillance video using low-level features. PloS one (2018).

[9]

Naresh Kumar and Nagarajan Sukavanam. 2018. Motion Trajectory for Human Action Recognition Using Fourier Temporal Features of Skeleton Joints. Journal of Image and Graphics, Vol. 6, No. 2, pp. 174-180.

[10]

Javad Mahmoodi and Afsane Salajeghe. 2019. A classification method based on optical flow for violence detection. Expert Systems with Applications 127 (2019), 121–127. https://doi.org/10.1016/j.eswa.2019.02.032

Digital Library

[11]

Mohtavipour, S.M., Saeidi, M. & Arabsorkhi, A. 2022. A multi-stream CNN for deep violence detection in video sequences using handcrafted features. Vis Comput 38, 2057–2072 (2022). https://doi.org/10.1007/s00371-021-02266-4

Digital Library

[12]

Karen Simonyan and Andrew Zisserman. 2014. Two-Stream Convolutional Networks for Action Recognition in Videos. CoRR abs/1406.2199 (2014). arXiv:1406.2199 http://arxiv.org/abs/1406.2199

[13]

Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, and Trevor Darrell. 2017. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4 (2017), 677–691. https://doi.org/10.1109/TPAMI.2016.2599174

Digital Library

[14]

Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning Spatiotemporal Features with 3D Convolutional Networks. In 2015 IEEE International Conference on Computer Vision (ICCV). 4489–4497. https://doi.org/10.1109/ICCV.2015.510

Digital Library

[15]

João Carreira and Andrew Zisserman. 2017. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. CoRR abs/1705.07750 (2017). arXiv:1705.07750 http://arxiv.org/abs/1705.07750

[16]

Seyma Yucer and Yusuf Sinan Akgul. 2018. 3D Human Action Recognition with Siamese-LSTM Based Deep Metric Learning. Journal of Image and Graphics, Vol. 6, No. 1, pp. 21-26.

[17]

Rohit Halder and Rajdeep Chatterjee. 2020. CNN-BiLSTM Model for Violence Detection in Smart Surveillance. SN computer science 1, 4 (2020).

[18]

Zhihong Dong, Jie Qin, and Yunhong Wang. 2016. Multi-stream Deep Networks for Person to Person Violence Detection in Videos. In Pattern Recognition, Tieniu Tan, Xuelong Li, Xilin Chen, Jie Zhou, Jian Yang, and Hong Cheng (Eds.). Springer Singapore, Singapore, 517–531.

[19]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778. https://doi.org/10.1109/CVPR.2016.90

[20]

Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. 2017. A Closer Look at Spatiotemporal Convolutions for Action Recognition. CoRR abs/1711.11248 (2017). arXiv:1711.11248 http://arxiv.org/abs/1711.11248

[21]

Ming Cheng, Kunjing Cai, and Ming Li. 2021. RWF-2000: An Open Large Scale Video Database for Violence Detection. In 2020 25th International Conference on Pattern Recognition (ICPR). 4183–4190. https://doi.org/10.1109/ICPR48806.2021.9412502

[22]

Wei Wang, Shuai Dong, Kun Zou, and Wensheng Li. 2022. A Lightweight Network for Violence Detection. In 2022 the 5th International Conference on Image and Graphics Processing (ICIGP) (ICIGP 2022). Association for Computing Machinery, New York, NY, USA, 15–21. https://doi.org/10.1145/3512388.3512391

Digital Library

[23]

Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, and Kaiming He. 2019. SlowFast Networks for Video Recognition. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 6201–6210. https://doi.org/10.1109/ICCV.2019.00630

[24]

Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun, and IEEE. 2018. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. IEEE, NEW YORK, 6848–6856.

[25]

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In Computer Vision – ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 122–138.

Digital Library

[26]

Jie Hu, Li Shen, Samuel Albanie, Gang Sun, and Enhua Wu. 2020. Squeeze-and-Excitation Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 8 (2020), 2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372

Digital Library

Index Terms

Dual-Channel Improved ShuffleNet (DCISN) for Real-time Violence Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

A Lightweight Network for Violence Detection
ICIGP '22: Proceedings of the 2022 5th International Conference on Image and Graphics Processing

Video violence detection is an application area under the field of action recognition, which refers to the detection of violent behavior in video sequences. Existing methods or deep learning models, while capable of effective detection, are not as ...
End-to-end Multiplayer Violence Detection based on Deep 3D CNN
ICNCC '18: Proceedings of the 2018 VII International Conference on Network, Communication and Computing

Numerous behavior recognition researches have focused on UCF-101 video dataset, such as sports, cooking and other simple routines. Yet these studies are less useful in real-life surveillance scenarios. Violence detection in crowded scenes (such as ...
Real Life Violence Detection in Surveillance Videos using Spatiotemporal Features
IC3-2021: Proceedings of the 2021 Thirteenth International Conference on Contemporary Computing

Automatic violence detection has remarkable importance from practical and academic point of view. Generally speaking, detecting violence in a crowded locality, via computational approaches, is challenging owing to rapid movements, overlapping ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIGP '23: Proceedings of the 2023 6th International Conference on Image and Graphics Processing

January 2023

246 pages

ISBN:9781450398572

DOI:10.1145/3582649

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICIGP 2023

ICIGP 2023: 2023 The 6th International Conference on Image and Graphics Processing

January 6 - 8, 2023

Chongqing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
44
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)1

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents