Abstract
In the field of activity recognition, violence detection is one of the most challenging tasks due to the variety of action patterns and the lack of training data. In the last decade, the performance is getting improved by applying local spatio-temporal features. However, geometric relationships and transition processes of these features have not been fully utilized. In this paper, we propose a novel framework based on spatio-temporal hypergraph transition. First, we utilize hypergraphs to represent the geometric relationships among spatia-temporal features in a single frame. Then, we apply a new descriptor called Histogram of Velocity Change (HVC), which characterizes motion changing intensity, to model hypergraph transitions among consecutive frames. Finally, we adopt Hidden Markov Models (HMMs) with the hypergraphs and the descriptors to detect and localize violence in video frames. Experiment results on BEHAVE dataset and UT-Interaction dataset show that the proposed framework outperforms the existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Gool, L.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). doi:10.1007/978-3-319-46484-8_2
Kong, Y., Yun, F.: Close human interaction recognition using patch-aware models. IEEE Trans. Image Process. Publ. IEEE Signal Process. Soc. 25, 167–178 (2015)
Ke, O., Bennamoun, M., An, S., Boussaid, F., Sohel, F.: Human interaction prediction using deep temporal features. In: 2016 European Conference on Computer Vision (2016)
Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4305–4314 (2015)
Zhang, B., Wang, L., Wang, Z., Qiao, Y., Wang, H.: Real-time action recognition with enhanced motion vector CNNs. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (2016)
Lan, T., Chen, T.-C., Savarese, S.: A hierarchical representation for future action prediction. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 689–704. Springer, Cham (2014). doi:10.1007/978-3-319-10578-9_45
Xu, Z., Qing, L., Miao, J.: Activity auto-completion: predicting human activities from partial videos. In: International Conference on Computer Vision, pp. 3191–3199 (2015)
Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: 2011 International Conference on Computer Vision, pp. 1036–1043, November 2011
Cui, X., Liu, Q., Gao, M., Metaxas, D.N.: Abnormal detection using interaction energy potentials. In: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, 20–25 June 2011, pp. 3161–3167. Colorado Springs Co, USA, June 2011
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 935–942 (2009)
Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: IEEE International Conference on Computer Vision, pp. 1593–1600 (2009)
Blunsden, S.J., Fisher, R.B.: The BEHAVE video dataset: ground truthed video for multi-person. Ann. BMVA 4, 1–11 (2009)
Wu, B., Yuan, C., Hu, W.: Human action recognition based on context-dependent graph kernels. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2609–2616 (2014)
Ben Aoun, N., Mejdoub, M., Ben Amar, C.: Graph-based approach for human action recognition using spatio-temporal features. J. Vis. Commun. Image Represent. 25(2), 329–338 (2014)
Laptev, I., Lindeberg, T.: On space-time interest points. Int. J. Comput. Vision 64, 107–123 (2005)
De Souza, F.D.M., Chavez, G.C., Do Valle, E.A., De A. Araujo, A.: Violence detection in video using spatio-temporal features. In: 2012 Proceedings of the 25th SIBGRAPI Conference on Graphics, Patterns and Images, pp. 224–230 (2010)
Nam, J.H., Alghoniemy, M., Tewfik, A.H.: Audio-visual content-based violent scene characterization. In: Proceedings of the International Conference on Image Processing, ICIP 1998, pp. 353–357 (1998)
Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: real-time detection of violent crowd behavior. In: Computer Vision and Pattern Recognition Workshops, pp. 1–6 (2012)
Helbing, D., Molnár, P.: Social force model for pedestrian dynamics. Phys. Rev. E Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top. 51(5), 4282–4286 (1995)
Mousavi, H., Galoogahi, H.K., Perina, A., Murino, V.: Detecting abnormal behavioral patterns in crowd scenarios. In: Esposito, A., Jain, L.C. (eds.) Toward Robotic Socially Believable Behaving Systems - Volume II. ISRL, vol. 106, pp. 185–205. Springer, Cham (2016). doi:10.1007/978-3-319-31053-4_11
Brendel, W., Todorovic, S.: Learning spatiotemporal graphs of human activities. In: IEEE International Conference on Computer Vision, pp. 778–785 (2011)
Yi, Y., Lin, M.: Human action recognition with graph-based multiple-instance learning. Pattern Recogn. 53(C), 148–162 (2016)
Ta, A.P., Wolf, C., Lavou, G., Baskurt, A.: Recognizing and localizing individual activities through graph matching. In: Seventh IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 196–203 (2010)
Park, S., Park, S., Hebert, M.: Fast and scalable approximate spectral matching for higher order graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 479–492 (2014)
Duchenne, O., Bach, F., In So, K., Ponce, J.: A tensor-based algorithm for high-order graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2383–95 (2011)
Acknowledgments
This work was supported by National Science Foundation of China (No. U1611461), National Natural Science Foundation of China (61602014), Shenzhen Peacock Plan (20130408-183003656), and Science and Technology Planning Project of Guangdong Province, China (No. 2014B090910001).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Huang, J., Li, G., Li, N., Wang, R., Wang, W. (2017). A Violence Detection Approach Based on Spatio-temporal Hypergraph Transition. In: Felsberg, M., Heyden, A., Krüger, N. (eds) Computer Analysis of Images and Patterns. CAIP 2017. Lecture Notes in Computer Science(), vol 10425. Springer, Cham. https://doi.org/10.1007/978-3-319-64698-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-64698-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64697-8
Online ISBN: 978-3-319-64698-5
eBook Packages: Computer ScienceComputer Science (R0)