Abstract
This paper presents a double channel 3D convolution neural network to classify the exam scenes of invigilation videos. The first channel is based on the C3D convolution neural network, which is the status-of-arts method of the video scene classification. The structure of this channel is redesigned for classifying the exam-room scenes of invigilation videos. Another channel is based on the two-stream convolution neural network using the optical flow graph sequence as its input. This channel uses the data from the optical flow of video to improve the performance of the video scene classification. The formed double channel 3D convolution neural network has appropriate size of convolution kernel and pooling kernel design. Experiments show that the proposed neural network can classify the exam-room scenes of invigilation videos faster and more accurately than the existing methods.
This study is funded by the General Program of the National Natural Science Foundation of China (No: 61977029).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adil, M., Simon, R., Khatri, S.K.: Automated invigilation system for detection of suspicious activities during examination. In: IEEE Amity International Conference on Artificial Intelligence (AICAI) (2019)
Cote, M., Jean, F., Albu, A.B., Capson, D.W.: Video summarization for remote invigilation of online exams. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9 (2016)
Andrej, K., George, T., Sanketh, S., et al.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732 (2014)
Park, E., Han, X., Berg, T.L., et al.: Combining multiple sources of knowledge in deep CNNS for action recognition. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–8 (2016)
Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4305–4314 (2015)
Wang, H., Kläser, A., Schmid, C., et al.: Action recognition by dense trajectories. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176 (2011)
Shao, J., Chen, C.L., Kang, K., et al.: Slicing convolutional neural network for crowd video understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5620–5628 (2016)
Shao, J, Loy C.C., Wang, X.: Scene-independent group profiling in crowd. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2227–2234 (2014)
Zhang, J., Zheng, Y., Qi, D.: Deep spatio-temporal residual networks for citywide crowd flows prediction. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 1655–1661 (2016)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, Montreal, Canada, pp. 568–576 (2014)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1933–1941 (2016)
Zhu, Y., Lan, Z., Newsam, S., et al.: Hidden two-stream convolutional networks for action recognition. arXiv preprint arXiv:1704.00389 (2017)
Wang, H, Wang, L.: Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks. In: CVPR, pp. 499–508 (2017)
Ji, S., Xu, W., Yang, M., et al.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Tran, D., Bourdev, L., Fergus, R., et al.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 5534–5542 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, W., Yu, X. (2019). Double Channel 3D Convolutional Neural Network for Exam Scene Classification of Invigilation Videos. In: Lee, C., Su, Z., Sugimoto, A. (eds) Image and Video Technology. PSIVT 2019. Lecture Notes in Computer Science(), vol 11854. Springer, Cham. https://doi.org/10.1007/978-3-030-34879-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-34879-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34878-6
Online ISBN: 978-3-030-34879-3
eBook Packages: Computer ScienceComputer Science (R0)