Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474880.3474891acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicebtConference Proceedingsconference-collections
research-article

Multi-scale Mobile Phone Playing Behavior Recognition Based on Temporal Enhancement and Interaction

Published: 01 December 2021 Publication History

Abstract

Analyzing students' classroom behavior is an important part of evaluating classroom teaching effects in the education field, and the behavior of using mobile phones is an important manifestation of students' learning status. Therefore, the status of using mobile phones in the classroom can reflect the effect of classroom teaching to a certain extent. This article establishes a video-based classroom student behavior data set, and divides the behavior categories in the data into two categories: mobile phone playing and other; analysis of students playing mobile phone and other behaviors reveals that there are subtle movements in student behavior and a certain visual tempo. The resolution is low, and the occlusion is serious. In response to the above problems, this paper proposes a multi-scale mobile phone playing behavior recognition method based on temporal information enhancement and interaction. First, use the motion enhancement module to enhance the motion information between two frames to improve the recognition ability of subtle actions; secondly, add the temporal pyramid to extract the multi-scale features of the action, and then obtain the visual tempo information of the video; finally add the temporal information interaction module to enhance the temporal dimension information interaction, To further model the temporal information. The experimental results on the self-made student action dataset StudentAction show that compared with the existing methods, the algorithm has significantly improved recognition accuracy and better solves the problem of low accuracy in the recognition of subtle actions. Good performance have shown on the public datasets HMDB51 and UCF101, indicating that the method has strong generalization ability and can adapt to the recognition problems of different scene actions.

References

[1]
Wang H, Klser A, Schmid C, Dense Trajectories and Motion Boundary Descriptors for Action Recognition[J]. International Journal of Computer Vision, 2013, 103(1):60-79.
[2]
Wang Heng, Schmid C. Action recognition with improved trajectories[C]// Proc of IEEE International Conference on Computer Vision. 2013: 3551-3558.
[3]
Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos [J]. Advances in Neural Information Processing Systems. 2014, 1 (4): 568-576.
[4]
Ji S, Xu W, Yang M, 3D Convolutional Neural Networks for Human Action Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 35(1):221-231.
[5]
Tran D, Bourdev L, Fergus R, Learning Spatiotemporal Features with 3D Convolutional Networks[J]. 2014.
[6]
Xu H, Das A, Saenko K . R-C3D: Region Convolutional 3D Network for Temporal Activity Detection[C]// IEEE International Conference on Computer Vision (ICCV). IEEE, 2017.
[7]
Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals. Recurrent Neural Network Regularization[C]// International Conference on Learning Representations.2015.
[8]
GRAVES A, MOHAMED A, HINTON G. Speech Recognition with Deep Recurrent Neural Networks[C]// Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2013:6645-6649.
[9]
HOCHREITER S, SCHMIDHUBER J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[10]
Xie S, Sun C, Huang J, Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification[J]. 2017.
[11]
Tran D, Wang H, Torresani L, A Closer Look at Spatiotemporal Convolutions for Action Recognition[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018.
[12]
Carreira J, Zisserman A . Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017.
[13]
Qiu Z, Yao T, Mei T . Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks[C]// 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017.
[14]
Luo C. and Yuille A., Grouped Spatial-Temporal Aggregation for Efficient Action Recognition, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019, pp. 5511-5520.
[15]
Sun L, Jia K, Yeung D Y, Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks[J]. 2015:4597-4605.
[16]
Wang L, Xiong Y, Wang Z, Temporal Segment Networks: Towards Good Practices for Deep Action Recognition[J]. 2016.
[17]
Donahue J, Hendricks L A, Rohrbach M, Long-term Recurrent Convolutional Networks for Visual Recognition and Description[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(4):677-691.
[18]
Feichtenhofer C, Fan H, Malik J, SlowFast Networks for Video Recognition[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019.
[19]
Lin T Y, Dollár, Piotr, Girshick R, Feature Pyramid Networks for Object Detection[J]. 2016.
[20]
Yang C, Xu Y, Shi J,Temporal Pyramid Network for Action Recognition, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020.
[21]
Lin J, Gan C, Han S . TSM: Temporal Shift Module for Efficient Video Understanding[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019.
[22]
He K, Zhang X, Ren S, Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society, 2016.
[23]
Li Y, Ji B, Shi X, TEA: Temporal Excitation and Aggregation for Action Recognition[J]. 2020.
  1. Multi-scale Mobile Phone Playing Behavior Recognition Based on Temporal Enhancement and Interaction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICEBT '21: Proceedings of the 2021 5th International Conference on E-Education, E-Business and E-Technology
    June 2021
    174 pages
    ISBN:9781450389600
    DOI:10.1145/3474880
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 December 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Behavior recognition
    2. Motion enhancement
    3. Play cell phone
    4. Teaching effect evaluation
    5. Temporal interaction
    6. Temporal pyramid

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICEBT 2021

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 24
      Total Downloads
    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media