Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3234804.3234821acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdltConference Proceedingsconference-collections
research-article

Local Feature Analysis for real-time Action Recognition

Published: 27 June 2018 Publication History

Abstract

The real-time Action recognition is necessary for enabling computer automatically recognize human action in real world video. However, the current architectures (e.g. Hidden two stream ConvNets) are relatively shallow and based on fully-connected structure, which cares about the full image. The network tends to fail if two videos share similar backgrounds. To address this issue, we take a deep look into the effectiveness of local feature for better action representation. In particular, we visualize the hidden two stream network as an example and observe that local feature indeed reduce the impact of background information. Finally, we give the experimental results to demonstrate how does the local feature affects the recognition performance. We verify the performance of our proposed network on the standard video dataset UCF101 and it achieves the recognition accuracy of 91.6%, achieving a 2.6% improvement over the state-of-the-art real-time approaches.

References

[1]
Simonyan, K., and Zisserman, A. 2014. Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems.
[2]
Wang, L., Xiong, Y., Wang, Z., and Qiao, Y. 2015. Towards good practices for very deep two-stream convnets. arXiv preprint arXiv:1507.02159.
[3]
Carreira, J., and Zisserman, A. 2017. Quo vadis, action recognition? A new model and the kinetics dataset. In IEEE Conference on Computer Vision and Pattern Recognition.
[4]
Ng, J. Y. H., Choi, J., Neumann, J., and Davis, L. S. 2016. Actionflownet: Learning motion representation for action recognition. arXiv preprint arXiv:1612.03052.
[5]
Feichtenhofer, C., Pinz, A., and Wildes, R. 2016. Spatiotemporal residual networks for video action recognition. In Advances in neural information processing systems.
[6]
Diba, A., Sharma, V., and Van Gool, L. 2017. Deep temporal linear encoding networks. In IEEE Conference on Computer Vision and Pattern Recognition.
[7]
Diba, A., Pazandeh, A. M., and Van Gool, L. 2016. Efficient two-stream motion and appearance 3d cnns for video classification. arXiv preprint arXiv:1608.08851.
[8]
Wang, L., Qiao, Y., and Tang, X. 2015. Action recognition with trajectory-pooled deep-convolutional descriptors. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[9]
Sun, L., Jia, K., Yeung, D. Y., and Shi, B. E. 2015. Human action recognition using factorized spatio-temporal convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision.
[10]
Sun, S., Kuang, Z., Ouyang, W., Sheng, L., and Zhang, W. 2017. Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition. arXiv preprint arXiv:1711.11152.
[11]
Zhang, B., Wang, L., Wang, Z., Qiao, Y., and Wang, H. 2016. Real-time action recognition with enhanced motion vector CNNs. In IEEE Conference on Computer Vision and Pattern Recognition.
[12]
Zhu, Y., Lan, Z., Newsam, S., and Hauptmann, A. G. 2017. Hidden two-stream convolutional networks for action recognition. arXiv preprint arXiv:1704.00389.
[13]
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., and Fei-Fei, L. 2014. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.
[14]
Wang, H., and Schmid, C. 2013. Action recognition with improved trajectories. In International Conference on Computer Vision.
[15]
Tran, D., Bourdev, L., Fergus, R., Torresani, L., and Paluri, M. 2015. Learning spatiotemporal features with 3d convolutional networks. In International Conference on Computer Vision.
[16]
Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems.
[17]
Simonyan, K., and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[18]
He, K., Zhang, X., Ren, S., and Sun, J. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[19]
Ioffe, S., and Szegedy, C. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
[20]
Zeiler, M. D., and Fergus, R. 2014. Visualizing and Understanding Convolutional Networks. European Conference on Computer Vision. Springer, Cham.
[21]
Maaten L, Hinton G. 2008. Visualizing data using t-SNE{J}. Journal of machine learning research.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICDLT '18: Proceedings of the 2018 2nd International Conference on Deep Learning Technologies
June 2018
112 pages
ISBN:9781450364737
DOI:10.1145/3234804
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

  • Chongqing University of Posts and Telecommunications
  • University of Electronic Science and Technology of China: University of Electronic Science and Technology of China

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Action Recognition
  2. Convolutional neural network
  3. Local Feature
  4. Real-time

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Major Program of Shaanxi Province, China
  • National Natural Science Foundation (NFC) of China
  • Advance Research Program during the 13st Five-Year Plan Period of China
  • Ministry of Education project

Conference

ICDLT '18

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 86
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media