Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2499788.2499795acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicimcsConference Proceedingsconference-collections
research-article

Exploring dense trajectory feature and encoding methods for human interaction recognition

Published: 17 August 2013 Publication History

Abstract

Recently, human activity recognition has obtained increasing attention due to its wide range of potential applications. Much progress has been made to improve the performance on single actions in videos while few on collective and interactive activities. Human interaction is a more challenging task owing to multi-actors in an execution. In this paper, we utilize multi-scale dense trajectories and explore four advanced feature encoding methods on the human interaction dataset with a bag-of-features framework. Particularly, dense trajectories are described by shape, histogram of gradient orientation, histogram of flow orientation and motion boundary histogram, and all these are computed by integral images. Experimental results on the UT-Interaction dataset show that our approach outperforms state-of-the-art methods by 7-14%. Additionally, we thoroughly analyse a finding that the performance of vector quantization is on par with or even better than other sophisticated feature encoding methods by using dense trajectories in videos.

References

[1]
M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri. Actions as space-time shapes. In ICCV, volume 2, pages 1395--1402, 2005.
[2]
Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. Learning mid-level features for recognition. In CVPR, pages 2559--2566, 2010.
[3]
W. Brendel and S. Todorovic. Learning spatiotemporal graphs of human activities. In ICCV, pages 778--785, 2011.
[4]
N. Dalal, B. Triggs, and C. Schmid. Human detection using oriented histograms of flow and appearance. ECCV, pages 428--441, 2006.
[5]
P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. In Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pages 65--72, 2005.
[6]
A. Klaser, M. Marszalek, C. Schmid, et al. A spatio-temporal descriptor based on 3d-gradients. In BMVC, 2008.
[7]
H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. Hmdb: A large video database for human motion recognition. In ICCV, pages 2556--2563, 2011.
[8]
I. Laptev. On space-time interest points. IJCV, 64(2):107--123, 2005.
[9]
I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In CVPR, pages 1--8, 2008.
[10]
H. Lee, A. Battle, R. Raina, and A. Ng. Efficient sparse coding algorithms. In NIPS, volume 19, pages 801--808, 2007.
[11]
L. Liu, L. Wang, and X. Liu. In defense of soft-assignment coding. In ICCV, pages 2486--2493, 2011.
[12]
M. Marszalek, I. Laptev, and C. Schmid. Actions in context. In CVPR, pages 2929--2936, 2009.
[13]
M. Ryoo. Human activity prediction: Early recognition of ongoing activities from streaming videos. In ICCV, pages 1036--1043, 2011.
[14]
M. Ryoo, C.-C. Chen, J. Aggarwal, and A. Roy-Chowdhury. An overview of contest on semantic description of human activities (sdha) 2010. Recognizing Patterns in Signals, Speech, Images and Videos, pages 270--285, 2010.
[15]
C. Schuldt, I. Laptev, and B. Caputo. Recognizing human actions: A local svm approach. In ICPR, volume 3, pages 32--36, 2004.
[16]
P. Scovanner, S. Ali, and M. Shah. A 3-dimensional sift descriptor and its application to action recognition. In MM, pages 357--360. ACM, 2007.
[17]
D. Waltisberg, A. Yao, J. Gall, and L. Van Gool. Variations of a hough-voting action recognition system. Recognizing Patterns in Signals, Speech, Images and Videos, pages 306--312, 2010.
[18]
H. Wang, A. Klaser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In CVPR, pages 3169--3176, 2011.
[19]
H. Wang, A. Kläser, C. Schmid, and C.-L. Liu. Dense trajectories and motion boundary descriptors for action recognition. IJCV, Mar. 2013.
[20]
J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, pages 3360--3367, 2010.
[21]
G. Willems, T. Tuytelaars, and L. Van Gool. An efficient dense and scale-invariant spatio-temporal interest point detector. ECCV, pages 650--663, 2008.
[22]
J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, pages 1794--1801, 2009.
[23]
L. Yeffet and L. Wolf. Local trinary patterns for human action recognition. In ICCV, pages 492--497, 2009.

Cited By

View all
  • (2022)Machine Vision-Based Human Action Recognition Using Spatio-Temporal Motion Features (STMF) with Difference Intensity Distance Group Pattern (DIDGP)Electronics10.3390/electronics1115236311:15(2363)Online publication date: 28-Jul-2022
  • (2019)Human Interaction Recognition Based on the Co-occurring Visual Matrix SequenceIntelligent Robotics and Applications10.1007/978-3-030-27541-9_40(489-501)Online publication date: 6-Aug-2019
  • (2018)Early Recognition of Suspicious Activity for Crime PreventionComputer Vision10.4018/978-1-5225-5204-8.ch094(2139-2165)Online publication date: 2018
  • Show More Cited By

Index Terms

  1. Exploring dense trajectory feature and encoding methods for human interaction recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICIMCS '13: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
    August 2013
    419 pages
    ISBN:9781450322522
    DOI:10.1145/2499788
    • Conference Chair:
    • Tat-Seng Chua,
    • General Chairs:
    • Ke Lu,
    • Tao Mei,
    • Xindong Wu
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • NSF of China: National Natural Science Foundation of China
    • University of Sciences & Technology, Hefei: University of Sciences & Technology, Hefei
    • Beijing ACM SIGMM Chapter

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 August 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bag-of-features
    2. dense trajectory
    3. feature encoding
    4. human activity recognition

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICIMCS '13
    Sponsor:
    • NSF of China
    • University of Sciences & Technology, Hefei

    Acceptance Rates

    ICIMCS '13 Paper Acceptance Rate 20 of 94 submissions, 21%;
    Overall Acceptance Rate 163 of 456 submissions, 36%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Machine Vision-Based Human Action Recognition Using Spatio-Temporal Motion Features (STMF) with Difference Intensity Distance Group Pattern (DIDGP)Electronics10.3390/electronics1115236311:15(2363)Online publication date: 28-Jul-2022
    • (2019)Human Interaction Recognition Based on the Co-occurring Visual Matrix SequenceIntelligent Robotics and Applications10.1007/978-3-030-27541-9_40(489-501)Online publication date: 6-Aug-2019
    • (2018)Early Recognition of Suspicious Activity for Crime PreventionComputer Vision10.4018/978-1-5225-5204-8.ch094(2139-2165)Online publication date: 2018
    • (2018)Deep Temporal Feature Encoding for Action Recognition2018 24th International Conference on Pattern Recognition (ICPR)10.1109/ICPR.2018.8546263(1109-1114)Online publication date: Aug-2018
    • (2017)A New Framework of Human Interaction Recognition Based on Multiple Stage Probability FusionApplied Sciences10.3390/app70605677:6(567)Online publication date: 1-Jun-2017
    • (2017)Human interaction recognition fusing multiple features of depth sequencesIET Computer Vision10.1049/iet-cvi.2017.002511:7(560-566)Online publication date: 2-Aug-2017
    • (2016)Early Recognition of Suspicious Activity for Crime PreventionEmerging Technologies in Intelligent Applications for Image and Video Processing10.4018/978-1-4666-9685-3.ch009(205-231)Online publication date: 2016
    • (2016)Experiment verified physical-layer collision separation of passive UHF tags2016 8th International Conference on Wireless Communications & Signal Processing (WCSP)10.1109/WCSP.2016.7752717(1-5)Online publication date: Oct-2016
    • (2016)Exploring encoding and normalization methods on probabilistic latent semantic analysis model for action recognition2016 8th International Conference on Wireless Communications & Signal Processing (WCSP)10.1109/WCSP.2016.7752504(1-5)Online publication date: Oct-2016
    • (2015)Discriminative voting for activity predictionProceedings of the 7th International Conference on Internet Multimedia Computing and Service10.1145/2808492.2808494(1-4)Online publication date: 19-Aug-2015

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media