Multi-feature hierarchical topic models for human behavior recognition

Li, HePing; Zhang, Feng; Zhang, ShuWu

doi:10.1007/s11432-013-4794-9

Multi-feature hierarchical topic models for human behavior recognition

Research Paper
Published: 07 March 2014

Volume 57, pages 1–15, (2014)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

HePing Li¹,
Feng Zhang¹ &
ShuWu Zhang¹

207 Accesses
4 Citations
Explore all metrics

Abstract

Human behavior recognition is one important task of image processing and surveillance system. One main challenge of human behavior recognition is how to effectively model behaviors on condition of unconstrained videos due to tremendous variations from camera motion, background clutter, object appearance and so on. In this paper, we propose two novel Multi-Feature Hierarchical Latent Dirichlet Allocation models for human behavior recognition by extending the bag-of-word topic models such as the Latent Dirichlet Allocation model and the Multi-Modal Latent Dirichlet Allocation model. The two proposed models with three hierarchies including low-level visual features, feature topics, and behavior topics can effectively fuse two different types of features including motion and static visual features, avoid detecting or tracking the motion objects, and improve the recognition performance even if the features are extracted with a great amount of noise. Finally, we adopt the variational EM algorithm to learn the parameters of these models. Experiments on the YouTube dataset demonstrate the effectiveness of our proposed models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Hu W, Tan T, Wang L. A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern C-Appl Rev, 2004, 34: 334–352
Article Google Scholar
Dollar P, Rabaud V, Cottrell G, et al. Behavior recognition via sparse spatio-temporal features. In: the 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Beijing, 2005. 65–72
Chapter Google Scholar
Bay H, Ess A, Tuytelaars T, et al. SURF: speeded up robust features. Comput Vis Image Und, 2008, 110: 346–358
Article Google Scholar
Liu J G, Luo J B, Shah M. Recognizing realistic actions from video “in the wild”. In: International Conference on Computer Vision and Pattern Recognition, Florida, 2009. 1996–2003
Google Scholar
Gelman A, Carlin J B, Stern H S, et al. Bayesian data analysis. 2nd ed. Chapman Hall/CRC Texts in Statistical Science, 2004
MATH Google Scholar
Wang X G, Ma X X, Eric W, et al. Unsupervised activity perception in crowded and complicated scenes using hierarchical Bayesian models. IEEE Trans Pattern Anal, 2009, 31: 539–555
Article Google Scholar
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. J Mach Learn Res, 2003, 3: 993–1022
MATH Google Scholar
Blei D M, Jordan M I. Modeling annotated data. In: Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, 2003. 127–134
Google Scholar
Yakhnenko O, Honavar V. Multi-modal hierarchical Dirichlet process model for predicting image annotation and image-object label correspondence. In: the 9th SIAM International Conference on Data Mining, Sparks, Nevada, 2009. 281–294
Google Scholar
Bobick A, Davis J. The recognition of human movement using temporal templates. IEEE Trans Pattern Anal, 2001, 23: 257–267
Article Google Scholar
Schuldt C, Laptev I, Caputo B. Recognizing human actions: a local SVM approach. In: International Conference on Pattern Recognition, Cambridge, 2004. 32–36
Google Scholar
Oikonomopoulos A, Patras I, Pantic M. Spatiotemporal salient points for visual recognition of human actions. IEEE Trans Syst Man Cybern B-Cybern, 2006, 36: 710–719
Article Google Scholar
Blank M, Gorelick L, Shechtman E, et al. Actions as space-time shapes. In: International Conference on Computer Vision, Beijing, 2005. 1395–1402
Google Scholar
Seo H J, Milanfar P. Detection of human actions from a single example. In: International Conference on Computer Vision, Kyoto, 2009. 1965–1970
Google Scholar
Fathi A, Mori G. Action recognition by learning mid-level motion features. In: International Conference on Computer Vision and Pattern Recognition, Alaska, 2008. 1–8
Google Scholar
Mauthner T, Roth P M, Bischof H. Instant action recognition. In: the 16th Scandinavian Conference on Image Analysis, Oslo, 2009. 1–10
Google Scholar
Brendel W, Todorovic S. Activities as time series of human postures. In: European Conference on Computer Vision, Crete, 2010. 721–734
Google Scholar
MatiKainen P, Hebert M, Sukthankar R. Representing pairwise spatial and temporal relations for action recognition. In: European Conference on Computer Vision, Crete, 2010. 508–521
Google Scholar
Lui Y M, Beveridge J R. Action classification on product manifolds. In: International Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 833–839
Google Scholar
Li Y, Fermuller C, Aloimonos Y, et al. Learning shift-invariant sparse representation of actions. In: International Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2630–2637
Google Scholar
Wang L, Suter D. Learning and matching of dynamic shape manifolds for human action recognition. IEEE Trans Image Process, 2007, 16: 1646–1661
Article MathSciNet Google Scholar
Gong S G, Xiang T. Recognition of group activities using dynamic probabilistic networks. In: International Conference on Computer Vision, Nice, 2003. 742–749
Chapter Google Scholar
Li W Q, Zhang Z Y, Liu Z C. Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Trans Circ Syst for Vid, 2008, 18: 1499–1510
Article Google Scholar
Niebles J, Li F F. A hierarchical model of shape and appearance for human action classification. In: International Conference on Computer Vision and Pattern Recognition, Minnesota, 2007. 1–8
Google Scholar
Nater F, Grabner H, Gool L V. Exploiting simple hierarchies for unsupervised human behavior analysis. In: International Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2014–2021
Google Scholar
Laptev I, Marszalek M, Schmid C, et al. Learning realistic human action movies. In: International Conference on Computer Vision and Pattern Recognition, Alaska, 2008. 1–8
Google Scholar
Kratz L, Nishino K. Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In: International Conference on Computer Vision and Pattern Recognition, Florida, 2009. 1446–1453
Google Scholar
Ikizler-Cinbis N, Sclaroff S. Object, scene and actions: combining multiple features for human action recognition. In: European Conference on Computer Vision, Crete, 2010. 494–507
Google Scholar
Yao A, Gall G, Gool L V. A hough transform-based voting framework for action recognition. In: International Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2061–2068
Google Scholar
Niebles J C, Wang H C, Li F F. Unsupervised learning of human action categories using spatio-temporal words. Int J Comput Vision, 2008, 79: 299–318
Article Google Scholar
Wang Y, Mori G. Human action recognition by semi-latent topic models. IEEE Trans Pattern Anal, 2009, 31: 1762–1774
Article Google Scholar
Hospedale T, Gong S G, Xiang T. A markov clustering topic model for mining behavior in video. In: International Conference on Computer Vision, Kyoto, 2009. 1165–1172
Google Scholar
Li H P, Liu J, Zhang S W. Hierarchical Latent Dirichlet Allocation models for realistic action recognition. In: International Conference on Acoustics, Speech, and Singal Processing, Prague, 2011. 1297–1300
Google Scholar
Lowe D G. Distinctive image features from scale-invariant keypoints. Int J Comput Vision, 2004, 60: 91–110
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
HePing Li, Feng Zhang & ShuWu Zhang

Authors

HePing Li
View author publications
You can also search for this author in PubMed Google Scholar
Feng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
ShuWu Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to HePing Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, H., Zhang, F. & Zhang, S. Multi-feature hierarchical topic models for human behavior recognition. Sci. China Inf. Sci. 57, 1–15 (2014). https://doi.org/10.1007/s11432-013-4794-9

Download citation

Received: 28 August 2013
Accepted: 15 October 2013
Published: 07 March 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s11432-013-4794-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-feature hierarchical topic models for human behavior recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recognizing human actions by two-level Beta process hidden Markov model

Language-Motivated Approaches to Action Recognition

Discriminative sequential association latent dirichlet allocation for visual recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multi-feature hierarchical topic models for human behavior recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recognizing human actions by two-level Beta process hidden Markov model

Language-Motivated Approaches to Action Recognition

Discriminative sequential association latent dirichlet allocation for visual recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation