Abstract
This paper presents a new action recognition approach based on local spatio-temporal features. The main contributions of our approach are twofold. First, a new local spatio-temporal feature is proposed to represent the cuboids detected in video sequences. Specifically, the descriptor utilizes the covariance matrix to capture the self-correlation information of the low-level features within each cuboid. Since covariance matrices do not lie on Euclidean space, the Log-Euclidean Riemannian metric is used for distance measure between covariance matrices. Second, the Earth Mover’s Distance (EMD) is used for matching any pair of video sequences. In contrast to the widely used Euclidean distance, EMD achieves more robust performances in matching histograms/distributions with different sizes. Experimental results on two datasets demonstrate the effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Geometric Means in a Novel Vector Space Structure on Symmetric Positive-Definite Matrices. SIAM J. Matrix Anal. Appl., 328–347 (2007)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing Human Actions: A Local SVM Approach. In: ICPR, pp. 32–36 (2004)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised Learning of Human Action Categories Using Spatial Temporal Words. In: IJCV, pp. 299–318 (2008)
Yan, K., Sukthankar, R., Hebert, M.: Efficient Visual Event Detection using Volumetric Features. In: ICCV, pp. 166–173 (2005)
Lucena, M.J., Fuertes, J.M., Blanca, N.P.: Human Motion Characterization Using Spatio-temporal Features. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds.) IbPRIA 2007. LNCS, vol. 4477, pp. 72–79. Springer, Heidelberg (2007)
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior Recognition Via Sparse spatiotemporal Features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
Wong, S., Cipolla, R.: Extracting Spatiotemporal Interest Points using Global Information. In: ICCV, pp. 1–8 (2007)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. PAMI 27(10), 615–1630 (2005)
Li, X., Hu, W., Zhang, Z., Zhang, X., Zhu, M., Cheng, J.: Visual Tracking Via Incremental Log-Euclidean Riemannian Subspace Learning. In: CVPR (2008)
Kadir, T., Zisserman, A., Brady, M.: An Affine Invariant Salient Region Detector. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004)
Fathi, A., Mori, G.: Action Recognition by Learning Mid-level Motion Features. In: CVPR (2008)
Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: ICCV, pp. 59–66 (1998)
Yan, K., Sukthankar, R.: PCA-SIFT: A More Distinctive Representation for Local Image Descriptors. In: CVPR, pp. 506–513 (2004)
Rubner, Y., Tomasi, C., Guibas, L.J.: The Earth Mover’s Distance as a Metric for Image Retrieval. IJCV 40(2), 99–121 (2000)
Tuzel, O., Porikli, F., Meer, P.: Region Covariance: A Fast Descriptor for Detection and Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 589–600. Springer, Heidelberg (2006)
Liu, J., Ali, S., Shah, M.: Recognizing Human Actions Using Multiple Features. In: CVPR (2008)
Jia, K., Yeung, D.: Human Action Recognition using Local Spatio-Temporal Discriminant Embedding. In: CVPR (2008)
Perronnin, F.: Universal and Adapted Vocabularies for Generic Visual Categorization. PAMI 30(7), 1243–1256 (2008)
Wang, L., Suter, D.: Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model. In: CVPR (2007)
Liu, J., Shah, M.: Learning Human Actions via Information Maximazation. In: CVPR (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yuan, C., Hu, W., Li, X., Maybank, S., Luo, G. (2010). Human Action Recognition under Log-Euclidean Riemannian Metric. In: Zha, H., Taniguchi, Ri., Maybank, S. (eds) Computer Vision – ACCV 2009. ACCV 2009. Lecture Notes in Computer Science, vol 5994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12307-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-12307-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12306-1
Online ISBN: 978-3-642-12307-8
eBook Packages: Computer ScienceComputer Science (R0)