Abstract
We propose a methodology for recognizing actions at a distance by watching the human poses and deriving descriptors that capture the motion patterns of the poses. Human poses often carry a strong visual sense (intended meaning) which describes the related action unambiguously. But identifying the intended meaning of poses is a challenging task because of their variability and such variations in poses lead to visual sense ambiguity. From a large vocabulary of poses (visual words) we prune out ambiguous poses and extract key poses (or key words) using centrality measure of graph connectivity [1]. Under this framework, finding the key poses for a given sense (i.e., action type) amounts to constructing a graph with poses as vertices and then identifying the most “important” vertices in the graph (following centrality theory). The results on four standard activity recognition datasets show the efficacy of our approach when compared to the present state of the art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Navigli, R., Lapata, M.: An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Trans. on PAMI 32(4), 678–692 (2010)
Dollar, P., Rabaud, V., Cotrell, G., Belongie, S.: Behavior Recognition via Sparse Spatio-Temporal Features. In: IEEE Int. Workshop on VS-PETS, pp. 65–72 (2005)
Laptev, I., Lindeberg, T.: Space-time Interest Points. In: 9th ICCV, vol. 1, pp. 432–439 (2003)
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing Action at a Distance. In: 9th ICCV, vol. 2, pp. 726–733 (2003)
Ikizler, N., Duygulu, P.: Histogram of Oriented Rectangles: A New Pose Descriptor for Human Action Recognition. Image and Vision Computing 27, 1515–1526 (2009)
Wang, Y., Mori, G.: Human Action Recognition by Semi-Latent Topic Models. IEEE Trans. on PAMI 31(10), 1762–1774 (2009)
Niebles, J.C., Wang, H., Li, F.-F.: Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. IJCV 79(3), 299–318 (2008)
Liu, J., Luo, J., Shah, M.: Recognizing Realistic Actions from Videos “in the Wild”. In: CVPR (2009)
Niebles, J., Le, F.F.: A hierarchical model of shape and appearance for human action classification. In: CVPR (2007)
Bissacco, A., Yang, M.H., Soatto, S.: Detecting humans with their pose. In: NIPS (2007)
Fengjun, L., Nevatia, R.: Single View Human Action Recognition using Key Pose Matching and Viterbi Path Seraching. In: CVPR (2007)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Brin, S., Page, L.: The anatomy of a large-scale hyper-textual web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Kim, G., Faloutsos, C., Hebert, M.: Unsupervised modeling of object categories using link analysis technique. In: CVPR (2008)
Lucas, B.D., Kanade, T.: An Iterative Image Registration Technique with an Application to Stereo Vision. In: 7th IJCAI, pp. 674–679 (1981)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Pelleg, D., Moore, A.W.: X-means: Extending K-means with efficient Estimation of the Number of Clusters. In: ICML (2000)
Narayan, B.L., Murthy, C.A., Pal, S.K.: Maxdiff kd-trees for Data Condensation. PRL 27(3), 187–200 (2005)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2003)
Gemert, J.C.V., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual Word Ambiguity. IEEE Trans. on PAMI 32(7), 1271–1283 (2010)
Chen, C.C., Ryoo, M.S., Aggarwal, J.K.: UT-Tower Dataset: Aerial View Activity Classification Challenge (2010), http://cvrc.ece.utexas.edu/SDHA2010/Aerial_View_Activity.html
Lu, W.L., Okuma, K., Little, J.J.: Tracking and Recognizing Actions of Multiple Hockey Players Using the Boosted Particle Filter. Image and Vision Computing 27(1-2), 189–205 (2009)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing Human Actions: A Local SVM Approach. In: 17th ICPR, pp. 32–36 (2004)
http://www.csie.ntu.edu.tw/~cjlin/libsvm/ (June 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mukherjee, S., Biswas, S.K., Mukherjee, D.P. (2011). Modeling Sense Disambiguation of Human Pose: Recognizing Action at a Distance by Key Poses. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6492. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19315-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-19315-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19314-9
Online ISBN: 978-3-642-19315-6
eBook Packages: Computer ScienceComputer Science (R0)