Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

3D skeleton-based human action classification

Published: 01 May 2016 Publication History

Abstract

In recent years, there has been a proliferation of works on human action classification from depth sequences. These works generally present methods and/or feature representations for the classification of actions from sequences of 3D locations of human body joints and/or other sources of data, such as depth maps and RGB videos.This survey highlights motivations and challenges of this very recent research area by presenting technologies and approaches for 3D skeleton-based action classification. The work focuses on aspects such as data pre-processing, publicly available benchmarks and commonly used accuracy measurements. Furthermore, this survey introduces a categorization of the most recent works in 3D skeleton-based action classification according to the adopted feature representation.This paper aims at being a starting point for practitioners who wish to approach the study of 3D action classification and gather insights on the main challenges to solve in this emerging field. HighlightsState of the art 3D skeleton-based action classification methods are reviewed.Methods are categorized based on the adopted feature representation.Motivations and challenges for skeleton-based action recognition are highlighted.Data pre-processing, public benchmarks and validation protocols are discussed.Comparison of renowned methods, open problems and future work are presented.

References

[1]
S. Kwak, B. Han, J. Han, Scenario-based video event recognition by constraint flow, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Colorado Springs, 2011, pp. 3345-3352, http://dx.doi.org/10.1109/CVPR.2011.5995435.
[2]
U. Gaur, Y. Zhu, B. Song, A. Roy-Chowdhury, A string of feature graphs model for recognition of complex activities in natural videos, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Barcelona, Spain, 2011, pp. 2595-2602, http://dx.doi.org/10.1109/ICCV.2011.6126548.
[3]
S. Park, J. Aggarwal, Recognition of two-person interactions using a hierarchical Bayesian network, in: First ACM SIGMM International Workshop on Video surveillance, ACM, Berkeley, California, 2003, pp. 65-76, http://dx.doi.org/10.1145/982452.982461.
[4]
I. Junejo, E. Dexter, I. Laptev, P. Pérez, View-independent action recognition from temporal self-similarities, IEEE Trans. Pattern Anal. Mach. Intell., 33 (2011) 172-185.
[5]
Z. Duric, W. Gray, R. Heishman, F. Li, A. Rosenfeld, M. Schoelles, C. Schunn, H. Wechsler, Integrating perceptual and cognitive modeling for adaptive and intelligent human-computer interaction, Proc. IEEE, 90 (2002) 1272-1289.
[6]
Y.-J. Chang, S.-F. Chen, J.-D. Huang, A Kinect-based system for physical rehabilitation, Res. Dev. Disabil., 32 (2011) 2566-2570.
[7]
A. Thangali, J.P. Nash, S. Sclaroff, C. Neidle, Exploiting phonological constraints for handshape inference in ASL video, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Colorado Springs, 2011, pp. 521-528, http://dx.doi.org/10.1109/CVPR.2011.5995718.
[8]
A. Thangali Varadaraju, Exploiting phonological constraints for handshape recognition in sign language video (Ph.D. thesis), Boston University, MA, USA, 2013.
[9]
H. Cooper, R. Bowden, Large lexicon detection of sign language, in: Proceedings of International Workshop on Human-Computer Interaction (HCI), Springer, Berlin, Heidelberg, Beijing, P.R. China, 2007, pp. 88-97.
[10]
J.M. Rehg, G.D. Abowd, A. Rozga, M. Romero, M.A. Clements, S. Sclaroff, I. Essa, O.Y. Ousley, Y. Li, C. Kim, et al., Decoding children's social behavior, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Portland, Oregon, 2013, pp. 3414-3421, http://dx.doi.org/10.1109/CVPR.2013.438.
[11]
L. Lo Presti, S. Sclaroff, A. Rozga, Joint alignment and modeling of correlated behavior streams, in: Proceedings of International Conference on Computer Vision-Workshops (ICCVW), Sydney, Australia, 2013, pp. 730-737, http://dx.doi.org/10.1109/ICCVW.2013.100.
[12]
H. Moon, R. Sharma, N. Jung, Method and system for measuring shopper response to products based on behavior and facial expression, US Patent 8,219,438, July 10, 2012 {http://www.google.com/patents/US8219438}.
[13]
T.B. Moeslund, E. Granum, A survey of computer vision-based human motion capture, Comput. Vis. Image Underst., 81 (2001) 231-268.
[14]
S. Mitra, T. Acharya, Gesture recognition, a survey, IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev., 37 (2007) 311-324.
[15]
R. Poppe, A survey on vision-based human action recognition, Image Vis. Comput., 28 (2010) 976-990.
[16]
D. Weinland, R. Ronfard, E. Boyer, A survey of vision-based methods for action representation, segmentation and recognition, Comput. Vis. Image Underst., 115 (2011) 224-241.
[17]
M. Ziaeefar, R. Bergevin, Semantic human activity recognition, Pattern Recognit., 8 (2015) 2329-2345.
[18]
G. Guo, A. Lai, A survey on still image based human action recognition, Pattern Recognit., 47 (2014) 3343-3361.
[19]
C.H. Lim, E. Vats, C.S. Chan, Fuzzy human motion analysis, Pattern Recognit., 48 (2015) 1773-1796.
[20]
M. Andriluka, S. Roth, B. Schiele, Pictorial structures revisited: people detection and articulated pose estimation, in: Proceedings of Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Miami Beach, Florida, 2009, pp. 1014-1021, http://dx.doi.org/10.1109/CVPRW.2009.5206754.
[21]
Y. Yang, D. Ramanan, Articulated pose estimation with flexible mixtures-of-parts, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Colorado Springs, 2011, pp. 1385-1392, http://dx.doi.org/10.1109/CVPR.2011.5995741.
[22]
D. Ramanan, D.A. Forsyth, A. Zisserman, Strike a pose: tracking people by finding stylized poses, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, IEEE, San Diego, CA, USA, 2005, pp. 271-278, http://dx.doi.org/10.1109/CVPR.2005.335.
[23]
L. Bourdev, J. Malik, Poselets: body part detectors trained using 3D human pose annotations, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Kyoto, Japan, 2009, pp. 1365-1372, http://dx.doi.org/10.1109/ICCV.2009.5459303.
[24]
D. Tran, D. Forsyth, Improved human parsing with a full relational model, in: Proceedings of European Conference on Computer Vision (ECCV), Springer, Crete, Greece, 2010, pp. 227-240.
[25]
N. Ikizler, D. Forsyth, Searching video for complex activities with finite state models, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Minneapolis, Minnesota, 2007, pp. 1-8, http://dx.doi.org/10.1109/CVPR.2007.383168.
[26]
F. Lv, R. Nevatia, Single view human action recognition using key pose matching and Viterbi path searching, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Minneapolis, Minnesota, 2007, pp. 1-8.
[27]
N. Ikizler, P. Duygulu, Human action recognition using distribution of oriented rectangular patches, in: Proceedings of Workshop on Human Motion Understanding, Modeling, Capture and Animation, Springer, Rio de Janeiro, Brazil, 2007, pp. 271-284.
[28]
M. Brand, N. Oliver, A. Pentland, Coupled hidden Markov models for complex action recognition, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, San Juan, Puerto Rico, 1997, pp. 994-999.
[29]
H. Wang, A. Kläser, C. Schmid, C.-L. Liu, Dense trajectories and motion boundary descriptors for action recognition, Int. J. Comput. Vis., 103 (2013) 60-79.
[30]
J.C. Niebles, H. Wang, L. Fei-Fei, Unsupervised learning of human action categories using spatial-temporal words, Int. J. Comput. Vis., 79 (2008) 299-318.
[31]
G. Johansson, Visual perception of biological motion and a model for its analysis, Percept. Psychophys., 14 (1973) 201-211.
[32]
S. Sadanand, J.J. Corso, Action bank: a high-level representation of activity in video, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Providence, Rhode Island, 2012, pp. 1234-1241, http://dx.doi.org/10.1109/CVPR.2012.6247806.
[33]
A. Ciptadi, M.S. Goodwin, J.M. Rehg, Movement pattern histogram for action recognition and retrieval, in: Proceedings of European Conference on Computer Vision (ECCV), Springer, Zurich, 2014, pp. 695-710, http://dx.doi.org/10.1007/978-3-319-10605-2_45.
[34]
R. Vemulapalli, F. Arrate, R. Chellappa, Human action recognition by representing 3D skeletons as points in a Lie Group, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Columbus, Ohio. 2014, pp. 588-595, http://dx.doi.org/10.1109/CVPR.2014.82.
[35]
L. Sigal, Human pose estimation, Comput. Vis.: A Ref. Guide (2014) 362-370.
[36]
K. Mikolajczyk, B. Leibe, B. Schiele, Multiple object class detection with a generative model, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, IEEE, New York, 2006, pp. 26-36.
[37]
P. Viola, M.J. Jones, D. Snow, Detecting pedestrians using patterns of motion and appearance, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Nice, France, 2003, pp. 734-741.
[38]
P.F. Felzenszwalb, D.P. Huttenlocher, Pictorial structures for object recognition, Int. J. Comput. Vis., 61 (2005) 55-79.
[39]
V. Ferrari, M. Marin-Jimenez, A. Zisserman, Progressive search space reduction for human pose estimation, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Anchorage, Alaska, 2008, pp. 1-8, http://dx.doi.org/10.1109/CVPR.2008.4587468.
[40]
D. Ramanan, Learning to parse images of articulated objects, in: Advances in Neural Information Processing Systems 134 (2006).
[41]
A. Klaser, M. Marszałek, C. Schmid, A spatio-temporal descriptor based on 3d-gradients, in: Proceedings of British Machine Vision Conference (BMVC), BMVA Press, Leeds, UK. 2008, p. 275:1.
[42]
L. Wang, Y. Wang, T. Jiang, D. Zhao, W. Gao, Learning discriminative features for fast frame-based action recognition, Pattern Recognit., 46 (2013) 1832-1840.
[43]
A. Gilbert, J. Illingworth, R. Bowden, Fast realistic multi-action recognition using mined dense spatio-temporal features, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Kyoto, Japan, 2009, pp. 925-931, http://dx.doi.org/10.1109/ICCV.2009.5459335.
[44]
J. Liu, J. Luo, M. Shah, Recognizing realistic actions from videos in the wild, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Miami Beach, Florida, 2009, pp. 1996-2003.
[45]
K. Soomro, A.R. Zamir, M. Shah, Ucf101: a dataset of 101 human actions classes from videos in the wild, arXiv preprint arXiv:1212.0402.
[46]
K.K. Reddy, M. Shah, Recognizing 50 human action categories of web videos, Mach. Vis. Appl., 24 (2013) 971-981.
[47]
J. Cho, M. Lee, H.J. Chang, S. Oh, Robust action recognition using local motion and group sparsity, Pattern Recognit., 47 (2014) 1813-1825.
[48]
L. Liu, L. Shao, F. Zheng, X. Li, Realistic action recognition via sparsely-constructed gaussian processes, Pattern Recognit., 47 (2014) 3819-3827.
[49]
M. Hoai, Z.-Z. Lan, F. De la Torre, Joint segmentation and classification of human actions in video, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Colorado Springs, 2011, pp. 3265-3272, http://dx.doi.org/10.1109/CVPR.2011.5995470.
[50]
C.-Y. Chen, K. Grauman, Efficient activity detection with max-subgraph search, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Providence, Rhode Island, 2012, pp. 1274-1281, http://dx.doi.org/10.1109/CVPR.2012.6247811.
[51]
A. Gaidon, Z. Harchaoui, C. Schmid, Temporal localization of actions with actoms, IEEE Trans. Pattern Anal. Mach. Intell., 35 (2013) 2782-2795.
[52]
D. Gong, G. Medioni, X. Zhao, Structured time series analysis for human action segmentation and recognition, IEEE Trans. Pattern Anal. Mach. Intell., 36 (2014) 1414-1427.
[53]
K.N. Tran, I.A. Kakadiaris, S.K. Shah, Part-based motion descriptor image for human action recognition, Pattern Recognit., 45 (2012) 2562-2572.
[54]
W. Li, Z. Zhang, Z. Liu, Action recognition based on a bag of 3D points, in: Proceedings of Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, San Francisco, CA, USA, 2010, pp. 9-14, http://dx.doi.org/10.1109/CVPRW.2010.5543273.
[55]
S.Z. Masood, C. Ellis, M.F. Tappen, J.J. LaViola, R. Sukthankar, Exploring the trade-off between accuracy and observational latency in action recognition, Int. J. Comput. Vis., 101 (2013) 420-436.
[56]
J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finocchio, A. Blake, M. Cook, R. Moore, Real-time human pose recognition in parts from single depth images, Commun. ACM, 56 (2013) 116-124.
[57]
S. Litvak, Learning-based pose estimation from depth maps, US Patent 8,582,867, November 12, 2013.
[58]
L. Xia, C.-C. Chen, J. Aggarwal, View invariant human action recognition using histograms of 3D joints, in: Proceedings of Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Providence, Rhode Island, 2012, pp. 20-27, http://dx.doi.org/10.1109/CVPRW.2012.6239233.
[59]
X. Yang, Y. Tian, Eigenjoints-based action recognition using Naive-Bayes-Nearest-Neighbor, in: Proceedings of Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Providence, Rhode Island, 2012, pp. 14-19, http://dx.doi.org/10.1109/CVPRW.2012.6239232.
[60]
O. Oreifej, Z. Liu, W. Redmond, HON4D: histogram of oriented 4D normals for activity recognition from depth sequences, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, 2013, pp. 716-723, http://dx.doi.org/10.1109/CVPR.2013.98.
[61]
A. Yao, J. Gall, G. Fanelli, L.J. Van Gool, Does human action recognition benefit from pose estimation? in: Proceedings of the British Machine Vision Conference (BMVC), vol. 3, BMVA Press, Dundee, UK, 2011, pp. 67.1-67.11, http://dx.doi.org/10.5244/C.25.67.
[62]
L. Lo Presti, M. La Cascia, S. Sclaroff, O. Camps, Gesture modeling by Hanklet-based hidden Markov model, in: D. Cremers, I. Reid, H. Saito, M.-H. Yang (Eds.), Proceedings of Asian Conference on Computer Vision (ACCV 2014), Lecture Notes in Computer Science, Springer International Publishing, Singapore, 2015, pp. 529-546, http://dx.doi.org/10.1007/978-3-319-16811-1_35.
[63]
C. Wang, Y. Wang, A.L. Yuille, An approach to pose-based action recognition, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Portland, Oregon, 2013, pp. 915-922, http://dx.doi.org/10.1109/CVPR.2013.123.
[64]
F. Ofli, R. Chaudhry, G. Kurillo, R. Vidal, R. Bajcsy, Sequence of the most informative joints (SMIJ), J. Vis. Commun. Image Represent., 25 (2014) 24-38.
[65]
R. Slama, H. Wannous, M. Daoudi, A. Srivastava, Accurate 3D action recognition using learning on the Grassmann manifold, Pattern Recognit., 48 (2015) 556-567.
[66]
L. Chen, H. Wei, J. Ferryman, A survey of human motion analysis using depth imagery, Pattern Recognit. Lett., 34 (2013) 1995-2006.
[67]
J. Aggarwal, L. Xia, Human activity recognition from 3D data, Pattern Recognit. Lett., 48 (2014) 70-80.
[68]
D. Murray, J.J. Little, Using real-time stereo vision for mobile robot navigation, Auton. Robots, 8 (2000) 161-171.
[69]
I. Infantino, A. Chella, H. Dindo, I. Macaluso, Visual control of a robotic hand, in: Proceedings of International Conference on Intelligent Robots and Systems (IROS), vol. 2, IEEE, Las Vegas, CA, USA, 2003, pp. 1266-1271, http://dx.doi.org/10.1109/IROS.2003.1248819.
[70]
A. Chella, H. Dindo, I. Infantino, I. Macaluso, A posture sequence learning system for an anthropomorphic robotic hand, Robot. Auton. Syst., 47 (2004) 143-152.
[71]
P. Henry, M. Krainin, E. Herbst, X. Ren, D. Fox, RGB-D mapping: using depth cameras for dense 3D modeling of indoor environments, in: Experimental Robotics, Springer Tracts in Advanced Robotics, vol. 79, Citeseer, Springer, Berlin, Heidelberg, 2014, pp. 477-491, http://dx.doi.org/10.1007/978-3-642-28572-1_33.
[72]
J.C. Carr, R.K. Beatson, J.B. Cherrie, T.J. Mitchell, W.R. Fright, B.C. McCallum, T.R. Evans, Reconstruction and representation of 3D objects with radial basis functions, in: Proceedings of Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), ACM, Los Angeles, CA, USA, 2001, pp. 67-76, http://dx.doi.org/10.1145/383259.383266.
[73]
V. Kolmogorov, R. Zabih, Multi-camera scene reconstruction via graph cuts, in: Proceedings of European Conference on Computer Vision (ECCV), Springer, Copenhagen, Denmark, 2002, pp. 82-96.
[74]
Microsoft kinect sensor {http://www.microsoft.com/en-us/kinectforwindows/}.
[75]
E. Trucco, A. Verri, Introductory Techniques for 3-D Computer Vision, vol. 201, Prentice Hall, Englewood Cliffs, 1998.
[76]
D. Scharstein, R. Szeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, Int. J. Comput. Vis., 74 (2002) 7-42.
[77]
P. Fua, A parallel stereo algorithm that produces dense depth maps and preserves image features, Mach. Vis. Appl., 6 (1993) 35-49.
[78]
S. Foix, G. Alenya, C. Torras, Lock-in time-of-flight (tof) cameras: a survey, IEEE Sens. J., 11 (2011) 1917-1926.
[79]
D. Scharstein, R. Szeliski, High-accuracy stereo depth maps using structured light, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, IEEE, Madison, Wisconsin, 2003, p. I-195.
[80]
P. Felzenszwalb, D. McAllester, D. Ramanan, A discriminatively trained, multiscale, deformable part model, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Anchorage, Alaska, 2008, pp. 1-8, http://dx.doi.org/10.1109/CVPR.2008.4587597.
[81]
J. Shen, W. Yang, Q. Liao, Part template, Pattern Recognit., 46 (2013) 1920-1932.
[82]
M. Ye, X. Wang, R. Yang, L. Ren, M. Pollefeys, Accurate 3d pose estimation from a single depth image, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Barcelona, Spain, 2011, pp. 731-738.
[83]
M.A. Fischler, R.A. Elschlager, The representation and matching of pictorial structures, IEEE Trans. Comput., 22 (1973) 67-92.
[84]
M. W. Lee, I. Cohen, Proposal maps driven MCMC for estimating human body pose in static images, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, IEEE, Washington, DC, 2004, p. II-334.
[85]
G. Mori, X. Ren, A.A. Efros, J. Malik, Recovering human body configurations: combining segmentation and recognition, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, IEEE, Washington, DC, 2004, p. II-326.
[86]
X. Ren, A. C. Berg, J. Malik, Recovering human body configurations using pairwise constraints between parts, in: Proceedings of International Conference on Computer Vision (ICCV), vol. 1, IEEE, Beijing, P.R. China, 2005, pp. 824-831.
[87]
T.-P. Tian, S. Sclaroff, Fast globally optimal 2d human detection with loopy graph models, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, San Francisco, CA, USA, 2010, pp. 81-88.
[88]
B. Sapp, A. Toshev, B. Taskar, Cascaded models for articulated pose estimation, in: Proceedings of European Conference on Computer Vision (ECCV), Springer, Crete, Greece, 2010, pp. 406-420.
[89]
Y. Wang, D. Tran, Z. Liao, Learning hierarchical poselets for human parsing, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Colorado Springs, 2011, pp. 1705-1712.
[90]
M.P. Kumar, A. Zisserman, P.H. Torr, Efficient discriminative learning of parts-based models, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Kyoto, Japan, 2009, pp. 552-559.
[91]
S.S. SDK, Openni 2, openNI 2 SDK Binaries {http://structure.io/openni}, 2014.
[92]
M. Gleicher, Retargetting motion to new characters, in: Proceedings of Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), ACM, Orlando, Florida, USA, 1998, pp. 33-42, http://dx.doi.org/10.1145/280814.280820.
[93]
C. Hecker, B. Raabe, R.W. Enslow, J. DeWeese, J. Maynard, K. van Prooijen, Real-time motion retargeting to highly varied user-created morphologies, ACM Trans. Graph., 27 (2008) 27.
[94]
M. Gleicher, Comparing constraint-based motion editing methods, Graph. Models, 63 (2001) 107-134.
[95]
R. Kulpa, F. Multon, B. Arnaldi, Morphology-independent representation of motions for interactive human-like animation, Comput. Graph. Forum, 24 (2005) 343-351.
[96]
P. Baerlocher, R. Boulic, An inverse kinematics architecture enforcing an arbitrary number of strict priority levels, Vis. Comput., 20 (2004) 402-417.
[97]
P. Wei, N. Zheng, Y. Zhao, S.-C. Zhu, Concurrent action detection with structural prediction, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Sydney, Australia, 2013, pp. 3136-3143.
[98]
D. Wu, L. Shao, Leveraging hierarchical parametric networks for skeletal joints based action segmentation and recognition, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Columbus, Ohio. 2014, pp. 724-731.
[99]
R. Chaudhry, F. Ofli, G. Kurillo, R. Bajcsy, R. Vidal, Bio-inspired dynamic 3D discriminative skeletal features for human action recognition, in: Proceedings of Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), IEEE, Portland, Oregon, 2013, pp. 471-478, http://dx.doi.org/10.1109/CVPRW.2013.153.
[100]
M.E. Hussein, M. Torki, M.A. Gowayyed, M. El-Saban, Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations, in: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), AAAI Press, Beijing, P.R. China, 2013, pp. 2466-2472.
[101]
M. Zanfir, M. Leordeanu, C. Sminchisescu, The moving pose: an efficient 3d kinematics descriptor for low-latency action recognition and detection, in: Proceedings of International Conference on Computer Vision (ICCV), IEEE, Sydney, Australia, 2013, pp. 2752-2759.
[102]
T. Kerola, N. Inoue, K. Shinoda, Spectral graph skeletons for 3D action recognition, in: Proceedings of Asian Conference on Computer Vision (ACCV), Springer, Singapore, 2014, pp. 1-16.
[103]
A. Eweiwi, M.S. Cheema, C. Bauckhage, J. Gall, Efficient pose-based action recognition, in: Proceedings of Asian Conference on Computer Vision (ACCV), Springer, Singapore, 2014, pp. 1-16.
[104]
A.A. Chaaraoui, J.R. Padilla-López, F. Flórez-Revuelta, Fusion of skeletal and silhouette-based features for human action recognition with RGB-D devices, in: Proceedings of International Conference on Computer Vision Workshops (ICCVW), IEEE, Sydney, Australia, 2013, pp. 91-97, http://dx.doi.org/10.1109/ICCVW.2013.19.
[105]
M. Devanne, H. Wannous, S. Berretti, P. Pala, M. Daoudi, A. Del Bimbo, Space-time pose representation for 3D human action recognition, in: Proceedings of the International Conference on Image Analysis and Processing (ICIAP), Springer, Naples, Italy, 2013, pp. 456-464, http://dx.doi.org/10.1007/978-3-642-41190-849.
[106]
D.K. Hammond, P. Vandergheynst, R. Gribonval, Wavelets on graphs via spectral graph theory, Appl. Comput. Harmon. Anal., 30 (2011) 129-150.
[107]
E.P. Ijjina, C.K. Mohan, Human action recognition based on MOCAP information using convolution neural networks, in: Proceedings of International Conference on Machine Learning and Applications (ICMLA), IEEE, Detroit Michigan, 2014, pp. 159-164, http://dx.doi.org/10.1109/ICMLA.2014.30.
[108]
M. Müller, T. Röder, M. Clausen, Efficient content-based retrieval of motion capture data, ACM Trans. Graph., 24 (2005) 677-685.
[109]
G. Evangelidis, G. Singh, R. Horaud, et al., Skeletal quads: human action recognition using joint quadruples, in: Proceedings of International Conference on Pattern Recognition (ICPR), IEEE, Stockholm, Sweden, 2014, pp. 4513-4518, http://dx.doi.org/10.1109/ICPR.2014.772.
[110]
T. Jaakkola, D. Haussler, et al., Exploiting generative models in discriminative classifiers, in: Advances in Neural Information Processing Systems, 1999, pp. 487-493.
[111]
J.E. Humphreys, Introduction to Lie Algebras and Representation Theory, vol. 9, Springer Science & Business Media, New York, 1972.
[112]
J. Wang, Z. Liu, Y. Wu, J. Yuan, Mining actionlet ensemble for action recognition with depth cameras, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Providence, Rhode Island, 2012, pp. 1290-1297, http://dx.doi.org/10.1109/CVPR.2012.6247813.
[113]
Z. Shao, Y. Li, Integral invariants for space motion trajectory matching and recognition, Pattern Recognit., 48 (2015) 2418-2432.
[114]
M. Devanne, H. Wannous, S. Berretti, P. Pala, M. Daoudi, A. Del Bimbo, 3-D human action recognition by shape analysis of motion trajectories on Riemannian manifold, IEEE Trans. Cybern., 45 (2015) 1340-1352.
[115]
M. Barnachon, S. Bouakaz, B. Boufama, E. Guillou, Ongoing human action recognition with motion capture, Pattern Recognit., 47 (2014) 238-247.
[116]
I. Lillo, A. Soto, J.C. Niebles, Discriminative hierarchical modeling of spatio-temporally composable human activities, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Columbus, Ohio. 2014, pp. 812-819.
[117]
L. Miranda, T. Vieira, D. Martínez, T. Lewiner, A.W. Vieira, M.F. Campos, Online gesture recognition from pose kernel learning and decision forests, Pattern Recognit. Lett., 39 (2014) 65-73.
[118]
M. Raptis, D. Kirovski, H. Hoppe, Real-time classification of dance gestures from skeleton animation, in: Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, ACM, Hong Kong, 2011, pp. 147-156.
[119]
M. Barker, W. Rayens, Partial least squares for discrimination, J. Chemom., 17 (2003) 166-173.
[120]
R. Rosipal, L.J. Trejo, Kernel partial least squares regression in reproducing kernel Hilbert space, J. Mach. Learn. Res., 2 (2002) 97-123.
[121]
P. Climent-Pérez, A.A. Chaaraoui, J.R. Padilla-López, F. Flórez-Revuelta, Optimal joint selection for skeletal data from rgb-d devices using a genetic algorithm, in: Advances in Computational Intelligence, Springer, Tenerife - Puerto de la Cruz, Spain, 2013, pp. 163-174, http://dx.doi.org/10.1007/978-3-642-37798-3_15.
[122]
G. Dong, J. Li, Efficient mining of emerging patterns: discovering trends and differences, in: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Diego, CA, USA, 1999, pp. 43-52.
[123]
F.R. Bach, G.R. Lanckriet, M.I. Jordan, Multiple kernel learning, conic duality, and the SMO algorithm, in: Proceedings of International Conference on Machine Learning (ICML), ACM, Alberta, Canada, 2004, p. 6.
[124]
L. Seidenari, V. Varano, S. Berretti, A. Del Bimbo, P. Pala, Recognizing actions from depth cameras as weakly aligned multi-part bag-of-poses, in: Proceedings of Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Portland, Oregon, 2013, pp. 479-485.
[125]
L. Lo Presti, M. La Cascia, S. Sclaroff, O. Camps, Hankelet-based dynamical systems modeling for 3D action recognition, in: Image and Vision Computing, Elsevier, 44 (2015), 29-43, http://dx.doi.org/10.1016/j.imavis.2015.09.007 {http://www.sciencedirect.com/science/article/pii/S02628%85615001134}.
[126]
B. Li, O.I. Camps, M. Sznaier, Cross-view activity recognition using Hankelets, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Providence, Rhode Island, 2012, pp. 1362-1369, http://dx.doi.org/10.1109/CVPR.2012.6247822.
[127]
B. Li, M. Ayazoglu, T. Mao, O.I. Camps, M. Sznaier, Activity recognition using dynamic subspace angles, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Colorado Springs, 2011, pp. 3193-3200, http://dx.doi.org/10.1109/CVPR.2011.5995672.
[128]
A.M. Lehrmann, P.V. Gehler, S. Nowozin, Efficient nonlinear Markov models for human motion, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Columbus, Ohio. 2014, pp. 1314-1321.
[129]
C. Meek, D.M. Chickering, D. Heckerman, Autoregressive tree models for time-series analysis, in: Proceedings of the Second International SIAM Conference on Data Mining, SIAM, Toronto, Canada, 2002, pp. 229-244.
[130]
N. Raman, S.J. Maybank, Action classification using a discriminative multilevel HDP-HMM, Neurocomputing 154 (2015): 149-161
[131]
J. Sung, C. Ponce, B. Selman, A. Saxena, Unstructured human activity detection from RGBD images, in: Proceedings of International Conference on Robotics and Automation (ICRA), IEEE, St. Paul, Minnesota, 2012, pp. 842-849, http://dx.doi.org/10.1109/ICRA.2012.6224591.
[132]
J. Wang, Z. Liu, J. Chorowski, Z. Chen, Y. Wu, Robust 3D action recognition with Random Occupancy Patterns, in: Proceedings of European Conference on Computer Vision (ECCV), Springer, Florence, Italy, 2012, pp. 872-885, http://dx.doi.org/10.1007/978-3-642-33709-362.
[133]
A.W. Vieira, E.R. Nascimento, G.L. Oliveira, Z. Liu, M.F. Campos, STOP: space-time occupancy patterns for 3D action recognition from depth map sequences, Prog. Pattern Recognit. Image Anal. Comput. Vis. Appl. (2012) 252-259, http://dx.doi.org/10.1007/978-3-642-33275-331.
[134]
H. Rahmani, A. Mahmood, D.Q. Huynh, A. Mian, Hopc: histogram of oriented principal components of 3d pointclouds for action recognition, in: Proceedings of European Conference on Computer Vision (ECCV), Springer, Zurich, 2014, pp. 742-757.
[135]
E. Ohn-Bar, M.M. Trivedi, Joint angles similarities and HOG2 for action recognition, in: Proceedings of Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Portland, Oregon, 2013, pp. 465-470, http://dx.doi.org/10.1109/CVPRW.2013.76.
[136]
L. Xia, J. Aggarwal, Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera, in: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Portland, Oregon, 2013, pp. 2834-2841.
[137]
Y. Zhu, W. Chen, G. Guo, Fusing spatiotemporal features and joints for 3D action recognition, in: Proceedings of Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Portland, Oregon, 2013, pp. 486-491, http://dx.doi.org/10.1109/CVPRW.2013.78.
[138]
I. Laptev, On space-time interest points, Int. J. Comput. Vis., 64 (2005) 107-123.
[139]
S. Althloothi, M.H. Mahoor, X. Zhang, R.M. Voyles, Human activity recognition using multi-features and multiple kernel learning, Pattern Recognit., 47 (2014) 1800-1812.
[140]
J. Wang, Y. Wu, Learning maximum margin temporal warping for action recognition, in: 2013 IEEE International Conference on Computer Vision (ICCV), IEEE, Sydney, Australia, 2013, pp. 2688-2695.
[141]
C. Chen, R. Jafari, N. Kehtarnavaz, Improving human action recognition using fusion of depth camera and inertial sensors, IEEE Trans. Hum.-Mach. Syst., 45 (2015) 51-61.
[142]
A.F. Bobick, J.W. Davis, The recognition of human movement using temporal templates, IEEE Trans. Pattern Anal. Mach. Intell., 23 (2001) 257-267.
[143]
H.M. Hondori, M. Khademi, C.V. Lopes, Monitoring intake gestures using sensor fusion (microsoft kinect and inertial sensors) for smart home tele-rehab setting, in: 1st Annual IEEE Healthcare Innovation Conference, IEEE, Houston, TX, 2012, pp. 1-4.
[144]
B. Delachaux, J. Rebetez, A. Perez-Uribe, H.F.S. Mejia, Indoor activity recognition by combining one-vs.-all neural network classifiers exploiting wearable and depth sensors, in: Advances in Computational Intelligence. Lecture Notes in Computer Science, Springer, Tenerife - Puerto de la Cruz, Spain, 7903 (2013), pp. 216-223.
[145]
K. Liu, C. Chen, R. Jafari, N. Kehtarnavaz, Fusion of inertial and depth sensor data for robust hand gesture recognition, IEEE Sens. J., 14 (2014) 1898-1903.
[146]
S. Hadfield, R. Bowden, Hollywood 3d: recognizing actions in 3d natural scenes, in: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Portland, Oregon, 2013, pp. 3398-3405.
[147]
C. Ionescu, D. Papava, V. Olaru, C. Sminchisescu, Human3. 6m, IEEE Trans. Pattern Anal. Mach. Intell., 36 (2014) 1325-1339.
[148]
F. Ofli, R. Chaudhry, G. Kurillo, R. Vidal, R. Bajcsy, Berkeley MHAD: a comprehensive multimodal human action database, in: Proceedings of Workshop on Applications of Computer Vision (WACV), IEEE, Clearwater Beach Florida, 2013, pp. 53-60.
[149]
J.R. Padilla-López, A.A. Chaaraoui, F. Flórez-Revuelta, A discussion on the validation tests employed to compare human action recognition methods using the MSR Action 3D dataset, CoRR abs/1407.7390.arXiv:1407.7390.
[150]
J. Sung, C. Ponce, B. Selman, A. Saxena, Human activity detection from RGBD images, in: AAAI Workshops on Plan, Activity, and Intent Recognition, San Francisco, CA, USA, vol. 64, 2011, pp. 1-8.
[151]
S. Fothergill, H.M. Mentis, P. Kohli, S. Nowozin, Instructing people for training gestural interactive systems, in: J.A. Konstan, E.H. Chi, K. Höök (Eds.), Proceedings of ACM Conference on Human Factors in Computing Systems (CHI), ACM, Austin Texas, 2012, pp. 1737-1746, http://dx.doi.org/10.1145/2207676.2208303.
[152]
A. Malizia, A. Bellucci, The artificiality of natural user interfaces, Commun. ACM, 55 (2012) 36-38.

Cited By

View all
  • (2024)Human Activity Discovery With Automatic Multi-Objective Particle Swarm Optimization Clustering With Gaussian Mutation and Game TheoryIEEE Transactions on Multimedia10.1109/TMM.2023.326660326(420-435)Online publication date: 1-Jan-2024
  • (2024)Glimpse and Zoom: Spatio-Temporal Focused Dynamic Network for Skeleton-Based Action RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.335883634:7(5616-5629)Online publication date: 1-Jul-2024
  • (2024)Insights on the Distribution of Nonverbal and Verbal Oral Presentation Skills in an Educational InstitutionSN Computer Science10.1007/s42979-024-02785-65:5Online publication date: 25-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition
Pattern Recognition  Volume 53, Issue C
May 2016
313 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 May 2016

Author Tags

  1. Action classification
  2. Action recognition
  3. Body joint
  4. Body pose representation
  5. Skeleton

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Human Activity Discovery With Automatic Multi-Objective Particle Swarm Optimization Clustering With Gaussian Mutation and Game TheoryIEEE Transactions on Multimedia10.1109/TMM.2023.326660326(420-435)Online publication date: 1-Jan-2024
  • (2024)Glimpse and Zoom: Spatio-Temporal Focused Dynamic Network for Skeleton-Based Action RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.335883634:7(5616-5629)Online publication date: 1-Jul-2024
  • (2024)Insights on the Distribution of Nonverbal and Verbal Oral Presentation Skills in an Educational InstitutionSN Computer Science10.1007/s42979-024-02785-65:5Online publication date: 25-Apr-2024
  • (2024)Efficient facial expression recognition framework based on edge computingThe Journal of Supercomputing10.1007/s11227-023-05548-x80:2(1935-1972)Online publication date: 1-Jan-2024
  • (2023)Action Recognition Based on Dense Action CaptioningProceedings of the 15th International Conference on Digital Image Processing10.1145/3604078.3604079(1-9)Online publication date: 19-May-2023
  • (2023)Real-Time Privacy Preserving Human Activity Recognition on Mobile using 1DCNN-BiLSTM Deep LearningProceedings of the 2023 5th International Conference on Image, Video and Signal Processing10.1145/3591156.3591159(18-26)Online publication date: 24-Mar-2023
  • (2023)Position‐aware spatio‐temporal graph convolutional networks for skeleton‐based action recognitionIET Computer Vision10.1049/cvi2.1222317:7(844-854)Online publication date: 13-Jul-2023
  • (2023)Transformer for Skeleton-based action recognitionNeurocomputing10.1016/j.neucom.2023.03.001537:C(164-186)Online publication date: 7-Jun-2023
  • (2023)Cross-stream contrastive learning for self-supervised skeleton-based action recognitionImage and Vision Computing10.1016/j.imavis.2023.104689135:COnline publication date: 1-Jul-2023
  • (2023)Robust Gait Recognition Based on Spatio-Temporal Fusion NetworkNeural Processing Letters10.1007/s11063-023-11370-655:9(11785-11805)Online publication date: 1-Dec-2023
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media