Abstract
In this paper, a shared latent dynamical model (SLDM) and its application in tracking 3D human motion from monocular videos are proposed by combining the ideas of Gaussian processes dynamical model with shared latent structure. When tracking in high-dimensional space, SLDM can map state space and observation space to a shared latent space of low dimensionality with associated dynamics. During off-line training, three mappings, including dynamical mapping in latent space and mappings from the latent space to both state space and observation space, are learned. This model can separate traditional human motion estimation in high-dimensional space into two steps: In the first step, the shared latent dynamical variables are estimated; in the second step, the human pose of high dimension is reconstructed. Experiments in human motion tracking from monocular videos using simulations and real images demonstrate that this human tracking method is efficient.
Similar content being viewed by others
References
Agarwal A, Triggs B. Recovering 3D human pose from monocular images. IEEE Trans Pattern Anal Mach Intell, 2006, 28: 44–58
Sminchisescu C, Jepson A. Generative and Discriminative Models. Technical Report CSRG-501. 2004
Elgammal A, Lee C S. Inferring 3D body pose from silhouettes using activity manifold learning. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington DC, USA, 2004. 681–688
Tenenbaum J B, De Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction. Science, 2000, 290: 2319–2323
Roweis S, Saul L. Nonlinear dimensionality reduction by locally linear em-bedding. Science, 2000, 290: 2323–2326
Tangkuampien T, Suter D. Real-time human pose inference using kernel principal component pre-image approximations. In: Proceedings of British Machine Vision Conference, Edinburgh, UK, 2006
Urtasun R, Fleet D, Fua P. 3D people tracking with Gaussian process dynamical models. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, 2006
Lawrence N D. Large Scale Learning with the Gaussian Process Latent Variable Model. Technical Report CS-06-05. 2005
Wang J M, Fleet D J, Hertzmann A. Gaussian process dynamical models. In: Proceedings of Neural Information Processing Systems Conference, Vancouver, Canada, 2005. 1441–1448
Shon P, Grochow K, Hertzmann A, et al. Learning shared latent structure for image synthesis and robotic imitation. Adv Neural Inf Process Syst, 2006, 18: 1233–1240
Arulampalam M S, Maskell S, Gordon N, et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans Signal Process, 2002, 50: 174–188
Tong M, Liu Y, Huang T S. Recover human pose from monocular image under weak perspective projection. In: Proceedings of Computer Vision in Human-Computer Interaction: ICCV 2005 Workshop on HCI, Beijing, China, 2005. 36–46
CMU Human Motion Capture DataBase. Available online at http://mocap.cs.cmu.edu
CASIA Gait Database. Available online at http://www.sinobiometrics.com/Gait
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tong, M., Han, H. & Zhu, W. Shared latent dynamical structure for three-dimensional human pose estimation. Sci. China Inf. Sci. 54, 1375–1382 (2011). https://doi.org/10.1007/s11432-011-4245-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-011-4245-4