Article

Learning to track 3D human motion from silhouettes

Authors:

Ankur Agarwal and

Bill TriggsAuthors Info & Claims

ICML '04: Proceedings of the twenty-first international conference on Machine learning

July 2004

Page 2

https://doi.org/10.1145/1015330.1015343

Published: 04 July 2004 Publication History

Abstract

We describe a sparse Bayesian regression method for recovering 3D human body motion directly from silhouettes extracted from monocular video sequences. No detailed body shape model is needed, and realism is ensured by training on real human motion capture data. The tracker estimates 3D body pose by using Relevance Vector Machine regression to combine a learned autoregressive dynamical model with robust shape descriptors extracted automatically from image silhouettes. We studied several different combination methods, the most effective being to learn a nonlinear observation-update correction based on joint regression with respect to the predicted state and the observations. We demonstrate the method on a 54-parameter full body pose model, both quantitatively using motion capture based test sequences, and qualitatively on a test video sequence.

References

[1]

Agarwal, A., & Triggs, B. (2004a). 3D Human Pose from Silhouettes by Relevance Vector Regression. Int. Conf. Computer Vision & Pattern Recognition.

Digital Library

[2]

Agarwal, A., & Triggs, B. (2004b). Tracking Articulated Motion with Piecewise Learned Dynamical Models. European Conf. Computer Vision.

[3]

Athitsos, V., & Sclaroff, S. (2000). Inferring Body Pose without Tracking Body Parts. Int. Conf. Computer Vision & Pattern Recognition.

[4]

Athitsos, V., & Sclaroff, S. (2003). Estimating 3D Hand Pose From a Cluttered Image. Int. Conf. Computer Vision.

[5]

Belongie, S., Malik, J., & Puzicha, J. (2002). Shape Matching and Object Recognition using Shape Contexts. IEEE Trans. Pattern Analysis & Machine Intelligence, 24, 509--522.

Digital Library

[6]

Bishop, C. (1995). Neural Networks for Pattern Recognition, chapter 6. Oxford University Press.

Digital Library

[7]

Brand, M. (1999). Shadow Puppetry. Int. Conf. Computer Vision (pp. 1237--1244).

Digital Library

[8]

Bregler, C., & Malik, J. (1998). Tracking People with Twists and Exponential Maps. Int. Conf. Computer Vision & Pattern Recognition (pp. 8--15).

Digital Library

[9]

D'Souza, A., Vijayakumar, S., & Schaal, S. (2001). Learning Inverse Kinematics. Int. Conf. on Intelligent Robots and Systems.

[10]

Grauman, K., Shakhnarovich, G., & Darrell, T. (2003). Inferring 3D Structure with a Statistical Image-Based Shape Model. Int. Conf. Computer Vision (pp. 641--648).

Digital Library

[11]

Howe, N., Leventon, M., & Freeman, W. (1999). Bayesian Reconstruction of 3D Human Motion from Single-Camera Video. Neural Information Processing Systems.

[12]

Jurie, F., & Dhome, M. (2002). Hyperplane Approximation for Template Matching. IEEE Trans. Pattern Analysis & Machine Intelligence, 24, 996--1000.

Digital Library

[13]

Lowe, D. (1999). Object Recognition from Local Scale-invariant Features. Int. Conf. Computer Vision (pp. 1150--1157).

Digital Library

[14]

Mori, G., & Malik, J. (2002). Estimating Human Body Configurations Using Shape Context Matching. European Conf. Computer Vision (pp. 666--680).

Digital Library

[15]

Ormoneit, D., Sidenbladh, H., Black, M., & Hastie, T. (2000). Learning and Tracking Cyclic Human Motion. Neural Information Processing Systems (pp. 894--900).

[16]

Pavlovic, V., Rehg, J., & MacCormick, J. (2000). Learning Switching Linear Models of Human Motion. Neural Information Processing Systems (pp. 981--987).

[17]

Rubner, Y., Tomasi, C., & Guibas, L. (1998). A Metric for Distributions with Applications to Image Databases. Int. Conf. Computer Vision. Bombay.

Digital Library

[18]

Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast Pose Estimation with Parameter Sensitive Hashing. Int. Conf. Computer Vision.

Digital Library

[19]

Sidenbladh, H., Black, M., & Sigal, L. (2002). Implicit Probabilistic Models of Human Motion for Synthesis and Tracking. European Conf. Computer Vision.

Digital Library

[20]

Sminchisescu, C., & Triggs, B. (2003). Kinematic Jump Processes For Monocular 3D Human Tracking. Int. Conf. Computer Vision & Pattern Recognition.

Digital Library

[21]

Stenger, B., Thayananthan, A., Torr, P., & Cipolla, R. (2003). Filtering Using a Tree-Based Estimator. Int. Conf. Computer Vision.

Digital Library

[22]

Taylor, C. (2000). Reconstruction of Articulated Objects from Point Correspondances in a Single Uncalibrated Image. Int. Conf. Computer Vision & Pattern Recognition.

[23]

Tipping, M. (2000). The Relevance Vector Machine. Neural Information Processing Systems.

[24]

Tipping, M. (2001). Sparse Bayesian Learning and the Relevance Vector Machine. J. Machine Learning Research, 1, 211--244.

Digital Library

[25]

Williams, O., Blake, A., & Cipolla, R. (2003). A Sparse Probabilistic Learning Algorithm for Real-Time Tracking. Int. Conf. Computer Vision.

Digital Library

Cited By

Gouvatsos AXiao ZMarsden NZhang J(2017)Posing 3D Models from DrawingsComputers in Entertainment10.1145/272998415:2(1-14)Online publication date: 4-Apr-2017
https://dl.acm.org/doi/10.1145/2729984
Gong WZhang XGonzàlez JSobral ABouwmans TTu CZahzah E(2016)Human Pose Estimation from Monocular Images: A Comprehensive SurveySensors10.3390/s1612196616:12(1966)Online publication date: 25-Nov-2016
https://doi.org/10.3390/s16121966
Mahmudi MHarish PLe Callennec BBoulic R(2016)Artist-oriented 3D character posing from 2D strokesComputers and Graphics10.1016/j.cag.2016.03.00857:C(81-91)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1016/j.cag.2016.03.008
Show More Cited By

Learning to track 3D human motion from silhouettes
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks

Recommendations

Human Motion Analysis

Human motion analysis is receiving increasing attention from computer vision researchers. This interest is motivated by a wide spectrum of applications, such as athletic performance analysis, surveillance, man machine interfaces, content-based image ...
Read More
Self-occlusion handling for human body motion tracking from 3D ToF image sequence
3DVP '10: Proceedings of the 1st international workshop on 3D video processing

A 3D Time-of-flight (ToF) image is very useful to accurately track the human body motion due to its precision. However, the ToF image can not provide occluded 3D data because it also has a limitation of camera viewpoint. This paper proposes a self-...
Read More
Automatic reconstruction of 3D human motion pose from uncalibrated monocular video sequences based on markerless human motion tracking

We present a method to reconstruct human motion pose from uncalibrated monocular video sequences based on the morphing appearance model matching. The human pose estimation is made by integrated human joint tracking with pose reconstruction in depth-...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '04: Proceedings of the twenty-first international conference on Machine learning

July 2004

934 pages

ISBN:1581138385

DOI:10.1145/1015330

Conference Chair:
Carla Brodley
Purdue University/Tufts University

Copyright © 2004 Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

49
Total Citations
View Citations
1,375
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)1

Other Metrics

View Author Metrics

Citations

Cited By

Gouvatsos AXiao ZMarsden NZhang J(2017)Posing 3D Models from DrawingsComputers in Entertainment10.1145/272998415:2(1-14)Online publication date: 4-Apr-2017
https://dl.acm.org/doi/10.1145/2729984
Gong WZhang XGonzàlez JSobral ABouwmans TTu CZahzah E(2016)Human Pose Estimation from Monocular Images: A Comprehensive SurveySensors10.3390/s1612196616:12(1966)Online publication date: 25-Nov-2016
https://doi.org/10.3390/s16121966
Mahmudi MHarish PLe Callennec BBoulic R(2016)Artist-oriented 3D character posing from 2D strokesComputers and Graphics10.1016/j.cag.2016.03.00857:C(81-91)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1016/j.cag.2016.03.008
Deleforge AForbes FHoraud R(2015)High-dimensional regression with gaussian mixtures and partially-latent response variablesStatistics and Computing10.1007/s11222-014-9461-525:5(893-911)Online publication date: 1-Sep-2015
https://dl.acm.org/doi/10.1007/s11222-014-9461-5
Ali AJalil AAhmed JIftikhar MHussain M(2014)Correlation, Kalman filter and adaptive fast mean shift based heuristic approach for robust visual trackingSignal, Image and Video Processing10.1007/s11760-014-0612-09:7(1567-1585)Online publication date: 31-Jan-2014
https://doi.org/10.1007/s11760-014-0612-0
Radwan IDhall AGoecke R(2013)Monocular Image 3D Human Pose Estimation under Self-OcclusionProceedings of the 2013 IEEE International Conference on Computer Vision10.1109/ICCV.2013.237(1888-1895)Online publication date: 1-Dec-2013
https://dl.acm.org/doi/10.1109/ICCV.2013.237
Elguebaly TBouguila N(2013)Simultaneous Bayesian clustering and feature selection using RJMCMC-based learning of finite generalized Dirichlet mixture modelsSignal Processing10.1016/j.sigpro.2012.07.03793:6(1531-1546)Online publication date: 1-Jun-2013
https://dl.acm.org/doi/10.1016/j.sigpro.2012.07.037
Memisevic RSigal LFleet D(2012)Shared Kernel Information Embedding for Discriminative InferenceIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2011.15434:4(778-790)Online publication date: 1-Apr-2012
https://dl.acm.org/doi/10.1109/TPAMI.2011.154
Firouzi HNajjaran H(2012)Robust decentralized multi-model adaptive template trackingPattern Recognition10.1016/j.patcog.2012.05.00545:12(4494-4509)Online publication date: 1-Dec-2012
https://dl.acm.org/doi/10.1016/j.patcog.2012.05.005
Li Cheng Minglun Gong Schuurmans DCaelli T(2011)Real-Time Discriminative Background SubtractionIEEE Transactions on Image Processing10.1109/TIP.2010.208776420:5(1401-1414)Online publication date: 1-May-2011
https://dl.acm.org/doi/10.1109/TIP.2010.2087764
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents