Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1015330.1015343acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Learning to track 3D human motion from silhouettes

Published: 04 July 2004 Publication History
  • Get Citation Alerts
  • Abstract

    We describe a sparse Bayesian regression method for recovering 3D human body motion directly from silhouettes extracted from monocular video sequences. No detailed body shape model is needed, and realism is ensured by training on real human motion capture data. The tracker estimates 3D body pose by using Relevance Vector Machine regression to combine a learned autoregressive dynamical model with robust shape descriptors extracted automatically from image silhouettes. We studied several different combination methods, the most effective being to learn a nonlinear observation-update correction based on joint regression with respect to the predicted state and the observations. We demonstrate the method on a 54-parameter full body pose model, both quantitatively using motion capture based test sequences, and qualitatively on a test video sequence.

    References

    [1]
    Agarwal, A., & Triggs, B. (2004a). 3D Human Pose from Silhouettes by Relevance Vector Regression. Int. Conf. Computer Vision & Pattern Recognition.
    [2]
    Agarwal, A., & Triggs, B. (2004b). Tracking Articulated Motion with Piecewise Learned Dynamical Models. European Conf. Computer Vision.
    [3]
    Athitsos, V., & Sclaroff, S. (2000). Inferring Body Pose without Tracking Body Parts. Int. Conf. Computer Vision & Pattern Recognition.
    [4]
    Athitsos, V., & Sclaroff, S. (2003). Estimating 3D Hand Pose From a Cluttered Image. Int. Conf. Computer Vision.
    [5]
    Belongie, S., Malik, J., & Puzicha, J. (2002). Shape Matching and Object Recognition using Shape Contexts. IEEE Trans. Pattern Analysis & Machine Intelligence, 24, 509--522.
    [6]
    Bishop, C. (1995). Neural Networks for Pattern Recognition, chapter 6. Oxford University Press.
    [7]
    Brand, M. (1999). Shadow Puppetry. Int. Conf. Computer Vision (pp. 1237--1244).
    [8]
    Bregler, C., & Malik, J. (1998). Tracking People with Twists and Exponential Maps. Int. Conf. Computer Vision & Pattern Recognition (pp. 8--15).
    [9]
    D'Souza, A., Vijayakumar, S., & Schaal, S. (2001). Learning Inverse Kinematics. Int. Conf. on Intelligent Robots and Systems.
    [10]
    Grauman, K., Shakhnarovich, G., & Darrell, T. (2003). Inferring 3D Structure with a Statistical Image-Based Shape Model. Int. Conf. Computer Vision (pp. 641--648).
    [11]
    Howe, N., Leventon, M., & Freeman, W. (1999). Bayesian Reconstruction of 3D Human Motion from Single-Camera Video. Neural Information Processing Systems.
    [12]
    Jurie, F., & Dhome, M. (2002). Hyperplane Approximation for Template Matching. IEEE Trans. Pattern Analysis & Machine Intelligence, 24, 996--1000.
    [13]
    Lowe, D. (1999). Object Recognition from Local Scale-invariant Features. Int. Conf. Computer Vision (pp. 1150--1157).
    [14]
    Mori, G., & Malik, J. (2002). Estimating Human Body Configurations Using Shape Context Matching. European Conf. Computer Vision (pp. 666--680).
    [15]
    Ormoneit, D., Sidenbladh, H., Black, M., & Hastie, T. (2000). Learning and Tracking Cyclic Human Motion. Neural Information Processing Systems (pp. 894--900).
    [16]
    Pavlovic, V., Rehg, J., & MacCormick, J. (2000). Learning Switching Linear Models of Human Motion. Neural Information Processing Systems (pp. 981--987).
    [17]
    Rubner, Y., Tomasi, C., & Guibas, L. (1998). A Metric for Distributions with Applications to Image Databases. Int. Conf. Computer Vision. Bombay.
    [18]
    Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast Pose Estimation with Parameter Sensitive Hashing. Int. Conf. Computer Vision.
    [19]
    Sidenbladh, H., Black, M., & Sigal, L. (2002). Implicit Probabilistic Models of Human Motion for Synthesis and Tracking. European Conf. Computer Vision.
    [20]
    Sminchisescu, C., & Triggs, B. (2003). Kinematic Jump Processes For Monocular 3D Human Tracking. Int. Conf. Computer Vision & Pattern Recognition.
    [21]
    Stenger, B., Thayananthan, A., Torr, P., & Cipolla, R. (2003). Filtering Using a Tree-Based Estimator. Int. Conf. Computer Vision.
    [22]
    Taylor, C. (2000). Reconstruction of Articulated Objects from Point Correspondances in a Single Uncalibrated Image. Int. Conf. Computer Vision & Pattern Recognition.
    [23]
    Tipping, M. (2000). The Relevance Vector Machine. Neural Information Processing Systems.
    [24]
    Tipping, M. (2001). Sparse Bayesian Learning and the Relevance Vector Machine. J. Machine Learning Research, 1, 211--244.
    [25]
    Williams, O., Blake, A., & Cipolla, R. (2003). A Sparse Probabilistic Learning Algorithm for Real-Time Tracking. Int. Conf. Computer Vision.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICML '04: Proceedings of the twenty-first international conference on Machine learning
    July 2004
    934 pages
    ISBN:1581138385
    DOI:10.1145/1015330
    • Conference Chair:
    • Carla Brodley

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 July 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 140 of 548 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Posing 3D Models from DrawingsComputers in Entertainment10.1145/272998415:2(1-14)Online publication date: 4-Apr-2017
    • (2016)Human Pose Estimation from Monocular Images: A Comprehensive SurveySensors10.3390/s1612196616:12(1966)Online publication date: 25-Nov-2016
    • (2016)Artist-oriented 3D character posing from 2D strokesComputers and Graphics10.1016/j.cag.2016.03.00857:C(81-91)Online publication date: 1-Jun-2016
    • (2015)High-dimensional regression with gaussian mixtures and partially-latent response variablesStatistics and Computing10.1007/s11222-014-9461-525:5(893-911)Online publication date: 1-Sep-2015
    • (2014)Correlation, Kalman filter and adaptive fast mean shift based heuristic approach for robust visual trackingSignal, Image and Video Processing10.1007/s11760-014-0612-09:7(1567-1585)Online publication date: 31-Jan-2014
    • (2013)Monocular Image 3D Human Pose Estimation under Self-OcclusionProceedings of the 2013 IEEE International Conference on Computer Vision10.1109/ICCV.2013.237(1888-1895)Online publication date: 1-Dec-2013
    • (2013)Simultaneous Bayesian clustering and feature selection using RJMCMC-based learning of finite generalized Dirichlet mixture modelsSignal Processing10.1016/j.sigpro.2012.07.03793:6(1531-1546)Online publication date: 1-Jun-2013
    • (2012)Shared Kernel Information Embedding for Discriminative InferenceIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2011.15434:4(778-790)Online publication date: 1-Apr-2012
    • (2012)Robust decentralized multi-model adaptive template trackingPattern Recognition10.1016/j.patcog.2012.05.00545:12(4494-4509)Online publication date: 1-Dec-2012
    • (2011)Real-Time Discriminative Background SubtractionIEEE Transactions on Image Processing10.1109/TIP.2010.208776420:5(1401-1414)Online publication date: 1-May-2011
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media