Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2683483.2683561acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvgipConference Proceedingsconference-collections
research-article

A unified approach to learning depth and motion features

Published: 14 December 2014 Publication History
  • Get Citation Alerts
  • Abstract

    We present a model for the joint estimation of disparity and motion. The model is based on learning about the interrelations between images from multiple cameras, multiple frames in a video, or the combination of both. We show that learning depth and motion cues, as well as their combinations, from data is possible within a single type of architecture and a single type of learning algorithm, by using biologically inspired "complex cell" like units, which encode correlations between the pixels across image pairs. Our experimental results show that the learning of depth and motion makes it possible to achieve competitive performance in 3-D activity analysis, and to outperform existing hand-engineered 3-D motion features by a very large margin.

    References

    [1]
    E. H. Adelson and J. R. Bergen. Spatiotemporal energy models for the perception of motion. J. OPT. SOC. AM. A, 2(2):284–299, 1985.
    [2]
    C. F. Cadieu and B. A. Olshausen. Learning Intermediate-Level Representations of Form and Motion from Natural Movies. Neural Computation, 24(4):827–866, Dec. 2011.
    [3]
    D. Fleet, H. Wagner, and D. Heeger. Neural encoding of binocular disparity: Energy models, position shifts and phase shifts. Vision Research, 36(12):1839–1857, June 1996.
    [4]
    D. J. Fleet, A. D. Jepson, and M. R. Jenkin. Phase-based disparity measurement. CVGIP: Image understanding, 53(2):198–210, 1991.
    [5]
    A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR, 2012.
    [6]
    S. Hadfield and R. Bowden. Hollywood 3d: Recognizing actions in 3d natural scenes. In CVPR, Portland, Oregon, June 23-28 2013.
    [7]
    S. Hadfield, K. Lebeda, and R. Bowden. Natural action recognition using invariant 3d motion encoding. In Computer Vision–ECCV 2014, pages 758–771. Springer, 2014.
    [8]
    R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521540518, second edition, 2004.
    [9]
    C. Kanan and G. Cottrell. Robust classification of objects, faces, and flowers using natural image statistics. In CVPR, pages 2472–2479, 2010.
    [10]
    K. Konda and R. Memisevic. Learning to combine depth and motion. arXiv preprint arXiv:1312.3429, 2013.
    [11]
    K. R. Konda, R. Memisevic, and V. Michalski. Learning to encode motion using spatio-temporal synchrony. In Proceedings of ICLR, April 2014.
    [12]
    Q. Le, W. Zou, S. Yeung, and A. Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In CVPR, 2011.
    [13]
    R. Memisevic. Gradient-based learning of higher-order image features. In ICCV, pages 1591–1598. IEEE, 2011.
    [14]
    R. Memisevic. On multi-view feature learning. In Proceedings of ICML, pages 161–168, 2012.
    [15]
    I. Ohzawa, G. C. Deangelis, and R. D. Freeman. Stereoscopic depth discrimination in the visual cortex: neurons ideally suited as disparity detectors. Science, 249(4972):1037–1041, 1990.
    [16]
    N. Qian. Computing stereo disparity and motion with known binocular cell properties. Neural Computation, 6(3):390–404, 1994.
    [17]
    S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction. In ICML, 2011.
    [18]
    D. Scharstein and R. Szeliski. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV, 47(1-3):7–42, 2002.
    [19]
    G. W. Taylor, R. Fergus, Y. LeCun, and C. Bregler. Convolutional learning of spatio-temporal features. In ECCV, 2010.

    Cited By

    View all
    • (2020)Modelling binocular disparity processing from statistics in natural scenesVision Research10.1016/j.visres.2020.07.009176(27-39)Online publication date: Nov-2020

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICVGIP '14: Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing
    December 2014
    692 pages
    ISBN:9781450330619
    DOI:10.1145/2683483
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 December 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. activity recognition
    2. machine learning
    3. motion and depth estimation

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICVGIP '14

    Acceptance Rates

    Overall Acceptance Rate 95 of 286 submissions, 33%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Modelling binocular disparity processing from statistics in natural scenesVision Research10.1016/j.visres.2020.07.009176(27-39)Online publication date: Nov-2020

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media