Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1553374.1553469acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Deep learning from temporal coherence in video

Published: 14 June 2009 Publication History

Abstract

This work proposes a learning method for deep architectures that takes advantage of sequential data, in particular from the temporal coherence that naturally exists in unlabeled video recordings. That is, two successive frames are likely to contain the same object or objects. This coherence is used as a supervisory signal over the unlabeled data, and is used to improve the performance on a supervised task of interest. We demonstrate the effectiveness of this method on some pose invariant object and face recognition tasks.

References

[1]
Becker, S. (1996a). Learning Temporally Persistent Hierarchical Representations. Advances in Neural Information Processing Systems (pp. 824--830).
[2]
Becker, S. (1996b). Mutual information maximization: models of cortical self-organization. Network: Computation in Neural Systems, 7, 7--31.
[3]
Becker, S. (1999). Implicit Learning in 3D Object Recognition: The Importance of Temporal Context. Neural Computation, 11, 347--374.
[4]
Becker, S., & Hinton, G. (1992). Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature, 355, 161--163.
[5]
Belkin, M., Niyogi, P., & Sindhwani, V. (2005). On manifold regularization. Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTATS) (pp. 17--24).
[6]
Bottou, L. (1991). Stochastic gradient learning in neural networks. Proceedings of Neuro-Nîmes 91. Nimes, France: EC2.
[7]
Bowling, M., Ghodsi, A., & Wilkinson, D. (2005). Action respecting embedding. International Conference on Machine Learning (pp. 65--72).
[8]
Bromley, J., Bentz, J., W. Bottou, L., & Guyon, I. (1993). Signature verification using a siamese time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence (p. 669).
[9]
Caputo, B., Hornegger, J., Paulus, D., & Niemann, H. (2002). A spin-glass markov random field for 3-d object recognition (Technical Report LME-TR-2002-01). Institut fur Informatik, Universitat Erlangen Nurnberg.
[10]
Chapelle, O., Schölkopf, B., & Zien, A. (2006). Semi-supervised learning. Adaptive computation and machine learning. Cambridge, Mass., USA: MIT Press.
[11]
Chapelle, O., & Zien, A. (2003). Semi-Supervised Classification by Low Density Separation. Advances in Neural Information Processing Systems, 17, 1633--1640.
[12]
Chopra, S., Hadsell, R., & LeCun, Y. (2005). Learning a Similarity Measure Discriminatively, with Application to Face Verification. Proc. Computer Vision and Pattern Recognition Conference (pp. 539--546).
[13]
Hinton, G., & Sejnowski, T. (1999). Unsupervised Learning: Foundations of Neural Computation. MIT Press.
[14]
Huang, R., Metaxas, D. N., & Pavlovic, V. (2004). A hybrid face recognition method using markov random fields. Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 (pp. 157--160).
[15]
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278--2324.
[16]
LeCun, Y., Huang, F. J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. Proc. Computer Vision and Pattern Recognition Conference (pp. 97--104).
[17]
Nayar, S. K., Watanabe, M., & Noguchi, M. (1996). Real-time focus range sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18, 1186--1198.
[18]
Osadchy, R., LeCun, Y., & Miller, M. (2007). Synergistic face detection and pose estimation with energy-based models. Journal of Machine Learning Research (pp. 1197--1215).
[19]
Roobaert, D., & Hulle, M. M. V. (1999). View-based 3d object recognition with support vector machines. In IEEE International Workshop on Neural Networks for Signal Processing (pp. 77--84).
[20]
Roweis, S., & Saul, L. (2000). Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 290, 2323--2326.
[21]
Samaria, F., & Harter, A. (1994). Parameterisation of a stochastic model for human face identification. Proceedings of 2nd IEEE Workshop on Applications of Computer Vision (pp. 138--142).
[22]
Tenenbaum, J., Silva, V., & Langford, J. (2000). A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290, 2319--2323.
[23]
Torralba, A., Fergus, R., & Freeman, W. T. (2008). 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell., 30, 1958--1970.
[24]
Vapnik, V. (1995). The nature of statistical learning theory. Springer. Second edition.
[25]
Wersing, H., & Köörner, E. (2003). Learning optimized features for hierarchical models of invariant recognition. Neural Computation, 15, 1559--1599.
[26]
Weston, J., Rattle, F., & Collobert, R. (2008). Deep learning via semi-supervised embedding. International Conference on Machine Learning (pp. 1168--1175).
[27]
Wiskott, L., & Sejnowski, T. (2002). Slow feature analysis: Unsupervised learning of invariances. Neural Computation, 14, 715--770.

Cited By

View all
  • (2024)Slow Down to Go Better: A Survey on Slow Feature AnalysisIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320162135:3(3416-3436)Online publication date: Mar-2024
  • (2024)Self-Supervised Video Representation Learning via Capturing Semantic Changes Indicated by SaccadesIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.329093834:8(6634-6645)Online publication date: Aug-2024
  • (2024)Self-Supervised Learning of Color Constancy2024 IEEE International Conference on Development and Learning (ICDL)10.1109/ICDL61372.2024.10644375(1-7)Online publication date: 20-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374

Sponsors

  • NSF
  • Microsoft Research: Microsoft Research
  • MITACS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICML '09
Sponsor:
  • Microsoft Research

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)2
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Slow Down to Go Better: A Survey on Slow Feature AnalysisIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320162135:3(3416-3436)Online publication date: Mar-2024
  • (2024)Self-Supervised Video Representation Learning via Capturing Semantic Changes Indicated by SaccadesIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.329093834:8(6634-6645)Online publication date: Aug-2024
  • (2024)Self-Supervised Learning of Color Constancy2024 IEEE International Conference on Development and Learning (ICDL)10.1109/ICDL61372.2024.10644375(1-7)Online publication date: 20-May-2024
  • (2024)Towards efficient image and video style transfer via distillation and learnable feature transformationComputer Vision and Image Understanding10.1016/j.cviu.2024.103947241(103947)Online publication date: Apr-2024
  • (2024)Unsupervised Feature Learning for Video UnderstandingDeep Learning for Video Understanding10.1007/978-3-031-57679-9_6(93-127)Online publication date: 28-Mar-2024
  • (2023)Learning by Restoring Broken 3D GeometryIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.326386745:9(11024-11039)Online publication date: 1-Sep-2023
  • (2023)Leaping Into Memories: Space-Time Deep Feature Synthesis2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00188(1966-1976)Online publication date: 1-Oct-2023
  • (2023)Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.01794(18706-18716)Online publication date: Jun-2023
  • (2023)Semi-supervised 3D Video Information Retrieval with Deep Neural Network and Bi-directional Dynamic-time Warping Algorithm2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386172(758-767)Online publication date: 15-Dec-2023
  • (2023)A Review of Predictive and Contrastive Self-supervised Learning for Medical ImagesMachine Intelligence Research10.1007/s11633-022-1406-420:4(483-513)Online publication date: 3-Jun-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media