Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Large Scale Online Learning of Image Similarity Through Ranking

Published: 01 March 2010 Publication History

Abstract

Learning a measure of similarity between pairs of objects is an important generic problem in machine learning. It is particularly useful in large scale applications like searching for an image that is similar to a given image or finding videos that are relevant to a given video. In these tasks, users look for objects that are not only visually similar but also semantically related to a given object. Unfortunately, the approaches that exist today for learning such semantic similarity do not scale to large data sets. This is both because typically their CPU and storage requirements grow quadratically with the sample size, and because many methods impose complex positivity constraints on the space of learned similarity functions.
The current paper presents OASIS, an Online Algorithm for Scalable Image Similarity learning that learns a bilinear similarity measure over sparse representations. OASIS is an online dual approach using the passive-aggressive family of learning algorithms with a large margin criterion and an efficient hinge loss cost. Our experiments show that OASIS is both fast and accurate at a wide range of scales: for a data set with thousands of images, it achieves better results than existing state-of-the-art methods, while being an order of magnitude faster. For large, web scale, data sets, OASIS can be trained on more than two million images from 150K text queries within 3 days on a single CPU. On this large scale data set, human evaluations showed that 35% of the ten nearest neighbors of a given test image, as found by OASIS, were semantically relevant to that image. This suggests that query independent similarity could be accurately learned even for large scale data sets that could not be handled before.

References

[1]
A. Bar-Hillel, T. Hertz, N. Shental, and D.Weinshall. Learning distance functions using equivalence relations. In Proc. of 20th International Conference on Machine Learning (ICML), page 11, 2003.
[2]
L. Bottou. Large-scale machine learning and stochastic algorithms. In NIPS 2008 Workshop on Optimization for Machine Learning, 2008.
[3]
Y. Chen, E.K. Garcia, M.R. Gupta, A. Rahimi, and L. Cazzanti. Similarity-based classification: Concepts and algorithms. The Journal of Machine Learning Research, 10:747-776, 2009.
[4]
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research (JMLR), 7:551-585, 2006.
[5]
J.V. Davis, B. Kulis, P. Jain, S. Sra, and I.S. Dhillon. Information-theoretic metric learning. In Proceedings of the 24th international conference on Machine learning, pages 209-216. ACM Press New York, NY, USA, 2007.
[6]
P. Duygulu, K. Barnard, N. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In European Conference on Computer Vision (ECCV), pages 97-112, 2002.
[7]
S.L. Feng, R.Manmatha, and V. Lavrenko. Multiple Bernoulli relevance models for image and video annotation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
[8]
A. Frome, Y. Singer, F. Sha, and J. Malik. Learning globally-consistent local distance functions for shape-based image retrieval and classification. In International Conference on Computer Vision, pages 1-8, 2007.
[9]
A. Globerson and S. Roweis. Metric learning by collapsing classes. Advances in Neural Information Processing Systems, 18:451, 2006.
[10]
D. Grangier and S. Bengio. A discriminative kernel-based model to rank images from text queries. Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 30(8):1371-1384, 2008.
[11]
D. Grangier, F.Monay, and S. Bengio. Learning to retrieve images from text queries with a discriminative model. In International Conference on Adaptive Multimedia Retrieval (AMR), 2006.
[12]
G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007.
[13]
R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, 2006.
[14]
P. Jain, B. Kulis, I. Dhillon, and K. Grauman. Online metric learning and fast similarity search. In Advances in Neural Information Processing Systems, volume 22, 2008a.
[15]
P. Jain, B. Kulis, and K. Grauman. Fast image search for learned metrics. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pages 1-8, 2008b.
[16]
J. Jeon and R. Manmatha. Using maximum entropy for automatic image annotation. In International Conference on Image and Video Retrieval, pages 24-32, 2004.
[17]
B. Kulis,M.A. Sustik, and I.S. Dhillon. Low-rank kernel learning with bregman matrix divergences. Journal of Machine Learning Research, 10:341-376, 2009.
[18]
G.R.G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M.I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research (JMLR), 5:27- 72, 2004.
[19]
D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV), 60(2):91-110, 2004.
[20]
W.S. Noble. Multi-kernel learning for biology. In NIPS 2008 workshop on kernel learning, 2008.
[21]
T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 24(7):971-987, 2002.
[22]
P. Quelhas, F. Monay, J. M. Odobez, D. Gatica-Perez, T. Tuytelaars, and L. J. Van Gool. Modeling scenes with local descriptors and latent aspects. In International Conference on Computer Vision, pages 883-890, 2005.
[23]
N. Rasiwasia and N. Vasconcelos. A study of query by semantic example. In 3rd International Workshop on Semantic Learning and Applications in Multimedia, 2008.
[24]
R. Rosales and G. Fung. Learning sparse metrics via linear programming. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 367-373. ACM New York, NY, USA, 2006.
[25]
M. Schultz and T. Joachims. Learning a distance metric from relative comparisons. In Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference. Bradford Book, 2004.
[26]
V. Takala, T. Ahonen, and M. Pietikainen. Block-based methods for image retrieval using local binary patterns. In Scandinavian Conference on Image Analysis (SCIA), 2005.
[27]
K. Tieu and P. Viola. Boosting image retrieval. International Journal of Computer Vision (IJCV), 56(1):17 - 36, 2004.
[28]
A. Torralba, R. Fergus, and W. T. Freeman. Tiny images. Technical Report MIT-CSAIL-TR-2007- 024, Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 2007. URL http://dspace.mit.edu/handle/1721.1/37291.
[29]
A. Tversky. Features of similarity. Psychological Review, 84(4):327-352, 1977.
[30]
K. Weinberger, J. Blitzer, and L. Saul. Distance metric learning for large margin nearest neighbor classification. Advances in Neural Information Processing Systems, 18:1473, 2006.
[31]
K.Q. Weinberger and L.K. Saul. Fast solvers and efficient implementations for distance metric learning. In ICML25, pages 1160-1167, 2008.
[32]
E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 521-528, Cambridge, MA, 2003. MIT Press.
[33]
L. Yang. Distance metric learning: A comprehensive survey. Technical report, Michigan State University, 2006.

Cited By

View all
  • (2024)Adversarial Attack and Defense in Deep RankingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336569946:8(5306-5324)Online publication date: 1-Aug-2024
  • (2024)Semantic Hierarchy-Aware SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333243546:4(2123-2138)Online publication date: 1-Apr-2024
  • (2024)I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.322920746:3(1695-1710)Online publication date: 1-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 11, Issue
3/1/2010
3637 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 March 2010
Published in JMLR Volume 11

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)101
  • Downloads (Last 6 weeks)15
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Adversarial Attack and Defense in Deep RankingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336569946:8(5306-5324)Online publication date: 1-Aug-2024
  • (2024)Semantic Hierarchy-Aware SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333243546:4(2123-2138)Online publication date: 1-Apr-2024
  • (2024)I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.322920746:3(1695-1710)Online publication date: 1-Mar-2024
  • (2024)Continual Learning With Siamese Neural Networks for Sustainable Network ManagementIEEE Transactions on Network and Service Management10.1109/TNSM.2024.336892821:3(2664-2674)Online publication date: 1-Jun-2024
  • (2024)Weakly-Supervised Contrastive Learning for Unsupervised Object DiscoveryIEEE Transactions on Image Processing10.1109/TIP.2024.338024333(2689-2702)Online publication date: 27-Mar-2024
  • (2023)Exploring model dynamics for accumulative poisoning discoveryProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620220(42983-43004)Online publication date: 23-Jul-2023
  • (2023)Cross Modal Data Discovery over Structured and Unstructured Data LakesProceedings of the VLDB Endowment10.14778/3611479.361153316:11(3377-3390)Online publication date: 24-Aug-2023
  • (2023)Transformation-Based Fuzzy Rule Interpolation With Mahalanobis Distance Measures Supported by Choquet IntegralIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2022.319436831:4(1083-1097)Online publication date: 1-Apr-2023
  • (2023)Deep Siamese Neural Networks for Facial Expression Recognition in the WildIEEE Transactions on Affective Computing10.1109/TAFFC.2021.307724814:2(1148-1158)Online publication date: 1-Apr-2023
  • (2023)A cross-view geo-localization method guided by relation-aware global attentionMultimedia Systems10.1007/s00530-023-01101-129:4(2205-2216)Online publication date: 9-May-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media