Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Learning image-to-class distance metric for image classification

Published: 03 April 2013 Publication History

Abstract

Image-To-Class (I2C) distance is a novel distance used for image classification and has successfully handled datasets with large intra-class variances. However, it uses Euclidean distance for measuring the distance between local features in different classes, which may not be the optimal distance metric in real image classification problems. In this article, we propose a distance metric learning method to improve the performance of I2C distance by learning per-class Mahalanobis metrics in a large margin framework. Our I2C distance is adaptive to different classes by combining with the learned metric for each class. These multiple per-class metrics are learned simultaneously by forming a convex optimization problem with the constraints that the I2C distance from each training image to its belonging class should be less than the distances to other classes by a large margin. A subgradient descent method is applied to efficiently solve this optimization problem. For efficiency and scalability to large-scale problems, we also show how to simplify the method to learn a diagonal matrix for each class. We show in experiments that our learned Mahalanobis I2C distance can significantly outperform the original Euclidean I2C distance as well as other distance metric learning methods in several prevalent image datasets, and our simplified diagonal matrices can preserve the performance but significantly speed up the metric learning procedure for large-scale datasets. We also show in experiment that our method is able to correct the class imbalance problem, which usually leads the NN-based methods toward classes containing more training images.

References

[1]
Bar-Hillel, A., Hertz, T., Shental, N., and Weinshall, D. 2005. Learning a mahalanobis metric from equivalence constraints. J. Mach. Learn. Res. 6, 937--965.
[2]
Bay, H., Ess, A., Tuytelaars, T., and Gool, L. V. 2008. SURF: Speeded up robust features. Comput. Vis. Image Understand. 110, 3, 346--359.
[3]
Behmo, R., Marcombes, P., Dalalyan, A., and Prinet, V. 2010. Towards optimal naive bayes nearest neighbor. In Proceedings of the European Conference on Computer Vision. 171--184.
[4]
Boiman, O., Shechtman, E., and Irani, M. 2008. In defense of nearest-neighbor based image classification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[5]
Bosch, A., Zisserman, A., and Munoz, X. 2008. Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30, 4.
[6]
Cox, T. and Cox, M. 1994. Multidimensional Scaling. Chapman & Hall, London.
[7]
Davis, J. V., Kulis, B., Jain, P., Sra, S., and Dhillon, I. S. 2007. Information-Theoretic metric learning. In Proceedings of the 24th International Conference on Machine Learning. 209--216.
[8]
Fei-Fei, L., Fergus, R., and Perona, P. 2004. Learning generative visual models from few training examples: An Incremental bayesian approach tested on 101 object categories. In Proceedings of the CVPR Workshop on Generative-Model Based Vision.
[9]
Fei-Fei, L. and Perona, P. 2005. A bayesian hierarchical model for learning natural scene categories. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 524--531.
[10]
Frome, A., Singer, Y., and Malik, J. 2006. Image retrieval and classification using local distance functions. Adv. Neural Inf. Process. Syst. 19.
[11]
Frome, A., Singer, Y., Sha, F., and Malik, J. 2007. Learning globally-consistent local distance functions for shape-based image retrieval and classification. In Proceedings of the IEEE International Conference on Computer Vision.
[12]
Goldberger, G. H. J., Roweis, S., and Salakhutdinov, R. 2005. Neighbourhood components analysis. Adv. Neural Info. Process. Syst. 17.
[13]
Grauman, K. and Darrell, T. 2005. The pyramid match kernel: Discriminative classification with sets of image features. In Proceedings of the IEEE International Conference on Computer Vision.
[14]
Griffin, G., Holub, A., and Perona, P. 2007. Caltech-256 object category dataset. Tech. rep. 7694, California Institute of Technology.
[15]
Gu, C., Lim, J., Arbelaez, P., and Malik, J. 2009. Recognition using regions. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[16]
Hoi, S., Liu, W., and Chang, S.-F. 2008. Semi-Supervised distance metric learning for collaborative image retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[17]
Hoi, S., Liu, W., Lyu, M. R., and Ma, W.-Y. 2006. Learning distance metrics with contextual constraints for image retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 2. 2072--2078.
[18]
Huang, Y., Xu, D., and Cham, T.-J. 2010. Face and human gait recognition using image-to-class distance. IEEE Trans. Circuits Syst. Video Technol. 20, 3, 431--438.
[19]
Lazebnik, S., Schmid, C., and Ponce, J. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2169--2178.
[20]
Li, L.-J. and Fei-Fei, L. 2007. What, where and who? Classifying events by scene and object recognition. In Proceedings of the IEEE International Conference on Computer Vision.
[21]
Liu, J. and Shah, M. 2007. Scene modeling using co-clustering. In Proceedings of the IEEE International Conference on Computer Vision.
[22]
Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 2, 91--110.
[23]
Lu, Z. and Ip, H. H. 2009a. Image categorization by learning with context and consistency. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[24]
Lu, Z. and Ip, H. H. 2009b. Image categorization with spatial mismatch kernels. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[25]
Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 3, 145--175.
[26]
Roweis, S. and Saul, L. K. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500, 2323--2326.
[27]
Si, L., Jin, R., Hoi, S. C. H., and Lyu, M. R. 2006. Collaborative image retrieval via regularized metric learning. Multimedia Syst. 12, 1, 34--44.
[28]
Tenenbaum, J., de Silva, V., and Langford, J. C. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 5500, 2319--2323.
[29]
Vedaldi, A. and Fulkerson, B. 2008. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/.
[30]
Wang, J., Yang, J., Yu, K., Lv, F., and Gong, Y. 2010a. Locality-Constrained linear coding for image classification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[31]
Wang, Z., Hu, Y., and Chia, L.-T. 2009. Learning instance-to-class distance for human action recognition. In Proceedings of the International Conference on Image Processing. 3545--3548.
[32]
Wang, Z., Hu, Y., and Chia, L.-T. 2010b. Image-to-Class distance metric learning for image classification. In Proceedings of the European Conference on Computer Vision. 709--719.
[33]
Weinberger, K. Q. and Saul, L. K. 2009. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207--244.
[34]
Wu, J. and Rehg, J. M. 2009. Beyond the euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In Proceedings of the IEEE International Conference on Computer Vision.
[35]
Yang, J., KaiYu, Gong, Y., and Huang, T. 2009. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[36]
Yang, L., Jin, R., Sukthankar, R., and Liu, Y. 2006. An efficient algorithm for local distance metric learning. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI'06).
[37]
Zha, Z., Mei, T., Wang, M., Wang, Z., and Hua, X. 2009. Robust distance metric learning with auxiliary knowledge. In Proceedings of the 21st International Jont Conference on Artifical Intelligence.
[38]
Zhang, Y. and Yeung, D.-Y. 2010. Transfer metric learning by learning task relationships. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Cited By

View all
  • (2020)Deep Neighborhood Component Analysis for Visual Similarity ModelingACM Transactions on Intelligent Systems and Technology10.1145/337578711:3(1-15)Online publication date: 18-Apr-2020
  • (2020)DSR: A Deep Learning Framework Towards Modulation Signal Retrieval2020 7th International Conference on Dependable Systems and Their Applications (DSA)10.1109/DSA51864.2020.00041(234-239)Online publication date: Nov-2020
  • (2019)Content-Based Image Retrieval : A Comprehensive StudyInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT1952275(1073-1081)Online publication date: 20-Mar-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 4, Issue 2
Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
March 2013
339 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2438653
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 April 2013
Accepted: 01 March 2012
Revised: 01 March 2012
Received: 01 October 2010
Published in TIST Volume 4, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Image-to-class distance
  2. distance metric learning
  3. image classification
  4. nearest-neighbor classification

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Deep Neighborhood Component Analysis for Visual Similarity ModelingACM Transactions on Intelligent Systems and Technology10.1145/337578711:3(1-15)Online publication date: 18-Apr-2020
  • (2020)DSR: A Deep Learning Framework Towards Modulation Signal Retrieval2020 7th International Conference on Dependable Systems and Their Applications (DSA)10.1109/DSA51864.2020.00041(234-239)Online publication date: Nov-2020
  • (2019)Content-Based Image Retrieval : A Comprehensive StudyInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT1952275(1073-1081)Online publication date: 20-Mar-2019
  • (2019)Predictive, Personalized, Preventive and Participatory (4P) Medicine Applied to Telemedicine and eHealth in the LiteratureJournal of Medical Systems10.1007/s10916-019-1279-443:5(1-10)Online publication date: 1-May-2019
  • (2017)Visual Classification of Furniture StylesACM Transactions on Intelligent Systems and Technology10.1145/30659518:5(1-20)Online publication date: 30-Jun-2017
  • (2015)Nonnegative Multiresolution Representation-Based Texture Image ClassificationACM Transactions on Intelligent Systems and Technology10.1145/27380507:1(1-21)Online publication date: 7-Oct-2015
  • (2014)Deep Learning for Content-Based Image RetrievalProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654948(157-166)Online publication date: 3-Nov-2014

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media