Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2072298.2072334acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Towards multi-semantic image annotation with graph regularized exclusive group lasso

Published: 28 November 2011 Publication History

Abstract

To bridge the semantic gap between low level feature and human perception, most of the existing algorithms aim mainly at annotating images with concepts coming from only one semantic space, e.g. cognitive or affective. The naive combination of the outputs from these spaces will implicitly force the conditional independence and ignore the correlations among the spaces. In this paper, to exploit the comprehensive semantic of images, we propose a general framework for harmoniously integrating the above multiple semantics, and investigating the problem of learning to annotate images with training images labeled in two or more correlated semantic spaces, such as fascinating nighttime, or exciting cat. This kind of semantic annotation is more oriented to real world search scenario. Our proposed approach outperforms the baseline algorithms by making the following contributions. 1) Unlike previous methods that annotate images within only one semantic space, our proposed multi-semantic annotation associates each image with labels from multiple semantic spaces. 2) We develop a multi-task linear discriminative model to learn a linear mapping from features to labels. The tasks are correlated by imposing the exclusive group lasso regularization for competitive feature selection, and the graph Laplacian regularization to deal with insufficient training sample issue. 3) A Nesterov-type smoothing approximation algorithm is presented for efficient optimization of our model. Extensive experiments on NUS-WIDEEmotive dataset (56k images) with 8×81 emotive cognitive concepts and Object&Scene datasets from NUS-WIDE well validate the effectiveness of the proposed approach.

References

[1]
A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73 (3):243--272, 2008.
[2]
S. Becker, J. Bobin, and E. Candes. NESTA: A fast and accurate first-order method for sparse recovery. SIAM J. on Imaging Sciences, 4(1):1--39, 2011.
[3]
R. Caruana. Multi-task learning. Machine Learning, 28(1):41--75, 1997.
[4]
X. Chen, Y. Mu, S. Yan, and T.-S. Chua. Efficient large-scale image annotation by probabilistic collaborative multi-label propagation. In ACM MM, 2010.
[5]
T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng. NUS-WIDE: A real-world web image database from national university of singapore. In ACM CIVR, 2009.
[6]
R. Cilibrasi and P. M. B. Vitanyi. The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3):370--383, 2007.
[7]
T. Evgeniou and M. Pontil. Regularized multi--task learning. In SIGKDD, 2004.
[8]
M. Fornasier and H. Rauhut. Recovery algorithm for vector-valued data with joint sparsity constraints. SIAM Journal on Numerical Analysis, 46(2):577--613, 2008.
[9]
T. Griffiths and Z. Ghahramani. Infinite latent feature models and the indian buffet process. In NIPS, 2005.
[10]
A. Hanjalic. Extracting moods from pictures and sounds: Towards truly personalized TV. Signal Processing Magazine, 23(2):90--100, 2006.
[11]
J. A. Hanley and B. J. McNeil. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1):29--36, 1982.
[12]
T. Hayashi and M. Hagiwara. Image query by impression words-the IQI system. IEEE Transactions on Consumer Electronics, 44(2):347--352, 1998.
[13]
L. Jacob, G. Obozinski, and J.-P. Vert. Group lasso with overlap and graph lasso. In ICML, 2009.
[14]
K. Kesorn. Multi-Model Multi-Semantic Image Retrieval. PhD Thesis, Queen Mary, University of London, 2010.
[15]
M. Kowalski. Sparse regression using mixed norms. Applied and Computational Harmonic Analysis, 27(3):303--324, 2009.
[16]
M. Kowalski and B. Torreesani. Sparsity and persistence: Mixed norms provide simple signals models with dependent coefficient. Signal, Image and Video Processing, 3(3):251--264, 2009.
[17]
M. Lew, N. Sebe, C. Djeraba, and R. Jain. Content-based multimedia information retrieval: State-of-the-art and challenges. ACM Trans. Multimedia Comput. Commun. Appl., 2(1):1--19, 2006.
[18]
D. Liu, X.-S. Hua, L. Yang, M. Wang, and H.-J. Zhang. Tag ranking. In WWW, 2009.
[19]
H. Liu, M. Palatucci, and J. Zhang. Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In ICML, pages 649--656, 2009.
[20]
Y. Liu, R. Jin, and L. Yang. Semi-supervised multi-label learning by constrained non-negative matrix factorization. In AAAI, 2006.
[21]
J. Machajdik and A. Hanbury. Affective image classification using features inspired by psychology and art theory. In ACM MM, 2010.
[22]
J. A. Mikels, B. L. Fredrickson, G. R. Larkin, C. M. Lindberg, S. J. Maglio, and P. A. Reuter-Lorenz. Emotional category data on images from the international affective picture system. Behavior Research Methods, 37(4):626--630, 2005.
[23]
Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, 2004.
[24]
Y. Nesterov. Smooth minimization of non-smooth functions. Mathematical Programming, 103(1):127--152, 2005.
[25]
J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag, 2006.
[26]
G. Obozinski, B. Taskar, and M. Jordan. Joint covariate selection and joint subspace selection for multiple classification problems. Journal of Statistics and Computing, 20(2):231--252, 2009.
[27]
G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, and H.-J. Zhang. Correlative multi-label video annotation. In ACM MM, 2007.
[28]
A. Subramanya and J. Bilmes. Entropic graph regularization in non-parametric semi-supervised classification. In NIPS, 2009.
[29]
J. Tang, S. Yan, R. Hong, G.-J. Qi, and T.-S. Chua. Inferring semantic concepts from community-contributed images and noisy tags. In ACM MM, 2009.
[30]
P. Tseng. On accelerated proximal gradient methods for convex-concave optimization. submitted to SIAM Journal of Optimization, 2008.
[31]
N. Ueda and K. Saito. Parametric mixture models for multilabeled text. In NIPS, 2002.
[32]
F. Wang and C. Zhang. Label propagation through linear neighborhoods. In ICML, 2006.
[33]
M. Wang, X.-S. Hua, R. Hong, J. Tang, G.-J. Qi, and Y. Song. Unified video annotation via multi-graph learning. IEEE Transactions on Circuits and Systems for Video Technology, 19(5):733--746, 2009.
[34]
M. Wang, X.-S. Hua, J. Tang, and R. Hong. Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Transactions on Multimedia, 11(3):465--476, 2009.
[35]
W.-N. Wang, Y.-L. Yu, and S.-M. Jiang. Image retrieval by emotional semantics: A study of emotional space and feature extraction. In IEEE SMC, 2006.
[36]
L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, and S. Li. Semi-supervised multi-label learning by constrained non-negative matrix factorization. In ACM MM, 2008.
[37]
Q. Wu, C. Zhou, and C. Wang. Content-based affective image classification and retrieval using support vector machines. Affective Computing and Intelligent Interaction, 37(84):239--247, 2005.
[38]
V. Yanulevskaya, J. C. van Gemert, K. Roth, A. K. Herbold, N. Sebe, and J. M. Geusebroek. Emotional valence categorization using holistic image features. In ICIP, 2008.
[39]
M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B, 68(1):49--67, 2006.
[40]
Z.-J. Zha, X.-S. Hua, T. Mei, J. Wang, G.-J. Qi, and Z. Wang. Joint multi-label multi-instance learning for image classification. In CVPR, 2008.
[41]
Z.-J. Zha, L. Yang, T. Mei, M. Wang, and Z. Wang. Visual query suggestion. In ACM MM, 2009.
[42]
J. Zhang. A probabilistic framework for multi-task learning. Technical report, Carnegie Mellon University-LTI-06-006, 2006.
[43]
P. Zhao, G. Rocha, and B. Yu. The composite absolute penalties family for grouped and hierarchical variable selection. The Annals of Statistics, 37(6A):3468--3497, 2009.
[44]
Y. Zhou, R. Jin, and S. C. Hoi. Exclusive lasso for multi-task feature selection. In AISTATS, 2010.
[45]
S. Zhu, X. Ji, W. Xu, and Y. Gong. Multi-labelled classification using maximum entropy method. In ACM SIGIR, 2005.

Cited By

View all
  • (2024)MeFiNet: Modeling multi-semantic convolution-based feature interactions for CTR predictionIntelligent Data Analysis10.3233/IDA-22711328:1(261-278)Online publication date: 3-Feb-2024
  • (2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
  • (2018)Visual object tracking via coefficients constrained exclusive group LASSOMachine Vision and Applications10.1007/s00138-018-0930-229:5(749-763)Online publication date: 1-Jul-2018
  • Show More Cited By

Index Terms

  1. Towards multi-semantic image annotation with graph regularized exclusive group lasso

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '11: Proceedings of the 19th ACM international conference on Multimedia
      November 2011
      944 pages
      ISBN:9781450306164
      DOI:10.1145/2072298
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 November 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. exclusive group lasso
      2. multi-semantic image annotation

      Qualifiers

      • Research-article

      Conference

      MM '11
      Sponsor:
      MM '11: ACM Multimedia Conference
      November 28 - December 1, 2011
      Arizona, Scottsdale, USA

      Acceptance Rates

      Overall Acceptance Rate 995 of 4,171 submissions, 24%

      Upcoming Conference

      MM '24
      The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)6
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 15 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MeFiNet: Modeling multi-semantic convolution-based feature interactions for CTR predictionIntelligent Data Analysis10.3233/IDA-22711328:1(261-278)Online publication date: 3-Feb-2024
      • (2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
      • (2018)Visual object tracking via coefficients constrained exclusive group LASSOMachine Vision and Applications10.1007/s00138-018-0930-229:5(749-763)Online publication date: 1-Jul-2018
      • (2016)Computational Methods for Integrating Vision and LanguageSynthesis Lectures on Computer Vision10.2200/S00705ED1V01Y201602COV0076:1(1-227)Online publication date: 20-Apr-2016
      • (2016)Graph feature selection for dementia diagnosisNeurocomputing10.1016/j.neucom.2015.09.126195:C(19-22)Online publication date: 26-Jun-2016
      • (2016)Learning multi-task local metrics for image annotationMultimedia Tools and Applications10.1007/s11042-014-2402-775:4(2203-2231)Online publication date: 1-Feb-2016
      • (2015)Multiple ocular diseases detection based on joint sparse multi-task learning2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)10.1109/EMBC.2015.7319578(5260-5263)Online publication date: Aug-2015
      • (2015)Semantic-Guided Feature Selection for Industrial Automation SystemsThe Semantic Web - ISWC 201510.1007/978-3-319-25010-6_13(225-240)Online publication date: 24-Oct-2015
      • (2015)Discriminative Feature Selection for Multiple Ocular Diseases Classification by Sparse Induced Graph Regularized Group LassoMedical Image Computing and Computer-Assisted Intervention -- MICCAI 201510.1007/978-3-319-24571-3_2(11-19)Online publication date: 20-Nov-2015
      • (2014)EMERGSEMProceedings of the 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems10.1109/SITIS.2014.117(256-263)Online publication date: 23-Nov-2014
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media