Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3269206.3269296acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Joint Dictionary Learning and Semantic Constrained Latent Subspace Projection for Cross-Modal Retrieval

Published: 17 October 2018 Publication History

Abstract

With the increasing of multi-modal data on the internet, cross-modal retrieval has received a lot of attention in recent years. It aims to use one type of data as query and retrieve results of another type. For different modality data, how to reduce their heterogeneous property and preserve their local relationship are two main challenges. In this paper, we present a novel joint dictionary learning and semantic constrained latent subspace learning method for cross-modal retrieval~(JDSLC) to deal with above two issues. In this unified framework, samples from different modalities are encoded by their corresponding dictionaries to reduce the semantic gap. In the meantime, we learn modality-specific projection matrices to map the sparse coefficients into the shared latent subspace. Meanwhile, we impose a novel cross-modal similarity constraint to make the representations of samples that belong to same class but from different modalities as close as possible in the latent subspace. An efficient algorithm is proposed to jointly optimize the proposed model and learn the optimal dictionary, coefficients and projection matrix for each modality. Extensive experimental results on multiple benchmark datasets show that our method outperforms the state-of-the-art approaches.

References

[1]
Cheng Deng, Xu Tang, Junchi Yan, Wei Liu, and Xinbo Gao. 2016. Discriminative dictionary learning with common label alignment for cross-modal retrieval. IEEE TMM, Vol. 18, 2 (2016), 208--218.
[2]
David R Hardoon, Sandor Szedmak, and John Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Computation, Vol. 16, 12 (2004), 2639--2664.
[3]
Ran He, Man Zhang, Liang Wang, Ye Ji, and Qiyue Yin. 2015. Cross-Modal Subspace Learning via Pairwise Constraints. IEEE TIP, Vol. 24, 12 (2015), 5543--5556.
[4]
De-An Huang and Yu-Chiang Frank Wang. 2013. Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In ICCV. 2496--2503.
[5]
Zhuolin Jiang, Zhe Lin, and Larry S Davis. 2013. Label consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE TPAMI, Vol. 35, 11 (2013), 2651--2664.
[6]
Cuicui Kang, Shiming Xiang, Shengcai Liao, Changsheng Xu, and Chunhong Pan. 2015. Learning consistent feature representation for cross-modal multimedia retrieval. IEEE TMM, Vol. 17, 3 (2015), 370--381.
[7]
Honglak Lee, Alexis Battle, Rajat Raina, and Andrew Y Ng. 2007. Efficient sparse coding algorithms. In NIPS .
[8]
Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro. 2009. Online dictionary learning for sparse coding. In ICML .
[9]
Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Gert RG Lanckriet, Roger Levy, and Nuno Vasconcelos. 2010. A new approach to cross-modal multimedia retrieval. In ACM MM .
[10]
Abhishek Sharma, Abhishek Kumar, Hal Daume, and David W Jacobs. 2012. Generalized multiview analysis: A discriminative latent space. In CVPR .
[11]
K. Wang, R. He, L. Wang, W. Wang, and T. Tan. 2016a. Joint Feature Selection and Subspace Learning for Cross-Modal Retrieval. IEEE TPAMI, Vol. 38, 10 (2016), 2010--2023.
[12]
Kaiye Wang, Ran He, Wei Wang, Liang Wang, and Tieniu Tan. 2013. Learning coupled feature spaces for cross-modal matching. In ICCV .
[13]
Kaiye Wang, Qiyue Yin, Wei Wang, Shu Wu, and Liang Wang. 2016b. A comprehensive survey on cross-modal retrieval. ArXiv (2016).
[14]
Jianlong Wu, Zhouchen Lin, and Hongbin Zha. 2017. Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval. In ACM SIGIR .
[15]
Xing Xu, Yang Yang, Atsushi Shimada, Rin-ichiro Taniguchi, and Li He. 2015. Semi-supervised coupled dictionary learning for cross-modal retrieval in internet images and texts. In ACM MM .
[16]
Xiaohua Zhai, Yuxin Peng, and Jianguo Xiao. 2014. Learning cross-media joint representation with sparse and semisupervised regularization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 24, 6 (2014), 965--978.
[17]
Liang Zhang, Bingpeng Ma, Guorong Li, Qingming Huang, and Qi Tian. 2016. Pl-ranking: a novel ranking method for cross-modal retrieval. In ACM MM .

Cited By

View all
  • (2021)Learning Feature Representation and Partial Correlation for Multimodal Multi-Label DataIEEE Transactions on Multimedia10.1109/TMM.2020.300496323(1882-1894)Online publication date: 2021
  • (2021)Fine-Grained Image-Text Retrieval via Discriminative Latent Space LearningIEEE Signal Processing Letters10.1109/LSP.2021.306559528(643-647)Online publication date: 2021
  • (2020)Unsupervised Deep Imputed Hashing for Partial Cross-modal Retrieval2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9206611(1-8)Online publication date: Jul-2020
  • Show More Cited By

Index Terms

  1. Joint Dictionary Learning and Semantic Constrained Latent Subspace Projection for Cross-Modal Retrieval

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
      October 2018
      2362 pages
      ISBN:9781450360142
      DOI:10.1145/3269206
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 October 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cross-modal retrieval
      2. dictionary learning
      3. semantic constraints

      Qualifiers

      • Short-paper

      Funding Sources

      Conference

      CIKM '18
      Sponsor:

      Acceptance Rates

      CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Learning Feature Representation and Partial Correlation for Multimodal Multi-Label DataIEEE Transactions on Multimedia10.1109/TMM.2020.300496323(1882-1894)Online publication date: 2021
      • (2021)Fine-Grained Image-Text Retrieval via Discriminative Latent Space LearningIEEE Signal Processing Letters10.1109/LSP.2021.306559528(643-647)Online publication date: 2021
      • (2020)Unsupervised Deep Imputed Hashing for Partial Cross-modal Retrieval2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9206611(1-8)Online publication date: Jul-2020
      • (2020)Semantic Consistency Cross-Modal Retrieval With Semi-Supervised Graph RegularizationIEEE Access10.1109/ACCESS.2020.29662208(14278-14288)Online publication date: 2020
      • (2019)Interpretable Multiple-Kernel Prototype Learning for Discriminative Representation and Feature SelectionProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357865(1863-1872)Online publication date: 3-Nov-2019

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media