research-article

Graph Convolutional Multi-modal Hashing for Flexible Multimedia Retrieval

Authors:

Xu Lu,

Lei Zhu,

Li Liu,

Liqiang Nie,

Huaxiang ZhangAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 1414 - 1422

https://doi.org/10.1145/3474085.3475598

Published: 17 October 2021 Publication History

Get Access

Abstract

Multi-modal hashing makes an important contribution to multimedia retrieval, where a key challenge is to encode heterogeneous modalities into compact hash codes. To solve this dilemma, graph-based multi-modal hashing methods generally define individual affinity matrix of each independent modality and apply linear algorithm for heterogeneous modalities fusion and compact hash learning. Several other methods construct graph Laplacian matrix based on semantic information to help learn discriminative hash code. However, these conventional methods roughly ignore the structural similarity of training set and the complex relations among multi-modal samples, which leads to unsatisfactory complementarity of fused hash codes. More notably, they are faced with two other important problems: huge computing and storage costs caused by graph construction and partial modality feature lost problem when incomplete query sample comes. In this paper, we propose a Flexible Graph Convolutional Multi-modal Hashing (FGCMH) method that adopts GCNs with linear complexity to preserve both the modality-individual and modality-fused structural similarity for discriminative hash learning. Necessarily, accurate multimedia retrieval can be performed on complete and incomplete datasets with our method. Specifically, multiple modality-individual GCNs under semantic guidance are proposed to act on each individual modality independently for intra-modality similarity preserving, then the output representations are fused into a fusion graph with adaptive weighting scheme. Hash GCN and semantic GCN, which share parameters in the first two layers, propagate fusion information and generate hash codes under high-level label space supervision. In the query stage, our method adaptively captures various multi-modal contents in a flexible and robust way, even if partial modality features are lost. Experimental results on three publicly datasets show the flexibility and effectiveness of our proposed method.

References

[1]

Cong Bai, Chao Zeng, Qing Ma, Jinglin Zhang, and Shengyong Chen. 2020. Deep Adversarial Discrete Hashing for Cross-Modal Retrieval. In Proceedings of the International Conference on Multimedia Retrieval. 525--531.

Abstract

References

Cited By

Index Terms

Recommendations

Graph Convolutional Incomplete Multi-modal Hashing

Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval

Supervised Discriminative Discrete Hashing for Cross-Modal Retrieval

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations