research-article

Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval

Authors:

Xiao-Lin WangAuthors Info & Claims

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 1271 - 1280

https://doi.org/10.1145/2983323.2983743

Published: 24 October 2016 Publication History

Abstract

Recently, multimodal hashing techniques have received considerable attention due to their low storage cost and fast query speed for multimodal data retrieval. Many methods have been proposed; however, there are still some problems that need to be further considered. For example, some of these methods just use a similarity matrix for learning hash functions which will discard some useful information contained in original data; some of them relax binary constraints or separate the process of learning hash functions and binary codes into two independent stages to bypass the obstacle of handling the discrete constraints on binary codes for optimization, which may generate large quantization error; some of them are not robust to noise. All these problems may degrade the performance of a model. To consider these problems, in this paper, we propose a novel supervised hashing framework for cross-modal retrieval, i.e., Supervised Robust Discrete Multimodal Hashing (SRDMH). Specifically, SRDMH tries to make final binary codes preserve label information as same as that in original data so that it can leverage more label information to supervise the binary codes learning. In addition, it learns hashing functions and binary codes directly instead of relaxing the binary constraints so as to avoid large quantization error problem. Moreover, to make it robust and easy to solve, we further integrate a flexible l_2,p loss with nonlinear kernel embedding and an intermediate presentation of each instance. Finally, an alternating algorithm is proposed to solve the optimization problem in SRDMH. Extensive experiments are conducted on three benchmark data sets. The results demonstrate that the proposed method (SRDMH) outperforms or is comparable to several state-of-the-art methods for cross-modal retrieval task.

References

[1]

A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117--122, 2008.

Digital Library

[2]

A. Andoni and I. P. Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, pages 793--801, 2015.

Digital Library

[3]

J. L. Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9):509--517, 1975.

Digital Library

[4]

M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR, pages 3594--3601, 2010.

[5]

T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. NUS-WIDE: a real-world web image database from national university of singapore. In CIVR, 2009.

Digital Library

[6]

G. Ding, Y. Guo, and J. Zhou. Collective matrix factorization hashing for multimodal data. In CVPR, pages 2083--2090, 2014.

Digital Library

[7]

T. Do, A. Doan, and N. Cheung. Discrete hashing with deep neural network. CoRR, abs/1508.07148, 2015.

[8]

J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3(3):209--26, 1977.

Digital Library

[9]

A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, pages 518--529, 1999.

Digital Library

[10]

Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, pages 817--824, 2011.

Digital Library

[11]

M. J. Huiskes and M. S. Lew. The MIR flickr retrieval evaluation. In MIR, pages 39--43, 2008.

Digital Library

[12]

B. Kulis and T. Darrell. Learning to hash with binary reconstructive embeddings. In NIPS, pages 1042--1050, 2009.

Digital Library

[13]

B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In ICCV, pages 2130--2137, 2009.

[14]

S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In IJCAI, pages 1360--1365, 2011.

Digital Library

[15]

H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In NIPS, pages 801--808, 2006.

Digital Library

[16]

R.-S. Lin, D. A. Ross, and J. Yagnik. Spec hashing: Similarity preserving algorithm for entropy-based coding. In CVPR, pages 848--854, 2010.

[17]

Z. Lin, G. Ding, M. Hu, and J. Wang. Semantics-preserving hashing for cross-view retrieval. In CVPR, pages 3864--3872, 2015.

[18]

W. Liu, J. Wang, R. Ji, Y. Jiang, and S. Chang. Supervised hashing with kernels. In CVPR, pages 2074--2081, 2012.

Digital Library

[19]

Y. Liu, J. Cui, Z. Huang, H. Li, and H. T. Shen. SKLSH: An efficient index structure for spproximate nearest neighbor search. In VLDB, pages 745--756, 2014.

Digital Library

[20]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.

Digital Library

[21]

S. M. Omohundro. Efficient algorithms with neural network behavior. Complex Systems, 1(2):273--347, 1987.

[22]

J. C. Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. R. G. Lanckriet, R. Levy, and N. Vasconcelos. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3):521--535, 2014.

Digital Library

[23]

G. Shakhnarovich. Learning task-specific similarity. PhD thesis, MIT, 2005.

Digital Library

[24]

F. Shen, C. Shen, W. Liu, and H. T. Shen. Supervised discrete hashing. In CVPR, pages 37--45, 2015.

[25]

C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In CVPR, pages 1--8, 2008.

[26]

J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In MM, pages 423--432, 2011.

Digital Library

[27]

J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD, pages 785--796, 2013.

Digital Library

[28]

F. Ture, T. Elsayed, and J. Lin. No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity. In SIGIR, pages 943--952, 2011.

Digital Library

[29]

J. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Information Processing Letters, 40(4):175--179, 1991.

[30]

D. Wang, X. Gao, X. Wang, and L. He. Semantic topic multimodal hashing for cross-media retrieval. In IJCAI, pages 3890--3896, 2015.

Digital Library

[31]

J. Wang, O. Kumar, and S. Chang. Semi-supervised hashing for scalable image retrieval. In CVPR, pages 3424--3431, 2010.

[32]

J. Wang, S. Kumar, and S.-F. Chang. Sequential projection learning for hashing with compact codes. In ICML, pages 1127--1134, 2010.

Digital Library

[33]

J. Wang, X.-S. Xu, S. Guo, L. Cui, and X. Wang. Linear unsupervised hashing for ann search in euclidean space. Neurocomputing, 171(c):283--292, 2016.

Digital Library

[34]

S.-S. Wang, Z. Huang, and X.-S. Xu. A multi-label least-squares hashing for scalable image search. In SDM, pages 954--962, 2015.

[35]

Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS 21, pages 1753--1760, 2009.

Digital Library

[36]

H. Xu, J. Wang, Z. Li, and G. Zeng. Complementary hashing for approximate nearest neighbor search. In ICCV, pages 1631--1638, 2011.

Digital Library

[37]

Y. Yang, Z. Ma, Y. Yang, F. Nie, and H. T. Shen. Multitask spectral clustering by exploring intertask correlation. IEEE Transactions on Cybernetics, 45(5):1069--1080, 2015.

[38]

Y. Yang, Z. Zha, Y. Gao, X. Zhu, and T. Chua. Corrections to "exploiting web images for semantic video indexing via robust sample-specific loss". IEEE Transactions on Multimedia, 17(2):256, 2015.

Digital Library

[39]

D. Zhang and W. Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI, pages 2177--2183, 2014.

Digital Library

[40]

D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In SIGIR, pages 225--234, 2011.

Digital Library

[41]

Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In NIPS, pages 1385--1393, 2012.

Digital Library

[42]

Y. Zhen and D.-Y. Yeung. A probabilistic model for multimodal hash function learning. In KDD, pages 940--948, 2012.

Digital Library

[43]

J. Zhou, G. Ding, and Y. Guo. Latent semantic sparse hashing for cross-modal similarity search. In SIGIR, pages 415--424, 2014.

Digital Library

[44]

X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. Linear cross-modal hashing for efficient multimedia search. In MM, pages 143--152, 2013.

Digital Library

[45]

F. Zou, C. Liu, H. Ling, H. Feng, L. Yan, and D. Li. Least square regularized spectral hashing for similarity search. Signal Processing, 93(8):2265--2273, 2013.

Digital Library

Cited By

Ying LYu HWang JJi YQian S(2021)Multi-Level Multi-Modal Cross-Attention Network for Fake News DetectionIEEE Access10.1109/ACCESS.2021.31140939(132363-132373)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3114093
Ying LYu HWang JJi YQian S(2021)Fake News Detection via Multi-Modal Topic Memory NetworkIEEE Access10.1109/ACCESS.2021.31139819(132818-132829)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3113981
Yi JHe YLiu X(2021)Online Discriminative Semantic-Preserving Hashing for Large-Scale Cross-Modal RetrievalPRICAI 2021: Trends in Artificial Intelligence10.1007/978-3-030-89188-6_33(440-453)Online publication date: 25-Oct-2021
https://doi.org/10.1007/978-3-030-89188-6_33
Show More Cited By

Index Terms

Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Visual content-based indexing and retrieval
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Asymmetric Discrete Cross-Modal Hashing
ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Recently, cross-modal hashing (CMH) methods have attracted much attention. Many methods have been explored; however, there are still some issues that need to be further considered. 1) How to efficiently construct the correlations among heterogeneous ...
Semi-Relaxation Supervised Hashing for Cross-Modal Retrieval
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Recently, some cross-modal hashing methods have been devised for cross-modal search task. Essentially, given a similarity matrix, most of these methods tackle a discrete optimization problem by separating it into two stages, i.e., first relaxing the ...
Semi-paired and semi-supervised multimodal hashing via cross-modality label propagation
Abstract
Due to the fast query speed and low storage cost, multimodal hashing methods have been attracting increasing attention in large-scale cross-media retrieval tasks. Most existing multimodal hashing methods can only handle fully-paired settings, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

October 2016

2566 pages

ISBN:9781450340731

DOI:10.1145/2983323

General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM'16

Sponsor:

CIKM'16: ACM Conference on Information and Knowledge Management

October 24 - 28, 2016

Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
431
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)2

Reflects downloads up to 18 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ying LYu HWang JJi YQian S(2021)Multi-Level Multi-Modal Cross-Attention Network for Fake News DetectionIEEE Access10.1109/ACCESS.2021.31140939(132363-132373)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3114093
Ying LYu HWang JJi YQian S(2021)Fake News Detection via Multi-Modal Topic Memory NetworkIEEE Access10.1109/ACCESS.2021.31139819(132818-132829)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3113981
Yi JHe YLiu X(2021)Online Discriminative Semantic-Preserving Hashing for Large-Scale Cross-Modal RetrievalPRICAI 2021: Trends in Artificial Intelligence10.1007/978-3-030-89188-6_33(440-453)Online publication date: 25-Oct-2021
https://doi.org/10.1007/978-3-030-89188-6_33
Chen DCheng MMin CJing L(2020)Unsupervised Deep Imputed Hashing for Partial Cross-modal Retrieval2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9206611(1-8)Online publication date: Jul-2020
https://doi.org/10.1109/IJCNN48605.2020.9206611
Zhong FWang GChen ZXia FMin G(2020)Cross-Modal Retrieval for CPSS DataIEEE Access10.1109/ACCESS.2020.29675948(16689-16701)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.2967594
Fang YRen YPark J(2020)Semantic-enhanced discrete matrix factorization hashing for heterogeneous modal matchingKnowledge-Based Systems10.1016/j.knosys.2019.105381192(105381)Online publication date: Mar-2020
https://doi.org/10.1016/j.knosys.2019.105381
Wang HLiu XNie X(2020)Supervised discrete hashing through similarity learningMultimedia Tools and Applications10.1007/s11042-020-08799-5Online publication date: 11-Mar-2020
https://doi.org/10.1007/s11042-020-08799-5
Li CYan TLuo XNie LXu X(2019)Supervised Robust Discrete Multimodal Hashing for Cross-Media RetrievalIEEE Transactions on Multimedia10.1109/TMM.2019.291271421:11(2863-2877)Online publication date: Nov-2019
https://doi.org/10.1109/TMM.2019.2912714
Luo XWu YYu WXu X(2019)Class consistent hashing for fast Web data searchingWorld Wide Web10.1007/s11280-018-0540-y22:2(477-497)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11280-018-0540-y
Otto CHolzki SEwerth R(2019)“Is This an Example Image?” – Predicting the Relative Abstractness Level of Image and TextAdvances in Information Retrieval10.1007/978-3-030-15712-8_46(711-725)Online publication date: 7-Apr-2019
https://doi.org/10.1007/978-3-030-15712-8_46
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents