research-article

Open access

HyP² Loss: Beyond Hypersphere Metric Space for Multi-label Image Retrieval

Authors:

Jue WangAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 3173 - 3184

https://doi.org/10.1145/3503161.3548032

Published: 10 October 2022 Publication History

PDF eReader

Abstract

Image retrieval has become an increasingly appealing technique with broad multimedia application prospects, where deep hashing serves as the dominant branch towards low storage and efficient retrieval. In this paper, we carried out in-depth investigations on metric learning in deep hashing for establishing a powerful metric space in multi-label scenarios, where the pair loss suffers high computational overhead and converge difficulty, while the proxy loss is theoretically incapable of expressing the profound label dependencies and exhibits conflicts in the constructed hypersphere space. To address the problems, we propose a novel metric learning framework with Hybrid Proxy-Pair Loss (HyP$^2$ Loss) that constructs an expressive metric space with efficient training complexity w.r.t. the whole dataset. The proposed HyP$^2$ Loss focuses on optimizing the hypersphere space by learnable proxies and excavating data-to-data correlations of irrelevant pairs, which integrates sufficient data correspondence of pair-based methods and high-efficiency of proxy-based methods. Extensive experiments on four standard multi-label benchmarks justify the proposed method outperforms the state-of-the-art, is robust among different hash bits and achieves significant performance gains with a faster, more stable convergence speed. Our code is available at https://github.com/JerryXu0129/HyP2-Loss.

Supplementary Material

MP4 File (MM22-fp1188.mp4)

Compared with the common single-label image retrieval, the multi-label image retrieval task is more challenging as the image features are more complex and image embedding in metric space is higher required. Pair-based methods are most commonly used in image retrieval tasks. However, such approaches suffer high computational consumption and converge difficulty, especially are more serious and inevitable in multi-label scenarios. Proxy-based methods are proposed to improve model robustness with efficient training complexity in single-label scenarios. However, they are also disqualified in multi-label tasks for some reason. In this paper, we theoretically analyze the primary reasons that proxy-based methods are disqualified for multi-label retrieval. Then we propose the novel HyP2 Loss, which introduces a crucial constraint term of irrelevant samples on the basis of the proxy loss, while preserving the efficient training complexity, which compensates for the limitation of the hypersphere metric space.

Download
83.47 MB

References

[1]

Nicolas Aziere and Sinisa Todorovic. 2019. Ensemble Deep Manifold Similarity Learning Using Hard Proxies. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR. Computer Vision Foundation / IEEE, 7299--7307.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Hashing Orthogonal Constraint Loss for Multi-Label Image Retrieval

Optimal semi-supervised metric learning for image retrieval

Multi-label double-layer learning for cross-modal retrieval

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations