research-article

Efficient Query-based Black-box Attack against Cross-modal Hashing Retrieval

Authors:

Xinhua WangAuthors Info & Claims

ACM Transactions on Information Systems, Volume 41, Issue 3

Article No.: 54, Pages 1 - 25

https://doi.org/10.1145/3559758

Published: 07 February 2023 Publication History

Abstract

Deep cross-modal hashing retrieval models inherit the vulnerability of deep neural networks. They are vulnerable to adversarial attacks, especially for the form of subtle perturbations to the inputs. Although many adversarial attack methods have been proposed to handle the robustness of hashing retrieval models, they still suffer from two problems: (1) Most of them are based on the white-box settings, which is usually unrealistic in practical application. (2) Iterative optimization for the generation of adversarial examples in them results in heavy computation. To address these problems, we propose an Efficient Query-based Black-Box Attack (EQB²A) against deep cross-modal hashing retrieval, which can efficiently generate adversarial examples for the black-box attack. Specifically, by sending a few query requests to the attacked retrieval system, the cross-modal retrieval model stealing is performed based on the neighbor relationship between the retrieved results and the query, thus obtaining the knockoffs to substitute the attacked system. A multi-modal knockoffs-driven adversarial generation is proposed to achieve efficient adversarial example generation. While the entire network training converges, EQB²A can efficiently generate adversarial examples by forward-propagation with only given benign images. Experiments show that EQB²A achieves superior attacking performance under the black-box setting.

References

[1]

Cong Bai, Chao Zeng, Qing Ma, Jinglin Zhang, and Shengyong Chen. 2020. Deep adversarial discrete hashing for cross-modal retrieval. In Proceedings of the International Conference on Multimedia Retrieval. 525–531.

Digital Library

[2]

Jiawang Bai, Bin Chen, Yiming Li, Dongxian Wu, Weiwei Guo, Shu-tao Xia, and En-hui Yang. 2020. Targeted attack for deep hashing based retrieval. In Proceedings of the European Conference on Computer Vision. 618–634.

Digital Library

[3]

Wieland Brendel, Jonas Rauber, and Matthias Bethge. 2018. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In Proceedings of the International Conference on Learning Representations. 1–12.

[4]

Yue Cao, Mingsheng Long, Bin Liu, and Jianmin Wang. 2018. Deep Cauchy hashing for Hamming space retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1229–1237.

[5]

Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014).

[6]

Jianbo Chen, Michael I. Jordan, and Martin J. Wainwright. 2020. HopSkipJumpAttack: A query-efficient decision-based attack. In Proceedings of the IEEE Symposium on Security and Privacy. 1277–1294.

[7]

Mingyang Chen, Junda Lu, Yi Wang, Jianbin Qin, and Wei Wang. 2021. DAIR: A query-efficient decision-based attack on image retrieval systems. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 1064–1073.

Digital Library

[8]

TatSeng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from national university of singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. 1–9.

Digital Library

[9]

Hui Cui, Lei Zhu, Jingjing Li, Yang Yang, and Liqiang Nie. 2020. Scalable deep hashing for large-scale social image retrieval. IEEE Trans. Image Process. 29 (2020), 1271–1284.

[10]

Cheng Deng, Zhaojia Chen, Xianglong Liu, Xinbo Gao, and Dacheng Tao. 2018. Triplet-based deep hashing network for cross-modal retrieval. IEEE Trans. Image Process. 27, 8 (2018), 3893–3903.

[11]

Jacob Devlin, MingWei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[12]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27 (2014), 2672–2680.

Digital Library

[13]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[14]

Shengshan Hu, Yechao Zhang, Xiaogeng Liu, Leo Yu Zhang, Minghui Li, and Hai Jin. 2021. AdvHash: Set-to-set targeted attack on deep hashing with one single adversarial patch. In Proceedings of the International Conference on Multimedia. 2335–2343.

Digital Library

[15]

Mark J. Huiskes and Michael S. Lew. 2008. The MIR Flickr retrieval evaluation. In Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval. 39–43.

Digital Library

[16]

Nathan Inkawhich, Wei Wen, Hai (Helen) Li, and Yiran Chen. 2019. Feature space perturbations yield more transferable adversarial examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7066–7074.

[17]

Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep cross-modal hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3232–3240.

[18]

Parminder Kaur, Husanbir Singh Pannu, and Avleen Kaur Malhi. 2021. Comparative analysis on cross-modal information retrieval: A review. Comput. Sci. Rev. 39 (2021), 100336.

[19]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[20]

Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In Proceedings of the International Conference on Learning Representations. 1–11.

[21]

Chao Li, Shangqian Gao, Cheng Deng, Wei Liu, and Heng Huang. 2021. Adversarial attack on deep cross-modal Hamming retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2218–2227.

[22]

Chao Li, Shangqian Gao, Cheng Deng, De Xie, and Wei Liu. 2019. Cross-modal learning with adversarial samples. Adv. Neural Inf. Process. Syst. 32 (2019), 10791–10801.

[23]

Chao Li, Haoteng Tang, Cheng Deng, Liang Zhan, and Wei Liu. 2020. Vulnerability vs. reliability: Disentangled adversarial examples for cross-modal learning. In Proceedings of the International Conference on Knowledge Discovery & Data Mining. 421–429.

Digital Library

[24]

Jie Li, Rongrong Ji, Hong Liu, Xiaopeng Hong, Yue Gao, and Qi Tian. 2019. Universal perturbation attack against image retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4899–4908.

[25]

Maosen Li, Cheng Deng, Tengjiao Li, Junchi Yan, Xinbo Gao, and Heng Huang. 2020. Towards transferable targeted attack. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 638–646.

[26]

Xiaodan Li, Jinfeng Li, Yuefeng Chen, Shaokai Ye, Yuan He, Shuhui Wang, Hang Su, and Hui Xue. 2021. QAIR: Practical query-efficient black-box attacks for image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3330–3339.

[27]

Qiubin Lin, Wenming Cao, Zhiquan He, and Zhihai He. 2020. Mask cross-modal hashing networks. IEEE Trans. Multim. 23 (2020), 550–558.

[28]

TsungYi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. 740–755.

[29]

Song Liu, Shengsheng Qian, Yang Guan, Jiawei Zhan, and Long Ying. 2020. Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 1379–1388.

Digital Library

[30]

Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. 2019. A geometry-inspired decision-based attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4889–4897.

[31]

Zhuoran Liu, Zhengyu Zhao, and Martha Larson. 2019. Who’s afraid of adversarial queries? The impact of image modifications on content-based image retrieval. In Proceedings of the International Conference on Multimedia Retrieval. 306–314.

[32]

Junda Lu, Mingyang Chen, Yifang Sun, Wei Wang, Yi Wang, and Xiaochun Yang. 2021. A smart adversarial attack on deep hashing based image retrieval. In Proceedings of the International Conference on Multimedia Retrieval. 227–235.

Digital Library

[33]

Xu Lu, Lei Zhu, Zhiyong Cheng, Liqiang Nie, and Huaxiang Zhang. 2019. Online multi-modal hashing with dynamic query-adaption. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 715–724.

Digital Library

[34]

Yantao Lu, Yunhan Jia, Jianyu Wang, Bai Li, Weiheng Chai, Lawrence Carin, and Senem Velipasalar. 2020. Enhancing cross-task black-box transferability of adversarial examples with dispersion reduction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 937–946.

[35]

Chen Ma, Li Chen, and Jun-Hai Yong. 2021. Simulating unknown target models for query-efficient black-box attacks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11835–11844.

[36]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).

[37]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. 234–241.

[38]

Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2017. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 4 (2017), 640–651.

Digital Library

[39]

Xiaobo Shen, Guohua Dong, Yuhui Zheng, Long Lan, Ivor W. Tsang, and Quan-Sen Sun. 2022. Deep co-image-label hashing for multi-label image retrieval. IEEE Trans. Multim. 24 (2022), 1116–1126.

[40]

XiaoBo Shen, Fumin Shen, QuanSen Sun, Yang Yang, Yunhao Yuan, and Heng Tao Shen. 2017. Semi-paired discrete hashing: Learning latent hash codes for semi-paired cross-view retrieval. IEEE Trans. Cybern. 47, 12 (2017), 4275–4288.

[41]

Shupeng Su, Zhisheng Zhong, and Chao Zhang. 2019. Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3027–3035.

[42]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).

[43]

Giorgos Tolias, Filip Radenovic, and Ondrej Chum. 2019. Targeted mismatch adversarial attack: Query with a flower to retrieve the tower. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5037–5046.

[44]

Rong-Cheng Tu, Xian-Ling Mao, Bing Ma, Yong Hu, Tan Yan, Wei Wei, and Heyan Huang. 2022. Deep cross-modal hashing with hashing functions and unified hash codes jointly learning. IEEE Trans. Knowl. Data Eng. 34, 2 (2022), 560–572.

Digital Library

[45]

Jingdong Wang, Ting Zhang, Nicu Sebe, and Heng Tao Shen. 2017. A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2017), 769–790.

[46]

Xunguang Wang, Zheng Zhang, Baoyuan Wu, Fumin Shen, and Guangming Lu. 2021. Prototype-supervised adversarial network for targeted attack of deep hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 16357–16366.

[47]

Gengshen Wu, Zijia Lin, Jungong Han, Li Liu, Guiguang Ding, Baochang Zhang, and Jialie Shen. 2018. Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In Proceedings of the International Joint Conference on Artificial Intelligence. 2854–2860.

[48]

Weibin Wu, Yuxin Su, Xixian Chen, Shenglin Zhao, Irwin King, Michael R. Lyu, and Yu-Wing Tai. 2020. Boosting the transferability of adversarial samples via attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1158–1167.

[49]

Chaowei Xiao, Bo Li, JunYan Zhu, Warren He, Mingyan Liu, and Dawn Song. 2018. Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610 (2018).

[50]

Yanru Xiao and Cong Wang. 2021. You see what I want you to see: Exploring targeted black-box transferability attack for hash-based image retrieval systems. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1934–1943.

[51]

Yanru Xiao, Cong Wang, and Xing Gao. 2020. Evade deep image retrieval by stashing private images in the hash space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9651–9660.

[52]

De Xie, Cheng Deng, Chao Li, Xianglong Liu, and Dacheng Tao. 2020. Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE Trans. Image Process. 29 (2020), 3626–3637.

Digital Library

[53]

Ruiqing Xu, Chao Li, Junchi Yan, Cheng Deng, and Xianglong Liu. 2019. Graph convolutional network hashing for cross-modal retrieval. In Proceedings of the International Joint Conference on Artificial Intelligence. 982–988.

[54]

Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence. 1618–1625.

[55]

Erkun Yang, Tongliang Liu, Cheng Deng, and Dacheng Tao. 2018. Adversarial examples for Hamming space search. IEEE Trans. Cybern. 50, 4 (2018), 1473–1484.

[56]

Jiancheng Yang, Yangzhou Jiang, Xiaoyang Huang, Bingbing Ni, and Chenglong Zhao. 2020. Learning black-box attackers with transferable priors and query feedback. Adv. Neural Inf. Process. Syst. 33 (2020), 12288–12299.

[57]

Jun Yu, Hao Zhou, Yibing Zhan, and Dacheng Tao. 2021. Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In Proceedings of the AAAI Conference on Artificial Intelligence. 4626–4634.

[58]

PengFei Zhang, Yang Li, Zi Huang, and XinShun Xu. 2022. Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans. Multim. 24 (2022), 466–479.

Digital Library

[59]

Guoping Zhao, Mingyu Zhang, Jiajun Liu, Yaxian Li, and Ji-Rong Wen. 2022. AP-GAN: Adversarial patch attack on content-based image retrieval systems. GeoInformatica 26, 2 (2022), 347–377.

Digital Library

[60]

Mingyi Zhou, Jing Wu, Yipeng Liu, Shuaicheng Liu, and Ce Zhu. 2020. DaST: Data-free substitute training for adversarial attacks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 231–240.

Cited By

Han DBabaei RZhao SCheng S(2024)Exploring the Efficacy of Learning Techniques in Model Extraction Attacks on Image Classifiers: A Comparative StudyApplied Sciences10.3390/app1409378514:9(3785)Online publication date: 29-Apr-2024
https://doi.org/10.3390/app14093785
Han LWang RChen CZhang HZhang YZhang W(2024)Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal RetrievalIEEE Access10.1109/ACCESS.2024.337117312(31756-31770)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3371173
Li ZYao TWang LLi YWang G(2024)Supervised Contrastive Discrete Hashing for cross-modal retrievalKnowledge-Based Systems10.1016/j.knosys.2024.111837295:COnline publication date: 18-Jul-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111837
Show More Cited By

Index Terms

Efficient Query-based Black-box Attack against Cross-modal Hashing Retrieval
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval

Recommendations

Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing Retrieval
Deep cross-modal hashing has promoted the field of multi-modal retrieval due to its excellent efficiency and storage, but its vulnerability to backdoor attacks is rarely studied. Notably, current deep cross-modal hashing methods inevitably require large-...
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection
RAID '23: Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses

In this paper, we propose PhantomSound, a query-efficient black-box attack toward voice assistants. Existing black-box adversarial attacks on voice assistants either apply substitution models or leverage the intermediate model output to estimate the ...
Object-Aware Transfer-Based Black-Box Adversarial Attack on Object Detector
Pattern Recognition and Computer Vision
Abstract
Deep neural networks have been demonstrated to be vulnerable to adversarial noise from attacks. Compared with white-box attacks, black-box attacks fool deep neural networks to yield erroneous predictions without knowing the model parameters. Black-...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 41, Issue 3

July 2023

890 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/3582880

Editor:
Min Zhang
Tsinghua University, China

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 February 2023

Online AM: 03 September 2022

Accepted: 20 August 2022

Revised: 05 August 2022

Received: 27 May 2022

Published in TOIS Volume 41, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Natural Science Foundation of Shandong, China
Youth Innovation Project of Shandong Universities, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
1,204
Total Downloads

Downloads (Last 12 months)618
Downloads (Last 6 weeks)36

Reflects downloads up to 18 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Han DBabaei RZhao SCheng S(2024)Exploring the Efficacy of Learning Techniques in Model Extraction Attacks on Image Classifiers: A Comparative StudyApplied Sciences10.3390/app1409378514:9(3785)Online publication date: 29-Apr-2024
https://doi.org/10.3390/app14093785
Han LWang RChen CZhang HZhang YZhang W(2024)Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal RetrievalIEEE Access10.1109/ACCESS.2024.337117312(31756-31770)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3371173
Li ZYao TWang LLi YWang G(2024)Supervised Contrastive Discrete Hashing for cross-modal retrievalKnowledge-Based Systems10.1016/j.knosys.2024.111837295:COnline publication date: 18-Jul-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111837
Zheng CZhu LZhang ZDuan WLu W(2024)LCEMHInformation Sciences: an International Journal10.1016/j.ins.2023.120064659:COnline publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1016/j.ins.2023.120064
Zhang ZZhang Z(2024)Deep Collaborative Graph HashingBinary Representation Learning on Visual Images10.1007/978-981-97-2112-2_6(143-167)Online publication date: 7-Mar-2024
https://doi.org/10.1007/978-981-97-2112-2_6
Nie XShi YMeng ZHuang JGuan WYin Y(2023)Complex Scenario Image Retrieval via Deep Similarity-aware HashingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362401620:4(1-24)Online publication date: 11-Dec-2023
https://dl.acm.org/doi/10.1145/3624016
Zhao WSong JYuan SGao LYang YShen HEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Precise Target-Oriented Attack against Deep Hashing-based RetrievalProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612364(6379-6389)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612364
Yin HSun YXu GKanoulas E(2023)Trustworthy Recommendation and Search: Introduction to the Special Issue - Part 1ACM Transactions on Information Systems10.1145/357999541:3(1-5)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1145/3579995
Wang YHu WHong R(2023)Iterative Adversarial Attack on Image-Guided Story Ending GenerationIEEE Transactions on Multimedia10.1109/TMM.2023.334516726(6117-6130)Online publication date: 20-Dec-2023
https://dl.acm.org/doi/10.1109/TMM.2023.3345167
Cao MBai YCao ZNie LZhang M(2023)Efficient Image-Text Retrieval via Keyword-Guided Pre-ScreeningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.333948934:6(5132-5145)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1109/TCSVT.2023.3339489
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents