Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient Query-based Black-box Attack against Cross-modal Hashing Retrieval

Published: 07 February 2023 Publication History

Abstract

Deep cross-modal hashing retrieval models inherit the vulnerability of deep neural networks. They are vulnerable to adversarial attacks, especially for the form of subtle perturbations to the inputs. Although many adversarial attack methods have been proposed to handle the robustness of hashing retrieval models, they still suffer from two problems: (1) Most of them are based on the white-box settings, which is usually unrealistic in practical application. (2) Iterative optimization for the generation of adversarial examples in them results in heavy computation. To address these problems, we propose an Efficient Query-based Black-Box Attack (EQB2A) against deep cross-modal hashing retrieval, which can efficiently generate adversarial examples for the black-box attack. Specifically, by sending a few query requests to the attacked retrieval system, the cross-modal retrieval model stealing is performed based on the neighbor relationship between the retrieved results and the query, thus obtaining the knockoffs to substitute the attacked system. A multi-modal knockoffs-driven adversarial generation is proposed to achieve efficient adversarial example generation. While the entire network training converges, EQB2A can efficiently generate adversarial examples by forward-propagation with only given benign images. Experiments show that EQB2A achieves superior attacking performance under the black-box setting.

References

[1]
Cong Bai, Chao Zeng, Qing Ma, Jinglin Zhang, and Shengyong Chen. 2020. Deep adversarial discrete hashing for cross-modal retrieval. In Proceedings of the International Conference on Multimedia Retrieval. 525–531.
[2]
Jiawang Bai, Bin Chen, Yiming Li, Dongxian Wu, Weiwei Guo, Shu-tao Xia, and En-hui Yang. 2020. Targeted attack for deep hashing based retrieval. In Proceedings of the European Conference on Computer Vision. 618–634.
[3]
Wieland Brendel, Jonas Rauber, and Matthias Bethge. 2018. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In Proceedings of the International Conference on Learning Representations. 1–12.
[4]
Yue Cao, Mingsheng Long, Bin Liu, and Jianmin Wang. 2018. Deep Cauchy hashing for Hamming space retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1229–1237.
[5]
Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014).
[6]
Jianbo Chen, Michael I. Jordan, and Martin J. Wainwright. 2020. HopSkipJumpAttack: A query-efficient decision-based attack. In Proceedings of the IEEE Symposium on Security and Privacy. 1277–1294.
[7]
Mingyang Chen, Junda Lu, Yi Wang, Jianbin Qin, and Wei Wang. 2021. DAIR: A query-efficient decision-based attack on image retrieval systems. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 1064–1073.
[8]
TatSeng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from national university of singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. 1–9.
[9]
Hui Cui, Lei Zhu, Jingjing Li, Yang Yang, and Liqiang Nie. 2020. Scalable deep hashing for large-scale social image retrieval. IEEE Trans. Image Process. 29 (2020), 1271–1284.
[10]
Cheng Deng, Zhaojia Chen, Xianglong Liu, Xinbo Gao, and Dacheng Tao. 2018. Triplet-based deep hashing network for cross-modal retrieval. IEEE Trans. Image Process. 27, 8 (2018), 3893–3903.
[11]
Jacob Devlin, MingWei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[12]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27 (2014), 2672–2680.
[13]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
[14]
Shengshan Hu, Yechao Zhang, Xiaogeng Liu, Leo Yu Zhang, Minghui Li, and Hai Jin. 2021. AdvHash: Set-to-set targeted attack on deep hashing with one single adversarial patch. In Proceedings of the International Conference on Multimedia. 2335–2343.
[15]
Mark J. Huiskes and Michael S. Lew. 2008. The MIR Flickr retrieval evaluation. In Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval. 39–43.
[16]
Nathan Inkawhich, Wei Wen, Hai (Helen) Li, and Yiran Chen. 2019. Feature space perturbations yield more transferable adversarial examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7066–7074.
[17]
Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep cross-modal hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3232–3240.
[18]
Parminder Kaur, Husanbir Singh Pannu, and Avleen Kaur Malhi. 2021. Comparative analysis on cross-modal information retrieval: A review. Comput. Sci. Rev. 39 (2021), 100336.
[19]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[20]
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In Proceedings of the International Conference on Learning Representations. 1–11.
[21]
Chao Li, Shangqian Gao, Cheng Deng, Wei Liu, and Heng Huang. 2021. Adversarial attack on deep cross-modal Hamming retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2218–2227.
[22]
Chao Li, Shangqian Gao, Cheng Deng, De Xie, and Wei Liu. 2019. Cross-modal learning with adversarial samples. Adv. Neural Inf. Process. Syst. 32 (2019), 10791–10801.
[23]
Chao Li, Haoteng Tang, Cheng Deng, Liang Zhan, and Wei Liu. 2020. Vulnerability vs. reliability: Disentangled adversarial examples for cross-modal learning. In Proceedings of the International Conference on Knowledge Discovery & Data Mining. 421–429.
[24]
Jie Li, Rongrong Ji, Hong Liu, Xiaopeng Hong, Yue Gao, and Qi Tian. 2019. Universal perturbation attack against image retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4899–4908.
[25]
Maosen Li, Cheng Deng, Tengjiao Li, Junchi Yan, Xinbo Gao, and Heng Huang. 2020. Towards transferable targeted attack. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 638–646.
[26]
Xiaodan Li, Jinfeng Li, Yuefeng Chen, Shaokai Ye, Yuan He, Shuhui Wang, Hang Su, and Hui Xue. 2021. QAIR: Practical query-efficient black-box attacks for image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3330–3339.
[27]
Qiubin Lin, Wenming Cao, Zhiquan He, and Zhihai He. 2020. Mask cross-modal hashing networks. IEEE Trans. Multim. 23 (2020), 550–558.
[28]
TsungYi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. 740–755.
[29]
Song Liu, Shengsheng Qian, Yang Guan, Jiawei Zhan, and Long Ying. 2020. Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 1379–1388.
[30]
Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. 2019. A geometry-inspired decision-based attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4889–4897.
[31]
Zhuoran Liu, Zhengyu Zhao, and Martha Larson. 2019. Who’s afraid of adversarial queries? The impact of image modifications on content-based image retrieval. In Proceedings of the International Conference on Multimedia Retrieval. 306–314.
[32]
Junda Lu, Mingyang Chen, Yifang Sun, Wei Wang, Yi Wang, and Xiaochun Yang. 2021. A smart adversarial attack on deep hashing based image retrieval. In Proceedings of the International Conference on Multimedia Retrieval. 227–235.
[33]
Xu Lu, Lei Zhu, Zhiyong Cheng, Liqiang Nie, and Huaxiang Zhang. 2019. Online multi-modal hashing with dynamic query-adaption. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 715–724.
[34]
Yantao Lu, Yunhan Jia, Jianyu Wang, Bai Li, Weiheng Chai, Lawrence Carin, and Senem Velipasalar. 2020. Enhancing cross-task black-box transferability of adversarial examples with dispersion reduction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 937–946.
[35]
Chen Ma, Li Chen, and Jun-Hai Yong. 2021. Simulating unknown target models for query-efficient black-box attacks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11835–11844.
[36]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).
[37]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. 234–241.
[38]
Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2017. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 4 (2017), 640–651.
[39]
Xiaobo Shen, Guohua Dong, Yuhui Zheng, Long Lan, Ivor W. Tsang, and Quan-Sen Sun. 2022. Deep co-image-label hashing for multi-label image retrieval. IEEE Trans. Multim. 24 (2022), 1116–1126.
[40]
XiaoBo Shen, Fumin Shen, QuanSen Sun, Yang Yang, Yunhao Yuan, and Heng Tao Shen. 2017. Semi-paired discrete hashing: Learning latent hash codes for semi-paired cross-view retrieval. IEEE Trans. Cybern. 47, 12 (2017), 4275–4288.
[41]
Shupeng Su, Zhisheng Zhong, and Chao Zhang. 2019. Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3027–3035.
[42]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
[43]
Giorgos Tolias, Filip Radenovic, and Ondrej Chum. 2019. Targeted mismatch adversarial attack: Query with a flower to retrieve the tower. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5037–5046.
[44]
Rong-Cheng Tu, Xian-Ling Mao, Bing Ma, Yong Hu, Tan Yan, Wei Wei, and Heyan Huang. 2022. Deep cross-modal hashing with hashing functions and unified hash codes jointly learning. IEEE Trans. Knowl. Data Eng. 34, 2 (2022), 560–572.
[45]
Jingdong Wang, Ting Zhang, Nicu Sebe, and Heng Tao Shen. 2017. A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2017), 769–790.
[46]
Xunguang Wang, Zheng Zhang, Baoyuan Wu, Fumin Shen, and Guangming Lu. 2021. Prototype-supervised adversarial network for targeted attack of deep hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 16357–16366.
[47]
Gengshen Wu, Zijia Lin, Jungong Han, Li Liu, Guiguang Ding, Baochang Zhang, and Jialie Shen. 2018. Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In Proceedings of the International Joint Conference on Artificial Intelligence. 2854–2860.
[48]
Weibin Wu, Yuxin Su, Xixian Chen, Shenglin Zhao, Irwin King, Michael R. Lyu, and Yu-Wing Tai. 2020. Boosting the transferability of adversarial samples via attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1158–1167.
[49]
Chaowei Xiao, Bo Li, JunYan Zhu, Warren He, Mingyan Liu, and Dawn Song. 2018. Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610 (2018).
[50]
Yanru Xiao and Cong Wang. 2021. You see what I want you to see: Exploring targeted black-box transferability attack for hash-based image retrieval systems. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1934–1943.
[51]
Yanru Xiao, Cong Wang, and Xing Gao. 2020. Evade deep image retrieval by stashing private images in the hash space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9651–9660.
[52]
De Xie, Cheng Deng, Chao Li, Xianglong Liu, and Dacheng Tao. 2020. Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE Trans. Image Process. 29 (2020), 3626–3637.
[53]
Ruiqing Xu, Chao Li, Junchi Yan, Cheng Deng, and Xianglong Liu. 2019. Graph convolutional network hashing for cross-modal retrieval. In Proceedings of the International Joint Conference on Artificial Intelligence. 982–988.
[54]
Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence. 1618–1625.
[55]
Erkun Yang, Tongliang Liu, Cheng Deng, and Dacheng Tao. 2018. Adversarial examples for Hamming space search. IEEE Trans. Cybern. 50, 4 (2018), 1473–1484.
[56]
Jiancheng Yang, Yangzhou Jiang, Xiaoyang Huang, Bingbing Ni, and Chenglong Zhao. 2020. Learning black-box attackers with transferable priors and query feedback. Adv. Neural Inf. Process. Syst. 33 (2020), 12288–12299.
[57]
Jun Yu, Hao Zhou, Yibing Zhan, and Dacheng Tao. 2021. Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In Proceedings of the AAAI Conference on Artificial Intelligence. 4626–4634.
[58]
PengFei Zhang, Yang Li, Zi Huang, and XinShun Xu. 2022. Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans. Multim. 24 (2022), 466–479.
[59]
Guoping Zhao, Mingyu Zhang, Jiajun Liu, Yaxian Li, and Ji-Rong Wen. 2022. AP-GAN: Adversarial patch attack on content-based image retrieval systems. GeoInformatica 26, 2 (2022), 347–377.
[60]
Mingyi Zhou, Jing Wu, Yipeng Liu, Shuaicheng Liu, and Ce Zhu. 2020. DaST: Data-free substitute training for adversarial attacks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 231–240.

Cited By

View all
  • (2024)Exploring the Efficacy of Learning Techniques in Model Extraction Attacks on Image Classifiers: A Comparative StudyApplied Sciences10.3390/app1409378514:9(3785)Online publication date: 29-Apr-2024
  • (2024)Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal RetrievalIEEE Access10.1109/ACCESS.2024.337117312(31756-31770)Online publication date: 2024
  • (2024)Supervised Contrastive Discrete Hashing for cross-modal retrievalKnowledge-Based Systems10.1016/j.knosys.2024.111837295:COnline publication date: 18-Jul-2024
  • Show More Cited By

Index Terms

  1. Efficient Query-based Black-box Attack against Cross-modal Hashing Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 41, Issue 3
    July 2023
    890 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/3582880
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 February 2023
    Online AM: 03 September 2022
    Accepted: 20 August 2022
    Revised: 05 August 2022
    Received: 27 May 2022
    Published in TOIS Volume 41, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Adversarial attack
    2. cross-modal hashing retrieval
    3. black-box attack
    4. adversarial generation

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • National Natural Science Foundation of China
    • Natural Science Foundation of Shandong, China
    • Youth Innovation Project of Shandong Universities, China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)618
    • Downloads (Last 6 weeks)36
    Reflects downloads up to 18 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Exploring the Efficacy of Learning Techniques in Model Extraction Attacks on Image Classifiers: A Comparative StudyApplied Sciences10.3390/app1409378514:9(3785)Online publication date: 29-Apr-2024
    • (2024)Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal RetrievalIEEE Access10.1109/ACCESS.2024.337117312(31756-31770)Online publication date: 2024
    • (2024)Supervised Contrastive Discrete Hashing for cross-modal retrievalKnowledge-Based Systems10.1016/j.knosys.2024.111837295:COnline publication date: 18-Jul-2024
    • (2024)LCEMHInformation Sciences: an International Journal10.1016/j.ins.2023.120064659:COnline publication date: 12-Apr-2024
    • (2024)Deep Collaborative Graph HashingBinary Representation Learning on Visual Images10.1007/978-981-97-2112-2_6(143-167)Online publication date: 7-Mar-2024
    • (2023)Complex Scenario Image Retrieval via Deep Similarity-aware HashingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362401620:4(1-24)Online publication date: 11-Dec-2023
    • (2023)Precise Target-Oriented Attack against Deep Hashing-based RetrievalProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612364(6379-6389)Online publication date: 26-Oct-2023
    • (2023)Trustworthy Recommendation and Search: Introduction to the Special Issue - Part 1ACM Transactions on Information Systems10.1145/357999541:3(1-5)Online publication date: 7-Feb-2023
    • (2023)Iterative Adversarial Attack on Image-Guided Story Ending GenerationIEEE Transactions on Multimedia10.1109/TMM.2023.334516726(6117-6130)Online publication date: 20-Dec-2023
    • (2023)Efficient Image-Text Retrieval via Keyword-Guided Pre-ScreeningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.333948934:6(5132-5145)Online publication date: 5-Dec-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media