Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3581783.3612009acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Pedestrian-specific Bipartite-aware Similarity Learning for Text-based Person Retrieval

Published: 27 October 2023 Publication History

Abstract

Text-based person retrieval is a challenging task that aims to search pedestrian images with the same identity according to language descriptions. Current methods usually indiscriminately measure the similarity between text and image by matching global visual-textual features and matched local region-word features. However, these methods underestimate the key cue role of mismatched region-word pairs and ignore the problem of low similarity between matched region-word pairs. To alleviate these issues, we propose a novel Pedestrian-specific Bipartite-aware Similarity Learning (PBSL) framework that efficiently reveals the plausible and credible levels of contribution of pedestrian-specific mismatched and matched region-word pairs towards overall similarity. Specifically, to focus on mismatched region-word pairs, we first develop a new co-interactive attention that utilizes cross-modal information to guide the extraction of pedestrian-specific information in a single modality. We then design a negative similarity regularization mechanism to use the negative similarity score as a bias to correct the overall similarity. Additionally, to enhance the contribution of matched region-word pairs, we introduce graph networks to aggregate and propagate local information of pedestrian-specific, using overall visual-textual similarity to evaluate locally matched region-word pairs for weight refinement. Finally, extensive experiments are conducted on the CUHK-PEDES, ICFG-PEDES, and RSTPReid datasets to demonstrate the competitive performance of the proposed PBSL in the text-based person retrieval task.

References

[1]
Yucheng Chen, Rui Huang, Hong Chang, Chuanqi Tan, Tao Xue, and Bingpeng Ma. 2021. Cross-Modal Knowledge Adaptation for Language-Based Person Search. IEEE Transactions on Image Processing, Vol. 30 (2021), 4057--4069.
[2]
Yuhao Chen, Guoqing Zhang, Yujiang Lu, Zhenxing Wang, and Yuhui Zheng. 2022. TIPCB: A simple but effective part-based convolutional baseline for text-based person search. Neurocomputing, Vol. 494 (2022), 171--181.
[3]
Zefeng Ding, Changxing Ding, Zhiyin Shao, and Dacheng Tao. 2021. Semantically self-aligned network for text-to-image part-aware person re-identification. arXiv preprint arXiv:2107.12666 (2021).
[4]
Neng Dong, Liyan Zhang, Shuanglin Yan, Hao Tang, and Jinhui Tang. 2023. Erasing, Transforming, and Noising Defense Network for Occluded Person Re-Identification. arXiv preprint arXiv:2307.07187 (2023).
[5]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In 9th International Conference on Learning Representations, ICLR.
[6]
Zi-Yi Dou, Yichong Xu, Zhe Gan, Jianfeng Wang, Shuohang Wang, Lijuan Wang, Chenguang Zhu, Pengchuan Zhang, Lu Yuan, Nanyun Peng, et al. 2022. An empirical study of training end-to-end vision-and-language transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18166--18176.
[7]
Xiaoyu Du, Xiang Wang, Xiangnan He, Zechao Li, Jinhui Tang, and Tat-Seng Chua. 2020. How to learn item representation for cold-start multimedia recommendation?. In Proceedings of the 28th ACM International Conference on Multimedia. 3469--3477.
[8]
Xiaoyu Du, Zike Wu, Fuli Feng, Xiangnan He, and Jinhui Tang. 2022. Invariant Representation Learning for Multimedia Recommendation. In Proceedings of the 30th ACM International Conference on Multimedia. 619--628.
[9]
Ammarah Farooq, Muhammad Awais, Josef Kittler, and Syed Safwan Khalid. 2022. AXM-Net: Implicit Cross-Modal Feature Alignment for Person Re-identification. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI.
[10]
Alex Graves and Alex Graves. 2012. Long short-term memory. Supervised sequence labelling with recurrent neural networks (2012), 37--45.
[11]
Xiao Han, Sen He, Li Zhang, and Tao Xiang. 2021. Text-Based Person Search with Limited Data. In 32nd British Machine Vision Conference, BMVC.
[12]
Lisa Anne Hendricks, John Mellor, Rosalia Schneider, Jean-Baptiste Alayrac, and Aida Nematzadeh. 2021. Decoupling the role of data, attention, and losses in multimodal transformers. Transactions of the Association for Computational Linguistics, Vol. 9 (2021), 570--585.
[13]
Ding Jiang and Mang Ye. 2023. Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).
[14]
Ya Jing, Chenyang Si, Junbo Wang, Wei Wang, Liang Wang, and Tieniu Tan. 2020. Pose-guided multi-granularity attention network for text-based person search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 11189--11196.
[15]
Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu, and Xiaodong He. 2018. Stacked cross attention for image-text matching. In European Conference on Computer Vision, ECCV.
[16]
Shiping Li, Min Cao, and Min Zhang. 2022. Learning Semantic-Aligned Feature Representation for Text-Based Person Search. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP.
[17]
Shuang Li, Tong Xiao, Hongsheng Li, Wei Yang, and Xiaogang Wang. 2017a. Identity-aware textual-visual matching with latent co-attention. In Proceedings of the IEEE International Conference on Computer Vision. 1890--1899.
[18]
Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, Dayu Yue, and Xiaogang Wang. 2017b. Person search with natural language description. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1970--1979.
[19]
Zechao Li, Hao Tang, Zhimao Peng, Guo-Jun Qi, and Jinhui Tang. 2023. Knowledge-guided semantic transfer network for few-shot image recognition. IEEE Transactions on Neural Networks and Learning Systems (2023).
[20]
Jiawei Liu, Zheng-Jun Zha, Richang Hong, Meng Wang, and Yongdong Zhang. 2019. Deep adversarial graph attention convolution network for text-based person search. In 27th ACM International Conference on Multimedia, MM.
[21]
Xinchen Liu, Wu Liu, Jinkai Zheng, Chenggang Yan, and Tao Mei. 2020. Beyond the parts: Learning multi-view cross-part correlation for vehicle re-identification. In Proceedings of the 28th ACM International Conference on Multimedia. 907--915.
[22]
Kai Niu, Linjiang Huang, Yan Huang, Peng Wang, Liang Wang, and Yanning Zhang. 2022. Cross-modal Co-occurrence Attributes Alignments for Person Search by Language. In Proceedings of the 30th ACM International Conference on Multimedia. 4426--4434.
[23]
Kai Niu, Yan Huang, Wanli Ouyang, and Liang Wang. 2020b. Improving description-based person re-identification by multi-granularity image-text alignments. IEEE Transactions on Image Processing, Vol. 29 (2020), 5542--5556.
[24]
Kai Niu, Yan Huang, and Liang Wang. 2020a. Textual Dependency Embedding for Person Search by Language. In 28th ACM International Conference on Multimedia, MM.
[25]
Biao Qian, Yang Wang, Richang Hong, and Meng Wang. 2023 a. Adaptive Data-Free Quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7960--7968.
[26]
Biao Qian, Yang Wang, Richang Hong, and Meng Wang. 2023 b. Rethinking data-free quantization as a zero-sum game. arXiv preprint arXiv:2302.09572 (2023).
[27]
Biao Qian, Yang Wang, Hongzhi Yin, Richang Hong, and Meng Wang. 2022. Switchable online knowledge distillation. In European Conference on Computer Vision. Springer, 449--466.
[28]
Nikolaos Sarafianos, Xiang Xu, and Ioannis A. Kakadiaris. 2019. Adversarial representation learning for text-to-image matching. In IEEE/CVF International Conference on Computer Vision, ICCV.
[29]
Zhiyin Shao, Xinyu Zhang, Meng Fang, Zhifeng Lin, Jian Wang, and Changxing Ding. 2022. Learning Granularity-Unified Representations for Text-to-Image Person Re-identification. In 30th ACM International Conference on Multimedia, MM.
[30]
Fei Shen, Xiaoyu Du, Liyan Zhang, and Jinhui Tang. 2023 a. Triplet Contrastive Learning for Unsupervised Vehicle Re-identification. arXiv preprint arXiv:2301.09498 (2023).
[31]
Fei Shen, Xiaoxiao Peng, Lisheng Wang, Xingmeng Zhang, Mei Shu, and Yayun Wang. 2022. HSGM: A Hierarchical Similarity Graph Module for Object Re-identification. In 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1--6.
[32]
Fei Shen, Yi Xie, Jianqing Zhu, Xiaobin Zhu, and Huanqiang Zeng. 2023 b. Git: Graph interactive transformer for vehicle re-identification. IEEE Transactions on Image Processing, Vol. 32 (2023), 1039--1051.
[33]
Fei Shen, Jianqing Zhu, Xiaobin Zhu, Jingchang Huang, Huanqiang Zeng, Zhen Lei, and Canhui Cai. 2021a. An Efficient Multiresolution Network for Vehicle Reidentification. IEEE Internet of Things Journal, Vol. 9, 11 (2021), 9049--9059.
[34]
Fei Shen, Jianqing Zhu, Xiaobin Zhu, Yi Xie, and Jingchang Huang. 2021b. Exploring spatial significance via hybrid pyramidal graph network for vehicle re-identification. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 7 (2021), 8793--8804.
[35]
Xiujun Shu, Wei Wen, Haoqian Wu, Keyu Chen, Yiran Song, Ruizhi Qiao, Bo Ren, and Xiao Wang. 2022. See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval. In European Conference on Computer Vision Workshop on Real-World Surveillance, ECCVW.
[36]
Teng Sun, Liqiang Jing, Yinwei Wei, Xuemeng Song, Zhiyong Cheng, and Liqiang Nie. 2023. Dual Consistency-enhanced Semi-supervised Sentiment Analysis towards COVID-19 Tweets. In IEEE Transactions on Knowledge and Data Engineering. IEEE.
[37]
Teng Sun, Chun Wang, Xuemeng Song, Fuli Feng, and Liqiang Nie. 2022b. Response generation by jointly modeling personalized linguistic styles and emotions. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 18, 2 (2022), 1--20.
[38]
Teng Sun, Wenjie Wang, Liqaing Jing, Yiran Cui, Xuemeng Song, and Liqiang Nie. 2022a. Counterfactual reasoning for out-of-distribution multimodal sentiment analysis. In Proceedings of the 30th ACM International Conference on Multimedia. 15--23.
[39]
Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. 2018. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In European Conference on Computer Vision, ECCV.
[40]
Hao Tang, Zechao Li, Zhimao Peng, and Jinhui Tang. 2020. Blockmix: meta regularization and self-calibrated inference for metric-based meta-learning. In Proceedings of the 28th ACM international conference on multimedia. 610--618.
[41]
Hao Tang, Chengcheng Yuan, Zechao Li, and Jinhui Tang. 2022. Learning attention-guided pyramidal features for few-shot fine-grained recognition. Pattern Recognition, Vol. 130 (2022), 108792.
[42]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[43]
Chengji Wang, Zhiming Luo, Yaojin Lin, and Shaozi Li. 2021a. Text-based Person Search via Multi-Granularity Embedding Learning. In Thirtieth International Joint Conference on Artificial Intelligence, IJCAI.
[44]
Yang Wang. 2021. Survey on deep multi-modal data analytics: Collaboration, rivalry, and fusion. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 17, 1s (2021), 1--25.
[45]
Yang Wang, Jinjia Peng, Huibing Wang, and Meng Wang. 2022b. Progressive learning with multi-scale attention network for cross-domain vehicle re-identification. Science China Information Sciences, Vol. 65, 6 (2022), 160103.
[46]
Zhe Wang, Zhiyuan Fang, Jun Wang, and Yezhou Yang. 2020a. Vitaa: Visual-textual attributes alignment in person search by natural language. In European Conference on Computer Vision, ECCV.
[47]
Zheng Wang, Zhenwei Gao, Xing Xu, Yadan Luo, Yang Yang, and Heng Tao Shen. 2022a. Point to Rectangle Matching for Image Text Retrieval. In Proceedings of the 30th ACM International Conference on Multimedia. 4977--4986.
[48]
Zijie Wang, Jingyi Xue, Aichun Zhu, Yifeng Li, Mingyi Zhang, and Chongliang Zhong. 2021b. AMEN: Adversarial Multi-space Embedding Network for TextBased Person Re-identification. In Chinese Conference on Pattern Recognition and Computer Vision, PRCV.
[49]
Zijie Wang, Aichun Zhu, Jingyi Xue, Daihong Jiang, Chao Liu, Yifeng Li, and Fangqiang Hu. 2022c. SUM: Serialized Updating and Matching for text-based person retrieval. Knowledge-Based Systems, Vol. 248 (2022), 108891.
[50]
Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, and Yifeng Li. 2022d. CAIBC: Capturing All-round Information Beyond Color for Text-based Person Retrieval. In 30th ACM International Conference on Multimedia, MM.
[51]
Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, and Yifeng Li. 2022 e. Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold. In 30th ACM International Conference on Multimedia, MM.
[52]
Zijie Wang, Aichun Zhu, Zhe Zheng, Jing Jin, Zhouxin Xue, and Gang Hua. 2020b. IMG-Net: inner-cross-modal attentional multigranular network for description-based person re-identification. Journal of Electronic Imaging, Vol. 29, 4 (2020), 043028.
[53]
Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. 2018. Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 79--88.
[54]
Boqiang Xu, Lingxiao He, Xingyu Liao, Wu Liu, Zhenan Sun, and Tao Mei. 2020. Black re-id: A head-shoulder descriptor for the challenging problem of person re-identification. In Proceedings of the 28th ACM International Conference on Multimedia. 673--681.
[55]
Wenhao Xu, Zhiyin Shao, and Changxing Ding. 2023. Mining False Positive Examples for Text-Based Person Re-identification. arXiv preprint arXiv:2303.08466 (2023).
[56]
Rui Yan, Lingxi Xie, Xiangbo Shu, Liyan Zhang, and Jinhui Tang. 2023. Progressive Instance-Aware Feature Learning for Compositional Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[57]
Rui Yan, Lingxi Xie, Jinhui Tang, Xiangbo Shu, and Qi Tian. 2020. HiGCIN: Hierarchical graph-based cross inference network for group activity recognition. IEEE transactions on pattern analysis and machine intelligence (2020).
[58]
Shuanglin Yan, Neng Dong, Liyan Zhang, and Jinhui Tang. 2022a. CLIP-Driven Fine-grained Text-Image Person Re-identification. arXiv preprint arXiv:2210.10276 (2022).
[59]
Shuanglin Yan, Hao Tang, Liyan Zhang, and Jinhui Tang. 2022b. Image-specific information suppression and implicit local alignment for text-based person search. arXiv preprint arXiv:2208.14365 (2022).
[60]
Shuanglin Yan, Yafei Zhang, Minghong Xie, Dacheng Zhang, and Zhengtao Yu. 2022c. Cross-domain person re-identification with pose-invariant feature decomposition and hypergraph structure alignment. Neurocomputing, Vol. 467 (2022), 229--241.
[61]
Xun Yang, Jianfeng Dong, Yixin Cao, Xun Wang, Meng Wang, and Tat-Seng Chua. 2020a. Tree-augmented cross-modal encoding for complex-query video retrieval. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval. 1339--1348.
[62]
Xun Yang, Xiaoyu Du, and Meng Wang. 2020b. Learning to match on graph for fashion compatibility modeling. In Proceedings of the AAAI Conference on artificial intelligence, Vol. 34. 287--294.
[63]
Xun Yang, Fuli Feng, Wei Ji, Meng Wang, and Tat-Seng Chua. 2021. Deconfounded video moment retrieval with causal intervention. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1--10.
[64]
Xun Yang, Meng Wang, and Dacheng Tao. 2017. Person re-identification with metric learning using privileged information. IEEE Transactions on Image Processing, Vol. 27, 2 (2017), 791--805.
[65]
Xun Yang, Shanshan Wang, Jian Dong, Jianfeng Dong, Meng Wang, and Tat-Seng Chua. 2022. Video moment retrieval with cross-modal neural architecture search. IEEE Transactions on Image Processing, Vol. 31 (2022), 1204--1216.
[66]
Zican Zha, Hao Tang, Yunlian Sun, and Jinhui Tang. 2023. Boosting few-shot fine-grained recognition with background suppression and foreground alignment. IEEE Transactions on Circuits and Systems for Video Technology (2023).
[67]
Huatian Zhang, Zhendong Mao, Kun Zhang, and Yongdong Zhang. 2022. Show your faith: Cross-modal confidence-aware network for image-text matching. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 3262--3270.
[68]
Ying Zhang and Huchuan Lu. 2018. Deep cross-modal projection learning for image-text matching. In Proceedings of the European conference on computer vision (ECCV). 686--701.
[69]
Kecheng Zheng, Wu Liu, Jiawei Liu, Zheng-Jun Zha, and Tao Mei. 2020a. Hierarchical gumbel attention network for text-based person search. In 28th ACM International Conference on Multimedia, MM.
[70]
Zhedong Zheng, Tao Ruan, Yunchao Wei, Yi Yang, and Tao Mei. 2020b. VehicleNet: Learning robust visual representation for vehicle re-identification. IEEE Transactions on Multimedia, Vol. 23 (2020), 2683--2693.
[71]
Zhedong Zheng, Liang Zheng, Michael Garrett, Yi Yang, Mingliang Xu, and Yi-Dong Shen. 2020c. Dual-path convolutional image-text embeddings with instance loss. ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 16, 2 (2020), 51:1--51:23.
[72]
Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In Proceedings of the IEEE international conference on computer vision. 3754--3762.
[73]
Aichun Zhu, Zijie Wang, Yifeng Li, Xili Wan, Jing Jin, Tian Wang, Fangqiang Hu, and Gang Hua. 2021. DSSL: deep surroundings-person separation learning for text-based person retrieval. In Proceedings of the 29th ACM International Conference on Multimedia. 209--217.

Cited By

View all
  • (2025)Explicitly diverse visual question generationNeural Networks10.1016/j.neunet.2024.107002184(107002)Online publication date: Apr-2025
  • (2025)Flare-aware cross-modal enhancement network for multi-spectral vehicle Re-identificationInformation Fusion10.1016/j.inffus.2024.102800116(102800)Online publication date: Apr-2025
  • (2024)Cross-modal generation and alignment via attribute-guided prompt for unsupervised text-based person retrievalProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/116(1047-1055)Online publication date: 3-Aug-2024
  • Show More Cited By

Index Terms

  1. Pedestrian-specific Bipartite-aware Similarity Learning for Text-based Person Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-modal
    2. person retrieval
    3. similarity learning

    Qualifiers

    • Research-article

    Funding Sources

    • Natural Science Foundation of Jiangsu Province
    • 2021 Jiangsu Shuangchuang (Mass Innovation and Entrepreneurship) Talent Program
    • National Natural Science Founda- tion of China

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)316
    • Downloads (Last 6 weeks)40
    Reflects downloads up to 31 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Explicitly diverse visual question generationNeural Networks10.1016/j.neunet.2024.107002184(107002)Online publication date: Apr-2025
    • (2025)Flare-aware cross-modal enhancement network for multi-spectral vehicle Re-identificationInformation Fusion10.1016/j.inffus.2024.102800116(102800)Online publication date: Apr-2025
    • (2024)Cross-modal generation and alignment via attribute-guided prompt for unsupervised text-based person retrievalProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/116(1047-1055)Online publication date: 3-Aug-2024
    • (2024)Triplet Contrastive Representation Learning for Unsupervised Vehicle Re-identificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3695255Online publication date: 6-Sep-2024
    • (2024)Causal-driven Large Language Models with Faithful Reasoning for Knowledge Question AnsweringProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681263(4331-4340)Online publication date: 28-Oct-2024
    • (2024)Prototypical Prompting for Text-to-image Person Re-identificationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681165(2331-2340)Online publication date: 28-Oct-2024
    • (2024)CB-YOLO: Dense Object Detection of YOLO for Crowded Wheat Head Identification and LocalizationJournal of Circuits, Systems and Computers10.1142/S0218126625500793Online publication date: 27-Nov-2024
    • (2024)CITNet: Convolution Interaction Transformer Network for Hyperspectral and LiDAR Image ClassificationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.347796562(1-18)Online publication date: 2024
    • (2024)Enhancing Aerial Object Detection With Selective Frequency Interaction NetworkIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.33810965:12(6109-6120)Online publication date: Dec-2024
    • (2024)Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS ImageryIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2024.338454517(8189-8202)Online publication date: 2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media