Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3664647.3681684acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Information Fusion with Knowledge Distillation for Fine-grained Remote Sensing Object Detection

Published: 28 October 2024 Publication History

Abstract

Fine-grained remote sensing object detection aims to locate and identify specific targets with variable scale and orientation from complex background in the high-resolution and wide-swath images, which needs requirement of high precision and real-time processing simultaneously. Although traditional knowledge distillation technology show its effectiveness in model compression and accuracy preservation for natural images, the challenges of heavy background noise and intra-class similarity faced by remote sensing images limits the knowledge quality of teacher model and the learning ability of student model. To address these issues, we propose the Information Fusion with Knowledge Distillation (IFKD) method to enhance student model performance by integrating information from external images, frequency domain, and hyperbolic space. This includes three key modules: 1) External Disturbance Enhancement (EDE), which uses MobileSAM to enrich teachers' knowledge and reduce students' dependency on teachers; 2) Frequency Domain Reconstruction (FDR) to amplify key feature representations and reduce background noise interference by resampling low-frequency information; 3) Hyperbolic Similarity Mask (HSM) to increase intra-class differences, guiding students in analyzing and utilizing teachers' knowledge, and leveraging the exponential capabilities of hyperbolic space for performance improvement. Experimental results verify that the IFKD method significantly enhances performance in fine-grained recognition tasks compared to existing distillation techniques. Specially, 65.8% and 81.4% Ap_50 have achieved on optical ShipRSImageNet and SAR Aircraft-1.0 with our method, even which is 0.4% and 4.7% higher than the teacher.

References

[1]
Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade r-cnn: Delving into high quality object detection. In CVPR. 6154--6162.
[2]
Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, and Manmohan Chandraker. 2017. Learning efficient object detection models with knowledge distillation. In NIPS, Vol. 30 (2017).
[3]
Pengguang Chen, Shu Liu, Hengshuang Zhao, and Jiaya Jia. 2021. Distilling knowledge via knowledge review. In CVPR. 5008--5017.
[4]
Gong Cheng, Chunbo Lang, Maoxiong Wu, Xingxing Xie, Xiwen Yao, and Junwei Han. 2021. Feature Enhancement Network for Object Detection in Optical Remote Sensing Images. Journal of Remote Sensing, Vol. 2021 (2021).
[5]
Gong Cheng, Yanqing Yao, Shengyang Li, Ke Li, Xingxing Xie, Jiabao Wang, Xiwen Yao, and Junwei Han. 2022. Dual-Aligned Oriented Detector. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022).
[6]
Gong Cheng, Peicheng Zhou, and Junwei Han. 2016. Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, Vol. 54, 12 (2016).
[7]
Bhuwan Dhingra, Christopher J. Shallue, Mohammad Norouzi, Andrew M. Dai, and George E. Dahl. 2018. Embedding Text in Hyperbolic Spaces. In North American Chapter of the Association for Computational Linguistics.
[8]
Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, and Chang Xu. 2021. Distilling object detectors via decoupled features. In CVPR. 2154--2164.
[9]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In ICCV. 2961--2969.
[10]
Byeongho Heo, Minsik Lee, Sangdoo Yun, and Jin Young Choi. 2018. Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons. Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference, Vol. abs/1811.03233 (2018).
[11]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. arXiv: Machine Learning, Vol. abs/1503.02531 (2015).
[12]
Valentin Khrulkov, Leyla Mirvakhabova, Evgeniya Ustinova, Ivan Oseledets, and Victor Lempitsky. 2020. Hyperbolic image embeddings. In CVPR. 6418--6428.
[13]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. 2023. Segment Anything. arXiv:2304.02643 (2023).
[14]
Cong Li, Gong Cheng, Guangxing Wang, Peicheng Zhou, and Junwei Han. 2023. Instance-Aware Distillation for Efficient Object Detection in Remote Sensing Images. IEEE Trans. Geosci. Remote. Sens., Vol. 61 (2023), 1--11.
[15]
Ke Li, Gong Cheng, Shuhui Bu, and Xiong You. 2018. Rotation-Insensitive and Context-Augmented Object Detection in Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, Vol. 56, 4 (2018).
[16]
Ke Li, Gang Wan, Gong Cheng, Liqiu Meng, and Junwei Han. 2020. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 159 (2020), 296--307.
[17]
Shaojie Li, Mingbao Lin, Yan Wang, Yongjian Wu, Yonghong Tian, Ling Shao, and Rongrong Ji. 2023. Distilling a Powerful Student Model via Online Knowledge Distillation. IEEE Transactions on Neural Networks and Learning Systems, Vol. 34, 11 (2023).
[18]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. 2017. Focal Loss For Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 42, 2 (2017), 318--327.
[19]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In ECCV. Springer, 21--37.
[20]
Yufan Liu, Jiajiong Cao, Bing Li, Chunfeng Yuan, Weiming Hu, Yangxi Li, and Yunqiang Duan. 2019. Knowledge distillation via instance relationship graph. In CVPR. 7096--7104.
[21]
Zhenliang Ni, Fukui Yang, Shengzhao Wen, and Gang Zhang. 2023. Dual relation knowledge distillation for object detection. arXiv preprint arXiv:2302.05637 (2023).
[22]
Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. 2016. Deep metric learning via lifted structured feature embedding. In CVPR. 4004--4012.
[23]
Wonpyo Park, Dongju Kim, Yan Lu, and Minsu Cho. 2019. Relational knowledge distillation. In CVPR. 3967--3976.
[24]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In CVPR. 779--788.
[25]
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: better, faster, stronger. In CVPR. 7263--7271.
[26]
Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, 6 (2017), 1137--1149.
[27]
Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. FitNets: Hints for Thin Deep Nets. In ICLR.
[28]
Ruoyu Sun, Fuhui Tang, Xiaopeng Zhang, Hongkai Xiong, and Qi Tian. 2020. Distilling Object Detectors with Task Adaptive Regularization. CoRR, Vol. abs/2006.13108 (2020).
[29]
Z Tian, C Shen, H Chen, and T He. 2019. FCOS: Fully convolutional one-stage object detection. arXiv preprint arXiv:1904.01355 (2019).
[30]
Alexandru Tifrea, Gary Bécigneul, and Octavian-Eugen Ganea. 2018. Poincaré GloVe: Hyperbolic Word Embeddings. In ICLR.
[31]
Frederick Tung and Greg Mori. 2019. Similarity-preserving knowledge distillation. In ICCV. 1365--1374.
[32]
Jingye Wang, Ruoyi Du, Dongliang Chang, Kongming Liang, and Zhanyu Ma. 2022. Domain Generalization via Frequency-domain-based Feature Disentanglement and Interaction. In ACM International Conference on Multimedia. 4821--4829.
[33]
Luting Wang, Yi Liu, Penghui Du, Zihan Ding, Yue Liao, Qiaosong Qi, Biaolong Chen, and Si Liu. 2023. Object-aware distillation pyramid for open-vocabulary object detection. In CVPR. 11186--11196.
[34]
Peijin Wang, Xian Sun, Wenhui Diao, and Kun Fu. 2020. FMSSD: Feature-Merged Single-Shot Detection for Multiscale Objects in Large-Scale Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing, Vol. 58, 99 (2020), 3377--3390.
[35]
Tao Wang, Li Yuan, Xiaopeng Zhang, and Jiashi Feng. 2019. Distilling object detectors with fine-grained feature imitation. In CVPR. 4933--4942.
[36]
Tao Wang, Li Yuan, Xiaopeng Zhang, and Jiashi Feng. 2019. Distilling object detectors with fine-grained feature imitation. In CVPR. 4933--4942.
[37]
Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Krahenbuhl. 2017. Sampling Matters In Deep Embedding Learning. In CVPR. 2859--2867.
[38]
Xingxing Xie, Gong Cheng, Jiabao Wang, Xiwen Yao, and Junwei Han. 2021. Oriented R-CNN for Object Detection. In ICCV. 3500--3509.
[39]
Qinwei Xu, Ruipeng Zhang, Ya Zhang, Yanfeng Wang, and Qi Tian. 2021. A fourier-based framework for domain generalization. In CVPR. 14383--14392.
[40]
Tao Xu, Xian Sun, Wenhui Diao, Liangjin Zhao, Kun Fu, and Hongqi Wang. 2022. ASSD: Feature Aligned Single-Shot Detection for Multiscale Objects in Aerial Imagery. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022), 1--17.
[41]
Xue Yang, Junchi Yan, Ziming Feng, and Tao He. 2021. R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. In AAAI. 3163--3171.
[42]
Yiran Yang, Xian Sun, Wenhui Diao, Hao Li, Youming Wu, Xinming Li, and Kun Fu. 2022. Adaptive Knowledge Distillation for Lightweight Remote Sensing Object Detectors Optimizing. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022), 1--15.
[43]
Zhendong Yang, Zhe Li, Xiaohu Jiang, Yuan Gong, Zehuan Yuan, Danpei Zhao, and Chun Yuan. 2022. Focal and global knowledge distillation for detectors. In CVPR. 4643--4652.
[44]
Sergey Zagoruyko and Nikos Komodakis. 2017. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer. In ICLR.
[45]
Chaoning Zhang, Dongshen Han, Yu Qiao, Jung Uk Kim, Sung-Ho Bae, Seungkyu Lee, and Choong Seon Hong. 2023. Faster Segment Anything: Towards Lightweight SAM for Mobile Applications. arXiv preprint arXiv:2306.14289 (2023).
[46]
Linfeng Zhang and Kaisheng Ma. 2021. Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors. In ICLR.
[47]
Rui Zhang, Sheng Tang, Luoqi Liu, Yongdong Zhang, Jintao Li, and Shuicheng Yan. 2018. High Resolution Feature Recovering for Accelerating Urban Scene Parsing. In IJCAI. 1156--1162.
[48]
Xiaohan Zhang, Yafei Lv, Libo Yao, Wei Xiong, and Chunlong Fu. 2020. A new benchmark and an attribute-guided multilevel feature representation network for fine-grained ship classification in optical remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol. 13 (2020), 1271--1285.
[49]
Zhengning Zhang, Lin Zhang, Yue Wang, Pengming Feng, and Ran He. 2021. ShipRSImageNet: A large-scale fine-grained dataset for ship detection in high-resolution optical remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol. 14 (2021), 8458--8472.
[50]
WANG Zhirui, KANG Yuzhuo, ZENG Xuan, WANG Yuelei, ZHANG Ting, and SUN Xian. 2023. SAR-AIRcraft-1.0: High-resolution SAR Aircraft Detection and Recognition Dataset. JR (2023).
[51]
Du Zhixing, Rui Zhang, Ming Chang, Shaoli Liu, Tianshi Chen, Yunji Chen, et al. 2021. Distilling object detectors with feature richness. In NIPS, Vol. 34 (2021), 5213--5224.

Index Terms

  1. Information Fusion with Knowledge Distillation for Fine-grained Remote Sensing Object Detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fine-grained object detection
    2. information fusion.
    3. knowledge distillation
    4. remote sensing images

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 71
      Total Downloads
    • Downloads (Last 12 months)71
    • Downloads (Last 6 weeks)18
    Reflects downloads up to 01 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media