Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3696409.3700195acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

CFRL: Coarse-Fine Decoupled Representation Learning For Long-Tailed Recognition

Published: 28 December 2024 Publication History

Abstract

Data often faces a severe class imbalance issue in the real world, meaning that the number of instances within classes varies greatly, following a long-tailed distribution. In this case, the direct application of supervised learning yields poor performance. Existing long-tailed recognition (LTR) methods often heavily rely on the label information to enhance tail classes’ accuracy at the expense of head class by an image-level end-to-end resampling strategy to address data distribution imbalance. Nevertheless, they neglect label bias, which can severely affect the LTR model’s accuracy. In this paper, we propose a novel approach, namely Coarse-Fine Decoupled Representation Learning (CFRL) for LTR. Our core idea is to decouple data representations from the classifier and decompose representation learning into two stages: image-level and patch-level. Specifically, in the image-level stage, we leverage unsupervised learning on image-level information to reduce the impact of label bias caused by imbalanced datasets. In the patch-level stage, we introduce patch-level rotation augmentation as negative samples, forcing the model to acquire more comprehensive information. Our theoretical and empirical analyses demonstrate that the approach does not sacrifice the accuracy of head classes while significantly reducing the overfitting of tail classes, improving both of them. We showcase state-of-the-art results on CIFAR, ImageNet, and iNaturalist datasets. Furthermore, we illustrate that this training methodology can be combined with various existing Long-Tailed Recognition (LTR) methods, further enhancing their performance.

Supplemental Material

PDF File
Sup of paper 35

References

[1]
Shaden Alshammari, Yu-Xiong Wang, Deva Ramanan, and Shu Kong. 2022. Long-tailed recognition via weight balancing. In CVPR.
[2]
Jiarui Cai, Yizhou Wang, Jenq-Neng Hwang, et al. 2021. Ace: Ally complementary experts for solving long-tailed recognition in one-shot. In ICCV.
[3]
Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. 2019. Learning imbalanced datasets with label-distribution-aware margin loss. Advances in neural information processing systems 32 (2019).
[4]
Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. 2019. Learning imbalanced datasets with label-distribution-aware margin loss. NeurIPS (2019).
[5]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In ICML.
[6]
Hsin-Ping Chou, Shih-Chieh Chang, Jia-Yu Pan, Wei Wei, and Da-Cheng Juan. 2020. Remix: rebalanced mixup. In ECCV.
[7]
Jiequan Cui, Shu Liu, Zhuotao Tian, Zhisheng Zhong, and Jiaya Jia. 2022. Reslt: Residual learning for long-tailed recognition. PAMI (2022).
[8]
Jiequan Cui, Zhisheng Zhong, Shu Liu, Bei Yu, and Jiaya Jia. 2021. Parametric contrastive learning. In ICCV.
[9]
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In CVPR.
[10]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In ICLR.
[11]
Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross B. Girshick. 2022. Masked Autoencoders Are Scalable Vision Learners. In CVPR.
[12]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In CVPR.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.
[14]
Ruifei He, Jihan Yang, and Xiaojuan Qi. 2021. Re-Distributing Biased Pseudo Labels for Semi-Supervised Semantic Segmentation: A Baseline Investigation. In ICCV.
[15]
Yin-Yin He, Jianxin Wu, Xiu-Shen Wei, et al. 2021. Distilling virtual examples for long-tailed recognition. In ICCV.
[16]
Dan Hendrycks and Thomas Dietterich. 2019. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:https://arXiv.org/abs/1903.12261 (2019).
[17]
Yan Hong, Jianfu Zhang, Zhongyi Sun, and Ke Yan. 2022. SAFA: Sample-Adaptive Feature Augmentation for Long-Tailed Image Classification. In ECCV.
[18]
Ahmet Iscen, Andre Araujo, Boqing Gong, and Cordelia Schmid. 2021. Class-Balanced Distillation for Long-Tailed Visual Recognition. In BMVC.
[19]
Muhammad Abdullah Jamal, Matthew Brown, Ming-Hsuan Yang, Liqiang Wang, and Boqing Gong. 2020. Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In CVPR.
[20]
Bingyi Kang, Yu Li, Sa Xie, Zehuan Yuan, and Jiashi Feng. 2020. Exploring balanced feature spaces for representation learning. In ICLR.
[21]
Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, and Yannis Kalantidis. 2019. Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:https://arXiv.org/abs/1910.09217 (2019).
[22]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. NeurIPS (2020).
[23]
A. Krizhevsky and G. Hinton. 2009. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto (2009).
[24]
Bolian Li, Zongbo Han, Haining Li, Huazhu Fu, and Changqing Zhang. 2022. Trustworthy Long-Tailed Classification. In CVPR.
[25]
Jun Li, Zichang Tan, Jun Wan, Zhen Lei, and Guodong Guo. 2022. Nested Collaborative Learning for Long-Tailed Visual Recognition. In CVPR.
[26]
Mengke Li, Yiu-ming Cheung, Yang Lu, et al. 2022. Long-tailed Visual Recognition via Gaussian Clouded Logit Adjustment. In CVPR.
[27]
Tianhong Li, Peng Cao, Yuan Yuan, Lijie Fan, Yuzhe Yang, Rogerio S Feris, Piotr Indyk, and Dina Katabi. 2022. Targeted supervised contrastive learning for long-tailed recognition. In CVPR.
[28]
Tianhao Li, Limin Wang, and Gangshan Wu. 2021. Self supervision to distillation for long-tailed visual recognition. In ICCV.
[29]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In ICCV.
[30]
Hao Liu and Pieter Abbeel. 2020. Hybrid discriminative-generative training via contrastive learning. arXiv preprint arXiv:https://arXiv.org/abs/2007.09070 (2020).
[31]
Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, and Stella X. Yu. 2019. Large-Scale Long-Tailed Recognition in an Open World. In CVPR.
[32]
Ilya Loshchilov and Frank Hutter. 2017. SGDR: Stochastic Gradient Descent with Warm Restarts. In ICLR.
[33]
Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, and Sanjiv Kumar. 2021. Long-tail learning via logit adjustment. In ICLR.
[34]
Yao Qin, Chiyuan Zhang, Ting Chen, Balaji Lakshminarayanan, Alex Beutel, and Xuezhi Wang. 2022. Understanding and improving robustness of vision transformers through patch-based negative augmentation. Advances in Neural Information Processing Systems 35 (2022), 16276–16289.
[35]
Jiawei Ren, Cunjun Yu, Xiao Ma, Haiyu Zhao, Shuai Yi, et al. 2020. Balanced meta-softmax for long-tailed visual recognition. NeurIPS (2020).
[36]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. IJCV (2015).
[37]
Abhishek Sinha, Kumar Ayush, Jiaming Song, Burak Uzkent, Hongxia Jin, and Stefano Ermon. 2021. Negative data augmentation. arXiv preprint arXiv:https://arXiv.org/abs/2102.05113 (2021).
[38]
Jingru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin, and Junjie Yan. 2020. Equalization loss for long-tailed object recognition. In CVPR.
[39]
Kaihua Tang, Jianqiang Huang, and Hanwang Zhang. 2020. Long-tailed classification by keeping the good and removing the bad momentum causal effect. NeurIPS (2020).
[40]
Hugo Touvron, Matthieu Cord, and Hervé Jégou. 2022. Deit iii: Revenge of the vit. In ECCV.
[41]
Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. 2018. The inaturalist species classification and detection dataset. In CVPR.
[42]
Hualiang Wang, Siming Fu, Xiaoxuan He, Hangxiang Fang, Zuozhu Liu, and Haoji Hu. 2022. Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-tailed Learning. In ECCV.
[43]
Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In ICML.
[44]
Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, and Stella X. Yu. 2021. Long-tailed Recognition by Routing Diverse Distribution-Aware Experts. In ICLR.
[45]
Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, and Fan Yang. 2021. Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In CVPR.
[46]
Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, and Dahua Lin. 2020. Distribution-balanced loss for multi-label classification in long-tailed datasets. In ECCV.
[47]
Liuyu Xiang, Guiguang Ding, Jungong Han, et al. 2020. Learning from multiple experts: Self-paced knowledge distillation for long-tailed classification. In ECCV.
[48]
Yue Xu, Yong-Lu Li, Jiefeng Li, and Cewu Lu. 2022. Constructing balance from imbalance for long-tailed image recognition. In ECCV. Springer.
[49]
Zhengzhuo Xu, Zenghao Chai, Chun Yuan, et al. 2021. Towards calibrated model for long-tailed visual recognition from prior perspective. NeurIPS 34 (2021).
[50]
Zhengzhuo Xu, Ruikang Liu, Shuo Yang, Zenghao Chai, and Chun Yuan. 2023. Learning Imbalanced Data with Vision Transformers. In CVPR.
[51]
Yuzhe Yang and Zhi Xu. 2020. Rethinking the value of labels for improving class-imbalanced learning. Advances in neural information processing systems (2020).
[52]
Sihao Yu, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Zizhen Wang, and Xueqi Cheng. 2022. A Re-Balancing Strategy for Class-Imbalanced Classification Based on Instance Difficulty. In CVPR.
[53]
Weiping Yu, Taojiannan Yang, and Chen Chen. 2021. Towards Resolving the Challenge of Long-tail Distribution in UAV Images for Object Detection. In WACV.
[54]
Songyang Zhang, Zeming Li, Shipeng Yan, Xuming He, and Jian Sun. 2021. Distribution alignment: A unified framework for long-tail visual recognition. In CVPR.
[55]
Xiao Zhang, Zhiyuan Fang, Yandong Wen, Zhifeng Li, and Yu Qiao. 2017. Range loss for deep face recognition with long-tailed training data. In ICCV.
[56]
Yifan Zhang, Bryan Hooi, Lanqing Hong, and Jiashi Feng. 2021. Test-agnostic long-tailed recognition by test-time aggregating diverse experts with self-supervision. arXiv preprint arXiv:https://arXiv.org/abs/2107.09249 (2021).
[57]
Yifan Zhang, Bryan Hooi, Lanqing Hong, and Jiashi Feng. 2022. Self-supervised aggregation of diverse experts for test-agnostic long-tailed recognition. NeurIPS (2022).
[58]
Yan Zhao, Weicong Chen, Xu Tan, Kai Huang, and Jihong Zhu. 2022. Adaptive logit adjustment loss for long-tailed visual recognition. In AAAI.
[59]
Zhisheng Zhong, Jiequan Cui, Shu Liu, and Jiaya Jia. 2021. Improving Calibration for Long-Tailed Recognition. In CVPR.
[60]
Boyan Zhou, Quan Cui, Xiu-Shen Wei, and Zhao-Min Chen. 2020. Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In CVPR.
[61]
Jianggang Zhu, Zheng Wang, Jingjing Chen, Yi-Ping Phoebe Chen, and Yu-Gang Jiang. 2022. Balanced Contrastive Learning for Long-Tailed Visual Recognition. In CVPR.

Index Terms

  1. CFRL: Coarse-Fine Decoupled Representation Learning For Long-Tailed Recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MMAsia '24: Proceedings of the 6th ACM International Conference on Multimedia in Asia
    December 2024
    939 pages
    ISBN:9798400712739
    DOI:10.1145/3696409
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 December 2024

    Check for updates

    Author Tags

    1. Long Tail Recognition
    2. Representation learning

    Qualifiers

    • Research-article

    Conference

    MMAsia '24
    Sponsor:
    MMAsia '24: ACM Multimedia Asia
    December 3 - 6, 2024
    Auckland, New Zealand

    Acceptance Rates

    Overall Acceptance Rate 59 of 204 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 20
      Total Downloads
    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media