Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SWRM: Similarity Window Reweighting and Margin for Long-Tailed Recognition

Published: 08 March 2024 Publication History

Abstract

Real-world data usually obeys a long-tailed distribution, where a few classes have higher number of samples compared to the other classes. Recent studies have been proposed to alleviate the extreme data imbalance from different perspectives. In this article, we experimentally find that due to the easily confusing visual features between some head- and tail classes, the cross-entropy model is prone to misclassify tail samples to similar head classes. Therefore, to alleviate the influence of the confusion on model performance and improve the classification of tail classes, we propose a Similarity Window Reweighting and Margin (SWRM) algorithm, where the SWRM consists of Similarity Window Reweighting (SWR) and Similarity Window Margin (SWM) algorithms. For the confusable head- and tail classes, SWR assigns larger weights to tail classes and smaller weights to head classes. Therefore, the model can enlarge the importance of tail classes and effectively improve their classification. Moreover, SWR considers the difference in label frequency and the impact of category similarity simultaneously, so that the weight coefficients are more reasonable and efficacious. SWM generates adaptive margins that are proportional to the ratio of the classifier’s weight norm, thus promoting the learning of tail classifier with small weight norm. Our SWRM effectively eliminates the confusion between head- and tail classes and alleviates the misclassification issues. Extensive experiments on three long-tailed datasets, i.e., CIFAR100-LT, ImageNet-LT, and Places-LT, verify our proposed method’s effectiveness and superiority over comparative methods.

References

[1]
Shaden Alshammari, Yu-Xiong Wang, Deva Ramanan, and Shu Kong. 2022. Long-tailed recognition via weight balancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6897–6907.
[2]
Jiarui Cai, Yizhou Wang, and Jenq-Neng Hwang. 2021. Ace: Ally complementary experts for solving long-tailed recognition in one-shot. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 112–121.
[3]
Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. 2019. Learning imbalanced datasets with label-distribution-aware margin loss. Advances in Neural Information Processing Systems 32 (2019), 1565–1576.
[4]
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16 (2002), 321–357.
[5]
Qiong Chen, Tianlin Huang, Geren Zhu, and Enlu Lin. 2023. A dual-branch model with inter-and intra-branch contrastive loss for long-tailed recognition. Neural Networks 168 (2023), 214–222.
[6]
Zhineng Chen, Shanshan Ai, and Caiyan Jia. 2019. Structure-aware deep learning for product image classification. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 1s (2019), 1–20.
[7]
Hsin-Ping Chou, Shih-Chieh Chang, Jia-Yu Pan, Wei Wei, and Da-Cheng Juan. 2020. Remix: Rebalanced mixup. In Proceedings of the Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Part VI 16. Springer, 95–110.
[8]
Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V. Le. 2019. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 113–123.
[9]
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9268–9277.
[10]
Terrance DeVries and Graham W. Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).
[11]
Chris Drummond and Robert C. Holte. 2003. C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Proceedings of the Workshop on Learning from Imbalanced Datasets II. 1–8.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[13]
Youngkyu Hong, Seungju Han, Kwanghee Choi, Seokjun Seo, Beomsu Kim, and Buru Chang. 2021. Disentangling label distribution for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6626–6636.
[14]
Wittawat Jitkrittum, Aditya Krishna Menon, Ankit Singh Rawat, and Sanjiv Kumar. 2022. Elm: Embedding and logit margins for long-tail learning. arXivpreprint arXiv:2204.13208 (2022).
[15]
Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, and Yannis Kalantidis. 2020. Decoupling representation and classifier for long-tailed recognition. In Proceedings of the 8th International Conference on Learning Representations. OpenReview.net.
[16]
Jian Li, Ziyao Meng, Daqian Shi, Rui Song, Xiaolei Diao, Jingwen Wang, and Hao Xu. 2023. FCC: Feature clusters compression for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 24080–24089.
[17]
Jun Li, Zichang Tan, Jun Wan, Zhen Lei, and Guodong Guo. 2022. Nested collaborative learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6949–6958.
[18]
Mengke Li, Yiu-Ming Cheung, and Juyong Jiang. 2022. Feature-balanced loss for long-tailed visual recognition. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1–6.
[19]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980–2988.
[20]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Part V 13. Springer, 740–755.
[21]
Xiangbin Liu, Jiesheng He, Liping Song, Shuai Liu, and Gautam Srivastava. 2021. Medical image classification based on an adaptive size deep learning model. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 3s (2021), 1–18.
[22]
Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, and Stella X. Yu. 2019. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2537–2546.
[23]
Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, and Sanjiv Kumar. 2021. Long-tail learning via logit adjustment. In Proceedings of the 9th International Conference on Learning Representations, 2021.
[24]
Seulki Park, Youngkyu Hong, Byeongho Heo, Sangdoo Yun, and Jin Young Choi. 2022. The majority can help the minority: Context-rich minority oversampling for long-tailed classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6887–6896.
[25]
Seulki Park, Jongin Lim, Younghan Jeon, and Jin Young Choi. 2021. Influence-balanced loss for imbalanced visual classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 735–744.
[26]
Jiawei Ren, Cunjun Yu, Shunan Sheng, Xiao Ma, Haiyu Zhao, Shuai Yi, and Hongsheng Li. 2020. Balanced meta-softmax for long-tailed visual recognition. Advances in Neural Information Processing Systems 33 (2020), 4175–4186.
[27]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein, Alexander C. Berg, and Li Feifei. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115 (2015), 211–252.
[28]
Jun Shu, Qi Xie, Lixuan Yi, Qian Zhao, Sanping Zhou, Zongben Xu, and Deyu Meng. 2019. Meta-weight-net: Learning an explicit mapping for sample weighting. Advances in Neural Information Processing Systems 32 (2019), 1917–1928.
[29]
Saptarshi Sinha and Hiroki Ohashi. 2023. Difficulty-net: Learning to predict difficulty for long-tailed recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 6444–6453.
[30]
Saptarshi Sinha, Hiroki Ohashi, and Katsuyuki Nakamura. 2022. Class-difficulty based methods for long-tailed visual recognition. International Journal of Computer Vision 130, 10 (2022), 2517–2531.
[31]
Leslie N. Smith. 2022. Cyclical focal loss. arXiv preprint arXiv:2202.08978 (2022).
[32]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. Advances in Neural Information Processing Systems 30 (2017), 4077–4087.
[33]
Jingru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin, and Junjie Yan. 2020. Equalization loss for long-tailed object recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11662–11671.
[34]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 11 (2008).
[35]
Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. 2018. The inaturalist species classification and detection dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8769–8778.
[36]
Jianfeng Wang, Thomas Lukasiewicz, Xiaolin Hu, Jianfei Cai, and Zhenghua Xu. 2021. Rsg: A simple but effective module for learning imbalanced datasets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3784–3793.
[37]
Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, and Dahua Lin. 2021. Seesaw loss for long-tailed instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9695–9704.
[38]
Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, and Lei Wang. 2021. Contrastive learning based hybrid networks for long-tailed image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 943–952.
[39]
Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Jinqiao Wang, and Ming Tang. 2021. Adaptive class suppression loss for long-tail object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3103–3112.
[40]
Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, and Stella X. Yu. 2020. Long-tailed recognition by routing diverse distribution-aware experts. arXiv preprint arXiv:2010.01809 (2020).
[41]
Yiru Wang, Weihao Gan, Jie Yang, Wei Wu, and Junjie Yan. 2019. Dynamic curriculum learning for imbalanced data classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5017–5026.
[42]
Yu-Xiong Wang, Deva Ramanan, and Martial Hebert. 2017. Learning to model the tail. Advances in Neural Information Processing Systems 30 (2017), 7029–7039.
[43]
Yong Yang, Qiong Chen, Yuan Feng, and Tianlin Huang. 2023. MIANet: Aggregating unbiased instance and general information for few-shot semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7131–7140.
[44]
Renhui Zhang, Tiancheng Lin, Rui Zhang, and Yi Xu. 2022. Solving the long-tailed problem via intra-and inter-category balance. In Proceedings of the ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2355–2359.
[45]
Zhisheng Zhong, Jiequan Cui, Shu Liu, and Jiaya Jia. 2021. Improving calibration for long-tailed recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16489–16498.
[46]
Boyan Zhou, Quan Cui, Xiu-Shen Wei, and Zhao-Min Chen. 2020. Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9719–9728.
[47]
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 6 (2017), 1452–1464.

Cited By

View all
  • (2024)Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657727(229-239)Online publication date: 10-Jul-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 6
June 2024
715 pages
EISSN:1551-6865
DOI:10.1145/3613638
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 March 2024
Online AM: 07 February 2024
Accepted: 25 January 2024
Revised: 29 October 2023
Received: 31 July 2023
Published in TOMM Volume 20, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Long-tailed recognition
  2. class re-balancing
  3. reweighting
  4. logit adjustment

Qualifiers

  • Research-article

Funding Sources

  • Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application
  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)124
  • Downloads (Last 6 weeks)6
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657727(229-239)Online publication date: 10-Jul-2024

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media