Abstract
Existing unsupervised person re-identification approaches fail to fully capture the fine-grained features of local regions, which can result in people with similar appearances and different identities being assigned the same label after clustering. The identity-independent information contained in different local regions leads to different levels of local noise. To address these challenges, joint training with local soft attention and dual cross-neighbor label smoothing (DCLS) is proposed in this study. First, the joint training is divided into global and local parts, whereby a soft attention mechanism is proposed for the local branch to accurately capture the subtle differences in local regions, which improves the ability of the re-identification model in identifying a person’s local significant features. Second, DCLS is designed to progressively mitigate label noise in different local regions. The DCLS uses global and local similarity metrics to semantically align the global and local regions of the person and further determines the proximity association between local regions through the cross information of neighboring regions, thereby achieving label smoothing of the global and local regions throughout the training process. In extensive experiments, the proposed method outperformed existing methods under unsupervised settings on several standard person re-identification datasets.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Han, Q.; Liu, H.; Min, W.; Huang, T.; Lin, D.; Wang, Q. 3D skeleton and two streams approach to person re-identification using optimized region matching. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 18, No. 2s, Article No. 129, 2022.
Wang, Q.; Min, W.; Han, Q.; Yang, Z.; Xiong, X.; Zhu, M.; Zhao, H. Viewpoint adaptation learning with cross-view distance metric for robust vehicle re-identification. Information Sciences Vol. 564, 71–84, 2021.
Wang, Q.; Min, W.; He, D.; Zou, S.; Huang, T.; Zhang, Y.; Liu, R. Discriminative fine-grained network for vehicle re-identification using two-stage re-ranking. Science China Information Sciences Vol. 63, No. 11, Article No. 212102, 2020.
Bai, Y.; Wang, C.; Lou, Y.; Liu, J.; Duan, L. Y. Hierarchical connectivity-centered clustering for unsupervised domain adaptation on person reidentification. IEEE Transactions on Image Processing Vol. 30, 6715–6729, 2021.
Li, Y.; Yao, H.; Xu, C. Intra-domain consistency enhancement for unsupervised person re-identification. IEEE Transactions on Multimedia Vol. 24, 415–425, 2022.
Li, Y.; Yao, H.; Xu, C. TEST: Triplet ensemble student-teacher model for unsupervised person re-identification. IEEE Transactions on Image Processing Vol. 30, 7952–7963, 2021.
Sun, J.; Li, Y.; Chen, H.; Peng, Y.; Zhu, J. Unsupervised cross domain person re-identification by multi-loss optimization learning. IEEE Transactions on Image Processing Vol. 30, 2935–2946, 2021.
Li, M.; Zhu, X.; Gong, S. Unsupervised tracklet person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42, No. 7, 1770–1782, 2020.
Lin, Y.; Wu, Y.; Yan, C.; Xu, M.; Yang, Y. Unsupervised person re-identification via cross-camera similarity exploration. IEEE Transactions on Image Processing Vol. 29, 5481–5490, 2020.
Ding, Y.; Fan, H.; Xu, M.; Yang, Y. Adaptive exploration for unsupervised person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 16, No. 1, Article No. 3, 2020.
Wang, Z.; Jiang, J.; Wu, Y.; Ye, M.; Bai, X.; Satoh, S. Learning sparse and identity-preserved hidden attributes for person re-identification. IEEE Transactions on Image Processing Vol. 29, 2013–2025, 2020.
Fan, H.; Zheng, L.; Yan, C.; Yang, Y. Unsupervised person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 14, No. 4, Article No. 83, 2018.
Yu, H. X.; Wu, A.; Zheng, W. S. Cross-view asymmetric metric learning for unsupervised person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, 994–1002, 2017.
Wu, J.; Liao, S.; Lei, Z.; Wang, X.; Yang, Y.; Li, S. Z. Clustering and dynamic sampling based unsupervised domain adaptation for person re-identification. In: Proceedings of the IEEE International Conference on Multimedia and Expo, 886–891, 2019.
Fu, Y.; Wei, Y.; Wang, G.; Zhou, Y.; Shi, H.; Uiuc, U.; Huang, T. Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6111–6120, 2019.
Yang, Q.; Yu, H. X.; Wu, A.; Zheng, W. S. Patch-based discriminative feature learning for unsupervised person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3628–3637, 2019.
Lin, Y.; Xie, L.; Wu, Y.; Yan, C.; Tian, Q. Unsupervised person re-identification via softened similarity learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3387–3396, 2020.
Lin, S.; Li, H.; Li, C. T.; Kot, A. C. Multi-task mid-level feature alignment network for unsupervised cross-dataset person re-identification. arXiv preprint arXiv:1807.01440, 2018.
Huang, Y.; Peng, P.; Jin, Y.; Li, Y.; Xing, J. Domain adaptive attention learning for unsupervised person re-identification. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 11069–11076, 2020.
Wei, L.; Zhang, S.; Gao, W.; Tian, Q. Person transfer GAN to bridge domain gap for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 79–88, 2018.
Deng, W.; Zheng, L.; Ye, Q.; Kang, G.; Yang, Y.; Jiao, J. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 994–1003, 2018.
Zhu, J. Y.; Park, T.; Isola, P.; Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, 2242–2251, 2017.
Chen, Y.; Zhu, X.; Gong, S. Instance-guided context rendering for cross-domain person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 232–242, 2019.
Wang, Q.; Min, W.; Han, Q.; Liu, Q.; Zha, C.; Zhao, H.; Wei, Z. Inter-domain adaptation label for data augmentation in vehicle re-identification. IEEE Transactions on Multimedia Vol. 24, 1031–1041, 2022.
Ge, Y.; Chen, D.; Li, H. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. In: Proceedings of the International Conference on Learning Representations, 2020.
Ge, Y.; Zhu, F.; Chen, D.; Zhao, R. Self-paced contrastive learning with hybrid memory for domain adaptive object re-ID. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 949, 11309–11321, 2020.
Lin, Y.; Dong, X.; Zheng, L.; Yan, Y.; Yang, Y. A bottom-up clustering approach to unsupervised person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, No. 1, 8738–8745, 2019.
Ding, G.; Khan, S.; Tang, Z.; Zhang, J.; Porikli, F. Towards better validity: Dispersion based clustering for unsupervised person re-identification. arXiv preprint arXiv:1906.01308, 2019.
Zeng, K.; Ning, M.; Wang, Y.; Guo, Y. Hierarchical clustering with hard-batch triplet loss for person reidentification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13654–13662, 2020.
Zheng, F.; Deng, C.; Sun, X.; Jiang, X.; Guo, X.; Yu, Z.; Huang, F.; Ji, R. Pyramidal person re-identification via multi-loss dynamic training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8506–8514, 2019.
Wang, G.; Yuan, Y.; Chen, X.; Li, J.; Zhou, X. Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, 274–282, 2018.
Sun, Y.; Zheng, L.; Yang, Y.; Tian, Q.; Wang, S. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Computer Vision–ECCV 2018. Lecture Notes in Computer Science, Vol. 11208. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 501–518, 2018.
He, T.; Shen, L.; Guo, Y.; Ding, G.; Guo, Z. SECRET: Self-consistent pseudo label refinement for unsupervised domain adaptive person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 36, No. 1, 879–887, 2022.
Ester, M.; Kriegel, H.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 226–231, 1996.
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2818–2826, 2016.
Komodakis, N.; Zagoruyko, S. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In: Proceedings of the International Conference on Learning Representations, 2017.
Lukasik, M.; Bhojanapalli, S.; Menon, A. K.; Kumar, S. Does label smoothing mitigate label noise? In: Proceedings of the 37th International Conference on Machine Learning, 6448–6458, 2020.
Zheng, L.; Shen, L.; Tian, L.; Wang, S.; Wang, J.; Tian, Q. Scalable person re-identification: A benchmark. In: Proceedings of the IEEE International Conference on Computer Vision, 1116–1124, 2015.
Ristani, E.; Solera, F.; Zou, R.; Cucchiara, R.; Tomasi, C. Performance measures and a data set for multitarget, multi-camera tracking. In: Computer Vision–ECCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 9914. Hua, G.; Jégou, H. Eds. Springer Cham, 17–35, 2016.
Zhong, Z.; Zheng, L.; Cao, D.; Li, S. Re-ranking person re-identification with k-Reciprocal encoding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3652–3661, 2017.
Deng, J.; Dong, W.; Socher, R.; Li, L. J.; Li, K.; Li, F. F. ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255, 2009.
Zhong, Z.; Zheng, L.; Kang, G.; Li, S.; Yang, Y. Random erasing data augmentation. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 13001–13008, 2020.
Kingma, D. P.; Ba, J. L. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations, 2015.
Wang, D.; Zhang, S. Unsupervised person re-identification via multi-label classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10978–10987, 2020.
Li, J.; Zhang, S. Joint visual and temporal consistency for unsupervised domain adaptive person re-identification. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12369. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 483–499, 2020.
Ji, H.; Wang, L.; Zhou, S.; Tang, W.; Zheng, N.; Hua, G. Meta pairwise relationship distillation for unsupervised person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3641–3650, 2021.
Chen, H.; Wang, Y.; Lagadec, B.; Dantcheva, A.; Bremond, F. Joint generative and contrastive learning for unsupervised person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2004–2013, 2021.
Hu, Z.; Zhu, C.; He, G. Hard-sample guided hybrid contrast learning for unsupervised person re-identification. In: Proceedings of the 7th IEEE International Conference on Network Intelligence and Digital Content, 91–95, 2021.
Chen, H.; Lagadec, B.; Bremond, F. Ice: Interinstance contrastive encoding for unsupervised person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 14940–14949, 2021.
Luo, H.; Wang, P.; Xu, Y.; Ding, F.; Zhou, Y.; Wang, F.; Li, H.; Jin, R. Self-supervised pre-training for transformer-based person re-identification. arXiv preprint arXiv:2111.12084, 2021.
Zhai, Y.; Lu, S.; Ye, Q.; Shan, X.; Chen, J.; Ji, R.; Tian, Y. AD-cluster: Augmented discriminative clustering for domain adaptive person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9018–9027, 2020.
Zhang, M.; Liu, K.; Li, Y.; Guo, S.; Duan, H.; Long, Y.; Jin, Y. Unsupervised domain adaptation for person re-identification via heterogeneous graph alignment. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, No. 4, 3360–3368, 2021.
Luo, C.; Song, C.; Zhang, Z. Generalizing person reidentification by camera-aware invariance learning and cross-domain mixup. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12360. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 224–241, 2020.
Zhao, F.; Liao, S.; Xie, G. S.; Zhao, J.; Zhang, K.; Shao, L. Unsupervised domain adaptation with noise resistible mutual-training for person re-identification. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12356. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 526–544, 2020.
Chen, G.; Lu, Y.; Lu, J.; Zhou, J. Deep credible metric learning for unsupervised domain adaptation person re-identification. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12353. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 643–659, 2020.
Zhai, Y.; Ye, Q.; Lu, S.; Jia, M.; Ji, R.; Tian, Y. Multiple expert brainstorming for domain adaptive person re-identification. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12352. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 594–611, 2020.
Zheng, K.; Lan, C.; Zeng, W.; Zhang, Z.; Zha, Z. J. Exploiting sample uncertainty for domain adaptive person reidentification. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, No. 4, 3538–3546, 2021.
Zheng, K.; Liu, W.; He, L.; Mei, T.; Luo, J.; Zha, Z. J. Group-aware label transfer for domain adaptive person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5306–5315, 2021.
Zhong, Z.; Zheng, L.; Luo, Z.; Li, S.; Yang, Y. Invariance matters: Exemplar memory for domain adaptive person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 598–607, 2019.
Woo, S.; Park, J.; Lee, J. Y.; Kweon, I. S. CBAM: Convolutional block attention module. In: Computer Vision–ECCV 2018. Lecture Notes in Computer Science, Vol. 11211. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 3–19, 2018.
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-net: Efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11531–11539, 2020.
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitationnetworks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7132–7141, 2018.
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. Journal of Machine Learning Research Vol. 9, No. 11, 2579–2605, 2008.
Pan, X.; Luo, P.; Shi, J.; Tang, X. Two at once: Enhancing learning and generalization capacities via IBN-net. In: Computer Vision–ECCV 2018. Lecture Notes in Computer Science, Vol. 11208. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 484–500, 2018.
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant Nos. 62076117 and 62166026, the Jiangxi Key Laboratory of Smart City under Grant No. 20192BCD40002, and Jiangxi Provincial Natural Science Foundation under Grant No. 20224BAB212011.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Qing Han received her B.E. and M.E. degrees in computer application from Tianjin Polytechnic University, China in 1997 and 2006, respectively. She is currently an associate professor at the School of Mathematics and Computer Sciences, Nanchang University, China. Her current research interests include image and video processing, and network management.
Longfei Li received his B.E. degree in software engineering from Jiangxi Normal University, China in 2021. He is currently pursuing the M.E. degree at Nanchang University, China. His current research interests include computer vision.
Weidong Min received his B.E., M.E., and Ph.D. degrees in computer application from Tsinghua University, China in 1989, 1991, and 1995, respectively. He is currently a professor and the dean of the Institute of Metaverse, Nanchang University, China. He is the executive director of the China Society of Image and Graphics. His current research interests include image and video processing, artificial intelligence, big data, distributed systems, and smart city information technology.
Qi Wang received his M.E. degree in computer science and technology and his Ph.D. degree in information management and information system from Nanchang University, China in 2018 and 2021, respectively. He is currently a lecturer at the School of Mathematics and Computer Sciences at Nanchang University, China. His current research interests include computer vision, deep learning, and vehicle reidentification.
Shimiao Cui obtained his B.E. degree of computer science and technology at Nanchang University in China in 2021. He is currently a postgraduate student at Nanchang University, China. His research interests include computer vision and deep learning.
Jiongjin Chen received his B.E. degree in network engineering from Huizhou University, China in 2021. He is currently pursuing the M.E. degree at Nanchang University, China. His current research interests include computer vision.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Han, Q., Li, L., Min, W. et al. Joint training with local soft attention and dual cross-neighbor label smoothing for unsupervised person re-identification. Comp. Visual Media 10, 543–558 (2024). https://doi.org/10.1007/s41095-023-0354-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-023-0354-4