Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Self-Supervised Consistency Based on Joint Learning for Unsupervised Person Re-identification

Published: 18 September 2023 Publication History

Abstract

Recently, unsupervised domain adaptive person re-identification (Re-ID) methods have been extensively studied thanks to not requiring annotations, and they have achieved excellent performance. Most of the existing methods aim to train the Re-ID model for learning a discriminative feature representation. However, they usually only consider training the model to learn a global feature of a pedestrian image, but neglecting the local feature, which restricts further improvement of model performance. To address this problem, two local branches are added to the networks, aiming to allow the model to focus on the local feature containing identity information. Furthermore, we propose a self-supervised consistency constraint to further improve robustness of the model. Specifically, the self-supervised consistency constraint uses the basic data augmentation operations without other auxiliary networks, which can improve performance of the model effectively. Then, a learnable memory matrix is designed to store the mapping vectors that maps person features into probability distributions. Finally, extensive experiments are conducted on multiple commonly used person Re-ID datasets to verify the effectiveness of the proposed generative adversarial networks fusing global and local features. Experimental results reveal that our method achieves results comparable to state-of-the-art methods.

References

[1]
Mathilde Caron, Hugo Touvron, Ishan Misra, Herve Jegou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging properties in self-supervised vision transformers. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV’21).9630–9640.
[2]
Hao Chen, Yaohui Wang, Benoit Lagadec, Antitza Dantcheva, and Francois Bremond. 2021. Joint generative and contrastive learning for unsupervised person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2004–2013.
[3]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A simple framework for contrastive learning of visual representations. arXiv abs/2002.05709 (2020).
[4]
Xinlei Chen, Haoqi Fan, Ross B. Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv abs/2003.04297 (2020).
[5]
Xinlei Chen, Saining Xie, and Kaiming He. 2021. An empirical study of training self-supervised vision transformers. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV’21).9620–9629.
[6]
De Cheng, Xiaojun Chang, Li Liu, Alexander G. Hauptmann, Yihong Gong, and Nanning Zheng. 2017. Discriminative dictionary learning with ranking metric embedded for person re-identification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 964–970.
[7]
Yoon Hee Cho, Woo Jae Kim, Seunghoon Hong, and Sung Eui Yoon. 2022. Part-based pseudo label refinement for unsupervised person re-identification. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22).7298–7308.
[8]
Zuozhuo Dai, Guangyuan Wang, Siyu Zhu, Weihao Yuan, and Ping Tan. 2021. Cluster contrast for unsupervised person re-identification. In Proceedings of the Asian Conference on Computer Vision.
[9]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 Conference on Computer Vision and Pattern Recognition.
[10]
Yuhang Ding, Hehe Fan, Mingliang Xu, and Yezhou Yang. 2020. Adaptive exploration for unsupervised person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 16 (2020), 1–19.
[11]
Chanho Eom and Bumsub Ham. 2019. Learning disentangled representation for robust person re-identification. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS’19). 1–12.
[12]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96). 226–231.
[13]
Dengpan Fu, Dongdong Chen, Jianmin Bao, Hao Yang, Lu Yuan, Lei Zhang, Houqiang Li, and Dong Chen. 2020. Unsupervised pre-training for person re-identification. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).14745–14754.
[14]
Yang Fu, Yunchao Wei, Guanshuo Wang, Yuqian Zhou, Honghui Shi, and Thomas S. Huang. 2019. Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6112–6121.
[15]
Yixiao Ge, Dapeng Chen, and Hongsheng Li. 2020. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526 (2020).
[16]
Yixiao Ge, Zhuowan Li, Haiyu Zhao, Guojun Yin, Shuai Yi, Xiaogang Wang, and Hongsheng Li. 2018. FD-GAN: Pose-guided feature distilling GAN for robust person re-identification. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS’18). 1–12.
[17]
Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, and Hongsheng Li. 2020. Self-paced contrastive learning with hybrid memory for domain adaptive object Re-ID. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS’20).11309–11321.
[18]
Jean-Bastien Grill, Florian Strub, Florent Altche, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, and Michal Valko. 2020. Bootstrap your own latent: A new approach to self-supervised learning. arXiv abs/2006.07733 (2020).
[19]
Jian Han, Yali Li, and Shengjin Wang. 2021. Delving into probabilistic uncertainty for unsupervised domain adaptive person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence.
[20]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross B. Girshick. 2019. Momentum contrast for unsupervised visual representation learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19).9726–9735.
[21]
Zilong Ji, Xiaolong Zou, Xiaohan Lin, Xiao Liu, Tiejun Huang, and Si Wu. 2020. An attention-driven two-stage clustering method for unsupervised person re-identification. In Proceedings of the European Conference on Computer Vision. 20–36.
[22]
Qing Li, Xiaojiang Peng, Yu Qiao, and Qi Hao. 2022. Unsupervised person re-identification with multi-label learning guided self-paced clustering. Pattern Recognition 125 (2022), 108521.
[23]
Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. 2014. DeepReID: Deep filter pairing neural network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 152–159.
[24]
Wei Li, Xiatian Zhu, and Shaogang Gong. 2017. Person re-identification by deep joint learning of multi-loss classification. arXiv preprint arXiv:1705.04724 (2017).
[25]
Yaoyu Li, Hantao Yao, and Changsheng Xu. 2021. Intra-domain consistency enhancement for unsupervised person re-identification. IEEE Transactions on Multimedia 24 (2021), 415–425.
[26]
Yaoyu Li, Hantao Yao, and Changsheng Xu. 2021. TEST: Triplet ensemble student-teacher model for unsupervised person re-identification. IEEE Transactions on Image Processing 30 (2021), 7952–7963.
[27]
Yu-Jhe Li, Ci-Siang Lin, Yan-Bo Lin, and Yu-Chiang Frank Wang. 2019. Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7919–7929.
[28]
X. Lin, P. Ren, C. H. Yeh, L. Yao, A. Song, and X. Chang. 2021. Unsupervised person re-identification: A systematic survey of challenges and solutions. arXiv:2109.06057 (2021).
[29]
Sen Ling, Hua Yang, Chuang Liu, Lin Chen, and Hongtian Zhao. 2022. Spatial-temporal constrained pseudo-labeling for unsupervised person re-identification via GCN inference. In Digital TV and Wireless Multimedia Communications, Guangtao Zhai, Jun Zhou, Hua Yang, Ping An, and Xiaokang Yang (Eds.). Springer Singapore, Singapore, 297–311.
[30]
Wenhe Liu, Xiaojun Chang, Ling Chen, and Yi Yang. 2017. Early active learning with pairwise constraint for person re-identification. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 103–118.
[31]
Wenhe Liu, Xiaojun Chang, Ling Chen, and Yi Yang. 2018. Semi-supervised Bayesian attribute learning for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[32]
Haowen Luo, Pichao Wang, Yi Xu, Feng Ding, Yanxin Zhou, Fan Wang, Hao Li, and Rong Jin. 2021. Self-supervised pre-training for transformer-based person re-identification. arXiv abs/2111.12084 (2021).
[33]
Andy J. Ma, Pong C. Yuen, and Jiawei Li. 2013. Domain transfer support vector ranking for person re-identification without target camera label information. In Proceedings of the IEEE International Conference on Computer Vision. 3567–3574.
[34]
Z. Ming, M. Zhu, X. Wang, J. Zhu, J. Cheng, C. Gao, Y. Yang, and X. Wei. 2021. Deep learning-based person re-identification methods: A survey and outlook of recent works. arXiv:2110.04764 (2021).
[35]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
[36]
Xingang Pan, Ping Luo, Jianping Shi, and Xiaoou Tang. 2018. Two at once: Enhancing learning and generalization capacities via IBN-Net. In Proceedings of the European Conference on Computer Vision (ECCV’18). 464–479.
[37]
Zhiqi Pang, Jifeng Guo, Zhiqiang Ma, Wenbo Sun, and Yanbang Xiao. 2021. Median stable clustering and global distance classification for cross-domain person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 32, 5 (2021), 3164–3177.
[38]
Peixi Peng, Tao Xiang, Yaowei Wang, Massimiliano Pontil, Shaogang Gong, Tiejun Huang, and Yonghong Tian. 2016. Unsupervised cross-dataset transfer learning for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1306–1315.
[39]
Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of the European Conference on Computer Vision. 17–35.
[40]
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. 618–626.
[41]
Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. 2018. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European Conference on Computer Vision (ECCV’18). 480–496.
[42]
Zongzhe Sun, Feng Zhao, and Feng Wu. 2021. Unsupervised person re-identification via global-level and patch-level discriminative feature learning. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP’21). IEEE, Los Alamitos, CA, 2363–2367.
[43]
Qing Tang and Kang-Hyun Jo. 2021. Unsupervised person re-identification via nearest neighbor collaborative training strategy. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP’21). IEEE, Los Alamitos, CA, 1139–1143.
[44]
Dongkai Wang and Shiliang Zhang. 2020. Unsupervised person re-identification via multi-label classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10981–10990.
[45]
Q. Wang, B. Wu, P. Zhu, P. Li, and Q. Hu. 2020. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).
[46]
Jiali Xi, Qin Zhou, Xinzhe Li, and Shibao Zheng. 2022. Momentum source-proxy guided initialization for unsupervised domain adaptive person re-identification. Neurocomputing 483 (2022), 116–126.
[47]
Chen Yan, Lixuan Meng, Liang Li, Jiehua Zhang, Zhan Wang, Jian Yin, Jiyong Zhang, Yaoqi Sun, and Bolun Zheng. 2022. Age-invariant face recognition by multi-feature fusion and decomposition with self-attention. ACM Transactions on Multimedia Computing, Communications, and Applications 18 (2022), 1–18.
[48]
Chenggang Clarence Yan, Biao Gong, Yuxuan Wei, and Yue Gao. 2020. Deep multi-view enhancement hashing for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 43 (2020), 1445–1451.
[49]
Chenggang Clarence Yan, Yiming Hao, Liang Li, Jian Yin, Anan Liu, Zhendong Mao, Zhenyu Chen, and Xingyu Gao. 2021. Task-adaptive attention for image captioning. IEEE Transactions on Circuits and Systems for Video Technology 32 (2021), 43–51.
[50]
Chenggang Clarence Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, and Yongdong Zhang. 2020. Depth image denoising using nuclear norm and learning graph model. ACM Transactions on Multimedia Computing, Communications, and Applications 16 (2020), 1–17.
[51]
Chenggang Clarence Yan, T. Teng, Yutao Liu, Yongbing Zhang, Haoqian Wang, and Xiangyang Ji. 2021. Precise no-reference image quality evaluation based on distortion identification. ACM Transactions on Multimedia Computing, Communications, and Applications 17 (2021), 1–21.
[52]
Fengxiang Yang, Zhun Zhong, Zhiming Luo, Yuanzheng Cai, Yaojin Lin, Shaozi Li, and Nicu Sebe. 2021. Joint noise-tolerant learning and meta camera shift adaptation for unsupervised person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4855–4864.
[53]
Qize Yang, Hong-Xing Yu, Ancong Wu, and Wei-Shi Zheng. 2019. Patch-based discriminative feature learning for unsupervised person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3633–3642.
[54]
Yangbin Yu, Ying Zeng, Haifeng Hu, and Dihu Chen. 2021. Two-branch asymmetric model with alternately clustering for unsupervised person re-identification. IEEE Signal Processing Letters 29 (2021), 75–79.
[55]
Kaiwei Zeng, Munan Ning, Yaohua Wang, and Yang Guo. 2020. Hierarchical clustering with hard-batch triplet loss for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13657–13665.
[56]
Yunpeng Zhai, Shijian Lu, Qixiang Ye, Xuebo Shan, Jie Chen, Rongrong Ji, and Yonghong Tian. 2020. AD-Cluster: Augmented discriminative clustering for domain adaptive person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9021–9030.
[57]
Minying Zhang, Kai Liu, Yidong Li, Shihui Guo, Hongtao Duan, Yimin Long, and Yi Jin. 2021. Unsupervised domain adaptation for person re-identification via heterogeneous graph alignment. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI’21).
[58]
Wenfeng Zhang, Zhiqiang Wei, Lei Huang, Kezhen Xie, and Qibing Qin. 2020. Adaptive attention-aware network for unsupervised person re-identification. Neurocomputing 411 (2020), 20–31.
[59]
Xinyu Zhang, Dongdong Li, Zhigang Wang, Jian Wang, Errui Ding, Javen Qinfeng Shi, Zhaoxi Zhang, and Jingdong Wang. 2022. Implicit sample extension for unsupervised person re-identification. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22).7359–7368.
[60]
Ying Zhang, Tao Xiang, Timothy M. Hospedales, and Huchuan Lu. 2018. Deep mutual learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4320–4328.
[61]
Yue Zhang, Fanghui Zhang, Yi Jin, Yigang Cen, Viacheslav V. Voronin, and Shaohua Wan. 2022. Local correlation ensemble with GCN based on attention features for cross-domain person Re-ID. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 2 (2022), Article 56, 22 pages.
[62]
L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV’15).
[63]
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision. 1116–1124.
[64]
Z. Zhihui, J. Xinyang, Z. Feng, G. Xiaowei, H. Feiyue, Z. Weishi, and S. Xing. 2019. Viewpoint-aware loss with angular regularization for person re-identification. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), Vol. 27.
[65]
Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).
[66]
Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li, and Yi Yang. 2018. Camera style adaptation for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5157–5166.
[67]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2223–2232.
[68]
Xiaodi Zhu, Yanfeng Li, Jia Sun, Houjin Chen, and Jinlei Zhu. 2021. Learning with noisy labels method for unsupervised domain adaptive person re-identification. Neurocomputing 452 (2021), 78–88.
[69]
Yuehua Zhu, Cheng Deng, Huanhuan Cao, and Hao Wang. 2020. Object and background disentanglement for unsupervised cross-domain person re-identification. Neurocomputing 403 (2020), 88–97.
[70]
J. MacQueen. 1967. Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability 1, 14 (1967), 281–297.

Index Terms

  1. Self-Supervised Consistency Based on Joint Learning for Unsupervised Person Re-identification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 1
    January 2024
    639 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3613542
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 September 2023
    Online AM: 19 August 2023
    Accepted: 27 July 2023
    Revised: 13 July 2023
    Received: 13 November 2022
    Published in TOMM Volume 20, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Person re-identification
    2. unsupervised domain adaptive
    3. self-supervised
    4. joint learning

    Qualifiers

    • Research-article

    Funding Sources

    • Science and Technology Program of Guangdong Province

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 306
      Total Downloads
    • Downloads (Last 12 months)306
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 18 Aug 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media