Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Person Reidentification using 3D inception based Spatio-temporal features learning, attribute recognition, and Reranking

Published: 11 May 2023 Publication History

Abstract

Identifying pedestrians in video sequences captured by non-overlapping multi-cameras is referred to as video-based Person Re-identification. The successive video frames in video clips embrace motion patterns of pedestrians and represent a person's appearance from varying angles with different body poses and, thus, provide critical features to counter occlusion, pose variation, viewpoint change, etc. This article proposes a novel person reidentification methodology, which incorporates a 3D Inception-based Person Re-identification model, which embraces four three-dimensional (3D) Inception modules with 3D convolution and pooling layers. The receptive fields of neurons are well expanded through 3D inception modules in both temporal and spatial dimensions. Due to this, the model learns discriminatory appearance along with pedestrians' long-term and short-term motion patterns without any motion approximation module. Further, the model is trained with a unified loss function integrating center loss with usual identification loss to reduce intra-class difference while increasing inter-class difference. Further, the proposed method incorporates an attribute recognition model to identify discriminatory attributes in the video frames. The Spatio-temporal and attribute features are then utilized by a reranking method, which generates the k-most similar video clips for the given input. The effectiveness of the proposed method is validated by performing extensive experiments over three realistic surveillance video datasets; MARS, DukeMTMC-VideoReID, and iLIDS.

References

[1]
Chen D, Yuan Z, Hua G, Zheng N, Wang J (2015) Similarity learning on an explicit polynomial kernel feature map for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.:1565–1573
[2]
Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:403–412
[3]
Chen T, Ding S, Xie J, Yuan Y, Chen W, Yang Y, Wang Z (2019) Abd-net: attentive but diverse person re-identification. ProceedIEEE Int Conf Comput Vision:8350–8360
[4]
Chen B, Deng W, Hu J (2019) Mixed high-order attention network for person re-identification. Proceed IEEE Int Conf Comp Vision:371–381
[5]
Chen Z, Zhou Z, Huang J, Zhang P, and Li B Frame-guided region-aligned representation for video person re-identification Proc AAAI Conf Artif Intell 2020 34 7 10591-10598
[6]
Choudhary M, Tiwari V, Jain S (2021) Person re-identification using deep siamese network with multi-layer similarity constraints. Multimed Tools Appl:1–17
[7]
Fu Y, Wang X, Wei Y, and Huang T STA: spatial-temporal attention for large-scale video-based person re-identification Proc. AAAI Conf. Artif. Intell. 2019 33 8287-8294
[8]
Fu H, Zhang K, Li H, Wang J, and Wang Z Spatial temporal and channel aware network for video-based person re-identification Image Vis Comput 2022 118 104356
[9]
Ge Y, Li Z, Zhao H, Yin G, Yi S, Wang X, Li H (2018) Fd-Gan: pose-guided feature distilling Gan for robust person re-identification
[10]
Gong W, Yan B, and Lin C Flow-guided feature enhancement network for video-based person re-identification Neurocomputing 2020 383 295-302
[11]
Gong W, Yan B, and Lin C Flow-guided feature enhancement network for videobased person re-identification Neurocomputing 2020 383 295-302
[12]
Hermans A, Beyer L, Leibe B, In Defense of the Triplet Loss for Person Reidentification, https://arxiv.org/pdf/1703.07737.pdf 2017 (arXiv preprint arXiv:1703.07737).
[13]
Hou R, Ma B, Chang H, Gu X, Shan S, and Chen X VRSTC: occlusion-free video person re-identification Proc. IEEE Conf. Comput. Vis. Pattern Recognit 2019 (CVPR) 7176-7185
[14]
Jiang M, Leng B, Song G, and Meng Z Weighted triple-sequence loss for videobased person re-identification Neurocomputing 2020 381 314-321
[15]
Khamis S, Kuo C-H, Singh VK, Shet VD, Davis LS (2014) Joint learning for attribute-consistent person reidentification, in: European conference on computer vision, springer. Pp 134146.
[16]
Layne R, Hospedales TM, Gong S (2017) Attributes-based reidentification. Person Re-Identification, In, pp 93–117
[17]
Li W, Wang X (2013) Locally aligned feature transforms across views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition:3594–3601
[18]
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:152–159
[19]
Li J, Zhang S, Wang J, Gao W, Tian Q (2019) Global-Local Temporal Representations for Video Person Re-Identification. Proc. IEEE Int. Conf Comput. Vis. (ICCV):3957–3966
[20]
Li J, Zhang S, and Huang T Multi-scale temporal cues learning for video person re-identification IEEE Trans Image Process 2020 29 4 461-4 473
[21]
Li S, Yu H, and Hu H Appearance and motion enhancement for video-based person re-identification Proc. AAAI Conf. Artif. Intell. 2020 34 7 11394-11401
[22]
Li P, Pan P, Liu P, Xu M, and Yang Y Hierarchical temporal modeling with mutual distance matching for video based person re-identification IEEE Trans Circuits Syst Video Technol 2021 31 2 503-511
[23]
Liang Z, Bie Z, Sun Y, Wang J, Su C, Wang S, Tian Q. 2016. Mars: A video benchmark for large-scale person re-identification. In Proceedings of European Conference on Computer Vision. Springer, 868–884.
[24]
Lin Y, Zheng L, Zheng Z, Wu Y, and Yang Y Improving person re-identification by attribute and identity learning Comput Vis Pattern Recognit 2017 95 151-161
[25]
Lin G, Zhao S, and Shen J Video person re-identification with global statistic pooling and self-attention distillation Neurocomputing 2021 381 777-789
[26]
Liu J, Zha ZJ, Chen X, Wang Z, and Zhang Y Dense 3D-convolutional neural network for person re-identification in videos ACM Trans Multimedia Comput, Commun, Appl (TOMM) 2019 15 1s 1-19
[27]
Lowe DG Distinctive image features from scale-invariant keypoints Int. J. Comput. Vis. 2004 60 91-110
[28]
Mansouri N, Ammar S, and Kessentini Y Re-ranking person re-identification using attributes learning Neural Comput Applic 2021 33 19 12827-12843
[29]
Matsukawa T, Suzuki E (2016) Person re-identification using CNN features learned from combination of attributes, In: International conference on pattern recognition, Cancn, Mxico. pp 2429 – 2434.
[30]
Mclaughlin N, Del Rincon JM, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition:1325–1334
[31]
McLaughlin N, Del Rincon JM, Miller P (2016) Recurrent convolutional network for video-based person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:1325–1334
[32]
Ming Z, Zhu M, Wang X, Zhu J, Cheng J, Gao C, Yang Y, and Wei X Deep learning-based person re-identification methods: a survey and outlook of recent works Image Vis Comput 2022 119 104394
[33]
Ristani E, Solera F, Zou R, Cucchiara R, and Tomasi C Performance measures and a data set for multi-target, multi-camera tracking European conference on computer vision 2016 Springer 17-35
[34]
Song W, Zheng J, Wu Y, Chen C, and Liu F Discriminative feature extraction for video person re-identification via multi-task network Appl Intell 2021 51 2 788-803
[35]
Su C, Zhang, Xing J, Gao W, Tian Q (2016) Deep attributes driven multi-camera person re-identification, arXiv:1605.03259.
[36]
Subramaniam A, Nambiar A, Mittal A (2019) Co-Segmentation Inspired Attention Networks for Video-Based Person Re-Identification. Proc. IEEE Int. Conf. Comput. Vis. (ICCV). 562-572
[37]
Tay CP, Roy S, Yap KH (2019) Aanet: attribute attention network for person reidentifications. Proc IEEE Conf Comput Vis Pattern Recognit:7127–7136
[38]
Tay CP, Roy S, Yap KH (2019) Aanet: attribute attention network for person reidentifications. Proc IEEE Conf Comput Vis Pattern Recognit:7127–7136
[39]
Wang J, Zhu X, Gong SH, Li W (2015) Transferable joint attribute-identity deep learning for unsupervised person re-identification,In: Conference on computer vision and pattern recognition, tats-Unis. pp 2275 – 2284.
[40]
Wang T, Gong S, Zhu X, and Wang S Person re-identification by discriminative selection in video ranking IEEE Trans. Pattern Anal. Mach. Intell. 2016 38 2501-2514
[41]
G. Wang, Y. Yuan, X. Chen, J. Li, and X. Zhou, “Learning Discriminative Features with Multiple Granularities for Person Re-Identification,” in Proc. ACM Multimedia Conf. MM, 2018, pp. 274-282.
[42]
Wang Z et al. (2021) Robust Video-based Person Re-Identification by Hierarchical Mining. IEEE Trans. Circuits Syst. Video Technol. 1-1,.
[43]
Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. Proceed ACM Int Conf Multimedia:420–428
[44]
Wen Y, Zhang K, Li Z, and Qiao Y A discriminative feature learning approach for deep face recognition European conference on computer vision 2016 Cham Springer 499-515
[45]
Wu Y, Lin Y, Dong X, Yan Y, Ouyang W, and Yang Y Exploit the unknown gradually: one-shot video-based person re-identification by stepwise learning Proc. IEEE Conf. Comput. Vis. Pattern Recognit 2018 (CVPR), Jun 5177-5186
[46]
Wu Y, Bourahla O, Li X, Wu F, and Zhou X Adaptive graph representation learning for video person re-identification IEEE Trans Image Process 2020 29 8821-8830
[47]
Wu D, Ye M, Lin G, Gao X, Shen J (2021) Person re-identification by context-aware part attention and multi-head collaborative learning. IEEE Trans Inf. Foren, Sec
[48]
Yan Y, Qin J, Chen J, Liu L, Zhu F, Tai Y, Shao L (2020) Learning multi-granular hypergraphs for video-based person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:2899–2908
[49]
Yang J, Zheng W, Yang Q, Chen Y, Tian Q (2020) Spatial-temporal graph convolutional network for video-based person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:3289–3299
[50]
Yang X, Liu L, Wang N, and Gao X A two-stream dynamic pyramid representation model for video-based person re-identification IEEE Trans Image Process 2021 30 6266-6276
[51]
Yang F, Wang X, Zhu X, Liang B, and Li W Relation-based global-partial feature learning network for video-based person re-identification Neurocomputing 2022 488 424-435
[52]
Yao Y, Jiang X, Fujita H, and Fang Z A sparse graph wavelet convolution neural network for video-based person re-identification Pattern Recogn 2022 129 108708
[53]
Zhang Z, Lan C, Zeng W, Jin X, Chen Z (2020) Relation-aware global attention for person re-identification. Proceed IEEE Conf Comput Vision Patt Recogn:3183–3192
[54]
Zhang L et al. Ordered or Orderless: a revisit for video based person re- identification IEEE Trans Pattern Anal Mach Intell 2021 43 4 1460-1466
[55]
Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2019) Joint discriminative and generative learning for person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:2138–2147
[56]
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.:3652–3661
[57]
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person reidentification with K-reciprocal encoding, Conference on Computer Vision and Pattern Recognition, pp1318–1327. Hawa, tats Unis, IEEE
[58]
Zhou Z, Huang Y, Wang W, Liang W, Tan T. 2017. See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In proceedings of the IEEE international conference on computer vision. IEEE, 6776–6785.

Index Terms

  1. Person Reidentification using 3D inception based Spatio-temporal features learning, attribute recognition, and Reranking
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Multimedia Tools and Applications
        Multimedia Tools and Applications  Volume 83, Issue 1
        Jan 2024
        3275 pages

        Publisher

        Kluwer Academic Publishers

        United States

        Publication History

        Published: 11 May 2023
        Accepted: 18 April 2023
        Revision received: 18 October 2022
        Received: 11 July 2022

        Author Tags

        1. Person reidentification
        2. 3D inception
        3. Pedestrians identification
        4. Intelligent surveillance
        5. Video surveillance

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 25 Oct 2024

        Other Metrics

        Citations

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media