Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Context Sensing Attention Network for Video-based Person Re-identification

Published: 27 February 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Video-based person re-identification (ReID) is challenging due to the presence of various interferences in video frames. Recent approaches handle this problem using temporal aggregation strategies. In this work, we propose a novel Context Sensing Attention Network (CSA-Net), which improves both the frame feature extraction and temporal aggregation steps. First, we introduce the Context Sensing Channel Attention (CSCA) module, which emphasizes responses from informative channels for each frame. These informative channels are identified with reference not only to each individual frame, but also to the content of the entire sequence. Therefore, CSCA explores both the individuality of each frame and the global context of the sequence. Second, we propose the Contrastive Feature Aggregation (CFA) module, which predicts frame weights for temporal aggregation. Here, the weight for each frame is determined in a contrastive manner: i.e., not only by the quality of each individual frame, but also by the average quality of the other frames in a sequence. Therefore, it effectively promotes the contribution of relatively good frames. Extensive experimental results on four datasets show that CSA-Net consistently achieves state-of-the-art performance.

    References

    [1]
    Abhishek Aich, Meng Zheng, Srikrishna Karanam, Terrence Chen, Amit K. Roy-Chowdhury, and Ziyan Wu. 2021. Spatio-temporal representation factorization for video-based person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 152–162.
    [2]
    Cuiqun Chen, Mang Ye, Meibin Qi, Jingjing Wu, Yimin Liu, and Jianguo Jiang. 2022. Saliency and granularity: Discovering temporal coherence for video-based person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 32, 9 (2022), 6100–6112. DOI:
    [3]
    Dapeng Chen, Hongsheng Li, Tong Xiao, Shuai Yi, and Xiaogang Wang. 2018. Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1169–1178.
    [4]
    Guangyi Chen, Yongming Rao, Jiwen Lu, and Jie Zhou. 2020. Temporal coherence or temporal motion: Which is more critical for video-based person re-identification? In Proceedings of the European Conference on Computer Vision. Springer, 660–676.
    [5]
    Zengqun Chen, Zhiheng Zhou, Junchu Huang, Pengyu Zhang, and Bo Li. 2020. Frame-guided region-aligned representation for video person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence. 10591–10598.
    [6]
    Changxing Ding, Kan Wang, Pengfei Wang, and Dacheng Tao. 2022. Multi-task learning with coarse priors for robust part-aware person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 3 (2022), 1474–1488.
    [7]
    Chanho Eom, Geon Lee, Junghyup Lee, and Bumsub Ham. 2021. Video-based person re-identification with spatial and temporal memory networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12036–12045.
    [8]
    Hehe Fan, Liang Zheng, Chenggang Yan, and Yi Yang. 2018. Unsupervised person re-identification: Clustering and fine-tuning. ACM Transactions on Multimedia Computing, Communications, and Applications 14, 4 (2018), 1–18.
    [9]
    Pengfei Fang, Pan Ji, Lars Petersson, and Mehrtash Harandi. 2021. Set augmented triplet loss for video person re-identification. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 464–473.
    [10]
    Pedro F. Felzenszwalb, Ross B. Girshick, David McAllester, and Deva Ramanan. 2009. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 9 (2009), 1627–1645.
    [11]
    Yang Fu, Xiaoyang Wang, Yunchao Wei, and Thomas Huang. 2019. Sta: Spatial-temporal attention for large-scale video-based person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence. 8287–8294.
    [12]
    Yajun Gao, Tengfei Liang, Yi Jin, Xiaoyan Gu, Wu Liu, Yidong Li, and Congyan Lang. 2021. MSO: Multi-feature space joint optimization network for RGB-infrared person re-identification. In Proceedings of the 29th ACM International Conference on Multimedia. 5257–5265.
    [13]
    Wenhang Ge, Chunyan Pan, Ancong Wu, Hongwei Zheng, and Wei-Shi Zheng. 2021. Cross-camera feature prediction for intra-camera supervised person re-identification across distant scenes. In Proceedings of the 29th ACM International Conference on Multimedia. 3644–3653.
    [14]
    Xinqian Gu, Hong Chang, Bingpeng Ma, and Shiguang Shan. 2022. Motion feature aggregation for video-based person re-identification. IEEE Transactions on Image Processing 31 (2022), 3908–3919. DOI:
    [15]
    Xinqian Gu, Hong Chang, Bingpeng Ma, Hongkai Zhang, and Xilin Chen. 2020. Appearance-preserving 3d convolution for video-based person re-identification. In Proceedings of the European Conference on Computer Vision. Springer, 228–243.
    [16]
    Xinqian Gu, Bingpeng Ma, Hong Chang, Shiguang Shan, and Xilin Chen. 2019. Temporal knowledge propagation for image-to-video person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9647–9656.
    [17]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 770–778.
    [18]
    Ruibing Hou, Hong Chang, Bingpeng Ma, Rui Huang, and Shiguang Shan. 2021. BiCnet-TKS: Learning efficient spatial-temporal representation for video person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2014–2023.
    [19]
    Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, and Xilin Chen. 2020. Temporal complementary learning for video person re-identification. In Proceedings of the European Conference on Computer Vision. Springer, 388–405.
    [20]
    Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. 2019. Interaction-and-aggregation network for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9317–9326.
    [21]
    Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. 2019. VRSTC: Occlusion-free video person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7183–7192.
    [22]
    Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7132–7141.
    [23]
    Shuping Hu, Kan Wang, Jun Cheng, Huan Tan, and Jianxin Pang. 2022. Triplet ratio loss for robust person re-identification. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision. Springer, 42–54.
    [24]
    Jianing Li, Jingdong Wang, Qi Tian, Wen Gao, and Shiliang Zhang. 2019. Global-local temporal representations for video person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3958–3967.
    [25]
    Jianing Li, Shiliang Zhang, and Tiejun Huang. 2019. Multi-scale 3d convolution network for video based person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8618–8625.
    [26]
    Jianing Li, Shiliang Zhang, and Tiejun Huang. 2020. Multi-scale temporal cues learning for video person re-identification. IEEE Transactions on Image Processing 29 (2020), 4461–4473. DOI:
    [27]
    Mengliu Li, Han Xu, Jinjun Wang, Wenpeng Li, and Yongli Sun. 2020. Temporal aggregation with clip-level attention for video-based person re-identification. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.
    [28]
    Shuang Li, Slawomir Bak, Peter Carr, and Xiaogang Wang. 2018. Diversity regularized spatiotemporal attention for video-based person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 369–378.
    [29]
    Zhaoju Li, Zongwei Zhou, Nan Jiang, Zhenjun Han, Junliang Xing, and Jianbin Jiao. 2020. Spatial preserved graph convolution networks for person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1s (2020), 1–14.
    [30]
    Hao Liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, and Jiashi Feng. 2018. Video-based person re-identification with accumulative motion context. IEEE Transactions on Circuits and Systems for Video Technology 28, 10 (2018), 2788–2802.
    [31]
    Jiawei Liu, Zheng-Jun Zha, Xuejin Chen, Zilei Wang, and Yongdong Zhang. 2019. Dense 3D-convolutional neural network for person re-identification in videos. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 1s (2019), 1–19.
    [32]
    Liangchen Liu, Xi Yang, Nannan Wang, and Xinbo Gao. 2021. Viewing from frequency domain: A DCT-based information enhancement network for video person re-identification. In Proceedings of the 29th ACM International Conference on Multimedia. 227–235.
    [33]
    Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, and Xiaoyun Yang. 2021. Watching you: Global-guided reciprocal learning for video-based person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13334–13343.
    [34]
    Yu Liu, Junjie Yan, and Wanli Ouyang. 2017. Quality aware network for set to set recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5790–5799.
    [35]
    Yiheng Liu, Zhenxun Yuan, Wengang Zhou, and Houqiang Li. 2019. Spatial and temporal mutual promotion for video-based person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8786–8793.
    [36]
    Neeraj Matiyali and Gaurav Sharma. 2020. Video person re-identification using learned clip similarity aggregation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2655–2664.
    [37]
    Niall McLaughlin, Jesus Martinez Del Rincon, and Paul Miller. 2016. Recurrent convolutional network for video-based person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1325–1334.
    [38]
    Bo Pang, Deming Zhai, Junjun Jiang, and Xianming Liu. 2022. Fully unsupervised person re-identification via selective contrastive learning. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 2 (2022), 1–15.
    [39]
    Zequn Qin, Pengyi Zhang, Fei Wu, and Xi Li. 2021. Fcanet: Frequency channel attention networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 783–792.
    [40]
    Dripta S. Raychaudhuri and Amit K. Roy-Chowdhury. 2020. Exploiting temporal coherence for self-supervised one-shot video re-identification. In Proceedings of the European Conference on Computer Vision. Springer, 258–274.
    [41]
    Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of the European Conference on Computer Vision. Springer, 17–35.
    [42]
    Weijian Ruan, Chao Liang, Yi Yu, Zheng Wang, Wu Liu, Jun Chen, and Jiayi Ma. 2020. Correlation discrepancy insight network for video re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 4 (2020), 1–21.
    [43]
    Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 815–823.
    [44]
    Chen Shen, Zhongming Jin, Wenqing Chu, Rongxin Jiang, Yaowu Chen, Guo-Jun Qi, and Xian-Sheng Hua. 2019. Multi-level similarity perception network for person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 2 (2019), 1–19.
    [45]
    Guanglu Song, Biao Leng, Yu Liu, Congrui Hetang, and Shaofan Cai. 2018. Region-based quality estimation network for large-scale person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
    [46]
    Arulkumar Subramaniam, Athira Nambiar, and Anurag Mittal. 2019. Co-segmentation inspired attention networks for video-based person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 562–572.
    [47]
    Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. 2018. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European Conference on Computer Vision. 480–496.
    [48]
    Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton. 2013. On the importance of initialization and momentum in deep learning. In Proceedings of the International Conference on Machine Learning. PMLR, 1139–1147.
    [49]
    Zengming Tang and Jun Huang. 2022. Harmonious multi-branch network for person re-identification with harder triplet loss. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 4 (2022), 1–21.
    [50]
    Haoran Wang, Licheng Jiao, Fang Liu, Lingling Li, Xu Liu, Deyi Ji, and Weihao Gan. 2021. IPGN: Interactiveness proposal graph network for human-object interaction detection. IEEE Transactions on Image Processing 30 (2021), 6583–6593. DOI:
    [51]
    Haoran Wang, Licheng Jiao, Shuyuan Yang, Lingling Li, and Zexin Wang. 2020. Simple and effective: Spatial rescaling for person reidentification. IEEE Transactions on Neural Networks and Learning Systems 33, 1 (2020), 145–156. DOI:
    [52]
    Hanzheng Wang, Jiaqi Zhao, Yong Zhou, Rui Yao, Ying Chen, and Silin Chen. 2021. AMC-net: Attentive modality-consistent network for visible-infrared person re-identification. Neurocomputing 463 (2021), 226–236. DOI:
    [53]
    Kan Wang, Changxing Ding, Stephen J. Maybank, and Dacheng Tao. 2020. CDPM: Convolutional deformable part models for semantically aligned person re-identification. IEEE Transactions on Image Processing 29 (2020), 3416–3428. DOI:
    [54]
    Kan Wang, Shuping Hu, Jun Cheng, Jianxin Pang, and Huan Tan. 2022. RA loss: Relation-aware loss for robust person re-identification. In Proceedings of the Asian Conference on Computer Vision. 177–194.
    [55]
    Kan Wang, Pengfei Wang, Changxing Ding, and Dacheng Tao. 2021. Batch coherence-driven network for part-aware person re-identification. IEEE Transactions on Image Processing 30 (2021), 3405–3418. DOI:
    [56]
    Pengfei Wang, Changxing Ding, Zhiyin Shao, Zhibin Hong, Shengli Zhang, and Dacheng Tao. 2022. Quality-aware part models for occluded person re-identification. IEEE Transactions on Multimedia (2022). DOI:
    [57]
    Taiqing Wang, Shaogang Gong, Xiatian Zhu, and Shengjin Wang. 2014. Person re-identification by video ranking. In Proceedings of the European Conference on Computer Vision. Springer, 688–703.
    [58]
    Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2017. Non-local neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7794–7803.
    [59]
    Yu Wu, Yutian Lin, Xuanyi Dong, Yan Yan, Wanli Ouyang, and Yi Yang. 2018. Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5177–5186.
    [60]
    Shuangjie Xu, Yu Cheng, Kang Gu, Yang Yang, Shiyu Chang, and Pan Zhou. 2017. Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In Proceedings of the IEEE International Conference on Computer Vision. 4733–4742.
    [61]
    Sheng Xu, Chang Liu, Baochang Zhang, Jinhu Lü, Guodong Guo, and David Doermann. 2022. BiRe-ID: Binary neural network for efficient person re-ID. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 1s (2022), 1–22.
    [62]
    Yichao Yan, Jie Qin, Jiaxin Chen, Li Liu, Fan Zhu, Ying Tai, and Ling Shao. 2020. Learning multi-granular hypergraphs for video-based person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2899–2908.
    [63]
    Jinrui Yang, Wei-Shi Zheng, Qize Yang, Yingcong Chen, and Qi Tian. 2020. Spatial-temporal graph convolutional network for video-based person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3289–3299.
    [64]
    Xun Yang, Meng Wang, Richang Hong, Qi Tian, and Yong Rui. 2017. Enhancing person re-identification in a self-trained subspace. ACM Transactions on Multimedia Computing, Communications, and Applications 13, 3 (2017), 1–23.
    [65]
    Guowen Zhang, Pingping Zhang, Jinqing Qi, and Huchuan Lu. 2021. Hat: Hierarchical aggregation transformers for person re-identification. In Proceedings of the 29th ACM International Conference on Multimedia. 516–525.
    [66]
    Wenyu Zhang, Qing Ding, Jian Hu, Yi Ma, and Mingzhe Lu. 2021. Pixel-wise graph attention networks for person re-identification. In Proceedings of the 29th ACM International Conference on Multimedia. 5231–5238.
    [67]
    Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, and Zhibo Chen. 2020. Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10407–10416.
    [68]
    Yiru Zhao, Xu Shen, Zhongming Jin, Hongtao Lu, and Xian-sheng Hua. 2019. Attribute-driven feature disentangling and temporal aggregation for video person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4913–4922.
    [69]
    Liang Zheng, Zhi Bie, Yifan Sun, Jingdong Wang, Chi Su, Shengjin Wang, and Qi Tian. 2016. Mars: A video benchmark for large-scale person re-identification. In Proceedings of the European Conference on Computer Vision. Springer, 868–884.
    [70]
    Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision. 1116–1124.
    [71]
    Meng Zheng, Srikrishna Karanam, Ziyan Wu, and Richard J. Radke. 2019. Re-identification with consistent attentive siamese networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5735–5744.
    [72]
    Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. A discriminatively learned cnn embedding for person reidentification. ACM Transactions on Multimedia Computing, Communications, and Applications 14, 1 (2017), 1–20.
    [73]
    Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020. Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence. 13001–13008.
    [74]
    Zhen Zhou, Yan Huang, Wei Wang, Liang Wang, and Tieniu Tan. 2017. See the forest for the trees: Joint spatial and temporal recurrent neural networks for video-based person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4747–4756.

    Cited By

    View all
    • (2024)A Multi-Attention Feature Distillation Neural Network for Lightweight Single Image Super-ResolutionInternational Journal of Intelligent Systems10.1155/2024/32552332024Online publication date: 15-Feb-2024
    • (2024)Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664654Online publication date: 13-May-2024
    • (2024)Universal Relocalizer for Weakly Supervised Referring Expression GroundingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604520:7(1-23)Online publication date: 16-May-2024
    • Show More Cited By

    Index Terms

    1. Context Sensing Attention Network for Video-based Person Re-identification

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 4
      July 2023
      263 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3582888
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 February 2023
      Online AM: 01 December 2022
      Accepted: 25 November 2022
      Revised: 25 October 2022
      Received: 14 June 2022
      Published in TOMM Volume 19, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Video-based person re-identification
      2. channel attention
      3. feature aggregation

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • Guangdong Provincial Key Laboratory of Human Digital Twin
      • Key-Area Research and Development Program of Guangdong Province, China
      • Program of Guangdong Provincial Key Laboratory of Robot Localization and Navigation Technology
      • Natural Science Foundation of China
      • Shenzhen Technology Project
      • CAS Key Technology Talent Program, and Guangdong Technology Project

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)175
      • Downloads (Last 6 weeks)1

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Multi-Attention Feature Distillation Neural Network for Lightweight Single Image Super-ResolutionInternational Journal of Intelligent Systems10.1155/2024/32552332024Online publication date: 15-Feb-2024
      • (2024)Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664654Online publication date: 13-May-2024
      • (2024)Universal Relocalizer for Weakly Supervised Referring Expression GroundingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604520:7(1-23)Online publication date: 16-May-2024
      • (2024)Pseudo Content Hallucination for Unpaired Image CaptioningProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658080(320-329)Online publication date: 30-May-2024
      • (2024)MF2ShrT: Multimodal Feature Fusion Using Shared Layered Transformer for Face Anti-spoofingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364081720:6(1-21)Online publication date: 8-Mar-2024
      • (2024)Dynamic Weighted Adversarial Learning for Semi-Supervised Classification under Intersectional Class MismatchACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363531020:4(1-24)Online publication date: 11-Jan-2024
      • (2024)Deep Modular Co-Attention Shifting Network for Multimodal Sentiment AnalysisACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363470620:4(1-23)Online publication date: 11-Jan-2024
      • (2024)Efficient Video Transformers via Spatial-temporal Token Merging for Action RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363378120:4(1-21)Online publication date: 11-Jan-2024
      • (2024)Learning a Novel Ensemble Tracker for Robust Visual TrackingIEEE Transactions on Multimedia10.1109/TMM.2023.330793926(3194-3206)Online publication date: 1-Jan-2024
      • (2024)HASI: Hierarchical Attention-Aware Spatio–Temporal Interaction for Video-Based Person Re-IdentificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.334042834:6(4973-4988)Online publication date: Jun-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media