Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Dual-Stream Structured Graph Convolution Network for Skeleton-Based Action Recognition

Published: 12 November 2021 Publication History

Abstract

In this work, we propose a dual-stream structured graph convolution network (DS-SGCN) to solve the skeleton-based action recognition problem. The spatio-temporal coordinates and appearance contexts of the skeletal joints are jointly integrated into the graph convolution learning process on both the video and skeleton modalities. To effectively represent the skeletal graph of discrete joints, we create a structured graph convolution module specifically designed to encode partitioned body parts along with their dynamic interactions in the spatio-temporal sequence. In more detail, we build a set of structured intra-part graphs, each of which can be adopted to represent a distinctive body part (e.g., left arm, right leg, head). The inter-part graph is then constructed to model the dynamic interactions across different body parts; here each node corresponds to an intra-part graph built above, while an edge between two nodes is used to express these internal relationships of human movement. We implement the graph convolution learning on both intra- and inter-part graphs in order to obtain the inherent characteristics and dynamic interactions, respectively, of human action. After integrating the intra- and inter-levels of spatial context/coordinate cues, a convolution filtering process is conducted on time slices to capture these temporal dynamics of human motion. Finally, we fuse two streams of graph convolution responses in order to predict the category information of human action in an end-to-end fashion. Comprehensive experiments on five single/multi-modal benchmark datasets (including NTU RGB+D 60, NTU RGB+D 120, MSR-Daily 3D, N-UCLA, and HDM05) demonstrate that the proposed DS-SGCN framework achieves encouraging performance on the skeleton-based action recognition task.

References

[1]
James Atwood and Don Towsley. 2016. Diffusion-convolutional neural networks. In Advances in Neural Information Processing Systems. 1993–2001.
[2]
Fabien Baradel, Christian Wolf, and Julien Mille. 2017. Human action recognition: Pose-based attention draws focus to hands. In IEEE International Conference on Computer Vision Workshops. 604–613.
[3]
Fabien Baradel, Christian Wolf, and Julien Mille. 2018. Human activity recognition with pose-driven attention to RGB. In British Machine Vision Conference. 200.
[4]
Fabien Baradel, Christian Wolf, Julien Mille, and Graham W. Taylor. 2018. Glimpse clouds: Human activity recognition from unstructured feature points. In IEEE Conference on Computer Vision and Pattern Recognition. 469–478.
[5]
Aleksandar Bojchevski, Oleksandr Shchur, Daniel Zügner, and Stephan Günnemann. 2018. Netgan: Generating graphs via random walks. In International Conference on Machine Learning. 609–618.
[6]
Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT 2010. 177–186.
[7]
Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2013. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations.
[8]
Xingyang Cai, Wengang Zhou, Lei Wu, Jiebo Luo, and Houqiang Li. 2015. Effective active skeleton representation for low latency human action recognition. IEEE Transactions on Multimedia 18, 2 (2015), 141–154.
[9]
Joao Carreira and Andrew Zisserman. 2017. Quo vadis, action recognition? A new model and the kinetics dataset. In IEEE Conference on Computer Vision and Pattern Recognition. 6299–6308.
[10]
Ke Cheng, Yifan Zhang, Xiangyu He, Weihan Chen, Jian Cheng, and Hanqing Lu. 2020. Skeleton-based action recognition with shift graph convolutional network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’20).
[11]
Guilhem Chéron, Ivan Laptev, and Cordelia Schmid. 2015. P-CNN: Pose-based CNN features for action recognition. In Proceedings of the IEEE International Conference on Computer Vision. 3218–3226.
[12]
Srijan Das, Arpit Chaudhary, Francois Bremond, and Monique Thonnat. 2019. Where to focus on for human action recognition? In IEEE Winter Conference on Applications of Computer Vision. 71–80.
[13]
Srijan Das, Rui Dai, Michal Koperski, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, and Gianpiero Francesca. 2019. Toyota smarthome: Real-world activities of daily living. In Proceedings of the IEEE International Conference on Computer Vision. 833–842.
[14]
Srijan Das, Rui Dai, Michal Koperski, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, and Gianpiero Francesca. 2019. Toyota smarthome: Real-world activities of daily living. In IEEE International Conference on Computer Vision. 833–842.
[15]
Srijan Das, Saurav Sharma, Rui Dai, Francois Brémond, and Monique Thonnat. 2020. VPN: Learning video-pose embedding for activities of daily living. The 16th European Conference Computer Vision(2020), 72–90.
[16]
Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Annual Conference on Neural Information Processing Systems. 3837–3845.
[17]
Yong Du, Wei Wang, and Liang Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 1110–1118.
[18]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024–1034.
[19]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. In European Conference on Computer Vision. 346–361.
[20]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[21]
Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. https://arxiv.org/abs/1506.05163.
[22]
Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, and Jianguo Zhang. 2015. Jointly learning heterogeneous features for RGB-D activity recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 5344–5352.
[23]
Jian-Fang Hu, Wei-Shi Zheng, Jiahui Pan, Jianhuang Lai, and Jianguo Zhang. 2018. Deep bilinear learning for RGB-D action recognition. In European Conference on Computer Vision. 335–351.
[24]
Zhiwu Huang and Luc J. Van Gool. 2017. A Riemannian network for SPD matrix learning. In AAAI Conference on Artificial Intelligence. 2036–2042.
[25]
Zhiwu Huang, Chengde Wan, Thomas Probst, and Luc Van Gool. 2017. Deep learning on lie groups for skeleton-based action recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 1243–1252.
[26]
Ashesh Jain, Amir R. Zamir, Silvio Savarese, and Ashutosh Saxena. 2016. Structural-RNN: Deep learning on spatio-temporal graphs. In IEEE Conference on Computer Vision and Pattern Recognition. 5308–5317.
[27]
Jiatao Jiang, Zhen Cui, Chunyan Xu, and Jian Yang. 2019. Gaussian-induced convolution for graphs. In AAAI Conference on Artificial Intelligence.
[28]
Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, and Farid Boussaid. 2017. A new representation of skeleton sequences for 3D action recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 3288–3297.
[29]
Nikhil Ketkar. 2017. Introduction to PyTorch. In Deep Learning with Python. 195–208.
[30]
Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations.
[31]
Thomas N. Kipf and Max Welling. 2016. Variational graph auto-encoders. https://arxiv.org/abs/1611.07308.
[32]
Hema S. Koppula and Ashutosh Saxena. 2016. Anticipating human activities using object affordances for reactive robotic response. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 1 (2016), 14–29.
[33]
Inwoong Lee, Doyoung Kim, Seoungyoon Kang, and Sanghoon Lee. 2017. Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks. In IEEE International Conference on Computer Vision. 1012–1020.
[34]
Bo Li, Yuchao Dai, Xuelian Cheng, Huahui Chen, Yi Lin, and Mingyi He. 2017. Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In IEEE International Conference on Multimedia & Expo Workshops. 601–604.
[35]
Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, Rongrong Ji, and Jian Yang. 2018. Action-attending graphic neural network. IEEE Transactions on Image Processing 27, 7 (2018), 3657–3670.
[36]
Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, and Jian Yang. 2018. Spatio-temporal graph convolution for skeleton based action recognition. In AAAI Conference on Artificial Intelligence. 3482–3489.
[37]
Chao Li, Qiaoyong Zhong, Di Xie, and Shiliang Pu. 2018. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. In International Joint Conference on Artificial Intelligence. 786–792.
[38]
Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, and Qi Tian. 2019. Actional-structural graph convolutional networks for skeleton-based action recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 3595–3603.
[39]
Yanghao Li, Cuiling Lan, Junliang Xing, Wenjun Zeng, Chunfeng Yuan, and Jiaying Liu. 2016. Online human action detection using joint classification-regression recurrent neural networks. European Conference on Computer Vision.
[40]
Duohan Liang, Guoliang Fan, Guangfeng Lin, Wanjun Chen, Xiaorong Pan, and Hong Zhu. 2019. Three-stream convolutional neural network with multi-task and ensemble learning for 3D action recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[41]
Duohan Liang, Guoliang Fan, Guangfeng Lin, Wanjun Chen, Xiaorong Pan, and Hong Zhu. 2019. Three-stream convolutional neural network with multi-task and ensemble learning for 3D action recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops. 0–0.
[42]
Chunhui Liu, Yueyu Hu, Yanghao Li, Sijie Song, and Jiaying Liu. 2017. PKU-MMD: A Large Scale Benchmark for Continuous Multi-modal Human Action Understanding. https://arxiv.org/abs/1703.07475.
[43]
Jun Liu, Amir Shahroudy, Mauricio Perez, Gang Wang, Ling-Yu Duan, and Alex C. Kot. 2020. NTU RGB+D 120: A large-scale benchmark for 3D human activity understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 10 (2020), 2684–2701.
[44]
Jun Liu, Amir Shahroudy, Dong Xu, and Gang Wang. 2016. Spatio-temporal LSTM with trust gates for 3D human action recognition. In European Conference on Computer Vision. 816–833.
[45]
Mengyuan Liu, Hong Liu, and Chen Chen. 2017. Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognition 68 (2017), 346–362.
[46]
M. Liu, L. Nie, X. Wang, Q. Tian, and B. Chen. 2019. Online data organizer: Micro-video categorization by structure-guided multimodal dictionary learning. IEEE Transactions on Image Processing 28, 3 (2019), 1235–1247.
[47]
Mengyuan Liu and Junsong Yuan. 2018. Recognizing human actions as the evolution of pose estimation maps. In IEEE Conference on Computer Vision and Pattern Recognition. 1159–1168.
[48]
Rong Liu, Chunyan Xu, Tong Zhang, Wenting Zhao, Zhen Cui, and Jian Yang. 2019. Si-GCN: Structure-induced graph convolution network for skeleton-based action recognition. In 2019 International Joint Conference on Neural Networks (IJCNN’19). IEEE, 1–8.
[49]
Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, and Wanli Ouyang. 2020. Disentangling and unifying graph convolutions for skeleton-based action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 143–152.
[50]
Raphael Memmesheimer, Nick Theisen, and Dietrich Paulus. 2020. Gimme signals: Discriminative signal encoding for multimodal activity recognition. In International Conference on Intelligent Robots and Systems. 10394–10401.
[51]
M. Müller, T. Röder, M. Clausen, B. Eberhardt, B. Krüger, and A. Weber. 2007. Documentation Mocap Database HDM05. Technical Report CG-2007-2. Universität Bonn.
[52]
Tien-Nam Nguyen, Dinh-Tan Pham, Thi-Lan Le, Hai Vu, and Thanh-Hai Tran. 2018. Novel skeleton-based action recognition using covariance descriptors on most informative joints. In International Conference on Knowledge and Systems Engineering. 50–55.
[53]
Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning convolutional neural networks for graphs. In International Conference on Machine Learning. 2014–2023.
[54]
Juan-Manuel Pérez-Rúa, Valentin Vielzeuf, Stéphane Pateux, Moez Baccouche, and Frédéric Jurie. 2019. MFAS: Multi-modal fusion architecture search. In IEEE Conference on Computer Vision and Pattern Recognition. 6966–6975.
[55]
Liliana Lo Presti and Marco La Cascia. 2016. 3D skeleton-based human action classification: A survey. Pattern Recognition 53 (2016), 130–147.
[56]
Xiaolei Qin, Yongxin Ge, Liuwei Zhan, Guangrui Li, Sheng Huang, Hongxing Wang, and Feiyu Chen. 2018. Joint deep learning for RGB-D action recognition. In IEEE Visual Communications and Image Processing. 1–6.
[57]
Mohsen Ramezani and Farzin Yaghmaee. 2016. A review on human action analysis in videos for retrieval applications. Artificial Intelligence Review 46, 4 (2016), 485–514.
[58]
M. S. Ryoo, Thomas J. Fuchs, Lu Xia, Jake K. Aggarwal, and Larry Matthies. 2015. Robot-centric activity prediction from first-person videos: What will they do to me? In ACM/IEEE International Conference on Human-Robot Interaction. 295–302.
[59]
Amir Shahroudy, Jun Liu, Tian-Tsong Ng, and Gang Wang. 2016. NTU RGB+D: A large scale dataset for 3D human activity analysis. In IEEE Conference on Computer Vision and Pattern Recognition. 1010–1019.
[60]
Amir Shahroudy, Tian-Tsong Ng, Yihong Gong, and Gang Wang. 2017. Deep multimodal feature analysis for action recognition in RGB+ D videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 5 (2017), 1045–1058.
[61]
Amir Shahroudy, Tian-Tsong Ng, Qingxiong Yang, and Gang Wang. 2015. Multimodal multipart learning for action recognition in depth videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 10 (2015), 2123–2129.
[62]
Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2019. Skeleton-based action recognition with directed graph neural networks. In IEEE Conference on Computer Vision and Pattern Recognition. 7912–7921.
[63]
Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2019. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12026–12035.
[64]
Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2020. Decoupled spatial-temporal attention network for skeleton-based action recognition. European Conference on Computer Vision (2020), 536–553.
[65]
Chenyang Si, Wentao Chen, Wei Wang, Liang Wang, and Tieniu Tan. 2019. An attention enhanced graph convolutional LSTM network for skeleton-based action recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 1227–1236.
[66]
Chenyang Si, Ya Jing, Wei Wang, Liang Wang, and Tieniu Tan. 2018. Skeleton-based action recognition with spatial reasoning and temporal stack learning. In European Conference on Computer Vision. 103–118.
[67]
Chenyang Si, Ya Jing, Wei Wang, Liang Wang, and Tieniu Tan. 2020. Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network. Pattern Recognition 107 (2020), 107511.
[68]
Cees G. M. Snoek, Bouke Huurnink, Laura Hollink, Maarten De Rijke, Guus Schreiber, and Marcel Worring. 2007. Adding semantics to detectors for video retrieval. IEEE Transactions on Multimedia 9, 5 (2007), 975–986.
[69]
Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, and Jiaying Liu. 2017. An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In AAAI Conference on Artificial Intelligence.
[70]
Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, and Jiaying Liu. 2018. Skeleton-indexed deep multi-modal feature learning for high performance human action recognition. In IEEE International Conference on Multimedia and Expo (ICME’18). 1–6.
[71]
Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, and Jiaying Liu. 2018. Spatio-temporal attention-based LSTM networks for 3D action recognition and detection. IEEE Transactions on Image Processing 27, 7 (2018), 3459–3471.
[72]
Felipe Petroski Such, Shagan Sah, Miguel Alexander Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan D. Cahill, and Raymond Ptucha. 2017. Robust spatial filtering with graph convolutional neural networks. IEEE Journal of Selected Topics in Signal Processing 11, 6 (2017), 884–896.
[73]
Yansong Tang, Yi Tian, Jiwen Lu, Peiyang Li, and Jie Zhou. 2018. Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5323–5332.
[74]
T. J. Tsai, Andreas Stolcke, and Malcolm Slaney. 2015. A study of multimodal addressee detection in human-human-computer interaction. IEEE Transactions on Multimedia 17, 9 (2015), 1550–1561.
[75]
Raviteja Vemulapalli, Felipe Arrate, and Rama Chellappa. 2014. Human action recognition by representing 3D skeletons as points in a lie group. In The IEEE Conference on Computer Vision and Pattern Recognition.
[76]
Dongang Wang, Wanli Ouyang, Wen Li, and Dong Xu. 2018. Dividing and aggregating network for multi-view action recognition. In European Conference on Computer Vision. 451–467.
[77]
Jiang Wang, Zicheng Liu, Ying Wu, and Junsong Yuan. 2012. Mining actionlet ensemble for action recognition with depth cameras. In IEEE Conference on Computer Vision and Pattern Recognition. 1290–1297.
[78]
Jiang Wang, Xiaohan Nie, Yin Xia, Ying Wu, and Song-Chun Zhu. 2014. Cross-view action modeling, learning and recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 2649–2656.
[79]
Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. 2016. Temporal segment networks: Towards good practices for deep action recognition. In European Conference on Computer Vision. 20–36.
[80]
Pichao Wang, Wanqing Li, Chuankun Li, and Yonghong Hou. 2018. Action recognition based on joint trajectory maps with convolutional neural networks. Knowledge-Based Systems 158 (2018), 43–53.
[81]
Pei Wang, Chunfeng Yuan, Weiming Hu, Bing Li, and Yanning Zhang. 2016. Graph based skeleton motion representation and similarity measurement for action recognition. In European Conference on Computer Vision. 370–385.
[82]
Chunyan Xu, Canyi Lu, Xiaodan Liang, Junbin Gao, Wei Zheng, Tianjiang Wang, and Shuicheng Yan. 2016. Multi-loss regularized deep neural network. IEEE Transactions on Circuits and Systems for Video Technology 26, 12 (2016), 2273–2283.
[83]
Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. In AAAI Conference on Artificial Intelligence. 7444–7452.
[84]
Pengfei Zhang, Cuiling Lan, Junliang Xing, Wenjun Zeng, Jianru Xue, and Nanning Zheng. 2017. View adaptive recurrent neural networks for high performance human action recognition from skeleton data. In IEEE International Conference on Computer Vision. 2117–2126.
[85]
Pengfei Zhang, Cuiling Lan, Wenjun Zeng, Jianru Xue, and Nanning Zheng. 2020. Semantics-guided neural networks for efficient skeleton-based human action recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 1109–1118.
[86]
Pengfei Zhang, Jianru Xue, Cuiling Lan, Wenjun Zeng, Zhanning Gao, and Nanning Zheng. 2018. Adding attentiveness to the neurons in recurrent neural networks. In Proceedings of the European Conference on Computer Vision. 135–151.
[87]
Songyang Zhang, Yang Yang, Jun Xiao, Xiaoming Liu, Yi Yang, Di Xie, and Yueting Zhuang. 2018. Fusing geometric features for skeleton-based action recognition using multilayer LSTM networks. IEEE Transactions on Multimedia 20, 9 (2018), 2330–2343.
[88]
Rui Zhao, Kang Wang, Hui Su, and Qiang Ji. 2019. Bayesian graph convolution LSTM for skeleton based action recognition. In The IEEE International Conference on Computer Vision. 6882–6892.
[89]
Jiagang Zhu, Wei Zou, Liang Xu, Yiming Hu, Zheng Zhu, Manyu Chang, Junjie Huang, Guan Huang, and Dalong Du. 2018. Action machine: Rethinking action recognition in trimmed videos. https://arxiv.org/abs/1812.05770.
[90]
Mohammadreza Zolfaghari, Gabriel L. Oliveira, Nima Sedaghat, and Thomas Brox. 2017. Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection. In IEEE International Conference on Computer Vision. 2904–2913.

Cited By

View all
  • (2024)Improving efficiency of DNN-based relocalization module for autonomous driving with server-side computingJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-024-00592-113:1Online publication date: 25-Jan-2024
  • (2024)Enhancing trust transfer in supply chain finance: a blockchain-based transitive trust modelJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00557-w13:1Online publication date: 2-Jan-2024
  • (2024)HCCNet: Hybrid Coupled Cooperative Network for Robust Indoor LocalizationACM Transactions on Sensor Networks10.1145/366564520:4(1-22)Online publication date: 8-Jul-2024
  • Show More Cited By

Index Terms

  1. Dual-Stream Structured Graph Convolution Network for Skeleton-Based Action Recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 4
    November 2021
    529 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3492437
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 November 2021
    Accepted: 01 February 2021
    Revised: 01 January 2021
    Received: 01 June 2020
    Published in TOMM Volume 17, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Graph convolution network
    2. dual-stream structured graph convolution
    3. action recognition

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • National Natural Science Foundation of China
    • Natural Science Foundation of Jiangsu Province
    • CCF-Tencent Open Research Fund

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)111
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Improving efficiency of DNN-based relocalization module for autonomous driving with server-side computingJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-024-00592-113:1Online publication date: 25-Jan-2024
    • (2024)Enhancing trust transfer in supply chain finance: a blockchain-based transitive trust modelJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00557-w13:1Online publication date: 2-Jan-2024
    • (2024)HCCNet: Hybrid Coupled Cooperative Network for Robust Indoor LocalizationACM Transactions on Sensor Networks10.1145/366564520:4(1-22)Online publication date: 8-Jul-2024
    • (2024)InteractNet: Social Interaction Recognition for Semantic-rich VideosACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366366820:8(1-21)Online publication date: 12-Jun-2024
    • (2024)Real-Time Attentive Dilated U-Net for Extremely Dark Image EnhancementACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365466820:8(1-19)Online publication date: 12-Jun-2024
    • (2024)Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363947020:5(1-22)Online publication date: 7-Feb-2024
    • (2024)NSDIE: Noise Suppressing Dark Image Enhancement Using Multiscale Retinex and Low-Rank MinimizationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363877220:6(1-22)Online publication date: 8-Mar-2024
    • (2024)Learning Offset Probability Distribution for Accurate Object DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363721420:5(1-24)Online publication date: 22-Jan-2024
    • (2024)Semantic-driven diffusion for sign language production with gloss-pose latent spaces alignmentComputer Vision and Image Understanding10.1016/j.cviu.2024.104050246:COnline publication date: 1-Sep-2024
    • (2023)FSPLO: a fast sensor placement location optimization method for cloud-aided inspection of smart buildingsJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00410-012:1Online publication date: 6-Mar-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media