research-article

Dual-Stream Structured Graph Convolution Network for Skeleton-Based Action Recognition

Authors:

Chunlong HuAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 17, Issue 4

Article No.: 120, Pages 1 - 22

https://doi.org/10.1145/3450410

Published: 12 November 2021 Publication History

Get Access

Abstract

In this work, we propose a dual-stream structured graph convolution network (DS-SGCN) to solve the skeleton-based action recognition problem. The spatio-temporal coordinates and appearance contexts of the skeletal joints are jointly integrated into the graph convolution learning process on both the video and skeleton modalities. To effectively represent the skeletal graph of discrete joints, we create a structured graph convolution module specifically designed to encode partitioned body parts along with their dynamic interactions in the spatio-temporal sequence. In more detail, we build a set of structured intra-part graphs, each of which can be adopted to represent a distinctive body part (e.g., left arm, right leg, head). The inter-part graph is then constructed to model the dynamic interactions across different body parts; here each node corresponds to an intra-part graph built above, while an edge between two nodes is used to express these internal relationships of human movement. We implement the graph convolution learning on both intra- and inter-part graphs in order to obtain the inherent characteristics and dynamic interactions, respectively, of human action. After integrating the intra- and inter-levels of spatial context/coordinate cues, a convolution filtering process is conducted on time slices to capture these temporal dynamics of human motion. Finally, we fuse two streams of graph convolution responses in order to predict the category information of human action in an end-to-end fashion. Comprehensive experiments on five single/multi-modal benchmark datasets (including NTU RGB+D 60, NTU RGB+D 120, MSR-Daily 3D, N-UCLA, and HDM05) demonstrate that the proposed DS-SGCN framework achieves encouraging performance on the skeleton-based action recognition task.

References

[1]

James Atwood and Don Towsley. 2016. Diffusion-convolutional neural networks. In Advances in Neural Information Processing Systems. 1993–2001.

Abstract

References

Cited By

Index Terms

Recommendations

Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition

Mixed graph convolution and residual transformation network for skeleton-based action recognition

Skeleton-based multi-stream adaptive-attentional sub-graph convolution network for action recognition

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Full Text

HTML Format

Share

Share this Publication link

Share on social media

Affiliations