research-article

Generalized Graph Convolutional Networks for Action Recognition with Occluded Skeletons

Authors:

Wai ChenAuthors Info & Claims

ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition

Pages 43 - 49

https://doi.org/10.1145/3436369.3436470

Published: 11 January 2021 Publication History

Abstract

Current methods for skeleton-based action recognition usually work on the assumption that the observed skeletons provided for recognition are complete (i.e., without occlusion). However, it is almost impossible to guarantee the extracting of perfect skeleton samples in practical application processes. In this work, we propose a generalized graph convolutional network for action recognition with occluded skeletons. The key insight of our approach is to look beyond the physical joint connectivity and extract discriminative features, where the richly discovered semantic features will improve the robustness of the model. We have conducted comprehensive experiments with both the normal skeleton dataset and the synthetic occlusion dataset. The experimental results demonstrate that our model can significantly alleviate the performance deterioration.

References

[1]

Liliana Lo Presti and Marco La Cascia. 2016. 3D skeleton-based human action classification: A survey. Pattern Recognition 53 (2016), 130--147.

Digital Library

[2]

Fei Han, Brian Reily, William Hoff, and Hao Zhang. 2017. Space-time representation of people based on 3D skeletal data: A review. Computer Vision and Image Understanding 158 (2017), 85--105.

Digital Library

[3]

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7291--7299.

[4]

Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, and Dina Katabi. 2018. Through-wall human pose estimation using radio signals. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7356--7365.

[5]

Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, and Farid Boussaid. 2017. A new representation of skeleton sequences for 3d action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3288--3297.

[6]

Hossein Rahmani and Mohammed Bennamoun. 2017. Learning action recognition model from depth and skeleton videos. In Proceedings of the IEEE International Conference on Computer Vision. 5832--5841.

[7]

Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, and Farid Boussaid. 2018. Learning clip representations for skeleton-based 3D action recognition. IEEE Transactions on Image Processing 27, 6 (2018), 2842--2855.

[8]

Wentao Zhu, Cuiling Lan, Junliang Xing, Wenjun Zeng, Yanghao Li, Li Shen, and Xiaohui Xie. 2016. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In Proceedings of the AAAI Conference on Artificial Intelligence. 3697--3703.

[9]

Jun Liu, Amir Shahroudy, Dong Xu, Alex C Kot, and Gang Wang. 2018. Skeleton-based action recognition using spatio-temporal LSTM network with trust gates. IEEE transactions on pattern analysis and machine intelligence 40, 12 (2018), 3007--3021.

Digital Library

[10]

Chenyang Si, Ya Jing, Wei Wang, Liang Wang, and Tieniu Tan. 2018. Skeleton-based action recognition with spatial reasoning and temporal stack learning. In Proceedings of the European Conference on Computer Vision (ECCV). 103--118.

Digital Library

[11]

Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence. 7444--7452.

[12]

Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, and Jian Yang. 2018. Spatio-temporal graph convolution for skeleton based action recognition. In Proceedings of the AAAI Conference on Artificial Intelligence. 3482--3489.

[13]

Yansong Tang, Yi Tian, Jiwen Lu, Peiyang Li, and Jie Zhou. 2018. Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5323--5332.

[14]

Xiang Gao, Wei Hu, Jiaxiang Tang, Pan Pan, Jiaying Liu, and Zongming Guo. 2018. Generalized graph convolutional networks for skeleton-based action recognition. arXiv preprint arXiv: 1811.12013 (2018).

[15]

Chenyang Si, Wentao Chen, Wei Wang, Liang Wang, and Tieniu Tan. 2019. An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1227--1236.

[16]

Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2019. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12026--12035.

[17]

Yi-Fan Song, Zhang Zhang, and Liang Wang. 2019. Richly activated graph convolutional network for action recognition with incomplete skeletons. In Proceedings of the IEEE International Conference on Image Processing. 1--5.

[18]

Fanyang Meng, Hong Liu, Yongsheng Liang, Juanhui Tu, and Mengyuan Liu. 2019. Sample Fusion Network: An End-to-End Data Augmentation Network for Skeleton-Based Human Action Recognition. IEEE Transactions on Image Processing 28, 11 (2019), 5281--5295.

Digital Library

[19]

Jongmin Yu, Yongsang Yoon, and Moongu Jeon. 2020. Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition. arXiv preprint arXiv:2003.07514 (2020).

[20]

Raviteja Vemulapalli, Felipe Arrate, and Rama Chellappa. 2014. Human action recognition by representing 3d skeletons as points in a lie group. In Proceedings of the IEEE conference on computer vision and pattern recognition. 588--595.

Digital Library

[21]

Piotr Koniusz, Anoop Cherian, and Fatih Porikli. 2016. Tensor representations via kernel linearization for action recognition from 3d skeletons. In Proceedings of the European Conference on Computer Vision. Springer, 37--53.

[22]

Junwu Weng, Chaoqun Weng, and Junsong Yuan. 2017. Spatio-temporal naive-bayes nearest-neighbor (st-nbnn) for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4171--4180.

[23]

Yong Du, Wei Wang, and Liang Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1110--1118.

[24]

Inwoong Lee, Doyoung Kim, Seoungyoon Kang, and Sanghoon Lee. 2017. Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks. In Proceedings of the IEEE International Conference on Computer Vision. 1012--1020.

[25]

Mengyuan Liu, Hong Liu, and Chen Chen. 2017. Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognition 68 (2017), 346--362.

Digital Library

[26]

Xiaolu Ding, Kai Yang, and Wai Chen. 2019. An Attention-Enhanced Recurrent Graph Convolutional Network for Skeleton-Based Action Recognition. In Proceedings of the 2019 2nd International Conference on Signal Processing and Machine Learning. 79--84.

Digital Library

[27]

Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, and Wanli Ouyang. 2020. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 143--152.

Digital Library

[28]

Zhaoxuan Fan, Xu Zhao, Tianwei Lin, and Haisheng Su. 2018. Attention-based multiview re-observation fusion network for skeletal action recognition. IEEE Transactions on Multimedia 21, 2 (2018), 363--374.

Digital Library

[29]

Amir Shahroudy, Jun Liu, Tian-Tsong Ng, and Gang Wang. 2016. NTU RGB+ D: A large scale dataset for 3D human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1010--1019.

[30]

Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, et al. 2017. The kinetics human action video dataset. arXiv preprint arXiv:1705.06950 (2017).

[31]

Jun Liu, Amir Shahroudy, Dong Xu, and Gang Wang. 2016. Spatio-temporal lstm with trust gates for 3d human action recognition. In European Conference on Computer Vision. Springer, 816--833.

[32]

Hongsong Wang and Liang Wang. 2017. Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 499--508.

[33]

Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, and Jiaying Liu. 2017. An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In Proceedings of the AAAI conference on artificial intelligence. 4263--4270.

[34]

Duohan Liang, Guoliang Fan, Guangfeng Lin, Wanjun Chen, Xiaorong Pan, and Hong Zhu. 2019. Three-stream convolutional neural network with multi-task and ensemble learning for 3d action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 0--0.

[35]

Basura Fernando, Efstratios Gavves, Jose M Oramas, Amir Ghodrati, and Tinne Tuytelaars. 2015. Modeling video evolution for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5378--5387.

[36]

Tae Soo Kim and Austin Reiter. 2017. Interpretable 3d human action analysis with temporal convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 1623--1631.

[37]

Bin Li, Xi Li, Zhongfei Zhang, and Fei Wu. 2019. Spatio-temporal graph routing for skeleton-based action recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8561--8568.

Digital Library

Cited By

Peng KRoitberg AYang KZhang JStiefelhagen R(2023)Delving Deep Into One-Shot Skeleton-Based Action Recognition With Diverse OcclusionsIEEE Transactions on Multimedia10.1109/TMM.2023.323530025(1489-1504)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2023.3235300

Index Terms

Generalized Graph Convolutional Networks for Action Recognition with Occluded Skeletons
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Activity recognition and understanding

Recommendations

Attention-Based Generative Graph Convolutional Network for Skeleton-Based Human Action Recognition
ICVIP '19: Proceedings of the 3rd International Conference on Video and Image Processing

Skeleton-based action recognition is a challenging field in computer vision. Graph representations of skeleton are used to learn the connection patterns of human joints. However, the fixed handcraft graph of human skeleton topology can not well ...
Facial expression recognition with Convolutional Neural Networks

Facial expression recognition has been an active research area in the past 10 years, with growing application areas including avatar animation, neuromarketing and sociable robots. The recognition of facial expressions is not an easy problem for machine ...
A Spatial Attention-Enhanced Multi-Timescale Graph Convolutional Network for Skeleton-Based Action Recognition
AIPR '20: Proceedings of the 2020 3rd International Conference on Artificial Intelligence and Pattern Recognition

How to effectively extract discriminative spatial and temporal features is important for skeleton-based action recognition. However, current researches on skeleton-based action recognition mainly focus on the natural connections of the skeleton and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition

October 2020

552 pages

ISBN:9781450387835

DOI:10.1145/3436369

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Beijing University of Technology

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICCPR 2020

ICCPR 2020: 2020 9th International Conference on Computing and Pattern Recognition

October 30 - November 1, 2020

Xiamen, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
94
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Peng KRoitberg AYang KZhang JStiefelhagen R(2023)Delving Deep Into One-Shot Skeleton-Based Action Recognition With Diverse OcclusionsIEEE Transactions on Multimedia10.1109/TMM.2023.323530025(1489-1504)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2023.3235300

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents