Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3436369.3436470acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

Generalized Graph Convolutional Networks for Action Recognition with Occluded Skeletons

Published: 11 January 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Current methods for skeleton-based action recognition usually work on the assumption that the observed skeletons provided for recognition are complete (i.e., without occlusion). However, it is almost impossible to guarantee the extracting of perfect skeleton samples in practical application processes. In this work, we propose a generalized graph convolutional network for action recognition with occluded skeletons. The key insight of our approach is to look beyond the physical joint connectivity and extract discriminative features, where the richly discovered semantic features will improve the robustness of the model. We have conducted comprehensive experiments with both the normal skeleton dataset and the synthetic occlusion dataset. The experimental results demonstrate that our model can significantly alleviate the performance deterioration.

    References

    [1]
    Liliana Lo Presti and Marco La Cascia. 2016. 3D skeleton-based human action classification: A survey. Pattern Recognition 53 (2016), 130--147.
    [2]
    Fei Han, Brian Reily, William Hoff, and Hao Zhang. 2017. Space-time representation of people based on 3D skeletal data: A review. Computer Vision and Image Understanding 158 (2017), 85--105.
    [3]
    Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7291--7299.
    [4]
    Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, and Dina Katabi. 2018. Through-wall human pose estimation using radio signals. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7356--7365.
    [5]
    Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, and Farid Boussaid. 2017. A new representation of skeleton sequences for 3d action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3288--3297.
    [6]
    Hossein Rahmani and Mohammed Bennamoun. 2017. Learning action recognition model from depth and skeleton videos. In Proceedings of the IEEE International Conference on Computer Vision. 5832--5841.
    [7]
    Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, and Farid Boussaid. 2018. Learning clip representations for skeleton-based 3D action recognition. IEEE Transactions on Image Processing 27, 6 (2018), 2842--2855.
    [8]
    Wentao Zhu, Cuiling Lan, Junliang Xing, Wenjun Zeng, Yanghao Li, Li Shen, and Xiaohui Xie. 2016. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In Proceedings of the AAAI Conference on Artificial Intelligence. 3697--3703.
    [9]
    Jun Liu, Amir Shahroudy, Dong Xu, Alex C Kot, and Gang Wang. 2018. Skeleton-based action recognition using spatio-temporal LSTM network with trust gates. IEEE transactions on pattern analysis and machine intelligence 40, 12 (2018), 3007--3021.
    [10]
    Chenyang Si, Ya Jing, Wei Wang, Liang Wang, and Tieniu Tan. 2018. Skeleton-based action recognition with spatial reasoning and temporal stack learning. In Proceedings of the European Conference on Computer Vision (ECCV). 103--118.
    [11]
    Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence. 7444--7452.
    [12]
    Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, and Jian Yang. 2018. Spatio-temporal graph convolution for skeleton based action recognition. In Proceedings of the AAAI Conference on Artificial Intelligence. 3482--3489.
    [13]
    Yansong Tang, Yi Tian, Jiwen Lu, Peiyang Li, and Jie Zhou. 2018. Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5323--5332.
    [14]
    Xiang Gao, Wei Hu, Jiaxiang Tang, Pan Pan, Jiaying Liu, and Zongming Guo. 2018. Generalized graph convolutional networks for skeleton-based action recognition. arXiv preprint arXiv: 1811.12013 (2018).
    [15]
    Chenyang Si, Wentao Chen, Wei Wang, Liang Wang, and Tieniu Tan. 2019. An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1227--1236.
    [16]
    Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2019. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12026--12035.
    [17]
    Yi-Fan Song, Zhang Zhang, and Liang Wang. 2019. Richly activated graph convolutional network for action recognition with incomplete skeletons. In Proceedings of the IEEE International Conference on Image Processing. 1--5.
    [18]
    Fanyang Meng, Hong Liu, Yongsheng Liang, Juanhui Tu, and Mengyuan Liu. 2019. Sample Fusion Network: An End-to-End Data Augmentation Network for Skeleton-Based Human Action Recognition. IEEE Transactions on Image Processing 28, 11 (2019), 5281--5295.
    [19]
    Jongmin Yu, Yongsang Yoon, and Moongu Jeon. 2020. Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition. arXiv preprint arXiv:2003.07514 (2020).
    [20]
    Raviteja Vemulapalli, Felipe Arrate, and Rama Chellappa. 2014. Human action recognition by representing 3d skeletons as points in a lie group. In Proceedings of the IEEE conference on computer vision and pattern recognition. 588--595.
    [21]
    Piotr Koniusz, Anoop Cherian, and Fatih Porikli. 2016. Tensor representations via kernel linearization for action recognition from 3d skeletons. In Proceedings of the European Conference on Computer Vision. Springer, 37--53.
    [22]
    Junwu Weng, Chaoqun Weng, and Junsong Yuan. 2017. Spatio-temporal naive-bayes nearest-neighbor (st-nbnn) for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4171--4180.
    [23]
    Yong Du, Wei Wang, and Liang Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1110--1118.
    [24]
    Inwoong Lee, Doyoung Kim, Seoungyoon Kang, and Sanghoon Lee. 2017. Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks. In Proceedings of the IEEE International Conference on Computer Vision. 1012--1020.
    [25]
    Mengyuan Liu, Hong Liu, and Chen Chen. 2017. Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognition 68 (2017), 346--362.
    [26]
    Xiaolu Ding, Kai Yang, and Wai Chen. 2019. An Attention-Enhanced Recurrent Graph Convolutional Network for Skeleton-Based Action Recognition. In Proceedings of the 2019 2nd International Conference on Signal Processing and Machine Learning. 79--84.
    [27]
    Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, and Wanli Ouyang. 2020. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 143--152.
    [28]
    Zhaoxuan Fan, Xu Zhao, Tianwei Lin, and Haisheng Su. 2018. Attention-based multiview re-observation fusion network for skeletal action recognition. IEEE Transactions on Multimedia 21, 2 (2018), 363--374.
    [29]
    Amir Shahroudy, Jun Liu, Tian-Tsong Ng, and Gang Wang. 2016. NTU RGB+ D: A large scale dataset for 3D human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1010--1019.
    [30]
    Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, et al. 2017. The kinetics human action video dataset. arXiv preprint arXiv:1705.06950 (2017).
    [31]
    Jun Liu, Amir Shahroudy, Dong Xu, and Gang Wang. 2016. Spatio-temporal lstm with trust gates for 3d human action recognition. In European Conference on Computer Vision. Springer, 816--833.
    [32]
    Hongsong Wang and Liang Wang. 2017. Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 499--508.
    [33]
    Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, and Jiaying Liu. 2017. An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In Proceedings of the AAAI conference on artificial intelligence. 4263--4270.
    [34]
    Duohan Liang, Guoliang Fan, Guangfeng Lin, Wanjun Chen, Xiaorong Pan, and Hong Zhu. 2019. Three-stream convolutional neural network with multi-task and ensemble learning for 3d action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 0--0.
    [35]
    Basura Fernando, Efstratios Gavves, Jose M Oramas, Amir Ghodrati, and Tinne Tuytelaars. 2015. Modeling video evolution for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5378--5387.
    [36]
    Tae Soo Kim and Austin Reiter. 2017. Interpretable 3d human action analysis with temporal convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 1623--1631.
    [37]
    Bin Li, Xi Li, Zhongfei Zhang, and Fei Wu. 2019. Spatio-temporal graph routing for skeleton-based action recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8561--8568.

    Cited By

    View all
    • (2023)Delving Deep Into One-Shot Skeleton-Based Action Recognition With Diverse OcclusionsIEEE Transactions on Multimedia10.1109/TMM.2023.323530025(1489-1504)Online publication date: 1-Jan-2023

    Index Terms

    1. Generalized Graph Convolutional Networks for Action Recognition with Occluded Skeletons

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition
      October 2020
      552 pages
      ISBN:9781450387835
      DOI:10.1145/3436369
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      In-Cooperation

      • Beijing University of Technology

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 January 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. action recognition
      2. generalized network
      3. graph convolution
      4. occluded skeletons

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ICCPR 2020

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)28
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 29 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Delving Deep Into One-Shot Skeleton-Based Action Recognition With Diverse OcclusionsIEEE Transactions on Multimedia10.1109/TMM.2023.323530025(1489-1504)Online publication date: 1-Jan-2023

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media