Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3468691.3468733acmotherconferencesArticle/Chapter ViewAbstractPublication PagescniotConference Proceedingsconference-collections
research-article

Class-level Aware Network for Human Parsing

Published: 07 August 2021 Publication History

Abstract

Having shown great performance in human parsing, convolutional neural networks(CNNs) come with much computation budget. In this paper, a novel class-level aware network(CANet), which employs an asymmetric encoder-decoder architecture, is presented to achieve reliable human parsing results in a memory friendly way. To achieve the trade-off between speed and accuracy in human parsing, we design group-split-bottleneck(GS-bt) block, where group convolution and channel split are utilized in the residual block. In decoder network, the attention pyramid pooling module(APPM) is proposed to recovering the details of human parsing. Moreover, a multi-class classification branch is developed to extract class-level information and revise human parsing results. Compared to current models, our model has less parameters and experiments demonstrate that the proposed CANet can reach state-of-the-art results on PASCAL-Person-Part dataset.

References

[1]
I. Sutskever A. Krizhevsky and G. E. Hinton.2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems(NIPS).
[2]
Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 12(2017), 2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615
[3]
L. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille. 2016. Attention to Scale: Scale-Aware Semantic Image Segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3640–3649. https://doi.org/10.1109/CVPR.2016.396
[4]
Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L. Yuille. 2016. Attention to Scale: Scale-Aware Semantic Image Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV). 801–818.
[6]
Xianjie Chen, Roozbeh Mottaghi, Xiaobai Liu, Sanja Fidler, Raquel Urtasun, and Alan Yuille. 2014. Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1971–1978.
[7]
Francois Chollet. 2017. Xception: Deep Learning With Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1251–1258.
[8]
Terrance Devries and Graham W. Taylor. 2017. Improved Regularization of Convolutional Neural Networks with Cutout. CoRR abs/1708.04552(2017). arxiv:1708.04552
[9]
Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, and Cewu Lu. 2018. Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10]
Alberto Garcia-Garcia, Sergio Orts-Escolano, Sergiu Oprea, Victor Villena-Martinez, and José García Rodríguez. 2017. A Review on Deep Learning Techniques Applied to Semantic Segmentation. CoRR abs/1704.06857(2017). arxiv:1704.06857
[11]
Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, and Liang Lin. 2018. Instance-level human parsing via part grouping network. In Proceedings of the European Conference on Computer Vision (ECCV). 770–785.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778.
[13]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs/1704.04861(2017). arxiv:1704.04861
[14]
Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2261–2269. https://doi.org/10.1109/CVPR.2017.243
[15]
Simon Jégou, Michal Drozdzal, David Vázquez, Adriana Romero, and Yoshua Bengio. 2017. The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 11–19. https://doi.org/10.1109/CVPRW.2017.156
[16]
Fu Jun, Liu Jing, Tian Haijie, Li Yong, Bao Yongjun, Fang Zhiwei, and Lu Hanqing. 2019. Dual attention network for scene segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3146–3154.
[17]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of 3rd International Conference on Learning Representations (ICLR).
[18]
Alex Krizhevsky, Geoffrey Hinton, 2009. Learning multiple layers of features from tiny images. (2009).
[19]
Philipp Krähenbühl and Vladlen Koltun. 2011. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In Advances in Neural Information Processing Systems(NIPS), Vol. 24. 109–117.
[20]
Jogendra Nath Kundu, Gaurav Singh Rajput, and R. Venkatesh Babu. 2020. VRT-Net: Real-Time Scene Parsing via Variable Resolution Transform. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). 2038–2045.
[21]
Peike Li, Yunqiu Xu, Yunchao Wei, and Yi Yang. 2019. Self-Correction for Human Parsing. arXiv preprint arXiv:1910.09777(2019).
[22]
Xiaodan Liang, Ke Gong, Xiaohui Shen, and Liang Lin. 2018. Look into Person: Joint Body Parsing and Pose Estimation Network and a New Benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 4(2018), 871–885.
[23]
Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, and Shuicheng Yan. 2016. Semantic object parsing with local-global long short-term memory. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3185–3193.
[24]
Guosheng Lin, Fayao Liu, A. Milan, Chunhua Shen, and I. Reid. 2020. RefineNet: Multi-Path Refinement Networks for Dense Prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (2020), 1228–1242.
[25]
Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. 2016. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. In Fourth International Conference on 3D Vision. 565–571. https://doi.org/10.1109/3DV.2016.79
[26]
Adam Paszke, Abhishek Chaurasia, Sangpil Kim, and Eugenio Culurciello. 2016. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. arXiv:1606.02147 (2016).
[27]
Eduardo Romera, Jose M. Alvarez, Luis M. Bergasa, and Roberto Arroyo. 2018. ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation. IEEE Transactions on Intelligent Transportation Systems 19, 1(2018), 263–272. https://doi.org/10.1109/TITS.2017.2750080
[28]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention(MICCAI), Vol. 9351. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
[29]
Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, and Yao Zhao. 2019. Devil in the details: Towards accurate single and multiple human parsing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4814–4821.
[30]
E. Shelhamer, J. Long, and T. Darrell. 2017. Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4(2017), 640–651.
[31]
K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 1409.1556 (09 2014).
[32]
C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. 2015. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1–9. https://doi.org/10.1109/CVPR.2015.7298594
[33]
Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual Attention Network for Image Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3156–3164.
[34]
Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, and Nong Sang. 2018. Learning a discriminative feature network for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 1857–1866.
[35]
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6848–6856.
[36]
Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, and Jiaya Jia. 2018. ICNet for Real-Time Semantic Segmentation on High-Resolution Images. In Proceedings of the European Conference on Computer Vision (ECCV). 405–420.
[37]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2881–2890.

Cited By

View all
  • (2022)Edge-Aware Graph Matching Network for Part-Based Semantic SegmentationInternational Journal of Computer Vision10.1007/s11263-022-01671-z130:11(2797-2821)Online publication date: 1-Nov-2022

Index Terms

  1. Class-level Aware Network for Human Parsing
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          CNIOT '21: Proceedings of the 2021 2nd International Conference on Computing, Networks and Internet of Things
          May 2021
          270 pages
          ISBN:9781450389693
          DOI:10.1145/3468691
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 07 August 2021

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. CNN
          2. encoder-decoder networks
          3. human parsing

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Conference

          CNIOT2021

          Acceptance Rates

          Overall Acceptance Rate 39 of 82 submissions, 48%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)3
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 23 Dec 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2022)Edge-Aware Graph Matching Network for Part-Based Semantic SegmentationInternational Journal of Computer Vision10.1007/s11263-022-01671-z130:11(2797-2821)Online publication date: 1-Nov-2022

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media