Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3467691.3467692acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicrsaConference Proceedingsconference-collections
research-article

A Novel Human Parsing Method Driven by Multi-Scale Feature Blend Network

Published: 09 September 2021 Publication History

Abstract

In recent years, human parsing has been developed a lot for its valuable utilization. However, existing methods have not fully solved semantic errors and incomplete semantic predictions. In this regard, a Multi-Scale Feature Blend Network(MFBNet) is proposed to deal with these problems from the respective of fusing multi-scale features. Specifically, we creatively introduce the Context Embedding module which uses the feature pyramid as the main structure to blend multi-scale feature information. Besides, ResNet-101 is applied as the backbone network to train and optimize shared weights and map the generated feature maps to the Context Embedding module. Experimental results on several wide-used datasets show that the proposed method outperforms than the state-of-art methods in human parsing.

References

[1]
Gong K, Liang X, Li Y, Yang M, Lin L. Instance-Level Human Parsing via Part Grouping Network[C]. European Conference on Computer Vision(ECCV), Springer, Charm, 2018:805-822.
[2]
Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks[C]. NIPS’12: Proceedings of the 25 th International Conference on Neural Information Processing Systems – Volume 1, NY, Curran Associates Inc, 2012:1097-1105.
[3]
Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv:1409.1556v6 [cs.CV], 2014.
[4]
He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016:770-778.
[5]
Szegedy C, Liu W, Jia Y, Pierre S, Reed S Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015:1-9.
[6]
Long J, Shelharner E, Darrell T. Fully convolutional networks for semantic segmentation[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015:3441-3440.
[7]
Badrinarayanan V, Kendall A, Cipolla R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, 2017:2481-2495.
[8]
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation[C]. International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Charm, 2015:234-241.
[9]
Xu B, Han Q, Zheng L, Zhang G. A novel human parsing method enhanced by cross-refinement[C]. WCSE 2020: 2020 10 th International Workshop on Computer Science and Engineering (WCSE), 2020:314-322.
[10]
Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid Scene Parsing Network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HUI, 2017:6230-6239.
[11]
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille A. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[J]. arXiv:1412.7062v4 [cs.CV], 2016.
[12]
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille A. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs[J]. arXiv:1606.00915v2 [cs.CV], 2016.
[13]
Chen L, Papandreou G, Schroff F, Adam H. Rethinking Atrous Convolution for Semantic Image Segmentation[J]. arXiv:1706.05587v3 [cs.CV], 2017.
[14]
Chen L, Zhu Y, Papandreou G, Schroff F, Adam H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation[C]. European Conference on Computer Vision (ECCV), Springer, Cham, 2018:833-851.
[15]
Liang X, Xu C, Shen X, Yang J, Tang J, Lin L, Yan S. Human Parsing with Contextualized Convolutional Neural Network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no.1, 2017:115-127.
[16]
Luo Y, Zheng Z, Zheng L, Guan T, Yu J, Yang Y. Macro-Micro Adversarial Network for Human Parsing[C]. European Conference on Computer Vision (ECCV), Springer, Cham,2018:424-440.
[17]
Ruan T, Liu T, Huang Z, Wei Y, Wei S, Zhao Y. Devil in the Details: Towards Accurate Single and Multiple Human Parsing[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019:4814-4821.
[18]
Yu F, Koltun V. Multi-Scale Context Aggregation by Dilated Convolutions[J]. arXiv:1511.07122v3 [cs.CV], 2015.
[19]
Zhao J, Li J, Cheng Y, Sim T, Yan S, Feng J. Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing[C]. MM ’18: Proceedings of the 26 th ACM international conference on Multimedia, NY, Association for Computing Machinery, 2018:792-800.
[20]
Chen X, Mottaghi R, Liu X, Fidler S, Urtasun R, Yuille A. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Columbus, OH, 2014:1979-1986.
[21]
Everingham M, Gool L, Williams C, Winn J, Zisserman A. The PASCAL Visual Object Classed (VOC) Challenge[J]. Int J Comput Vis 88, 2010:303-338.
[22]
Gong K, Liang X, Zhang D, Shen X, Lin L. Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2017:6757-6765.
[23]
Xia F, Wang P, Chen L, Yuille A. Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net[C]. European Conference on Computer Vision(ECCV), Springer, Cham,2016:648-663.
[24]
Yamaguchi K, Kiapour M, Ortiz L, Berg T. Parsing clothing in fashion photographs[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, 2012:3570-3577.
[25]
Liang X, Xu C, Shen X Human Parsing with Contextualized Convolutional Neural Network[C]. 2015 IEEE International Conference on Computer Vision(ICCV), Santiago, 2015:1386-1394.
[26]
Li J, Zhao J, Wei Y Multiple-Human Parsing in the Wild[J]. arXiv:1705.07206v2 [cs.CV], 2017.
[27]
Liu X, Zheng M, Liu W, Song J, Mei T. BraidNet: Braiding Semantics and Details for Accurate Human Parsing[C]. MM ’19: Proceedings of the 27 th ACM International Conference on Multimedia, NY, Association for Computing Machinery, 2019:338-346.
[28]
Zhao J, Li J, Nie X Self-Supervised Neural Aggregation Networks for Human Parsing[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshop(CVPRW), Honolulu, HI, 2017:1595-1603.
[29]
Fang H, Lu G, Fang X, Xie J, Tai Y, Lu C. Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer[C]. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2018:70-78.
[30]
Lin G, Milan A, Shen C, Reid l, RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2017:5168-5177.
[31]
Gong K, Gao Y, Liang X Graphonomy: Universal Human Parsing via Graph Transfer Learning[C]. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2019:7450-7459.
  1. A Novel Human Parsing Method Driven by Multi-Scale Feature Blend Network

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICRSA '21: Proceedings of the 2021 4th International Conference on Robot Systems and Applications
    April 2021
    82 pages
    ISBN:9781450384940
    DOI:10.1145/3467691
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 September 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Deep Learning
    2. Human Parsing
    3. Semantic Segmentation

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Anhui Natural Science Foundation
    • Provincial Key Research and Development Program of Anhui
    • Youth Fund of National Natural Science Foundation of China

    Conference

    ICRSA 2021

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 25
      Total Downloads
    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media