research-article

A Novel Human Parsing Method Driven by Multi-Scale Feature Blend Network

Authors:

Gaofeng ZhangAuthors Info & Claims

ICRSA '21: Proceedings of the 2021 4th International Conference on Robot Systems and Applications

Pages 30 - 38

https://doi.org/10.1145/3467691.3467692

Published: 09 September 2021 Publication History

Abstract

In recent years, human parsing has been developed a lot for its valuable utilization. However, existing methods have not fully solved semantic errors and incomplete semantic predictions. In this regard, a Multi-Scale Feature Blend Network(MFBNet) is proposed to deal with these problems from the respective of fusing multi-scale features. Specifically, we creatively introduce the Context Embedding module which uses the feature pyramid as the main structure to blend multi-scale feature information. Besides, ResNet-101 is applied as the backbone network to train and optimize shared weights and map the generated feature maps to the Context Embedding module. Experimental results on several wide-used datasets show that the proposed method outperforms than the state-of-art methods in human parsing.

References

[1]

Gong K, Liang X, Li Y, Yang M, Lin L. Instance-Level Human Parsing via Part Grouping Network[C]. European Conference on Computer Vision(ECCV), Springer, Charm, 2018:805-822.

[2]

Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks[C]. NIPS’12: Proceedings of the 25 th International Conference on Neural Information Processing Systems – Volume 1, NY, Curran Associates Inc, 2012:1097-1105.

[3]

Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv:1409.1556v6 [cs.CV], 2014.

[4]

He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016:770-778.

[5]

Szegedy C, Liu W, Jia Y, Pierre S, Reed S Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015:1-9.

[6]

Long J, Shelharner E, Darrell T. Fully convolutional networks for semantic segmentation[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015:3441-3440.

[7]

Badrinarayanan V, Kendall A, Cipolla R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, 2017:2481-2495.

[8]

Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation[C]. International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Charm, 2015:234-241.

[9]

Xu B, Han Q, Zheng L, Zhang G. A novel human parsing method enhanced by cross-refinement[C]. WCSE 2020: 2020 10 th International Workshop on Computer Science and Engineering (WCSE), 2020:314-322.

[10]

Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid Scene Parsing Network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HUI, 2017:6230-6239.

[11]

Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille A. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[J]. arXiv:1412.7062v4 [cs.CV], 2016.

[12]

Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille A. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs[J]. arXiv:1606.00915v2 [cs.CV], 2016.

[13]

Chen L, Papandreou G, Schroff F, Adam H. Rethinking Atrous Convolution for Semantic Image Segmentation[J]. arXiv:1706.05587v3 [cs.CV], 2017.

[14]

Chen L, Zhu Y, Papandreou G, Schroff F, Adam H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation[C]. European Conference on Computer Vision (ECCV), Springer, Cham, 2018:833-851.

[15]

Liang X, Xu C, Shen X, Yang J, Tang J, Lin L, Yan S. Human Parsing with Contextualized Convolutional Neural Network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no.1, 2017:115-127.

Digital Library

[16]

Luo Y, Zheng Z, Zheng L, Guan T, Yu J, Yang Y. Macro-Micro Adversarial Network for Human Parsing[C]. European Conference on Computer Vision (ECCV), Springer, Cham,2018:424-440.

[17]

Ruan T, Liu T, Huang Z, Wei Y, Wei S, Zhao Y. Devil in the Details: Towards Accurate Single and Multiple Human Parsing[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019:4814-4821.

[18]

Yu F, Koltun V. Multi-Scale Context Aggregation by Dilated Convolutions[J]. arXiv:1511.07122v3 [cs.CV], 2015.

[19]

Zhao J, Li J, Cheng Y, Sim T, Yan S, Feng J. Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing[C]. MM ’18: Proceedings of the 26 th ACM international conference on Multimedia, NY, Association for Computing Machinery, 2018:792-800.

[20]

Chen X, Mottaghi R, Liu X, Fidler S, Urtasun R, Yuille A. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Columbus, OH, 2014:1979-1986.

Digital Library

[21]

Everingham M, Gool L, Williams C, Winn J, Zisserman A. The PASCAL Visual Object Classed (VOC) Challenge[J]. Int J Comput Vis 88, 2010:303-338.

Digital Library

[22]

Gong K, Liang X, Zhang D, Shen X, Lin L. Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2017:6757-6765.

[23]

Xia F, Wang P, Chen L, Yuille A. Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net[C]. European Conference on Computer Vision(ECCV), Springer, Cham,2016:648-663.

[24]

Yamaguchi K, Kiapour M, Ortiz L, Berg T. Parsing clothing in fashion photographs[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, 2012:3570-3577.

Digital Library

[25]

Liang X, Xu C, Shen X Human Parsing with Contextualized Convolutional Neural Network[C]. 2015 IEEE International Conference on Computer Vision(ICCV), Santiago, 2015:1386-1394.

[26]

Li J, Zhao J, Wei Y Multiple-Human Parsing in the Wild[J]. arXiv:1705.07206v2 [cs.CV], 2017.

[27]

Liu X, Zheng M, Liu W, Song J, Mei T. BraidNet: Braiding Semantics and Details for Accurate Human Parsing[C]. MM ’19: Proceedings of the 27 th ACM International Conference on Multimedia, NY, Association for Computing Machinery, 2019:338-346.

[28]

Zhao J, Li J, Nie X Self-Supervised Neural Aggregation Networks for Human Parsing[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshop(CVPRW), Honolulu, HI, 2017:1595-1603.

[29]

Fang H, Lu G, Fang X, Xie J, Tai Y, Lu C. Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer[C]. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2018:70-78.

[30]

Lin G, Milan A, Shen C, Reid l, RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2017:5168-5177.

[31]

Gong K, Gao Y, Liang X Graphonomy: Universal Human Parsing via Graph Transfer Learning[C]. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, 2019:7450-7459.

A Novel Human Parsing Method Driven by Multi-Scale Feature Blend Network
1. Computing methodologies
  1. Artificial intelligence

Recommendations

Mask-Guided Deformation Adaptive Network for Human Parsing
Due to the challenges of densely compacted body parts, nonrigid clothing items, and severe overlap in crowd scenes, human parsing needs to focus more on multilevel feature representations compared to general scene parsing tasks. Based on this observation, ...
Multi-Human Parsing Machines
MM '18: Proceedings of the 26th ACM international conference on Multimedia

Human parsing is an important task in human-centric analysis. Despite the remarkable progress in single-human parsing, the more realistic case of multi-human parsing remains challenging in terms of the data and the model. Compared with the considerable ...
Boundary-guided part reasoning network for human parsing
Abstract
The task of human parsing aims to segment the human body into different semantic regions. Despite advancements in this field, there are still two issues with current works: boundary indistinction and parsing inconsistency. In this paper, we ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICRSA '21: Proceedings of the 2021 4th International Conference on Robot Systems and Applications

April 2021

82 pages

ISBN:9781450384940

DOI:10.1145/3467691

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Anhui Natural Science Foundation
Provincial Key Research and Development Program of Anhui
Youth Fund of National Natural Science Foundation of China

Conference

ICRSA 2021

ICRSA 2021: 2021 4th International Conference on Robot Systems and Applications

April 9 - 11, 2021

Chengdu, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
25
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten