Abstract
Clothing parsing provides some significant cues to analyze the dressing collocation and occasion. In this paper, we propose a novel clothing parsing framework with deep end-to-end conditional feature coupling network for the photographic multi-persons in the fashion scene, and annotate a multi-persons clothing dataset for the effectiveness demonstration. Our parsing framework has three sub-networks, including the coarse parsing network (CPN), the multi-pose feature network (MFN) and the coupling residual network (CRN). CPN and MFN generate a coarse segmentation intermediary and 28 pose-indicated heat maps, respectively. CRN receives these auxiliary information and generates the fine-tuning clothing parsing result. To verify the generality and effectiveness of our parsing framework, we compare our method with the state-of-the-art parsing and segmentation methods such as Deeplab [2] and Co-CNN [7] on our multi-persons clothing dataset and some fashion clothing benchmarks. Experimental evaluations on these datasets demonstrate that our framework has a superior performance in the parsing task. In particular, our CFCN achieves 88.74% accuracy on the multi-persons clothing dataset, which is significantly higher than 86.50% by Deeplab. The project is available at https://github.com/suzhuoi/CFCNet.
This research is supported by the National Natural Science Foundation of China (61502541, 61772140, 61502546), the Natural Science Foundation of Guangdong Province (2016A030310202), the Science and Technology Planning Project of Zhongshan (2016A1044), and the Fundamental Research Funds for the Central Universities (Sun Yat-sen University, 16lgpy39).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: IEEE International Conference on Computer Vision, vol. 1, p. 7 (2017)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2016)
Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: IEEE Computer Vision and Pattern Recognition, pp. 3640–3649 (2016)
Dong, J., Chen, Q., Huang, Z., Yang, J., Yan, S.: Parsing based on parselets: a unified deformable mixture model for human parsing. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 88–101 (2016)
Dong, J., Chen, Q., Shen, X., Yang, J., Yan, S.: Towards unified human parsing and pose estimation. In: IEEE Computer Vision and Pattern Recognition, pp. 843–850 (2014)
Liang, X., et al.: Deep human parsing with active template regression. IEEE Trans. Pattern Anal. Mach. Intell. 37(12), 2402–2414 (2015)
Liang, X., et al.: Human parsing with contextualized convolutional neural network. In: IEEE International Conference on Computer Vision, pp. 1386–1394 (2015)
Liu, S., et al.: Fashion parsing with weak color-category labels. IEEE Trans. Multimedia 16(1), 253–265 (2014)
Liu, S., et al.: Fashion parsing with video context. IEEE Trans. Multimedia 17(8), 1347–1358 (2015)
Liu, S., et al.: Matching-CNN meets KNN: quasi-parametric human parsing. In: IEEE Computer Vision and Pattern Recognition, pp. 1419–1427 (2015)
Wu, Q., Boulanger, P.: Enhanced reweighted MRFs for efficient fashion image parsing. ACM Trans. Multimedia Comput. Commun. Appl. 12(3), 42 (2016)
Xia, F., Wang, P., Chen, X., Yuille, A.L.: Joint multi-person pose estimation and semantic part segmentation. In: IEEE Computer Vision and Pattern Recognition, pp. 6769–6778 (2017)
Xia, F., Zhu, J., Wang, P., Yuille, A.L.: Pose-guided human parsing by an and/or graph using pose-context features. In: AAAI, pp. 3632–3640 (2016)
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Parsing clothing in fashion photographs. In: IEEE International Conference on Computer Vision, pp. 3570–3577 (2012)
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Retrieving similar styles to parse clothing. IEEE Trans. Pattern Anal. Mach. Intell. 37(5), 1028–1040 (2015)
Yang, W., Luo, P., Lin, L.: Clothing co-parsing by joint image segmentation and labeling. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3182–3189 (2014)
Zheng, S., et al.: Conditional random fields as recurrent neural networks. In: IEEE International Conference on Computer Vision, pp. 1529–1537 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, J., Su, Z., Luo, X., Zhang, G., Liang, X. (2018). Conditional Feature Coupling Network for Multi-persons Clothing Parsing. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11164. Springer, Cham. https://doi.org/10.1007/978-3-030-00776-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-00776-8_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00775-1
Online ISBN: 978-3-030-00776-8
eBook Packages: Computer ScienceComputer Science (R0)