research-article

PAINT: Photo-realistic Fashion Design Synthesis

Authors:

Mohan S. KankanhalliAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 2

Article No.: 48, Pages 1 - 23

https://doi.org/10.1145/3545610

Published: 26 September 2023 Publication History

Abstract

In this article, we investigate a new problem of generating a variety of multi-view fashion designs conditioned on a human pose and texture examples of arbitrary sizes, which can replace the repetitive and low-level design work for fashion designers. To solve this challenging multi-modal image translation problem, we propose a novel Photo-reAlistic fashIon desigN synThesis (PAINT) framework, which decomposes the framework into three manageable stages. In the first stage, we employ a Layout Generative Network (LGN) to transform an input human pose into a series of person semantic layouts. In the second stage, we propose a Texture Synthesis Network (TSN) to synthesize textures on all transformed semantic layouts. Specifically, we design a novel attentive texture transfer mechanism for precisely expanding texture patches to the irregular clothing regions of the target fashion designs. In the third stage, we leverage an Appearance Flow Network (AFN) to generate the fashion design images of other viewpoints from a single-view observation by learning 2D multi-scale appearance flow fields. Experimental results demonstrate that our method is capable of generating diverse photo-realistic multi-view fashion design images with fine-grained appearance details conditioned on the provided multiple inputs. The source code and trained models are available at https://github.com/gxl-groups/PAINT.

References

[1]

Badour Albahar and Jia-Bin Huang. 2019. Guided image-to-image translation with bi-directional feature transformation. In ICCV. 9015–9024.

[2]

John F. Canny. 1986. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 6 (1986), 679–698.

Digital Library

[3]

Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L. Yuille. 2016. Attention to scale: Scale-aware semantic image segmentation. In CVPR. 3640–3649.

[4]

Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. 2018. Cascaded pyramid network for multi-person pose estimation. In CVPR. 7103–7112.

[5]

Haoye Dong, Xiaodan Liang, Yixuan Zhang, Xujie Zhang, Xiaohui Shen, Zhenyu Xie, Bowen Wu, and Jian Yin. 2020. Fashion editing with adversarial parsing learning. In CVPR. 8117–8125.

[6]

Alpana Dubey, Nitish Bhardwaj, Kumar Abhinav, Mani Suma Kuriakose, Sakshi Jain, and Veenu Arora. 2020. AI assisted apparel design. CoRR abs/2007.04950 (2020).

[7]

Patrick Esser, Ekaterina Sutter, and Bjorn Ommer. 2018. A variational U-Net for conditional appearance and shape generation. In CVPR. 8857–8866.

[8]

Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414–2423.

[9]

Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, and Liang Lin. 2018. Instance-level human parsing via part grouping network. In ECCV. 805–822.

[10]

Xiaoling Gu, Fei Gao, Min Tan, and Pai Peng. 2020. Fashion analysis and understanding with artificial intelligence. Inf. Process. Manag. 57, 5 (2020), 102276.

[11]

Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis. 2018. VITON: An image-based virtual try-on network. In CVPR. 7543–7552.

[12]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In NIPS. 6626–6637.

[13]

Xun Huang and Serge J. Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV. 1510–1519.

[14]

Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy. 2018. LiteFlowNet: A lightweight convolutional neural network for optical flow estimation. In CVPR. 8981–8989.

[15]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR. 5967–5976.

[16]

Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial transformer networks. In NIPS. 2017–2025.

[17]

Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie, Bharath Hariharan, Hartwig Adam, and Serge J. Belongie. 2020. Fashionpedia: Ontology, segmentation, and an attribute localization dataset. In ECCV. 316–332.

[18]

Shuhui Jiang and Yun Fu. 2017. Fashion style generator. In IJCAI. 3721–3727.

[19]

Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. 694–711.

[20]

Natsumi Kato, Hiroyuki Osone, Kotaro Oomori, Chun Wei Ooi, and Yoichi Ochiai. 2019. GANs-based clothes design: Pattern maker is all you need to design clothing. In AH. 21:1–21:7.

[21]

Tao Li, Zhiyuan Liang, Sanyuan Zhao, Jiahao Gong, and Jianbing Shen. 2020. Self-learning with rectification strategy for human parsing. In CVPR. 9260–9269.

[22]

Yining Li, Chen Huang, and Chen Change Loy. 2019. Dense intrinsic appearance flow for human pose transfer. In CVPR. 3693–3702.

[23]

Zeyu Li, Cheng Deng, Kun Wei, Wei Liu, and Dacheng Tao. 2021. Learning semantic priors for texture-realistic sketch-to-image synthesis. Neurocomputing 464 (2021), 130–140. DOI:

Digital Library

[24]

Zeyu Li, Cheng Deng, Erkun Yang, and Dacheng Tao. 2021. Staged sketch-to-image synthesis via semi-supervised generative adversarial networks. IEEE Trans. Multim. 23 (2021), 2694–2705. DOI:

Digital Library

[25]

Tsung-Yi Lin, Piotr Dollar, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie. 2017. Feature pyramid networks for object detection. In CVPR. 936–944.

[26]

Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image inpainting for irregular holes using partial convolutions. In ECCV. 89–105.

[27]

Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In CVPR. 1096–1104.

[28]

Liqian Ma, Xu Jia, Qianru Sun, Bernt Schiele, Tinne Tuytelaars, and Luc Van Gool. 2017. Pose-guided person image generation. In NIPS. 405–415.

[29]

Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic image synthesis with spatially-adaptive normalization. In CVPR. 2337–2346.

[30]

Negar Rostamzadeh, Seyedarian Hosseini, Thomas Boquet, Wojciech Stokowiec, Ying Zhang, Christian Jauvin, and Chris Pal. 2018. Fashion-Gen: The generative fashion dataset and challenge. CoRR abs/1806.08317 (2018).

[31]

Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In NIPS. 2226–2234.

[32]

Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann LeCun, and Camille Couprie. 2018. DesIGN: Design inspiration from generative networks. In ECCV Workshops. 37–44.

[33]

Aliaksandr Siarohin, Enver Sangineto, Stéphane Lathuilière, and Nicu Sebe. 2018. Deformable GANs for pose-based human image generation. In CVPR. 3408–3416.

[34]

Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In CVPR. 6450–6458.

[35]

Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, and Ling Shao. 2019. Learning compositional neural information fusion for human parsing. In ICCV. 5702–5712.

[36]

Wenguan Wang, Tianfei Zhou, Siyuan Qi, Jianbing Shen, and Song-Chun Zhu. 2021. Hierarchical human semantic parsing with comprehensive part-relation modeling. IEEE Trans. Pattern Anal. Mach. Intell. (2021). DOI:

[37]

Wenguan Wang, Hailong Zhu, Jifeng Dai, Yanwei Pang, Jianbing Shen, and Ling Shao. 2020. Hierarchical human parsing with typed part-relation reasoning. In CVPR. 8926–8936.

[38]

Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 13, 4 (2004), 600–612.

Digital Library

[39]

Wenqi Xian, Patsorn Sangkloy, Varun Agrawal, Amit Raj, Jingwan Lu, Chen Fang, Fisher Yu, and James Hays. 2018. TextureGAN: Controlling deep image synthesis with texture patches. In CVPR. 8456–8465.

[40]

Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, and Xiaodong He. 2018. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks. In CVPR. 1316–1324.

[41]

Kota Yamaguchi, M. Hadi Kiapour, Luis E. Ortiz, and Tamara L. Berg. 2012. Parsing clothing in fashion photographs. In CVPR. 3570–3577.

[42]

Yanhua Yang, Lei Wang, De Xie, Cheng Deng, and Dacheng Tao. 2021. Multi-sentence auxiliary adversarial networks for fine-grained text-to-image synthesis. IEEE Trans. Image Process. 30 (2021), 2798–2809. DOI:

Digital Library

[43]

Cong Yu, Yang Hu, Yan Chen, and Bing Zeng. 2019. Personalized fashion design. In ICCV. 9045–9054.

[44]

Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative image inpainting with contextual attention. In CVPR. 5505–5514.

[45]

Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo. 2019. Learning pyramid-context encoder network for high-quality image inpainting. In CVPR. 1486–1494.

[46]

Han Zhang, Ian J. Goodfellow, Dimitris N. Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In ICML. 7354–7363.

[47]

Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR. 586–595.

[48]

Tianfei Zhou, Siyuan Qi, Wenguan Wang, Jianbing Shen, and Song-Chun Zhu. 2021. Cascaded parsing of human-object interaction recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2021). DOI:

[49]

Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A. Efros. 2016. View synthesis by appearance flow. In ECCV. 286–301.

[50]

Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In NIPS. 465–476.

[51]

Shizhan Zhu, Sanja Fidler, Raquel Urtasun, Dahua Lin, and Chen Change Loy. 2017. Be your own Prada: Fashion synthesis with structural coherence. In ICCV. 1689–1697.

[52]

Zhen Zhu, Tengteng Huang, Baoguang Shi, Miao Yu, Bofei Wang, and Xiang Bai. 2019. Progressive pose attention transfer for person image generation. In CVPR. 2347–2356.

Cited By

Li BXie DWu YZheng LXu CZhou YFu YWang CLiu BZuo X(2024)Synthesis and Detection Algorithms for Oblique Stripe Noise of Space-Borne Remote Sensing ImagesIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.336026862(1-14)Online publication date: 2024
https://doi.org/10.1109/TGRS.2024.3360268
Shen CLiu ZGao XFeng ZSong M(2023)Self-Adaptive Clothing Mapping Based Virtual Try-onACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361345320:3(1-26)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3613453

Index Terms

PAINT: Photo-realistic Fashion Design Synthesis
1. Applied computing
  1. Arts and humanities
    1. Fine arts
    2. Media arts
2. Human-centered computing
  1. Interaction design
    1. Empirical studies in interaction design

Recommendations

Toward Intelligent Fashion Design: A Texture and Shape Disentangled Generative Adversarial Network
Texture and shape in fashion, constituting essential elements of garments, characterize the body and surface of the fabric and outline the silhouette of clothing, respectively. The selection of texture and shape plays a critical role in the design process,...
InspirNET: An Unsupervised Generative Adversarial Network with Controllable Fine-grained Texture Disentanglement for Fashion Generation
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Texture constitutes the color and fabric of fashion items. Its choice in fashion items can directly express the personality and emotional state of a wearer. Despite the rapid development of intelligence-driven fashion design, it remains challenging to ...
Realistic image synthesis using photon mapping

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20, Issue 2

February 2024

548 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3613570

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 September 2023

Online AM: 30 June 2022

Accepted: 23 June 2022

Revised: 18 May 2022

Received: 14 December 2021

Published in TOMM Volume 20, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Zhejiang Provincial Natural Science Foundation of China
National Science Foundation of China
National Research Foundation, Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
550
Total Downloads

Downloads (Last 12 months)330
Downloads (Last 6 weeks)20

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li BXie DWu YZheng LXu CZhou YFu YWang CLiu BZuo X(2024)Synthesis and Detection Algorithms for Oblique Stripe Noise of Space-Borne Remote Sensing ImagesIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.336026862(1-14)Online publication date: 2024
https://doi.org/10.1109/TGRS.2024.3360268
Shen CLiu ZGao XFeng ZSong M(2023)Self-Adaptive Clothing Mapping Based Virtual Try-onACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361345320:3(1-26)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3613453

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents