Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

PAINT: Photo-realistic Fashion Design Synthesis

Published: 26 September 2023 Publication History
  • Get Citation Alerts
  • Abstract

    In this article, we investigate a new problem of generating a variety of multi-view fashion designs conditioned on a human pose and texture examples of arbitrary sizes, which can replace the repetitive and low-level design work for fashion designers. To solve this challenging multi-modal image translation problem, we propose a novel Photo-reAlistic fashIon desigN synThesis (PAINT) framework, which decomposes the framework into three manageable stages. In the first stage, we employ a Layout Generative Network (LGN) to transform an input human pose into a series of person semantic layouts. In the second stage, we propose a Texture Synthesis Network (TSN) to synthesize textures on all transformed semantic layouts. Specifically, we design a novel attentive texture transfer mechanism for precisely expanding texture patches to the irregular clothing regions of the target fashion designs. In the third stage, we leverage an Appearance Flow Network (AFN) to generate the fashion design images of other viewpoints from a single-view observation by learning 2D multi-scale appearance flow fields. Experimental results demonstrate that our method is capable of generating diverse photo-realistic multi-view fashion design images with fine-grained appearance details conditioned on the provided multiple inputs. The source code and trained models are available at https://github.com/gxl-groups/PAINT.

    References

    [1]
    Badour Albahar and Jia-Bin Huang. 2019. Guided image-to-image translation with bi-directional feature transformation. In ICCV. 9015–9024.
    [2]
    John F. Canny. 1986. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 6 (1986), 679–698.
    [3]
    Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L. Yuille. 2016. Attention to scale: Scale-aware semantic image segmentation. In CVPR. 3640–3649.
    [4]
    Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. 2018. Cascaded pyramid network for multi-person pose estimation. In CVPR. 7103–7112.
    [5]
    Haoye Dong, Xiaodan Liang, Yixuan Zhang, Xujie Zhang, Xiaohui Shen, Zhenyu Xie, Bowen Wu, and Jian Yin. 2020. Fashion editing with adversarial parsing learning. In CVPR. 8117–8125.
    [6]
    Alpana Dubey, Nitish Bhardwaj, Kumar Abhinav, Mani Suma Kuriakose, Sakshi Jain, and Veenu Arora. 2020. AI assisted apparel design. CoRR abs/2007.04950 (2020).
    [7]
    Patrick Esser, Ekaterina Sutter, and Bjorn Ommer. 2018. A variational U-Net for conditional appearance and shape generation. In CVPR. 8857–8866.
    [8]
    Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414–2423.
    [9]
    Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, and Liang Lin. 2018. Instance-level human parsing via part grouping network. In ECCV. 805–822.
    [10]
    Xiaoling Gu, Fei Gao, Min Tan, and Pai Peng. 2020. Fashion analysis and understanding with artificial intelligence. Inf. Process. Manag. 57, 5 (2020), 102276.
    [11]
    Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis. 2018. VITON: An image-based virtual try-on network. In CVPR. 7543–7552.
    [12]
    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In NIPS. 6626–6637.
    [13]
    Xun Huang and Serge J. Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV. 1510–1519.
    [14]
    Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy. 2018. LiteFlowNet: A lightweight convolutional neural network for optical flow estimation. In CVPR. 8981–8989.
    [15]
    Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR. 5967–5976.
    [16]
    Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial transformer networks. In NIPS. 2017–2025.
    [17]
    Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie, Bharath Hariharan, Hartwig Adam, and Serge J. Belongie. 2020. Fashionpedia: Ontology, segmentation, and an attribute localization dataset. In ECCV. 316–332.
    [18]
    Shuhui Jiang and Yun Fu. 2017. Fashion style generator. In IJCAI. 3721–3727.
    [19]
    Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. 694–711.
    [20]
    Natsumi Kato, Hiroyuki Osone, Kotaro Oomori, Chun Wei Ooi, and Yoichi Ochiai. 2019. GANs-based clothes design: Pattern maker is all you need to design clothing. In AH. 21:1–21:7.
    [21]
    Tao Li, Zhiyuan Liang, Sanyuan Zhao, Jiahao Gong, and Jianbing Shen. 2020. Self-learning with rectification strategy for human parsing. In CVPR. 9260–9269.
    [22]
    Yining Li, Chen Huang, and Chen Change Loy. 2019. Dense intrinsic appearance flow for human pose transfer. In CVPR. 3693–3702.
    [23]
    Zeyu Li, Cheng Deng, Kun Wei, Wei Liu, and Dacheng Tao. 2021. Learning semantic priors for texture-realistic sketch-to-image synthesis. Neurocomputing 464 (2021), 130–140. DOI:
    [24]
    Zeyu Li, Cheng Deng, Erkun Yang, and Dacheng Tao. 2021. Staged sketch-to-image synthesis via semi-supervised generative adversarial networks. IEEE Trans. Multim. 23 (2021), 2694–2705. DOI:
    [25]
    Tsung-Yi Lin, Piotr Dollar, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie. 2017. Feature pyramid networks for object detection. In CVPR. 936–944.
    [26]
    Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image inpainting for irregular holes using partial convolutions. In ECCV. 89–105.
    [27]
    Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In CVPR. 1096–1104.
    [28]
    Liqian Ma, Xu Jia, Qianru Sun, Bernt Schiele, Tinne Tuytelaars, and Luc Van Gool. 2017. Pose-guided person image generation. In NIPS. 405–415.
    [29]
    Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic image synthesis with spatially-adaptive normalization. In CVPR. 2337–2346.
    [30]
    Negar Rostamzadeh, Seyedarian Hosseini, Thomas Boquet, Wojciech Stokowiec, Ying Zhang, Christian Jauvin, and Chris Pal. 2018. Fashion-Gen: The generative fashion dataset and challenge. CoRR abs/1806.08317 (2018).
    [31]
    Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In NIPS. 2226–2234.
    [32]
    Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann LeCun, and Camille Couprie. 2018. DesIGN: Design inspiration from generative networks. In ECCV Workshops. 37–44.
    [33]
    Aliaksandr Siarohin, Enver Sangineto, Stéphane Lathuilière, and Nicu Sebe. 2018. Deformable GANs for pose-based human image generation. In CVPR. 3408–3416.
    [34]
    Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In CVPR. 6450–6458.
    [35]
    Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, and Ling Shao. 2019. Learning compositional neural information fusion for human parsing. In ICCV. 5702–5712.
    [36]
    Wenguan Wang, Tianfei Zhou, Siyuan Qi, Jianbing Shen, and Song-Chun Zhu. 2021. Hierarchical human semantic parsing with comprehensive part-relation modeling. IEEE Trans. Pattern Anal. Mach. Intell. (2021). DOI:
    [37]
    Wenguan Wang, Hailong Zhu, Jifeng Dai, Yanwei Pang, Jianbing Shen, and Ling Shao. 2020. Hierarchical human parsing with typed part-relation reasoning. In CVPR. 8926–8936.
    [38]
    Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 13, 4 (2004), 600–612.
    [39]
    Wenqi Xian, Patsorn Sangkloy, Varun Agrawal, Amit Raj, Jingwan Lu, Chen Fang, Fisher Yu, and James Hays. 2018. TextureGAN: Controlling deep image synthesis with texture patches. In CVPR. 8456–8465.
    [40]
    Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, and Xiaodong He. 2018. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks. In CVPR. 1316–1324.
    [41]
    Kota Yamaguchi, M. Hadi Kiapour, Luis E. Ortiz, and Tamara L. Berg. 2012. Parsing clothing in fashion photographs. In CVPR. 3570–3577.
    [42]
    Yanhua Yang, Lei Wang, De Xie, Cheng Deng, and Dacheng Tao. 2021. Multi-sentence auxiliary adversarial networks for fine-grained text-to-image synthesis. IEEE Trans. Image Process. 30 (2021), 2798–2809. DOI:
    [43]
    Cong Yu, Yang Hu, Yan Chen, and Bing Zeng. 2019. Personalized fashion design. In ICCV. 9045–9054.
    [44]
    Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative image inpainting with contextual attention. In CVPR. 5505–5514.
    [45]
    Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo. 2019. Learning pyramid-context encoder network for high-quality image inpainting. In CVPR. 1486–1494.
    [46]
    Han Zhang, Ian J. Goodfellow, Dimitris N. Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In ICML. 7354–7363.
    [47]
    Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR. 586–595.
    [48]
    Tianfei Zhou, Siyuan Qi, Wenguan Wang, Jianbing Shen, and Song-Chun Zhu. 2021. Cascaded parsing of human-object interaction recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2021). DOI:
    [49]
    Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A. Efros. 2016. View synthesis by appearance flow. In ECCV. 286–301.
    [50]
    Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In NIPS. 465–476.
    [51]
    Shizhan Zhu, Sanja Fidler, Raquel Urtasun, Dahua Lin, and Chen Change Loy. 2017. Be your own Prada: Fashion synthesis with structural coherence. In ICCV. 1689–1697.
    [52]
    Zhen Zhu, Tengteng Huang, Baoguang Shi, Miao Yu, Bofei Wang, and Xiang Bai. 2019. Progressive pose attention transfer for person image generation. In CVPR. 2347–2356.

    Cited By

    View all
    • (2024)Synthesis and Detection Algorithms for Oblique Stripe Noise of Space-Borne Remote Sensing ImagesIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.336026862(1-14)Online publication date: 2024
    • (2023)Self-Adaptive Clothing Mapping Based Virtual Try-onACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361345320:3(1-26)Online publication date: 23-Oct-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 2
    February 2024
    548 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3613570
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 September 2023
    Online AM: 30 June 2022
    Accepted: 23 June 2022
    Revised: 18 May 2022
    Received: 14 December 2021
    Published in TOMM Volume 20, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Generative adversarial network
    2. fashion image synthesis
    3. AI-assisted fashion design

    Qualifiers

    • Research-article

    Funding Sources

    • Zhejiang Provincial Natural Science Foundation of China
    • National Science Foundation of China
    • National Research Foundation, Singapore

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)330
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Synthesis and Detection Algorithms for Oblique Stripe Noise of Space-Borne Remote Sensing ImagesIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.336026862(1-14)Online publication date: 2024
    • (2023)Self-Adaptive Clothing Mapping Based Virtual Try-onACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361345320:3(1-26)Online publication date: 23-Oct-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media