Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

RAST: Restorable Arbitrary Style Transfer

Published: 22 January 2024 Publication History
  • Get Citation Alerts
  • Abstract

    The objective of arbitrary style transfer is to apply a given artistic or photo-realistic style to a target image. Although current methods have shown some success in transferring style, arbitrary style transfer still has several issues, including content leakage. Embedding an artistic style can result in unintended changes to the image content. This article proposes an iterative framework called Restorable Arbitrary Style Transfer (RAST) to effectively ensure content preservation and mitigate potential alterations to the content information. RAST can transmit both content and style information through multi-restorations and balance the content-style tradeoff in stylized images using the image restoration accuracy. To ensure RAST’s effectiveness, we introduce two novel loss functions: multi-restoration loss and style difference loss. We also propose a new quantitative evaluation method to assess content preservation and style embedding performance. Experimental results show that RAST outperforms state-of-the-art methods in generating stylized images that preserve content and embed style accurately.

    References

    [1]
    Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, and Jiebo Luo. 2021. Artflow: Unbiased image style transfer via reversible neural flows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 862–871.
    [2]
    Eva Cetinic and James She. 2022. Understanding and creating art with AI: Review and outlook. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 2 (2022), 1–22.
    [3]
    Haibo Chen, Lei Zhao, Zhizhong Wang, Huiming Zhang, Zhiwen Zuo, Ailin Li, Wei Xing, and Dongming Lu. 2021. Artistic style transfer with internal-external learning and contrastive learning. Advances in Neural Information Processing Systems 34 (2021), 26561–26573.
    [4]
    Tian Qi Chen and Mark Schmidt. 2016. Fast patch-based style transfer of arbitrary style. arXiv:1612.04337. Retrieved from https://arxiv.org/abs/1612.04337
    [5]
    Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8789–8797.
    [6]
    Yingying Deng, Fan Tang, Weiming Dong, Haibin Huang, Chongyang Ma, and Changsheng Xu. 2021. Arbitrary video style transfer via multi-channel correlation. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35, 1210–1217.
    [7]
    Yingying Deng, Fan Tang, Weiming Dong, Chongyang Ma, Xingjia Pan, Lei Wang, and Changsheng Xu. 2022. StyTr2: Image style transfer with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11326–11336.
    [8]
    Yingying Deng, Fan Tang, Weiming Dong, Wen Sun, Feiyue Huang, and Changsheng Xu. 2020. Arbitrary style transfer via multi-adaptation network. In Proceedings of the 28th ACM International Conference on Multimedia. 2719–2727.
    [9]
    Laurent Dinh, David Krueger, and Yoshua Bengio. 2015. NICE: Non-linear independent components estimation. arXiv:1410.8516. Retrieved from https://arxiv.org/abs/1410.8516
    [10]
    Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. 2016. A learned representation for artistic style. International Conference on Learning Representations.
    [11]
    Michael Elad and Peyman Milanfar. 2017. Style transfer via texture synthesis. IEEE Transactions on Image Processing 26, 5 (2017), 2338–2351.
    [12]
    Jakub Fišer, Ondřej Jamriška, Michal Lukáč, Eli Shechtman, Paul Asente, Jingwan Lu, and Daniel Sỳkora. 2016. StyLit: Illumination-guided example-based stylization of 3D renderings. ACM Transactions on Graphics 35, 4 (2016), 1–11.
    [13]
    Oriel Frigo, Neus Sabater, Julie Delon, and Pierre Hellier. 2016. Split and match: Example-based adaptive patch sampling for unsupervised style transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 553–561.
    [14]
    Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2414–2423.
    [15]
    Bruce Gooch, Erik Reinhard, and Amy Gooch. 2004. Human facial illustrations: Creation and psychophysical evaluation. ACM Transactions on Graphics 23, 1 (2004), 27–44.
    [16]
    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems.
    [17]
    Xiaoling Gu, Jie Huang, Yongkang Wong, Jun Yu, Jianping Fan, Pai Peng, and Mohan S. Kankanhalli. 2023. PAINT: Photo-realistic fashion design synthesis. ACM Transactions on Multimedia Computing, Communications, and Applications 20, 2 (2023), 1–23.
    [18]
    Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729–9738.
    [19]
    Aaron Hertzmann. 1998. Painterly rendering with curved brush strokes of multiple sizes. In Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques. 453–460.
    [20]
    Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin. 2001. Image analogies. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. 327–340.
    [21]
    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems.
    [22]
    Jonathan Ho, Xi Chen, Aravind Srinivas, Yan Duan, and Pieter Abbeel. 2019. Flow++: Improving flow-based generative models with variational dequantization and architecture design. In Proceedings of the International Conference on Machine Learning. PMLR, 2722–2730.
    [23]
    Kibeom Hong, Seogkyu Jeon, Huan Yang, Jianlong Fu, and Hyeran Byun. 2021. Domain-aware universal style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14609–14617.
    [24]
    Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision. 1501–1510.
    [25]
    Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision. Springer, 694–711.
    [26]
    Minguk Kang and Jaesik Park. 2020. ContraGAN: Contrastive learning for conditional image generation. In Proceedings of the 34th International Conference on Neural Information Processing Systems. 21357–21369.
    [27]
    Sergey Karayev, Matthew Trentacoste, Helen Han, Aseem Agarwala, Trevor Darrell, Aaron Hertzmann, and Holger Winnemoeller. 2014. Recognizing image style. In Proceedings of the British Machine Vision Conference 2014.
    [28]
    Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the International Conference on Machine Learning. PMLR, 1857–1865.
    [29]
    Diederik P. Kingma and Jimmy Ba. 2017. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980
    [30]
    Durk P. Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. In Proceedings of the 32nd International Conference on Neural Information Processing Systems.
    [31]
    Dmytro Kotovenko, Artsiom Sanakoyeu, Sabine Lang, and Bjorn Ommer. 2019. Content and style disentanglement for artistic style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4422–4431.
    [32]
    Chuan Li and Michael Wand. 2016. Combining Markov random fields and convolutional neural networks for image synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2479–2486.
    [33]
    Xueting Li, Sifei Liu, Jan Kautz, and Ming-Hsuan Yang. 2019. Learning linear transformations for fast image and video style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3809–3817.
    [34]
    Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang. 2017. Universal style transfer via feature transforms. In Proceedings of the 31st International Conference on Neural Information Processing Systems.
    [35]
    Minxuan Lin, Fan Tang, Weiming Dong, Xiao Li, Changsheng Xu, and Chongyang Ma. 2021. Distribution aligned multimodal and multi-domain image stylization. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 3 (2021), 1–17.
    [36]
    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740–755.
    [37]
    Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Meiling Wang, Xin Li, Zhengxing Sun, Qian Li, and Errui Ding. 2021. AdaAttN: Revisit attention mechanism in arbitrary neural style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6649–6658.
    [38]
    Zhi-Song Liu, Li-Wen Wang, Wan-Chi Siu, and Vicky Kalogeiton. 2022. Name your style: An arbitrary artist-aware image style transfer. arXiv:2202.13562. Retrieved from https://arxiv.org/abs/2202.13562
    [39]
    Ming Lu, Hao Zhao, Anbang Yao, Yurong Chen, Feng Xu, and Li Zhang. 2019. A closed-form solution to universal style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5952–5961.
    [40]
    Xuan Luo, Zhen Han, Lingkang Yang, and Lingling Zhang. 2022. Consistent style transfer. arXiv:2201.02233. Retrieved from https://arxiv.org/abs/2201.02233
    [41]
    Yingnan Ma, Chenqiu Zhao, Xudong Li, and Anup Basu. 2023. RAST: Restorable arbitrary style transfer via multi-restoration. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 331–340.
    [42]
    Bo Pang, Deming Zhai, Junjun Jiang, and Xianming Liu. 2022. Fully unsupervised person re-identification via selective contrastive learning. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 2 (2022), 1–15.
    [43]
    Dae Young Park and Kwang Hee Lee. 2019. Arbitrary style transfer with style-attentional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5880–5888.
    [44]
    Seobin Park, Jinsu Yoo, Donghyeon Cho, Jiwon Kim, and Tae Hyun Kim. 2020. Fast adaptation to super-resolution networks via meta-learning. In Proceedings of the European Conference on Computer Vision. Springer, 754–769.
    [45]
    Taesung Park, Alexei A. Efros, Richard Zhang, and Jun-Yan Zhu. 2020. Contrastive learning for unpaired image-to-image translation. In Proceedings of the European Conference on Computer Vision. Springer, 319–345.
    [46]
    Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang, and Bjorn Ommer. 2018. A style-aware content loss for real-time HD style transfer. In Proceedings of the European Conference on Computer Vision. 698–714.
    [47]
    Lu Sheng, Ziyi Lin, Jing Shao, and Xiaogang Wang. 2018. Avatar-Net: Multi-scale zero-shot style transfer by feature decoration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8242–8250.
    [48]
    YiChang Shih, Sylvain Paris, Connelly Barnes, William T. Freeman, and Frédo Durand. 2014. Style transfer for headshot portraits. ACM Transactions on Graphics (TOG) 33, 4 (2014), 1–14.
    [49]
    Aliaksandr Siarohin, Gloria Zen, Cveta Majtanovic, Xavier Alameda-Pineda, Elisa Ricci, and Nicu Sebe. 2019. Increasing image memorability with neural style transfer. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 2 (2019), 1–22.
    [50]
    Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations (ICLR’15).
    [51]
    Jae Woong Soh, Sunwoo Cho, and Nam Ik Cho. 2020. Meta-transfer learning for zero-shot super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3516–3525.
    [52]
    Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In Proceedings of the 6th International Conference on Computer Vision. IEEE, 839–846.
    [53]
    Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, and Victor Lempitsky. 2016. Texture networks: Feed-forward synthesis of textures and stylized images. In Proceedings of the 33rd International Conference on International Conference on Machine Learning 48 (2016), 1349–1357.
    [54]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems.
    [55]
    Quan Wang, Sheng Li, Xinpeng Zhang, and Guorui Feng. 2022. Multi-granularity brushstrokes network for universal style transfer. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 4 (2022), 1–17.
    [56]
    Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8798–8807.
    [57]
    Holger Winnemöller, Sven C. Olsen, and Bruce Gooch. 2006. Real-time video abstraction. ACM Transactions on Graphics 25, 3 (2006), 1221–1226.
    [58]
    Haiyan Wu, Yanyun Qu, Shaohui Lin, Jian Zhou, Ruizhi Qiao, Zhizhong Zhang, Yuan Xie, and Lizhuang Ma. 2021. Contrastive learning for compact single image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10551–10560.
    [59]
    Xiaolei Wu, Zhihao Hu, Lu Sheng, and Dong Xu. 2021. Styleformer: Real-time arbitrary style transfer via parametric style composition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14618–14627.
    [60]
    Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In Proceedings of the International Conference on Machine Learning. PMLR, 7354–7363.
    [61]
    Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 586–595.
    [62]
    Yuxin Zhang, Fan Tang, Weiming Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, and Changsheng Xu. 2022. Domain enhanced arbitrary image style transfer via contrastive learning. ACM SIGGRAPH 2022 Conference Proceeding. 1–8.
    [63]
    Yue Zhang, Fanghui Zhang, Yi Jin, Yigang Cen, Viacheslav Voronin, and Shaohua Wan. 2023. Local correlation ensemble with GCN based on attention features for cross-domain person Re-ID. ACM Transactions on Multimedia Computing, Communications and Applications 19, 2 (2023), 1–22.
    [64]
    Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision.

    Index Terms

    1. RAST: Restorable Arbitrary Style Transfer

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 5
      May 2024
      650 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3613634
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 22 January 2024
      Online AM: 30 December 2023
      Accepted: 21 December 2023
      Revised: 21 November 2023
      Received: 05 June 2023
      Published in TOMM Volume 20, Issue 5

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Neural style transfer
      2. multi-restorations
      3. style difference

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 216
        Total Downloads
      • Downloads (Last 12 months)216
      • Downloads (Last 6 weeks)29

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media