research-article

U-Attention to Textures: Hierarchical Hourglass Vision Transformer for Universal Texture Synthesis

Authors:

Valentin Deschaintre,

Arthur RoullierAuthors Info & Claims

CVMP '22: Proceedings of the 19th ACM SIGGRAPH European Conference on Visual Media Production

Article No.: 9, Pages 1 - 10

https://doi.org/10.1145/3565516.3565525

Published: 01 December 2022 Publication History

Abstract

We present a novel U-Attention vision Transformer for universal texture synthesis. We exploit the natural long-range dependencies enabled by the attention mechanism to allow our approach to synthesize diverse textures while preserving their structures in a single inference. We propose a hierarchical hourglass backbone that attends to the global structure and performs patch mapping at varying scales in a coarse-to-fine-to-coarse stream. Completed by skip connection and convolution designs that propagate and fuse information at different scales, our hierarchical U-Attention architecture unifies attention to features from macro structures to micro details, and progressively refines synthesis results at successive stages. Our method achieves stronger 2 × synthesis than previous work on both stochastic and structured textures while generalizing to unseen textures without fine-tuning. Ablation studies demonstrate the effectiveness of each component of our architecture.

Supplementary Material

Supplemental material (cvmp22-9-supp.pdf)

Download
9.19 MB

References

[1]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).

[2]

Irwan Bello. 2021. Lambdanetworks: Modeling long-range interactions without attention. arXiv preprint arXiv:2102.08602(2021).

[3]

Urs Bergmann, Nikolay Jetchev, and Roland Vollgraf. 2017. Learning Texture Manifolds with the Periodic Spatial GAN. In Proceedings of the 34th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 70), Doina Precupand Yee Whye Teh (Eds.). PMLR, 469–477. http://proceedings.mlr.press/v70/bergmann17a.html

[4]

Ya-Liang Chang, Zhe Yu Liu, Kuan-Ying Lee, and Winston Hsu. 2019. Free-form video inpainting with 3d gated convolution and temporal patchgan. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9066–9075.

[5]

Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. 2021. Pre-trained image processing transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12299–12310.

[6]

Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, and Ilya Sutskever. 2020. Generative pretraining from pixels. In International Conference on Machine Learning. PMLR, 1691–1703.

[7]

Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, 2020. Rethinking attention with performers. arXiv preprint arXiv:2009.14794(2020).

[8]

M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi. 2014. Describing Textures in the Wild. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).

[9]

D. Dai, H. Riemenschneider, and L. Van Gool. 2014. The Synthesizability of Texture Examples. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929(2020).

[11]

Alexei A Efros and William T Freeman. 2001. Image quilting for texture synthesis and transfer. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. 341–346.

Digital Library

[12]

Alexei A. Efros and Thomas K. Leung. 1999. Texture synthesis by non-parametric sampling. Proceedings of the Seventh IEEE International Conference on Computer Vision 2 (1999), 1033–1038 vol.2.

[13]

B. Galerne, Y. Gousseau, and J. Morel. 2011. Random Phase Textures: Theory and Synthesis. IEEE Transactions on Image Processing 20, 1 (2011), 257–267.

Digital Library

[14]

Bruno Galerne, Ares Lagae, Sylvain Lefebvre, and George Drettakis. 2012. Gabor Noise by Example. ACM Trans. Graph. 31, 4, Article 73 (July 2012), 9 pages. https://doi.org/10.1145/2185520.2185569

Digital Library

[15]

Bruno Galerne, Arthur Leclaire, and Lionel Moisan. 2017. Texton Noise. Computer Graphics Forum 36 (2017).

[16]

Leon Gatys, Alexander S Ecker, and Matthias Bethge. 2015. Texture synthesis using convolutional neural networks. Advances in neural information processing systems 28 (2015), 262–270.

[17]

Guillaume Gilet, Basile Sauvage, Kenneth Vanhoey, Jean-Michel Dischler, and Djamchid Ghazanfarpour. 2014. Local random-phase noise for procedural texturing. ACM Transactions on Graphics (TOG) 33 (2014), 1 – 11.

Digital Library

[18]

David J. Heeger and James R. Bergen. 1995. Pyramid-based texture analysis/synthesis. Proceedings of the 22nd annual conference on Computer graphics and interactive techniques (1995).

[19]

Eric Heitz and Fabrice Neyret. 2018. High-Performance By-Example Noise using a Histogram-Preserving Blending Operator. Proc. ACM Comput. Graph. Interact. Tech. 1 (2018), 31:1–31:25.

Digital Library

[20]

Philipp Henzler, Niloy J Mitra, and Tobias Ritschel. 2020. Learning a Neural 3D Texture Space from 2D Exemplars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8356 – 8364.

[21]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), 5967–5976.

[22]

Andrew Jaegle, Felix Gimeno, Andrew Brock, Andrew Zisserman, Oriol Vinyals, and João Carreira. 2021. Perceiver: General Perception with Iterative Attention. In ICML.

[23]

Nikolay Jetchev, Urs Bergmann, and Roland Vollgraf. 2016. Texture synthesis with spatial generative adversarial networks. arXiv preprint arXiv:1611.08207(2016).

[24]

Alexandre Kaspar, Boris Neubert, Dani Lischinski, Mark Pauly, and Johannes Kopf. 2015. Self Tuning Texture Optimization. Computer Graphics Forum 34 (2015).

[25]

Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.

[26]

Roland Kwitt and Peter Meerwald. 2008. Salzburg Texture Image Database (stex). https://wavelab.at/sources/STex/

[27]

Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang. 2017. Universal Style Transfer via Feature Transforms. In Advances in Neural Information Processing Systems.

[28]

Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum A Reda, Karan Sapra, Andrew Tao, and Bryan Catanzaro. 2020. Transposer: Universal texture synthesis using feature maps as transposed convolution filter. arXiv preprint arXiv:2007.07243(2020).

[29]

Morteza Mardani, Guilin Liu, Aysegul Dundar, Shiqiu Liu, Andrew Tao, and Bryan Catanzaro. 2020. Neural ffts for universal texture image synthesis. Advances in Neural Information Processing Systems 33 (2020).

[30]

Eyvind Niklasson, Alexander Mordvintsev, Ettore Randazzo, and Michael Levin. 2021. Self-organising textures. Distill 6, 2 (2021), e00027–003.

[31]

Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran. 2018. Image transformer. In International Conference on Machine Learning. PMLR, 4055–4064.

[32]

Lara Raad, Axel Davy, Agnès Desolneux, and Jean-Michel Morel. 2017. A survey of exemplar-based texture synthesis. CoRR abs/1707.07184(2017). arXiv:1707.07184http://arxiv.org/abs/1707.07184

[33]

Carlos Rodriguez-Pardo, Sergio Suja, David Pascual, Jorge Lopez-Moreno, and Elena Garces. 2019. Automatic extraction and synthesis of regular repeatable patterns. Computers & Graphics 83(2019), 33–41.

Digital Library

[34]

Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. 2019. SinGAN: Learning a Generative Model From a Single Natural Image. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019), 4569–4579.

[35]

Yi Tay, Dara Bahri, Donald Metzler, Da-Cheng Juan, Zhe Zhao, and Che Zheng. 2021. Synthesizer: Rethinking self-attention for transformer models. In International Conference on Machine Learning. PMLR, 10183–10192.

[36]

Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, and Victor S. Lempitsky. 2016. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images. In ICML.

[37]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

[38]

Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Loddon Yuille, and Liang-Chieh Chen. 2020b. Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation. In ECCV.

[39]

Sinong Wang, Belinda Z Li, Madian Khabsa, Han Fang, and Hao Ma. 2020a. Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768(2020).

[40]

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[41]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.

Digital Library

[42]

Fuzhi Yang, Huan Yang, Jianlong Fu, Hongtao Lu, and Baining Guo. 2020. Learning texture transformer network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5791–5800.

[43]

Yanhong Zeng, Jianlong Fu, and Hongyang Chao. 2020. Learning joint spatial-temporal transformations for video inpainting. In European Conference on Computer Vision. Springer, 528–543.

Digital Library

[44]

Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In International conference on machine learning. PMLR, 7354–7363.

[45]

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR.

[46]

Yang Zhou, Zhen Zhu, Xiang Bai, Dani Lischinski, Daniel Cohen-Or, and Hui Huang. 2018. Non-stationary texture synthesis by adversarial expansion. ACM Transactions on Graphics (TOG) 37 (2018), 1 – 13.

Digital Library

Cited By

Sartor SPeers P(2024)Content-aware Tile Generation using Exterior Boundary InpaintingACM Transactions on Graphics10.1145/368798143:6(1-12)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687981
Rodriguez-Pardo CCasas DGarces ELopez-Moreno J(2024)TexTile: A Differentiable Metric for Texture Tileability2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00425(4439-4449)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.00425
Liang XMo HGao C(2023)Controllable Garment Image Synthesis Integrated with Frequency Domain FeaturesComputer Graphics Forum10.1111/cgf.1493842:7Online publication date: 5-Nov-2023
https://doi.org/10.1111/cgf.14938
Show More Cited By

Index Terms

U-Attention to Textures: Hierarchical Hourglass Vision Transformer for Universal Texture Synthesis
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Matching
      2. Computer vision representations
        Appearance and texture representations
        Hierarchical representations
        Image representations

Recommendations

Synthesis of progressively-variant textures on arbitrary surfaces
SIGGRAPH '03: ACM SIGGRAPH 2003 Papers

We present an approach for decorating surfaces with progressively-variant textures. Unlike a homogeneous texture, a progressively-variant texture can model local texture variations, including the scale, orientation, color, and shape variations of ...
Synthesis of progressively-variant textures on arbitrary surfaces

We present an approach for decorating surfaces with progressively-variant textures. Unlike a homogeneous texture, a progressively-variant texture can model local texture variations, including the scale, orientation, color, and shape variations of ...
Perspective-aware texture analysis and synthesis

This paper presents a novel texture synthesis scheme for anisotropic 2D textures based on perspective feature analysis and energy optimization. Given an example texture, the synthesis process starts with analyzing the texel (TEXture ELement) scale ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CVMP '22: Proceedings of the 19th ACM SIGGRAPH European Conference on Visual Media Production

December 2022

97 pages

ISBN:9781450399395

DOI:10.1145/3565516

Editors:
Marco Volino
University of Surrey, UK
,
Rafał Mantiuk
University of Cambridge, UK
,
Armin Mustafa
University of Surrey, UK
,
Yulia Gryaditskaya
University of Surrey, UK

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CVMP '22

Sponsor:

SIGGRAPH

CVMP '22: European Conference on Visual Media Production

December 1 - 2, 2022

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 40 of 67 submissions, 60%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
99
Total Downloads

Downloads (Last 12 months)34
Downloads (Last 6 weeks)2

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sartor SPeers P(2024)Content-aware Tile Generation using Exterior Boundary InpaintingACM Transactions on Graphics10.1145/368798143:6(1-12)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687981
Rodriguez-Pardo CCasas DGarces ELopez-Moreno J(2024)TexTile: A Differentiable Metric for Texture Tileability2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00425(4439-4449)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.00425
Liang XMo HGao C(2023)Controllable Garment Image Synthesis Integrated with Frequency Domain FeaturesComputer Graphics Forum10.1111/cgf.1493842:7Online publication date: 5-Nov-2023
https://doi.org/10.1111/cgf.14938
Chatillon PGousseau YLefebvre S(2023)A Geometrically Aware Auto-Encoder for Multi-texture SynthesisScale Space and Variational Methods in Computer Vision10.1007/978-3-031-31975-4_20(263-275)Online publication date: 10-May-2023
https://doi.org/10.1007/978-3-031-31975-4_20

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents