Paint2Pix: Interactive Painting Based Progressive Image Synthesis and Editing

Singh, Jaskirat; Zheng, Liang; Smith, Cameron; Echevarria, Jose

doi:10.1007/978-3-031-19781-9_39

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13674))

Included in the following conference series:

European Conference on Computer Vision

2622 Accesses
3 Citations

Abstract

Controllable image synthesis with user scribbles is a topic of keen interest in the computer vision community. In this paper, for the first time we study the problem of photorealistic image synthesis from incomplete and primitive human paintings. In particular, we propose a novel approach paint2pix, which learns to predict (and adapt) “what a user wants to draw” from rudimentary brushstroke inputs, by learning a mapping from the manifold of incomplete human paintings to their realistic renderings. When used in conjunction with recent works in autonomous painting agents, we show that paint2pix can be used for progressive image synthesis from scratch. During this process, paint2pix allows a novice user to progressively synthesize the desired image output, while requiring just few coarse user scribbles to accurately steer the trajectory of the synthesis process. Furthermore, we find that our approach also forms a surprisingly convenient approach for real image editing, and allows the user to perform a diverse range of custom fine-grained edits through the addition of only a few well-placed brushstrokes. Source code and demo is available at https://github.com/1jsingh/paint2pix.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents

Intelligent-paint: a Chinese painting process generation method based on vision transformer

Article 03 April 2024

Image palette: painting style transfer via brushstroke control synthesis

Article 22 March 2016

References

Abdal, R., Qin, Y., Wonka, P.: Image2styleGAN: how to embed images into the styleGAN latent space? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4432–4441 (2019)
Google Scholar
Abdal, R., Qin, Y., Wonka, P.: Image2styleGAN++: how to edit the embedded images? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8296–8305 (2020)
Google Scholar
Abdal, R., Zhu, P., Mitra, N.J., Wonka, P.: StyleFlow: attribute-conditioned exploration of styleGAN-generated images using conditional continuous normalizing flows. ACM Trans. Graph. (TOG) 40(3), 1–21 (2021)
Article Google Scholar
Alaluf, Y., Patashnik, O., Cohen-Or, D.: ReStyle: a residual-based StyleGAN encoder via iterative refinement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021
Google Scholar
Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2Photo: internet image montage. ACM Trans. Graph. (TOG) 28(5), 1–10 (2009)
Google Scholar
Chen, W., Hays, J.: SketchyGAN: towards diverse and realistic sketch to image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9416–9425 (2018)
Google Scholar
Chen, X., Fan, H., Girshick, R., He, K.: Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020)
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2018)
Google Scholar
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: StarGAN v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Google Scholar
Esser, P., Rombach, R., Ommer, B.: Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12873–12883 (2021)
Google Scholar
Ghosh, A., et al.: Interactive sketch & fill: Multiclass sketch-to-image translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1171–1180 (2019)
Google Scholar
Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANSpace: discovering interpretable GAN controls. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9841–9850 (2020)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Huang, Z., Heng, W., Zhou, S.: Learning to paint with model-based deep reinforcement learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8709–8718 (2019)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
Google Scholar
Kotovenko, D., Wright, M., Heimbrecht, A., Ommer, B.: Rethinking style transfer: from pixels to parameterized brushstrokes. arXiv preprint arXiv:2103.17185 (2021)
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia (2013)
Google Scholar
Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5549–5558 (2020)
Google Scholar
Lee, J., Kim, E., Lee, Y., Kim, D., Chang, J., Choo, J.: Reference-based sketch image colorization using augmented-self reference and dense semantic correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5801–5810 (2020)
Google Scholar
Li, X., Zhang, B., Liao, J., Sander, P.V.: Deep sketch-guided cartoon video synthesis. CoRR (2020)
Google Scholar
Liu, R., Yu, Q., Yu, S.X.: Unsupervised sketch to photo synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 36–52. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_3
Chapter Google Scholar
Liu, S., et al.: Paint transformer: feed forward neural painting with stroke prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6598–6607 (2021)
Google Scholar
Liu, X., Yin, G., Shao, J., Wang, X., et al.: Learning to predict layout-to-image conditional convolutions for semantic image synthesis. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Google Scholar
Patashnik, O., Wu, Z., Shechtman, E., Cohen-Or, D., Lischinski, D.: Styleclip: text-driven manipulation of StyleGAN imagery. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2085–2094 (2021)
Google Scholar
Richardson, E., et al.: Encoding in style: a StyleGAN encoder for image-to-image translation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021
Google Scholar
Shen, Y., Zhou, B.: Closed-form factorization of latent semantics in GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1532–1540 (2021)
Google Scholar
Singh, J., Smith, C., Echevarria, J., Zheng, L.: Intelli-paint: towards developing human-like painting agents. In: European Conference on Computer Vision. Springer (2022)
Google Scholar
Singh, J., Zheng, L.: Combining semantic guidance and deep reinforcement learning for generating human level paintings. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar
Sushko, V., Schönfeld, E., Zhang, D., Gall, J., Schiele, B., Khoreva, A.: You only need adversarial supervision for semantic image synthesis. arXiv preprint arXiv:2012.04781 (2020)
Tov, O., Alaluf, Y., Nitzan, Y., Patashnik, O., Cohen-Or, D.: Designing an encoder for stylegan image manipulation. arXiv preprint arXiv:2102.02766 (2021)
Wang, Q., Guo, C., Dai, H.N., Li, P.: Self-stylized neural painter. In: SIGGRAPH Asia 2021 Posters, pp. 1–2 (2021)
Google Scholar
Xiang, X., Liu, D., Yang, X., Zhu, Y., Shen, X., Allebach, J.P.: Adversarial open domain adaptation for sketch-to-photo synthesis. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1434–1444 (2022)
Google Scholar
Yang, S., Wang, Z., Liu, J., Guo, Z.: Controllable sketch-to-image translation for robust face synthesis. IEEE Trans. Image Process. 30, 8797–8810 (2021)
Article Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Google Scholar
Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_36
Chapter Google Scholar
Zhu, P., Abdal, R., Qin, Y., Wonka, P.: Sean: Image synthesis with semantic region-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5104–5113 (2020)
Google Scholar
Zou, Z., Shi, T., Qiu, S., Yuan, Y., Shi, Z.: Stylized neural painting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15689–15698 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Australian National University, Canberra, Australia
Jaskirat Singh & Liang Zheng
Adobe Research, San Jose, USA
Jaskirat Singh, Cameron Smith & Jose Echevarria

Authors

Jaskirat Singh
View author publications
You can also search for this author in PubMed Google Scholar
Liang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Cameron Smith
View author publications
You can also search for this author in PubMed Google Scholar
Jose Echevarria
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaskirat Singh .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2898 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, J., Zheng, L., Smith, C., Echevarria, J. (2022). Paint2Pix: Interactive Painting Based Progressive Image Synthesis and Editing. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13674. Springer, Cham. https://doi.org/10.1007/978-3-031-19781-9_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-19781-9_39
Published: 23 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19780-2
Online ISBN: 978-3-031-19781-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Paint2Pix: Interactive Painting Based Progressive Image Synthesis and Editing

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents

Intelligent-paint: a Chinese painting process generation method based on vision transformer

Image palette: painting style transfer via brushstroke control synthesis

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2898 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Paint2Pix: Interactive Painting Based Progressive Image Synthesis and Editing

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents

Intelligent-paint: a Chinese painting process generation method based on vision transformer

Image palette: painting style transfer via brushstroke control synthesis

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2898 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation