Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3615522.3615557acmotherconferencesArticle/Chapter ViewAbstractPublication PagesvinciConference Proceedingsconference-collections
research-article

Naturality: A Natural Reflection of Chinese Calligraphy

Published: 20 October 2023 Publication History

Abstract

We present a machine learning-based interactive video installation powered by CLIP and diffusion models and inspired by the concept of naturality in traditional Chinese calligraphy. The artwork explores contemporary interpretations of this traditional concept through practical methods in Artificial Intelligence Generated Content (AIGC). Technically, the algorithms are based on state-of-the-art perceptual and generative models, incorporating multi-dimensional controls over text-to-image and image-to-image translation; conceptually, this real-time art installation extends the discussion brought by Xu Bing’s pieces Book from the Sky and Square Word Calligraphy. The project explores the possibility of AIGC in bridging human creativity and natural randomness, as well as a shifting creative paradigm enhanced by AI knowledge, perception, and association.

References

[1]
Mohd A. Ansari and Dushyant K. Singh. 2021. Human detection techniques for real time surveillance: a comprehensive survey. Multimedia tools and applications 80, 6 (2021), 8759–8808.
[2]
Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni M. Farinella, and Tal Hassner. 2022. FILM: Frame Interpolation for Large Motion. Computer Vision - ECCV 2022, Vol. 13667. Springer, Switzerland, 250–266.
[3]
Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni M. Farinella, and Tal Hassner. 2022. MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration. Computer Vision - ECCV 2022, Vol. 13668. Springer, Switzerland, 431–449.
[4]
Dmitry Baranchuk, Ivan Rubachev, Andrey Voynov, Valentin Khrulkov, and Artem Babenko. 2022. Label-Efficient Semantic Segmentation with Diffusion Models. arxiv:2112.03126 [cs.CV]
[5]
Xu Bing. 1988. Book from the Sky. https://www.xubing.com/en/work/details/206?type=project#206
[6]
Xu Bing. 1994. Square Word Calligraphy. https://www.xubing.com/en/work/details/206?type=project#206
[7]
Xu Bing. 2022. Living Word. https://www.xubing.com/en/work/details/695?year=2022&type=year#695
[8]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
[9]
Yihan Cao, Siyu Li, Yixin Liu, Zhiling Yan, Yutong Dai, Philip S. Yu, and Lichao Sun. 2023. A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT. (2023).
[10]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arxiv:2010.11929 [cs.CV]
[11]
Zhida Feng, Zhenyu Zhang, Xintong Yu, Yewei Fang, Lanxin Li, Xuyi Chen, Yuxiang Lu, Jiaxiang Liu, Weichong Yin, Shikun Feng, Yu Sun, Li Chen, Hao Tian, Hua Wu, and Haifeng Wang. 2023. ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts. arxiv:2210.15257 [cs.CV]
[12]
Stanislav Frolov, Tobias Hinz, Federico Raue, Jörn Hees, and Andreas Dengel. 2021. Adversarial text-to-image synthesis: A review. Neural Networks 144 (2021), 187–209. https://doi.org/10.1016/j.neunet.2021.07.019
[13]
Yaroslav Ganin, Tejas Kulkarni, Igor Babuschkin, S. M. Ali Eslami, and Oriol Vinyals. 2018. Synthesizing Programs for Images using Reinforced Adversarial Learning. In Proceedings of the 35th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 80), Jennifer Dy and Andreas Krause (Eds.). PMLR, 1666–1675. https://proceedings.mlr.press/v80/ganin18a.html
[14]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative Adversarial Networks. Commun. ACM 63, 11 (oct 2020), 139–144. https://doi.org/10.1145/3422622
[15]
Significant Gravitas. 2023. Auto-GPT. https://github.com/Significant-Gravitas/Auto-GPT
[16]
David Ha and Douglas Eck. 2017. A Neural Representation of Sketch Drawings. arxiv:1704.03477 [cs.NE]
[17]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 6840–6851. https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
[18]
Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J. Fleet. 2022. Video Diffusion Models. arxiv:2204.03458 [cs.CV]
[19]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. (2016).
[20]
Diederik P Kingma and Max Welling. 2022. Auto-Encoding Variational Bayes. arxiv:1312.6114 [stat.ML]
[21]
Midjourney. 2022. Midjourney. https://www.midjourney.com/
[22]
Ben Mildenhall, Peter Hedman, Ricardo Martin-Brualla, Pratul P. Srinivasan, and Jonathan T. Barron. 2022. NeRF in the Dark: High Dynamic Range View Synthesis From Noisy Raw Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 16190–16199.
[23]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2021. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Commun. ACM 65, 1 (dec 2021), 99–106. https://doi.org/10.1145/3503250
[24]
Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. arxiv:1411.1784 [cs.LG]
[25]
Simon Niklaus and Feng Liu. 2020. Softmax Splatting for Video Frame Interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26]
Junheum Park, Chul Lee, and Chang-Su Kim. 2021. Asymmetric Bilateral Motion Estimation for Video Frame Interpolation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14539–14548.
[27]
Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2022. DreamFusion: Text-to-3D using 2D Diffusion. arxiv:2209.14988 [cs.CV]
[28]
Alec Radford, Jong W. Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. (2021).
[29]
Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arxiv:1511.06434 [cs.LG]
[30]
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical Text-Conditional Image Generation with CLIP Latents. arxiv:2204.06125 [cs.CV]
[31]
Daniel Rebain, Mark Matthews, Kwang Moo Yi, Dmitry Lagun, and Andrea Tagliasacchi. 2022. LOLNerf: Learn From One Look. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1558–1567.
[32]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis With Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10684–10695.
[34]
Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, Jonathan Ho, David J Fleet, and Mohammad Norouzi. 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.). Vol. 35. Curran Associates, Inc., 36479–36494. https://proceedings.neurips.cc/paper_files/paper/2022/file/ec795aeadae0b7d230fa35cbaf04c041-Paper-Conference.pdf
[35]
Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, and Yaniv Taigman. 2022. Make-A-Video: Text-to-Video Generation without Text-Video Data. arxiv:2209.14792 [cs.CV]
[36]
Xu Tan, Tao Qin, Jiang Bian, Tie-Yan Liu, and Yoshua Bengio. 2023. Regeneration Learning: A Learning Paradigm for Data Generation. (2023).
[37]
Aaron van den Oord, Oriol Vinyals, and koray kavukcuoglu. 2017. Neural Discrete Representation Learning. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc.https://proceedings.neurips.cc/paper_files/paper/2017/file/7a98af17e63a0ac09ce2e96d03992fbc-Paper.pdf
[38]
Zeyu Wang, Weiqi Shi, Kiraz Akoglu, Eleni Kotoula, Ying Yang, and Holly Rushmeier. 2018. CHER-Ob: A Tool for Shared Analysis and Video Dissemination. J. Comput. Cult. Herit. 11, 4, Article 18 (nov 2018), 22 pages. https://doi.org/10.1145/3230673
[39]
Chung-Yi Weng, Brian Curless, and Ira Kemelmacher-Shlizerman. 2019. Photo Wake-Up: 3D Character Animation From a Single Photo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40]
Shan-Jean Wu, Chih-Yuan Yang, and Jane Yung jen Hsu. 2020. CalliGAN: Style and Structure-aware Chinese Calligraphy Character Generator. arxiv:2005.12500 [cs.CV]
[41]
Shaozu Yuan, Ruixue Liu, Meng Chen, Baoyang Chen, Zhijie Qiu, and Xiaodong He. 2021. Learning to Compose Stylistic Calligraphy Artwork with Emotions. In Proceedings of the 29th ACM International Conference on Multimedia (Virtual Event, China) (MM ’21). Association for Computing Machinery, New York, NY, USA, 3701–3709. https://doi.org/10.1145/3474085.3475711
[42]
Jiaxing Zhang, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Lin Zhang, Ping Yang, Xinyu Gao, Ziwei Wu, Xiaoqun Dong, Junqing He, Jianheng Zhuo, Qi Yang, Yongfeng Huang, Xiayu Li, Yanghan Wu, Junyu Lu, Xinyu Zhu, Weifeng Chen, Ting Han, Kunhao Pan, Rui Wang, Hao Wang, Xiaojun Wu, Zhongshen Zeng, and Chongpei Chen. 2022. Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence. CoRR abs/2209.02970 (2022).
[43]
Kai Zhang, Nick Kolkin, Sai Bi, Fujun Luan, Zexiang Xu, Eli Shechtman, and Noah Snavely. 2022. ARF: Artistic Radiance Fields. In Computer Vision – ECCV 2022, Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). Springer Nature Switzerland, Cham, 717–733.
[44]
Lvmin Zhang and Maneesh Agrawala. 2023. Adding Conditional Control to Text-to-Image Diffusion Models. arxiv:2302.05543 [cs.CV]
[45]
Linqi Zhou, Yilun Du, and Jiajun Wu. 2021. 3D Shape Generation and Completion Through Point-Voxel Diffusion. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 5826–5835.

Cited By

View all
  • (2024)A Study of the Intersection of Traditional Calligraphy and Digital Media ArtApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-21489:1Online publication date: 5-Aug-2024
  • (2024)Popular Chinese Fonts: The Role of Minimalism, the Influence of Zen and the Bauhaus SchoolContemporary Buddhism10.1080/14639947.2024.234242224:1-2(110-133)Online publication date: 5-Jun-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction
September 2023
308 pages
ISBN:9798400707513
DOI:10.1145/3615522
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. AI art.
  2. Chinese calligraphy
  3. naturality

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

VINCI 2023

Acceptance Rates

Overall Acceptance Rate 71 of 193 submissions, 37%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)137
  • Downloads (Last 6 weeks)18
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Study of the Intersection of Traditional Calligraphy and Digital Media ArtApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-21489:1Online publication date: 5-Aug-2024
  • (2024)Popular Chinese Fonts: The Role of Minimalism, the Influence of Zen and the Bauhaus SchoolContemporary Buddhism10.1080/14639947.2024.234242224:1-2(110-133)Online publication date: 5-Jun-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media