research-article

Naturality: A Natural Reflection of Chinese Calligraphy

Authors:

Zeyu WangAuthors Info & Claims

VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction

Article No.: 35, Pages 1 - 8

https://doi.org/10.1145/3615522.3615557

Published: 20 October 2023 Publication History

Abstract

We present a machine learning-based interactive video installation powered by CLIP and diffusion models and inspired by the concept of naturality in traditional Chinese calligraphy. The artwork explores contemporary interpretations of this traditional concept through practical methods in Artificial Intelligence Generated Content (AIGC). Technically, the algorithms are based on state-of-the-art perceptual and generative models, incorporating multi-dimensional controls over text-to-image and image-to-image translation; conceptually, this real-time art installation extends the discussion brought by Xu Bing’s pieces Book from the Sky and Square Word Calligraphy. The project explores the possibility of AIGC in bridging human creativity and natural randomness, as well as a shifting creative paradigm enhanced by AI knowledge, perception, and association.

References

[1]

Mohd A. Ansari and Dushyant K. Singh. 2021. Human detection techniques for real time surveillance: a comprehensive survey. Multimedia tools and applications 80, 6 (2021), 8759–8808.

[2]

Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni M. Farinella, and Tal Hassner. 2022. FILM: Frame Interpolation for Large Motion. Computer Vision - ECCV 2022, Vol. 13667. Springer, Switzerland, 250–266.

[3]

Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni M. Farinella, and Tal Hassner. 2022. MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration. Computer Vision - ECCV 2022, Vol. 13668. Springer, Switzerland, 431–449.

[4]

Dmitry Baranchuk, Ivan Rubachev, Andrey Voynov, Valentin Khrulkov, and Artem Babenko. 2022. Label-Efficient Semantic Segmentation with Diffusion Models. arxiv:2112.03126 [cs.CV]

[5]

Xu Bing. 1988. Book from the Sky. https://www.xubing.com/en/work/details/206?type=project#206

[6]

Xu Bing. 1994. Square Word Calligraphy. https://www.xubing.com/en/work/details/206?type=project#206

[7]

Xu Bing. 2022. Living Word. https://www.xubing.com/en/work/details/695?year=2022&type=year#695

[8]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf

[9]

Yihan Cao, Siyu Li, Yixin Liu, Zhiling Yan, Yutong Dai, Philip S. Yu, and Lichao Sun. 2023. A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT. (2023).

[10]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arxiv:2010.11929 [cs.CV]

[11]

Zhida Feng, Zhenyu Zhang, Xintong Yu, Yewei Fang, Lanxin Li, Xuyi Chen, Yuxiang Lu, Jiaxiang Liu, Weichong Yin, Shikun Feng, Yu Sun, Li Chen, Hao Tian, Hua Wu, and Haifeng Wang. 2023. ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts. arxiv:2210.15257 [cs.CV]

[12]

Stanislav Frolov, Tobias Hinz, Federico Raue, Jörn Hees, and Andreas Dengel. 2021. Adversarial text-to-image synthesis: A review. Neural Networks 144 (2021), 187–209. https://doi.org/10.1016/j.neunet.2021.07.019

Digital Library

[13]

Yaroslav Ganin, Tejas Kulkarni, Igor Babuschkin, S. M. Ali Eslami, and Oriol Vinyals. 2018. Synthesizing Programs for Images using Reinforced Adversarial Learning. In Proceedings of the 35th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 80), Jennifer Dy and Andreas Krause (Eds.). PMLR, 1666–1675. https://proceedings.mlr.press/v80/ganin18a.html

[14]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative Adversarial Networks. Commun. ACM 63, 11 (oct 2020), 139–144. https://doi.org/10.1145/3422622

Digital Library

[15]

Significant Gravitas. 2023. Auto-GPT. https://github.com/Significant-Gravitas/Auto-GPT

[16]

David Ha and Douglas Eck. 2017. A Neural Representation of Sketch Drawings. arxiv:1704.03477 [cs.NE]

[17]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 6840–6851. https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf

[18]

Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J. Fleet. 2022. Video Diffusion Models. arxiv:2204.03458 [cs.CV]

[19]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. (2016).

[20]

Diederik P Kingma and Max Welling. 2022. Auto-Encoding Variational Bayes. arxiv:1312.6114 [stat.ML]

[21]

Midjourney. 2022. Midjourney. https://www.midjourney.com/

[22]

Ben Mildenhall, Peter Hedman, Ricardo Martin-Brualla, Pratul P. Srinivasan, and Jonathan T. Barron. 2022. NeRF in the Dark: High Dynamic Range View Synthesis From Noisy Raw Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 16190–16199.

[23]

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2021. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Commun. ACM 65, 1 (dec 2021), 99–106. https://doi.org/10.1145/3503250

Digital Library

[24]

Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. arxiv:1411.1784 [cs.LG]

[25]

Simon Niklaus and Feng Liu. 2020. Softmax Splatting for Video Frame Interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]

Junheum Park, Chul Lee, and Chang-Su Kim. 2021. Asymmetric Bilateral Motion Estimation for Video Frame Interpolation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14539–14548.

[27]

Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2022. DreamFusion: Text-to-3D using 2D Diffusion. arxiv:2209.14988 [cs.CV]

[28]

Alec Radford, Jong W. Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. (2021).

[29]

Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arxiv:1511.06434 [cs.LG]

[30]

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical Text-Conditional Image Generation with CLIP Latents. arxiv:2204.06125 [cs.CV]

[31]

Daniel Rebain, Mark Matthews, Kwang Moo Yi, Dmitry Lagun, and Andrea Tagliasacchi. 2022. LOLNerf: Learn From One Look. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1558–1567.

[32]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis With Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10684–10695.

[34]

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, Jonathan Ho, David J Fleet, and Mohammad Norouzi. 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.). Vol. 35. Curran Associates, Inc., 36479–36494. https://proceedings.neurips.cc/paper_files/paper/2022/file/ec795aeadae0b7d230fa35cbaf04c041-Paper-Conference.pdf

[35]

Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, and Yaniv Taigman. 2022. Make-A-Video: Text-to-Video Generation without Text-Video Data. arxiv:2209.14792 [cs.CV]

[36]

Xu Tan, Tao Qin, Jiang Bian, Tie-Yan Liu, and Yoshua Bengio. 2023. Regeneration Learning: A Learning Paradigm for Data Generation. (2023).

[37]

Aaron van den Oord, Oriol Vinyals, and koray kavukcuoglu. 2017. Neural Discrete Representation Learning. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc.https://proceedings.neurips.cc/paper_files/paper/2017/file/7a98af17e63a0ac09ce2e96d03992fbc-Paper.pdf

[38]

Zeyu Wang, Weiqi Shi, Kiraz Akoglu, Eleni Kotoula, Ying Yang, and Holly Rushmeier. 2018. CHER-Ob: A Tool for Shared Analysis and Video Dissemination. J. Comput. Cult. Herit. 11, 4, Article 18 (nov 2018), 22 pages. https://doi.org/10.1145/3230673

Digital Library

[39]

Chung-Yi Weng, Brian Curless, and Ira Kemelmacher-Shlizerman. 2019. Photo Wake-Up: 3D Character Animation From a Single Photo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]

Shan-Jean Wu, Chih-Yuan Yang, and Jane Yung jen Hsu. 2020. CalliGAN: Style and Structure-aware Chinese Calligraphy Character Generator. arxiv:2005.12500 [cs.CV]

[41]

Shaozu Yuan, Ruixue Liu, Meng Chen, Baoyang Chen, Zhijie Qiu, and Xiaodong He. 2021. Learning to Compose Stylistic Calligraphy Artwork with Emotions. In Proceedings of the 29th ACM International Conference on Multimedia (Virtual Event, China) (MM ’21). Association for Computing Machinery, New York, NY, USA, 3701–3709. https://doi.org/10.1145/3474085.3475711

Digital Library

[42]

Jiaxing Zhang, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Lin Zhang, Ping Yang, Xinyu Gao, Ziwei Wu, Xiaoqun Dong, Junqing He, Jianheng Zhuo, Qi Yang, Yongfeng Huang, Xiayu Li, Yanghan Wu, Junyu Lu, Xinyu Zhu, Weifeng Chen, Ting Han, Kunhao Pan, Rui Wang, Hao Wang, Xiaojun Wu, Zhongshen Zeng, and Chongpei Chen. 2022. Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence. CoRR abs/2209.02970 (2022).

[43]

Kai Zhang, Nick Kolkin, Sai Bi, Fujun Luan, Zexiang Xu, Eli Shechtman, and Noah Snavely. 2022. ARF: Artistic Radiance Fields. In Computer Vision – ECCV 2022, Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). Springer Nature Switzerland, Cham, 717–733.

[44]

Lvmin Zhang and Maneesh Agrawala. 2023. Adding Conditional Control to Text-to-Image Diffusion Models. arxiv:2302.05543 [cs.CV]

[45]

Linqi Zhou, Yilun Du, and Jiajun Wu. 2021. 3D Shape Generation and Completion Through Point-Voxel Diffusion. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 5826–5835.

Cited By

Xiong P(2024)A Study of the Intersection of Traditional Calligraphy and Digital Media ArtApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-21489:1Online publication date: 5-Aug-2024
https://doi.org/10.2478/amns-2024-2148
Wang W(2024)Popular Chinese Fonts: The Role of Minimalism, the Influence of Zen and the Bauhaus SchoolContemporary Buddhism10.1080/14639947.2024.234242224:1-2(110-133)Online publication date: 5-Jun-2024
https://doi.org/10.1080/14639947.2024.2342422

Index Terms

Naturality: A Natural Reflection of Chinese Calligraphy
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Calligraphy to Image
VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction

Chinese calligraphy, a cherished aspect of Chinese and global culture, employs Chinese characters as its medium, while ink and brush serve as instrumental tools, resulting in a distinctive aesthetic allure. Calligraphy comprises intricate imagery that ...
Chinese Paleography, Calligraphy, and Pattern Recognition: Styles and Scripts in Excavated Ancient Chinese Documents
ICDAR '11: Proceedings of the 2011 International Conference on Document Analysis and Recognition

Just as the discovery of the Dead Sea Scrolls significantly changed the study of Judeo-Christian biblical tradition, excavated ancient Chinese documents, many of which are unavailable in the transmitted textual tradition, have dramatically changed the ...
Interactive creation of Chinese calligraphy with the application in calligraphy education
Transactions on edutainment V

Given a few tablet images of Chinese calligraphy, it is difficult to automatically create new Chinese calligraphy with better effects while keeping similar style. A semiautomatic creation scheme of Chinese calligraphy and its application in calligraphy ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction

September 2023

308 pages

ISBN:9798400707513

DOI:10.1145/3615522

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

VINCI 2023

VINCI 2023: The 16th International Symposium on Visual Information Communication and Interaction

September 22 - 24, 2023

Guangzhou, China

Acceptance Rates

Overall Acceptance Rate 71 of 193 submissions, 37%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
137
Total Downloads

Downloads (Last 12 months)137
Downloads (Last 6 weeks)18

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xiong P(2024)A Study of the Intersection of Traditional Calligraphy and Digital Media ArtApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-21489:1Online publication date: 5-Aug-2024
https://doi.org/10.2478/amns-2024-2148
Wang W(2024)Popular Chinese Fonts: The Role of Minimalism, the Influence of Zen and the Bauhaus SchoolContemporary Buddhism10.1080/14639947.2024.234242224:1-2(110-133)Online publication date: 5-Jun-2024
https://doi.org/10.1080/14639947.2024.2342422

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents