Language Guidance Generation Using Aesthetic Attribute Comparison for Human Photography and AIGC
Pages 127 - 135
Abstract
With the proliferation of mobile photography technology, leading mobile phone manufacturers are racing to enhance the shooting capabilities of their equipment and the photo beautification algorithm of their software. However, the development of intelligent equipment and algorithms cannot supplant human subjective photography techniques. Simultaneously, with the rapid advancement of AIGC technology, AI simulation shooting has become an integral part of people's daily lives. If it were possible to assist human photography and AIGC with language guidance, this would be a significant step forward in subjectively improving the aesthetic quality of photographic images. In this paper, we propose Aesthetic Language Guidance of Image (ALG) and present a series of language guidance rules (ALG Rules). ALG is divided into ALG-T and ALG-I based on whether the guiding rules are derived from photography templates or reference images, respectively. ALG-T and ALG-I both provide guidance for photography based on three attributes of color, light, and composition of images. ALG-T and ALG-I provide aesthetic language guidance for two types of input images, landscape and portrait images. We employ two methods to conduct confirmatory experiments, human photography, and AIGC imitation shooting. In the experiments, by comparing the aesthetic scores of original and modified images, the results show that our proposed guidance scheme significantly improves the aesthetic quality of photos in terms of color, composition, and lighting attributes.
References
[1]
Anselm Brachmann and Christoph Redies. 2017. Computational and experimental approaches to visual aesthetics. Frontiers in computational neuroscience, Vol. 11 (2017), 102.
[2]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, Vol. 33 (2020), 1877--1901.
[3]
Z. Chen, A. Tagliasacchi, and H. Zhang. 2020. BSP-Net: Generating Compact Meshes via Binary Space Partitioning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4]
P. R. Goode, J. Qiu, V. Yurchyshyn, J. Hickey, and S. E. Koonin. 2001. Earthshine observations of the Earth's reflectance. Geophysical Research Letters, Vol. 28, 9 (2001).
[5]
L. Hosek and A. Wilkie. 2012. An analytic model for full spectral sky-dome radiance. Acm Transactions on Graphics, Vol. 31, 4 (2012), 1--9.
[6]
Pvc Hough. 1962. Method and means for recognizing complex patterns. U.s.patent (1962).
[7]
L Ho?ekho?ek and A. Wilkie. 2013. Adding a Solar-Radiance Function to the Ho?ek-Wilkie Skylight Model. IEEE Computer Graphics & Applications, Vol. 33, 3 (2013), 44--52.
[8]
J. Itten. 1961. The Art of color:. (1961).
[9]
Xin Jin, Xinning Li, Hao Lou, Chenyu Fan, Qiang Deng, Chaoen Xiao, Shuai Cui, and Amit Kumar Singh. 2023. Aesthetic attribute assessment of images numerically on mixed multi-attribute datasets. ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 18, 3s (2023), 1--16.
[10]
Xin Jin, Hao Lou, Heng Huang, Xinning Li, Xiaodong Li, Shuai Cui, Xiaokun Zhang, and Xiqiao Li. 2022. Pseudo-Labeling and Meta Reweighting Learning for Image Aesthetic Quality Assessment. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 12 (2022), 25226--25235.
[11]
Immanuel Kant. 2000. Critique of the Power of Judgment. Cambridge University Press.
[12]
Yueying Kao, Chong Wang, and Kaiqi Huang. 2015. Visual aesthetic quality assessment with a regression model. In 2015 IEEE International Conference on Image Processing (ICIP). IEEE, 1583--1587.
[13]
Y. Liu, M. M. Cheng, X. Hu, K. Wang, and X. Bai. 2016. Richer Convolutional Features for Edge Detection. IEEE Computer Society (2016).
[14]
Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4535--4544.
[15]
A. Newell, K. Yang, and D. Jia. 2016. Stacked Hourglass Networks for Human Pose Estimation. In European Conference on Computer Vision.
[16]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.
[17]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog, Vol. 1, 8 (2019), 9.
[18]
M. Tan and Q. V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. (2019).
[19]
Unsplash. Accessed 2023-05-06. Unsplash. https://unsplash.com/.
[20]
G. Ward. 2004. High Dynamic Range Imaging. In Color & Imaging Conference.
[21]
J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba. 2012. Recognizing scene viewpoint using panoramic place representation. In IEEE.
[22]
S. Xie and Z. Tu. 2016. Holistically-Nested Edge Detection. In 2015 IEEE International Conference on Computer Vision (ICCV). io
Index Terms
- Language Guidance Generation Using Aesthetic Attribute Comparison for Human Photography and AIGC
Comments
Information & Contributors
Information
Published In
October 2023
151 pages
Copyright © 2023 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 29 October 2023
Check for updates
Author Tags
Qualifiers
- Research-article
Funding Sources
- the Fundamental Research Funds for the Central Universities
- the Natural Science Foundation of China
- the Project of Philosophy and Social Science Research, Ministry of Education of China
- the Science and Technology Project of the State Archives Administrator
Conference
MM '23
Sponsor:
Upcoming Conference
MM '24
- Sponsor:
- sigmm
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 171Total Downloads
- Downloads (Last 12 months)171
- Downloads (Last 6 weeks)16
Reflects downloads up to 04 Oct 2024
Other Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in