research-article

Language Guidance Generation Using Aesthetic Attribute Comparison for Human Photography and AIGC

Authors:

Hao LouAuthors Info & Claims

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice

Pages 127 - 135

https://doi.org/10.1145/3607541.3616819

Published: 29 October 2023 Publication History

Abstract

With the proliferation of mobile photography technology, leading mobile phone manufacturers are racing to enhance the shooting capabilities of their equipment and the photo beautification algorithm of their software. However, the development of intelligent equipment and algorithms cannot supplant human subjective photography techniques. Simultaneously, with the rapid advancement of AIGC technology, AI simulation shooting has become an integral part of people's daily lives. If it were possible to assist human photography and AIGC with language guidance, this would be a significant step forward in subjectively improving the aesthetic quality of photographic images. In this paper, we propose Aesthetic Language Guidance of Image (ALG) and present a series of language guidance rules (ALG Rules). ALG is divided into ALG-T and ALG-I based on whether the guiding rules are derived from photography templates or reference images, respectively. ALG-T and ALG-I both provide guidance for photography based on three attributes of color, light, and composition of images. ALG-T and ALG-I provide aesthetic language guidance for two types of input images, landscape and portrait images. We employ two methods to conduct confirmatory experiments, human photography, and AIGC imitation shooting. In the experiments, by comparing the aesthetic scores of original and modified images, the results show that our proposed guidance scheme significantly improves the aesthetic quality of photos in terms of color, composition, and lighting attributes.

References

[1]

Anselm Brachmann and Christoph Redies. 2017. Computational and experimental approaches to visual aesthetics. Frontiers in computational neuroscience, Vol. 11 (2017), 102.

[2]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, Vol. 33 (2020), 1877--1901.

[3]

Z. Chen, A. Tagliasacchi, and H. Zhang. 2020. BSP-Net: Generating Compact Meshes via Binary Space Partitioning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]

P. R. Goode, J. Qiu, V. Yurchyshyn, J. Hickey, and S. E. Koonin. 2001. Earthshine observations of the Earth's reflectance. Geophysical Research Letters, Vol. 28, 9 (2001).

[5]

L. Hosek and A. Wilkie. 2012. An analytic model for full spectral sky-dome radiance. Acm Transactions on Graphics, Vol. 31, 4 (2012), 1--9.

Digital Library

[6]

Pvc Hough. 1962. Method and means for recognizing complex patterns. U.s.patent (1962).

[7]

L Ho?ekho?ek and A. Wilkie. 2013. Adding a Solar-Radiance Function to the Ho?ek-Wilkie Skylight Model. IEEE Computer Graphics & Applications, Vol. 33, 3 (2013), 44--52.

Digital Library

[8]

J. Itten. 1961. The Art of color:. (1961).

[9]

Xin Jin, Xinning Li, Hao Lou, Chenyu Fan, Qiang Deng, Chaoen Xiao, Shuai Cui, and Amit Kumar Singh. 2023. Aesthetic attribute assessment of images numerically on mixed multi-attribute datasets. ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 18, 3s (2023), 1--16.

Digital Library

[10]

Xin Jin, Hao Lou, Heng Huang, Xinning Li, Xiaodong Li, Shuai Cui, Xiaokun Zhang, and Xiqiao Li. 2022. Pseudo-Labeling and Meta Reweighting Learning for Image Aesthetic Quality Assessment. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 12 (2022), 25226--25235.

[11]

Immanuel Kant. 2000. Critique of the Power of Judgment. Cambridge University Press.

[12]

Yueying Kao, Chong Wang, and Kaiqi Huang. 2015. Visual aesthetic quality assessment with a regression model. In 2015 IEEE International Conference on Image Processing (ICIP). IEEE, 1583--1587.

Digital Library

[13]

Y. Liu, M. M. Cheng, X. Hu, K. Wang, and X. Bai. 2016. Richer Convolutional Features for Edge Detection. IEEE Computer Society (2016).

[14]

Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4535--4544.

[15]

A. Newell, K. Yang, and D. Jia. 2016. Stacked Hourglass Networks for Human Pose Estimation. In European Conference on Computer Vision.

[16]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.

[17]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog, Vol. 1, 8 (2019), 9.

[18]

M. Tan and Q. V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. (2019).

[19]

Unsplash. Accessed 2023-05-06. Unsplash. https://unsplash.com/.

[20]

G. Ward. 2004. High Dynamic Range Imaging. In Color & Imaging Conference.

[21]

J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba. 2012. Recognizing scene viewpoint using panoramic place representation. In IEEE.

[22]

S. Xie and Z. Tu. 2016. Holistically-Nested Edge Detection. In 2015 IEEE International Conference on Computer Vision (ICCV). io

Index Terms

Language Guidance Generation Using Aesthetic Attribute Comparison for Human Photography and AIGC
1. Applied computing
  1. Arts and humanities
    1. Media arts

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice

October 2023

151 pages

ISBN:9798400702785

DOI:10.1145/3607541

General Chairs:
Cheng Jin
Professor, Fudan University, China
,
Liang He
Professor, East China Normal University, China
,
Mingli Song
Professor, Zhejiang University, China
,
Rui Wang
Professor, IIE, Chinese Academy of Sciences, China

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the Fundamental Research Funds for the Central Universities
the Natural Science Foundation of China
the Project of Philosophy and Social Science Research, Ministry of Education of China
the Science and Technology Project of the State Archives Administrator

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29, 2023

Ottawa ON, Canada

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
171
Total Downloads

Downloads (Last 12 months)171
Downloads (Last 6 weeks)16

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents