Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3607541.3616819acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Language Guidance Generation Using Aesthetic Attribute Comparison for Human Photography and AIGC

Published: 29 October 2023 Publication History

Abstract

With the proliferation of mobile photography technology, leading mobile phone manufacturers are racing to enhance the shooting capabilities of their equipment and the photo beautification algorithm of their software. However, the development of intelligent equipment and algorithms cannot supplant human subjective photography techniques. Simultaneously, with the rapid advancement of AIGC technology, AI simulation shooting has become an integral part of people's daily lives. If it were possible to assist human photography and AIGC with language guidance, this would be a significant step forward in subjectively improving the aesthetic quality of photographic images. In this paper, we propose Aesthetic Language Guidance of Image (ALG) and present a series of language guidance rules (ALG Rules). ALG is divided into ALG-T and ALG-I based on whether the guiding rules are derived from photography templates or reference images, respectively. ALG-T and ALG-I both provide guidance for photography based on three attributes of color, light, and composition of images. ALG-T and ALG-I provide aesthetic language guidance for two types of input images, landscape and portrait images. We employ two methods to conduct confirmatory experiments, human photography, and AIGC imitation shooting. In the experiments, by comparing the aesthetic scores of original and modified images, the results show that our proposed guidance scheme significantly improves the aesthetic quality of photos in terms of color, composition, and lighting attributes.

References

[1]
Anselm Brachmann and Christoph Redies. 2017. Computational and experimental approaches to visual aesthetics. Frontiers in computational neuroscience, Vol. 11 (2017), 102.
[2]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, Vol. 33 (2020), 1877--1901.
[3]
Z. Chen, A. Tagliasacchi, and H. Zhang. 2020. BSP-Net: Generating Compact Meshes via Binary Space Partitioning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4]
P. R. Goode, J. Qiu, V. Yurchyshyn, J. Hickey, and S. E. Koonin. 2001. Earthshine observations of the Earth's reflectance. Geophysical Research Letters, Vol. 28, 9 (2001).
[5]
L. Hosek and A. Wilkie. 2012. An analytic model for full spectral sky-dome radiance. Acm Transactions on Graphics, Vol. 31, 4 (2012), 1--9.
[6]
Pvc Hough. 1962. Method and means for recognizing complex patterns. U.s.patent (1962).
[7]
L Ho?ekho?ek and A. Wilkie. 2013. Adding a Solar-Radiance Function to the Ho?ek-Wilkie Skylight Model. IEEE Computer Graphics & Applications, Vol. 33, 3 (2013), 44--52.
[8]
J. Itten. 1961. The Art of color:. (1961).
[9]
Xin Jin, Xinning Li, Hao Lou, Chenyu Fan, Qiang Deng, Chaoen Xiao, Shuai Cui, and Amit Kumar Singh. 2023. Aesthetic attribute assessment of images numerically on mixed multi-attribute datasets. ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 18, 3s (2023), 1--16.
[10]
Xin Jin, Hao Lou, Heng Huang, Xinning Li, Xiaodong Li, Shuai Cui, Xiaokun Zhang, and Xiqiao Li. 2022. Pseudo-Labeling and Meta Reweighting Learning for Image Aesthetic Quality Assessment. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 12 (2022), 25226--25235.
[11]
Immanuel Kant. 2000. Critique of the Power of Judgment. Cambridge University Press.
[12]
Yueying Kao, Chong Wang, and Kaiqi Huang. 2015. Visual aesthetic quality assessment with a regression model. In 2015 IEEE International Conference on Image Processing (ICIP). IEEE, 1583--1587.
[13]
Y. Liu, M. M. Cheng, X. Hu, K. Wang, and X. Bai. 2016. Richer Convolutional Features for Edge Detection. IEEE Computer Society (2016).
[14]
Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4535--4544.
[15]
A. Newell, K. Yang, and D. Jia. 2016. Stacked Hourglass Networks for Human Pose Estimation. In European Conference on Computer Vision.
[16]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.
[17]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog, Vol. 1, 8 (2019), 9.
[18]
M. Tan and Q. V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. (2019).
[19]
Unsplash. Accessed 2023-05-06. Unsplash. https://unsplash.com/.
[20]
G. Ward. 2004. High Dynamic Range Imaging. In Color & Imaging Conference.
[21]
J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba. 2012. Recognizing scene viewpoint using panoramic place representation. In IEEE.
[22]
S. Xie and Z. Tu. 2016. Holistically-Nested Edge Detection. In 2015 IEEE International Conference on Computer Vision (ICCV). io

Index Terms

  1. Language Guidance Generation Using Aesthetic Attribute Comparison for Human Photography and AIGC

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice
    October 2023
    151 pages
    ISBN:9798400702785
    DOI:10.1145/3607541
    • General Chairs:
    • Cheng Jin,
    • Liang He,
    • Mingli Song,
    • Rui Wang
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. aigc
    2. alg-i
    3. alg-t
    4. computational aesthetics
    5. image aesthetic assessment
    6. language guidance

    Qualifiers

    • Research-article

    Funding Sources

    • the Fundamental Research Funds for the Central Universities
    • the Natural Science Foundation of China
    • the Project of Philosophy and Social Science Research, Ministry of Education of China
    • the Science and Technology Project of the State Archives Administrator

    Conference

    MM '23
    Sponsor:

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 171
      Total Downloads
    • Downloads (Last 12 months)171
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media