research-article

Aesthetic Attributes Assessment of Images

Authors:

Xinghui ZhouAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 311 - 319

https://doi.org/10.1145/3343031.3350970

Published: 15 October 2019 Publication History

Abstract

Image aesthetic quality assessment has been a relatively hot topic during the last decade. Most recently, comments type assessment (aesthetic captions) has been proposed to describe the general aesthetic impression of an image using text. In this paper, we propose Aesthetic Attributes Assessment of Images, which means the aesthetic attributes captioning. This is a new formula of image aesthetic assessment, which predicts aesthetic attributes captions together with the aesthetic score of each attribute. We introduce a new dataset named DPC-Captions which contains comments of up to 5 aesthetic attributes of one image through knowledge transfer from a full-annotated small-scale dataset. Then, we propose Aesthetic Multi-Attribute Network (AMAN), which is trained on a mixture of fully-annotated small-scale PCCD dataset and weakly-annotated large-scale DPC-Captions dataset. Our AMAN makes full use of transfer learning and attention model in a single framework. The experimental results on our DPC-Captions and PCCD dataset reveal that our method can predict captions of 5 aesthetic attributes together with numerical score assessment of each attribute. We use the evaluation criteria used in image captions to prove that our specially designed AMAN model outperforms traditional CNN-LSTM model and modern SCA-CNN model of image captions.

References

[1]

Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould. 2016. SPICE: Semantic Propositional Image Caption Evaluation. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part V (Lecture Notes in Computer Science), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.), Vol. 9909. Springer, 382--398. https://doi.org/10.1007/978--3--319--46454--1_24

[2]

Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[3]

Jyoti Aneja, Aditya Deshpande, and Alexander G. Schwing. 2018. Convolutional Image Captioning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[4]

Kuang-Yu Chang, Kung-Hung Lu, and Chu-Song Chen. 2017. Aesthetic Critiques Generation for Photos. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. IEEE Computer Society, 3534--3543. https://doi.org/10.1109/ICCV.2017.380

[5]

Fuhai Chen, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, and Jinsong Su. 2018. GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[6]

Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, and Tat-Seng Chua. 2017. SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017 . 6298--6306. https://doi.org/10.1109/CVPR.2017.667

[7]

Xiaowu Chen, Xin Jin, Hongyu Wu, and Qinping Zhao. 2015. Learning Templates for Artistic Portrait Lighting Analysis. IEEE Trans. Image Processing, Vol. 24, 2 (2015), 608--618.

Digital Library

[8]

C. Cui, H. Liu, T. Lian, L. Nie, L. Zhu, and Y. Yin. 2018. Distribution-oriented Aesthetics Assessment with Semantic-Aware Hybrid Network. IEEE Transactions on Multimedia (2018), 1--1. https://doi.org/10.1109/TMM.2018.2875357

[9]

Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2017. Image Aesthetic Assessment: An experimental survey. IEEE Signal Process. Mag., Vol. 34, 4 (2017), 80--106. https://doi.org/10.1109/MSP.2017.2696576

[10]

Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, and Trevor Darrell. 2017. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 39, 4 (2017), 677--691. https://doi.org/10.1109/TPAMI.2016.2599174

Digital Library

[11]

Zhe Dong and Xinmei Tian. 2015. Multi-level photo quality assessment with multi-view features. Neurocomputing, Vol. 168 (2015), 308--319.

Digital Library

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. IEEE Computer Society, 770--778.

[13]

Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017 . 2261--2269. https://doi.org/10.1109/CVPR.2017.243

[14]

X. Jin, J. Chi, S. Peng, Y. Tian, C. Ye, and X. Li. 2016. Deep Image Aesthetics Classification using Inception Modules and Fine-tuning Connected Layer. In The 8th International Conference on Wireless Communications and Signal Processing (WCSP). 1--6.

[15]

Xin Jin, Le Wu, Xiaodong Li, Siyu Chen, Siwei Peng, Jingying Chi, Shiming Ge, Chenggen Song, and Geng Zhao. 2018. Predicting Aesthetic Score Distribution Through Cumulative Jensen-Shannon Divergence. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, February 2--7, 2018 . https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16074

[16]

Xin Jin, Mingtian Zhao, Xiaowu Chen, Qinping Zhao, and Song Chun Zhu. 2010. Learning Artistic Lighting Template from Portrait Photographs. In Computer Vision - ECCV 2010, 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5--11, 2010, Proceedings, Part IV. 101--114.

[17]

Yueying Kao, Ran He, and Kaiqi Huang. 2017. Deep Aesthetic Quality Assessment With Semantic Information. IEEE Trans. Image Processing, Vol. 26, 3 (2017), 1482--1495. https://doi.org/10.1109/TIP.2017.2651399

Digital Library

[18]

Yueying Kao, Kaiqi Huang, and Steve J. Maybank. 2016. Hierarchical aesthetic quality assessment using deep convolutional neural networks. Sig. Proc.: Image Comm., Vol. 47 (2016), 500--510. https://doi.org/10.1016/j.image.2016.05.004

Digital Library

[19]

Andrej Karpathy and Li Fei-Fei. 2017. Deep Visual-Semantic Alignments for Generating Image Descriptions. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 39, 4 (2017), 664--676. https://doi.org/10.1109/TPAMI.2016.2598339

Digital Library

[20]

Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, and Charless Fowlkes. 2016. Photo Aesthetics Ranking Network with Attributes and Content Adaptation. In European Conference on Computer Vision (ECCV) .

[21]

Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Zijun Wang. 2014. RAPID: Rating Pictorial Aesthetics using Deep Learning. In Proceedings of the ACM International Conference on Multimedia, MM'14, Orlando, FL, USA, November 03 - 07, 2014. 457--466.

Digital Library

[22]

Ruotian Luo, Brian Price, Scott Cohen, and Gregory Shakhnarovich. 2018. Discriminability Objective for Training Descriptive Captions. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[23]

Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-Lamp: Adaptive Layout-Aware Multi-patch Deep Convolutional Neural Network for Photo Aesthetic Assessment. In CVPR. IEEE Computer Society, 722--731.

[24]

Long Mai, Hailin Jin, and Feng Liu. 2016. Composition-Preserving Deep Photo Aesthetics Assessment. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[25]

Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan L. Yuille, and Kevin Murphy. 2016. Generation and Comprehension of Unambiguous Object Descriptions. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016. 11--20. https://doi.org/10.1109/CVPR.2016.9

[26]

Alexander Mathews, Lexing Xie, and Xuming He. 2018. SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[27]

Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16--21, 2012 . 2408--2415.

[28]

Hossein Talebi and Peyman Milanfar. 2018. NIMA: Neural Image Assessment. IEEE Trans. Image Processing, Vol. 27, 8 (2018), 3998--4011. https://doi.org/10.1109/TIP.2018.2831899

[29]

Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7--12, 2015 . 3156--3164. https://doi.org/10.1109/CVPR.2015.7298935

[30]

Wenshan Wang, Su Yang, Weishan Zhang, and Jiulong Zhang. 2018. Neural Aesthetic Image Reviewer. CoRR, Vol. abs/1802.10240 (2018). arxiv: 1802.10240 http://arxiv.org/abs/1802.10240

[31]

Weining Wang, Mingquan Zhao, Li Wang, Jiexiong Huang, Chengjia Cai, and Xiangmin Xu. 2016. A multi-scene deep learning model for image aesthetic evaluation. Sig. Proc.: Image Comm., Vol. 47 (2016), 511--518.

Digital Library

[32]

Ye Zhou, Xin Lu, Junping Zhang, and James Z. Wang. 2016. Joint Image and Text Representation for Aesthetics Analysis. In Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15--19, 2016, Alan Hanjalic, Cees Snoek, Marcel Worring, Dick C. A. Bulterman, Benoit Huet, Aisling Kelliher, Yiannis Kompatsiaris, and Jin Li (Eds.). ACM, 262--266. https://doi.org/10.1145/2964284.2967223

Cited By

Keyao LLiu KPeng MZhao BJiangyuanhong LJiahui Z(2024)MACFAN: A multi-channel fusion network for subjective aesthetic attributes with automated comments labeling pipeline2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687485(1-6)Online publication date: 15-Jul-2024
https://doi.org/10.1109/ICME57554.2024.10687485
Zhang XXiao YPeng JGao XHu B(2024)Confidence-based dynamic cross-modal memory network for image aesthetic assessmentPattern Recognition10.1016/j.patcog.2023.110227149(110227)Online publication date: May-2024
https://doi.org/10.1016/j.patcog.2023.110227
Song XZhu P(2024)Combining Image Caption and Aesthetic Description Using Siamese NetworkProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_4(41-51)Online publication date: 4-Jan-2024
https://doi.org/10.1007/978-981-99-9239-3_4
Show More Cited By

Index Terms

Aesthetic Attributes Assessment of Images
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding

Recommendations

Aesthetic Attribute Assessment of Images Numerically on Mixed Multi-attribute Datasets
With the continuous development of social software and multimedia technology, images have become a kind of important carrier for spreading information and socializing. How to evaluate an image comprehensively has become the focus of recent researches. The ...
Multiple Aesthetic Attribute Assessment by Exploiting Relations Among Aesthetic Attributes
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

Current research of aesthetic assessment for images assumes one aesthetic score or one aesthetic label for an image, ignoring the relations of multiple aesthetic-related attributes. However, most images can be described by multiple aesthetic attributes ...
Aesthetic image captioning on the FAE-Captions dataset
Abstract
At present, most of the research on image aesthetics focuses on scoring pictures. We propose Aesthetic Assessment of Images, which means the dense aesthetic captioning. The image captioning model uses many photos for www.flickr.com as ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the Open Project Program of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University
National Natural Science Foundation of China
the Open Funds of CETC Big Data Research Institute Co.,Ltd.,
Fundamental Research Funds for the Central Universities

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
531
Total Downloads

Downloads (Last 12 months)68
Downloads (Last 6 weeks)2

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Keyao LLiu KPeng MZhao BJiangyuanhong LJiahui Z(2024)MACFAN: A multi-channel fusion network for subjective aesthetic attributes with automated comments labeling pipeline2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687485(1-6)Online publication date: 15-Jul-2024
https://doi.org/10.1109/ICME57554.2024.10687485
Zhang XXiao YPeng JGao XHu B(2024)Confidence-based dynamic cross-modal memory network for image aesthetic assessmentPattern Recognition10.1016/j.patcog.2023.110227149(110227)Online publication date: May-2024
https://doi.org/10.1016/j.patcog.2023.110227
Song XZhu P(2024)Combining Image Caption and Aesthetic Description Using Siamese NetworkProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_4(41-51)Online publication date: 4-Jan-2024
https://doi.org/10.1007/978-981-99-9239-3_4
Yang HLi YZhou XJin XShi PLiu Y(2024)Aesthetic Multi-attributes Captioning Network for PhotosArtificial Intelligence and Robotics10.1007/978-981-99-9109-9_12(121-130)Online publication date: 4-Jan-2024
https://doi.org/10.1007/978-981-99-9109-9_12
Pu YLiu DChen SZhong Y(2023)Research Progress on the Aesthetic Quality Assessment of Complex Layout Images Based on Deep LearningApplied Sciences10.3390/app1317976313:17(9763)Online publication date: 29-Aug-2023
https://doi.org/10.3390/app13179763
Jin XZhou WWang JXU DZheng YEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music RecommendationProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612140(6938-6947)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612140
Sheng XLi LChen PWu JDong WYang YXu LLi YShi GEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics AssessmentProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611969(1117-1126)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611969
Jin XLi YZhou WZhou XYang H(2023)Aesthetic Visual Question Answering of Photographs2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)10.1109/ICMEW59549.2023.00068(359-364)Online publication date: Jul-2023
https://doi.org/10.1109/ICMEW59549.2023.00068
Salin EAyache SFavre B(2023)Towards an Exhaustive Evaluation of Vision-Language Foundation Models2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW60793.2023.00041(339-352)Online publication date: 2-Oct-2023
https://doi.org/10.1109/ICCVW60793.2023.00041
He SMing ALi YSun JZheng SMa H(2023)Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01996(21781-21790)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.01996
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents