Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3343031.3350970acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Aesthetic Attributes Assessment of Images

Published: 15 October 2019 Publication History

Abstract

Image aesthetic quality assessment has been a relatively hot topic during the last decade. Most recently, comments type assessment (aesthetic captions) has been proposed to describe the general aesthetic impression of an image using text. In this paper, we propose Aesthetic Attributes Assessment of Images, which means the aesthetic attributes captioning. This is a new formula of image aesthetic assessment, which predicts aesthetic attributes captions together with the aesthetic score of each attribute. We introduce a new dataset named DPC-Captions which contains comments of up to 5 aesthetic attributes of one image through knowledge transfer from a full-annotated small-scale dataset. Then, we propose Aesthetic Multi-Attribute Network (AMAN), which is trained on a mixture of fully-annotated small-scale PCCD dataset and weakly-annotated large-scale DPC-Captions dataset. Our AMAN makes full use of transfer learning and attention model in a single framework. The experimental results on our DPC-Captions and PCCD dataset reveal that our method can predict captions of 5 aesthetic attributes together with numerical score assessment of each attribute. We use the evaluation criteria used in image captions to prove that our specially designed AMAN model outperforms traditional CNN-LSTM model and modern SCA-CNN model of image captions.

References

[1]
Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould. 2016. SPICE: Semantic Propositional Image Caption Evaluation. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part V (Lecture Notes in Computer Science), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.), Vol. 9909. Springer, 382--398. https://doi.org/10.1007/978--3--319--46454--1_24
[2]
Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[3]
Jyoti Aneja, Aditya Deshpande, and Alexander G. Schwing. 2018. Convolutional Image Captioning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[4]
Kuang-Yu Chang, Kung-Hung Lu, and Chu-Song Chen. 2017. Aesthetic Critiques Generation for Photos. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. IEEE Computer Society, 3534--3543. https://doi.org/10.1109/ICCV.2017.380
[5]
Fuhai Chen, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, and Jinsong Su. 2018. GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[6]
Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, and Tat-Seng Chua. 2017. SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017 . 6298--6306. https://doi.org/10.1109/CVPR.2017.667
[7]
Xiaowu Chen, Xin Jin, Hongyu Wu, and Qinping Zhao. 2015. Learning Templates for Artistic Portrait Lighting Analysis. IEEE Trans. Image Processing, Vol. 24, 2 (2015), 608--618.
[8]
C. Cui, H. Liu, T. Lian, L. Nie, L. Zhu, and Y. Yin. 2018. Distribution-oriented Aesthetics Assessment with Semantic-Aware Hybrid Network. IEEE Transactions on Multimedia (2018), 1--1. https://doi.org/10.1109/TMM.2018.2875357
[9]
Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2017. Image Aesthetic Assessment: An experimental survey. IEEE Signal Process. Mag., Vol. 34, 4 (2017), 80--106. https://doi.org/10.1109/MSP.2017.2696576
[10]
Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, and Trevor Darrell. 2017. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 39, 4 (2017), 677--691. https://doi.org/10.1109/TPAMI.2016.2599174
[11]
Zhe Dong and Xinmei Tian. 2015. Multi-level photo quality assessment with multi-view features. Neurocomputing, Vol. 168 (2015), 308--319.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. IEEE Computer Society, 770--778.
[13]
Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017 . 2261--2269. https://doi.org/10.1109/CVPR.2017.243
[14]
X. Jin, J. Chi, S. Peng, Y. Tian, C. Ye, and X. Li. 2016. Deep Image Aesthetics Classification using Inception Modules and Fine-tuning Connected Layer. In The 8th International Conference on Wireless Communications and Signal Processing (WCSP). 1--6.
[15]
Xin Jin, Le Wu, Xiaodong Li, Siyu Chen, Siwei Peng, Jingying Chi, Shiming Ge, Chenggen Song, and Geng Zhao. 2018. Predicting Aesthetic Score Distribution Through Cumulative Jensen-Shannon Divergence. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, February 2--7, 2018 . https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16074
[16]
Xin Jin, Mingtian Zhao, Xiaowu Chen, Qinping Zhao, and Song Chun Zhu. 2010. Learning Artistic Lighting Template from Portrait Photographs. In Computer Vision - ECCV 2010, 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5--11, 2010, Proceedings, Part IV. 101--114.
[17]
Yueying Kao, Ran He, and Kaiqi Huang. 2017. Deep Aesthetic Quality Assessment With Semantic Information. IEEE Trans. Image Processing, Vol. 26, 3 (2017), 1482--1495. https://doi.org/10.1109/TIP.2017.2651399
[18]
Yueying Kao, Kaiqi Huang, and Steve J. Maybank. 2016. Hierarchical aesthetic quality assessment using deep convolutional neural networks. Sig. Proc.: Image Comm., Vol. 47 (2016), 500--510. https://doi.org/10.1016/j.image.2016.05.004
[19]
Andrej Karpathy and Li Fei-Fei. 2017. Deep Visual-Semantic Alignments for Generating Image Descriptions. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 39, 4 (2017), 664--676. https://doi.org/10.1109/TPAMI.2016.2598339
[20]
Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, and Charless Fowlkes. 2016. Photo Aesthetics Ranking Network with Attributes and Content Adaptation. In European Conference on Computer Vision (ECCV) .
[21]
Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Zijun Wang. 2014. RAPID: Rating Pictorial Aesthetics using Deep Learning. In Proceedings of the ACM International Conference on Multimedia, MM'14, Orlando, FL, USA, November 03 - 07, 2014. 457--466.
[22]
Ruotian Luo, Brian Price, Scott Cohen, and Gregory Shakhnarovich. 2018. Discriminability Objective for Training Descriptive Captions. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[23]
Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-Lamp: Adaptive Layout-Aware Multi-patch Deep Convolutional Neural Network for Photo Aesthetic Assessment. In CVPR. IEEE Computer Society, 722--731.
[24]
Long Mai, Hailin Jin, and Feng Liu. 2016. Composition-Preserving Deep Photo Aesthetics Assessment. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[25]
Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan L. Yuille, and Kevin Murphy. 2016. Generation and Comprehension of Unambiguous Object Descriptions. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016. 11--20. https://doi.org/10.1109/CVPR.2016.9
[26]
Alexander Mathews, Lexing Xie, and Xuming He. 2018. SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[27]
Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16--21, 2012 . 2408--2415.
[28]
Hossein Talebi and Peyman Milanfar. 2018. NIMA: Neural Image Assessment. IEEE Trans. Image Processing, Vol. 27, 8 (2018), 3998--4011. https://doi.org/10.1109/TIP.2018.2831899
[29]
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7--12, 2015 . 3156--3164. https://doi.org/10.1109/CVPR.2015.7298935
[30]
Wenshan Wang, Su Yang, Weishan Zhang, and Jiulong Zhang. 2018. Neural Aesthetic Image Reviewer. CoRR, Vol. abs/1802.10240 (2018). arxiv: 1802.10240 http://arxiv.org/abs/1802.10240
[31]
Weining Wang, Mingquan Zhao, Li Wang, Jiexiong Huang, Chengjia Cai, and Xiangmin Xu. 2016. A multi-scene deep learning model for image aesthetic evaluation. Sig. Proc.: Image Comm., Vol. 47 (2016), 511--518.
[32]
Ye Zhou, Xin Lu, Junping Zhang, and James Z. Wang. 2016. Joint Image and Text Representation for Aesthetics Analysis. In Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15--19, 2016, Alan Hanjalic, Cees Snoek, Marcel Worring, Dick C. A. Bulterman, Benoit Huet, Aisling Kelliher, Yiannis Kompatsiaris, and Jin Li (Eds.). ACM, 262--266. https://doi.org/10.1145/2964284.2967223

Cited By

View all
  • (2024)MACFAN: A multi-channel fusion network for subjective aesthetic attributes with automated comments labeling pipeline2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687485(1-6)Online publication date: 15-Jul-2024
  • (2024)Confidence-based dynamic cross-modal memory network for image aesthetic assessmentPattern Recognition10.1016/j.patcog.2023.110227149(110227)Online publication date: May-2024
  • (2024)Combining Image Caption and Aesthetic Description Using Siamese NetworkProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_4(41-51)Online publication date: 4-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '19: Proceedings of the 27th ACM International Conference on Multimedia
October 2019
2794 pages
ISBN:9781450368896
DOI:10.1145/3343031
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. aesthetic assessment
  2. image captioning
  3. semi-supervised learning

Qualifiers

  • Research-article

Funding Sources

Conference

MM '19
Sponsor:

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;
Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)2
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MACFAN: A multi-channel fusion network for subjective aesthetic attributes with automated comments labeling pipeline2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687485(1-6)Online publication date: 15-Jul-2024
  • (2024)Confidence-based dynamic cross-modal memory network for image aesthetic assessmentPattern Recognition10.1016/j.patcog.2023.110227149(110227)Online publication date: May-2024
  • (2024)Combining Image Caption and Aesthetic Description Using Siamese NetworkProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_4(41-51)Online publication date: 4-Jan-2024
  • (2024)Aesthetic Multi-attributes Captioning Network for PhotosArtificial Intelligence and Robotics10.1007/978-981-99-9109-9_12(121-130)Online publication date: 4-Jan-2024
  • (2023)Research Progress on the Aesthetic Quality Assessment of Complex Layout Images Based on Deep LearningApplied Sciences10.3390/app1317976313:17(9763)Online publication date: 29-Aug-2023
  • (2023)An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music RecommendationProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612140(6938-6947)Online publication date: 26-Oct-2023
  • (2023)AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics AssessmentProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611969(1117-1126)Online publication date: 26-Oct-2023
  • (2023)Aesthetic Visual Question Answering of Photographs2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)10.1109/ICMEW59549.2023.00068(359-364)Online publication date: Jul-2023
  • (2023)Towards an Exhaustive Evaluation of Vision-Language Foundation Models2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW60793.2023.00041(339-352)Online publication date: 2-Oct-2023
  • (2023)Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01996(21781-21790)Online publication date: 1-Oct-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media