Style Transformation Method of Stage Background Images by Emotion Words of Lyrics
Abstract
:1. Introduction
- It uses emotion words contained in song lyrics to transform the style of stage background images. Audience immersion can be increased by using stage background images to represent emotions expressed in the song lyrics used for singing in stage performances. Emotions that are complex to represent using computers can be represented.
- Certain emotions that are difficult for humans to determine intuitively can be represented because the proposed method can transform the style of images based on an image with a high correlation with the emotion represented using lyrics.
2. Related Work
2.1. Stage Background Image Recommendation System
2.2. Emotion Classification
2.3. Style Transfer
2.4. Comparison of Methods for Image Style Transformation
3. Method of Transferring Image Style Based on Song Lyrics
3.1. Overview
3.2. Step 1: Lyric Preprocessing
3.3. Step 2: Emotion Image Processing
4. Experiments
4.1. Dataset and Experimental Environment
4.2. Experiment Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, S.; Jang, S.; Sung, Y. Melody Extraction and Encoding Method for Generating Healthcare Music Automatically. Electronics 2019, 8, 1250. [Google Scholar] [CrossRef] [Green Version]
- Li, S.; Jang, S.; Sung, Y. Automatic Melody Composition Using Enhanced GAN. Mathematics 2019, 7, 883. [Google Scholar] [CrossRef] [Green Version]
- Wen, J.; She, J.; Li, X.; Mao, H. Visual Background Recommendation for Dance Performances Using Deep Matrix Factorization. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2018, 14, 1–19. [Google Scholar] [CrossRef]
- Xu, T.; Zhang, P.; Huang, Q.; Zhang, H.; Gan, Z.; Huang1, X.; He, X. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Zhang, H.; Xu, T.; Li, H.; Zhang, S.; Wang, X.; Huang, X.; Metaxas, D.N. Stackgan++: Realistic Image Synthesis with Stacked Generative Adversarial Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 1947–1962. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hu, X.; Downie, J.S.; Ehmann, A.F. Lyric Text Mining in Music Mood Classification. In Proceedings of the 10th International Society for Music Information Retrieval(ISMIR), Kobe, Japan, 26–30 October 2009. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Hori, G. Color Extraction from Lyrics. In Proceedings of the 2019 4th International Conference on Automation, Control and Robotics Engineering (CACRE), Shenzhen, China, 19–21 July 2019. [Google Scholar]
- Isola, P.; Zhu, J.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Sung, Y.; Jin, Y.; Kwak, J.; Lee, S.; Cho, K. Advanced Camera Image Cropping Approach for CNN-Based End-to-End Controls on Sustainable Computing. Sustainability 2018, 10, 816. [Google Scholar] [CrossRef] [Green Version]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Zhao, S.; Gao, Y.; Jiang, X.; Yao, H.; Chua, T.; Sun, X. Exploring Principles-of-Art Features For Image Emotion Recognition. In Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, FL, USA, 3–7 November 2014. [Google Scholar]
- Machajdik, J.; Hanbury, A. Affective image classification using features inspired by psychology and art theory. In Proceedings of the 18th ACM International Conference on Multimedia, Florence, Italy, 25–29 October 2010. [Google Scholar]
- Han, E.; Cha, H. Extraction of Critical Low-Level Image Features for Effective Emotion Analysis. Inst. Control. Robot. Syst. 2019, 25, 319–326. [Google Scholar] [CrossRef]
- Wei, Z.; Zhang, J.; Lin, Z.; Lee, J.; Balasubramanian, N.; Hoai, M.; Samaras, D. Learning Visual Emotion Representations from Web Data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–18 June 2020. [Google Scholar]
- Yang, D.; Lee, W. Music Emotion Identification from Lyrics. In Proceedings of the IEEE International Symposium on Multimedia (ISM), San Diego, CA, USA, 14–16 December 2009. [Google Scholar]
- Zhao, S.; Yao, H.; Gao, Y.; Ding, G.; Chua, T. Predicting Personalized Image Emotion Perceptions in Social Networks. IEEE Trans. Affect. Comput. 2018, 9, 526–540. [Google Scholar] [CrossRef]
- Lee, J.; Lim, H.; Kim, H. Similarity Evaluation of Popular Music based on Emotion and Structure of Lyrics. KIISE Trans. Comput. Pract. 2016, 22, 479–487. [Google Scholar] [CrossRef]
- NRC Word-Emotion Association Lexicon. Available online: http://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm (accessed on 31 December 2020).
- Mohammad, S.M.; Turney, P.D. Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. In Proceedings of the Computational Approaches to Analysis and Generation of Emotion in Text(CAAGET), Los Angeles, CA, USA, 13–19 June 2010. [Google Scholar]
- Gao, W.; Zhang, X.; Yang, L.; Liu, H. An improved Sobel edge detection. In Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology, Chengdu, China, 9–11 July 2010; Volume 5, pp. 67–71. [Google Scholar]
- Zhu, C.; Richard, H.B.; Lu, P. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. (TOMS) 1997, 23, 550–560. [Google Scholar] [CrossRef]
- Ogul, H.; Celik, N. A Web Application for Content based Geographic Image Retrieval. In Proceedings of the 2017 25th Signal Processing and Communications Applications Conference (SIU), Antalya, Tulkey, 15–18 May 2017; pp. 1–4. [Google Scholar]
- Choi, E.; Lee, C. Feature extraction based on the Bhattacharyya distance. In Pattern Recognition; Elsvier: Amsterdam, The Netherland, 2003; Volume 36, pp. 1703–1709. [Google Scholar]
Zhao et al. [12] | Machajdik et al. [13] | Zhao et al. [17] | The Proposed Method | |
---|---|---|---|---|
Training data | IAPS, Art photo, Abstract painting | IAPS, Art photo, abstract painting | User’s metadata, IAPS, Abstract painting, Flickr | Song lyrics, Flickr |
Encoding | Image-based | Image-based | User’s metadata-based | One-hot encoding, Texture-based |
Model | SVM | Waterfall segmentation algorithm | Multi-Task Hypergraph learning | CNN |
Stage | Description |
---|---|
Lyric preprocessing | User’s selected lyrics are divided into verses and choruses, and a probability distribution of emotion words is extracted for each verse and chorus. |
Emotion image processing | From emotion images with tags, the appropriate images are selected for each verse and chorus, and styles of selected images transferred to stage background image. |
Anger | Disgust | Fear | Sadness | Anticipation | Joy | Surprise | Trust | |
---|---|---|---|---|---|---|---|---|
Quantity | 3428 (16%) | 3414 (15%) | 3572 (17%) | 3449 (15%) | 2312 (6%) | 2325 (10%) | 2625 (10%) | 2692 (11%) |
Verse 1 | Chorus 1 | ||
---|---|---|---|
Anticip: 0.2 Fear: 0.067 Joy: 0.2 Sadness: 0.133 Surprise: 0.2 Trust: 0.2 | Anticip: 0.077 Fear: 0.154 Joy: 0.231 Sadness: 0.077 Surprise: 0.231 Trust: 0.154 | ||
Target image | Image with a similar distribution (a) | Image with a similar distribution (b) | Image with a different distribution (c) | |
Emotion probability distribution | Amusement: 0.14 Awe: 0.14 Content: 0.29 Excitement: 0.29 Sad: 0.14 | Amusement: 0.14 Awe: 0.29 Content: 0.29 Excitement: 0.14 Sad: 0.14 | Amusement: 0.29 Awe: 0.14 Content: 0.29 Excitement: 0.29 | Anger: 1.00 |
HISTCMP_CORREL | 1.00 | 0.12 | 0.22 | 0.1 |
HISTCMP_CHISQR | 0.00 | 7665.83 | 7443.83 | 17,372.30 |
HISTCMP_INTERSECT | 1.00 | 0.38 | 0.28 | 0.05 |
HISTCMP_BHATTACHARYYA | 0.00 | 0.62 | 0.62 | 0.92 |
Target image | Image with a similar distribution (a) |
Image with a similar distribution (b) | Image with a different distribution (c) |
Lyrics | Verse 1 | Chorus | Verse 2 | |||
---|---|---|---|---|---|---|
Forgotten hero | ||||||
Meant to be this way | ||||||
Sax is my cardio | ||||||
Heart of a lion (Leo) | Verse 1 | Chorus | Verse 2 | Verse 3 | ||
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yoon, H.; Li, S.; Sung, Y. Style Transformation Method of Stage Background Images by Emotion Words of Lyrics. Mathematics 2021, 9, 1831. https://doi.org/10.3390/math9151831
Yoon H, Li S, Sung Y. Style Transformation Method of Stage Background Images by Emotion Words of Lyrics. Mathematics. 2021; 9(15):1831. https://doi.org/10.3390/math9151831
Chicago/Turabian StyleYoon, Hyewon, Shuyu Li, and Yunsick Sung. 2021. "Style Transformation Method of Stage Background Images by Emotion Words of Lyrics" Mathematics 9, no. 15: 1831. https://doi.org/10.3390/math9151831