Amplifying the music listening experience through song comments on music streaming platforms

Chen, Longfei; Liu, Qianyu; Zhang, Chenyang; Huang, Yangkun; Peng, Zhenhui; Zeng, Haipeng; Sun, Zhida; Ma, Xiaojuan; Li, Quan

doi:10.1007/s12650-024-00966-2

Amplifying the music listening experience through song comments on music streaming platforms

Regular Paper
Published: 10 March 2024

Volume 27, pages 401–419, (2024)
Cite this article

Journal of Visualization Aims and scope Submit manuscript

382 Accesses
Explore all metrics

Abstract

Music streaming services are increasingly popular among younger generations who seek social experiences through personal expression and sharing of subjective feelings in comments. However, such emotional aspects are often ignored by current platforms, which affect the listeners’ ability to find music that triggers specific personal feelings. To address this gap, this study proposes a novel approach that leverages deep learning methods to capture contextual keywords, sentiments, and induced mechanisms from song comments. The study augments a current music app with two features, including the presentation of tags that best represent song comments and a novel map metaphor that reorganizes song comments based on chronological order, content, and sentiment. The effectiveness of the proposed approach is validated through a usage scenario and a user study that demonstrate its capability to improve the user experience of exploring songs and browsing comments of interest. This study contributes to the advancement of music streaming services by providing a more personalized and emotionally rich music experience for younger generations.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 5

A Personalized and Context Aware Music Recommendation System

Music Recommendation Systems: A Survey

Personalized time-sync comment generation based on a multimodal transformer

Article 30 March 2024

Notes

EM (Exact Match) calculates the percentage of ground truth answers that match the predicted outcomes.

References

Alper B, Yang H, Haber E, Kandogan E (2011) Opinionblocks: visualizing consumer reviews. In: IEEE VisWeek 2011 workshop on interactive visual text analytics for decision making
Barzilay R, Elhadad M (1999) Using lexical chains for text summarization. Adv Autom Text Summ:111–121
Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann Math Stat 41(1):164–171
Article MathSciNet Google Scholar
Beliga S, Meštrović A, Martinčić-Ipšić S (2015) An overview of graph-based keyword extraction methods and approaches. J Inf Organ Sci 39(1):1–20
Google Scholar
Bharti SK, Babu KS (2017) Automatic keyword extraction for text summarization: a survey. arXiv:1704.03242
Brigl T (2018) Extracting reliable topics using ensemble latent Dirichlet allocation. PhD thesis, Technische Hochschule Ingolstadt
Byron L, Wattenberg M (2008) Stacked graphs-geometry & aesthetics. IEEE Trans Visual Comput Graphics 14(6):1245–1252
Article Google Scholar
Chen S, Chen S, Lin L, Yuan X, Liang J, Zhang X (2017) E-map: a visual analytics approach for exploring significant event evolutions in social media. In: 2017 IEEE conference on visual analytics science and technology (VAST), pp 36–47. https://doi.org/10.1109/VAST.2017.8585638
Chen S, Chen S, Wang Z, Liang J, Yuan X, Cao N, Wu Y (2016) D-map: visual analysis of ego-centric information diffusion patterns in social media. In: 2016 IEEE conference on visual analytics science and technology (VAST), pp 41–50. https://doi.org/10.1109/VAST.2016.7883510
Chen S, Li S, Chen S, Yuan X (2019) R-map: A map metaphor for visualizing information reposting process in social media. IEEE Trans Visual Comput Graphics 26(1):1204–1214
Article Google Scholar
Chen X (2018) Research on the characteristics and communication mode of interesting virtual community: take net ease cloud music as an example. In: 4th international symposium on social science (ISSS 2018). Atlantis Press, pp 408–412
Chuang Z-J, Wu C-H (2004) Multi-modal emotion recognition from speech and text, pp 45–62
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Article Google Scholar
Cui W, Liu S, Tan L, Shi C, Song Y, Gao Z, Qu H, Tong X (2011) Textflow: towards better understanding of evolving topics in text. IEEE Trans Visual Comput Graphics 17(12):2412–2421
Article Google Scholar
Cui Y, Che W, Liu T, Qin B, Wang S, Hu G (2020) Revisiting pre-trained models for Chinese natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings. Association for Computational Linguistics, pp 657–668
Dörk M, Gruen D, Williamson C, Carpendale S (2010) A visual backchannel for large-scale events. IEEE Trans Visual Comput Graphics 16(6):1129–1138
Article Google Scholar
Eckman P (1972) Universal and cultural differences in facial expression of emotion. In: Nebraska symposium on motivation, vol 19. University of Nebraska Press, pp 207–284
Ester M, Kriegel H.-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, KDD’96. AAAI Press, pp 226–231
Graves A, Jaitly N, Mohamed A.-r (2013) Hybrid speech recognition with deep bidirectional lstm. In: 2013 IEEE workshop on automatic speech recognition and understanding. IEEE, pp 273–278
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm networks. In: Proceedings 2005 IEEE international joint conference on neural networks, 2005, vol 4. IEEE, pp 2047–2052
Grivet S, Auber D, Domenger J.-P, Melançon G (2006) Bubble tree drawing algorithm. In: Computer vision and graphics. Springer, pp 633–641
Grootendorst M (2020) Keybert: minimal keyword extraction with bert. https://doi.org/10.5281/zenodo.4461265
Hanada M (2018) Correspondence analysis of color-emotion associations. Color Res Appl 43(2):224–237
Article MathSciNet Google Scholar
Hasan KS, Ng V (2014) Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1262–1273
Havre S, Hetzler E, Whitney P, Nowell L (2002) Themeriver: visualizing thematic changes in large document collections. IEEE Trans Visual Comput Graphics 8(1):9–20
Article Google Scholar
Hruschka DJ, Schwartz D, St. John DC, Picone-Decaro E, Jenkins RA, Carey JW (2004) Reliability in coding open-ended data: lessons learned from hiv behavioral research. Field Methods 16(3):307–331
Article Google Scholar
Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on Empirical methods in natural language processing, pp 216–223
Juslin PN (2013) From everyday emotions to aesthetic emotions: towards a unified theory of musical emotions. Phys Life Rev 10(3):235–266. https://doi.org/10.1016/j.plrev.2013.05.008
Article Google Scholar
Keogh E, Chu S, Hart D, Pazzani M (2004) Segmenting time series: a survey and novel approach. In: textitData mining in time series databases. World Scientific, pp 1–21
Kucher K, Paradis C, Kerren A (2018) The state of the art in sentiment visualization. In: Computer graphics forum, vol 37. Wiley Online Library, pp 71–96
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data
Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 497–506
Liu S, Zhou M. X, Pan S, Qian W, Cai W, Lian X (2009) Interactive, topic-based visual text summarization and analysis. In: Proceedings of the 18th ACM conference on information and knowledge management, pp 543–552
Marin MM, Bhattacharya J (2010) Music induced emotions: some current issues and cross-modal comparisons. Music Educ:1–38
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. arXiv:cs/0205070
Ren Y, Harper FM, Drenner S, Terveen L, Kiesler S, Riedl J, Kraut RE (2012) Building member attachment in online communities: applying theories of group identity and interpersonal bonds. MIS Q:841–864
Scherer KR, Zentner MR (2001) Emotional effects of music: production rules
Shneiderman B (1996) The eyes have it: A task by data type taxonomy for information visualizations. In: Proceedings 1996 IEEE symposium on visual languages. IEEE, pp 336–343
Siddiqi S, Sharan A (2015) Keyword and keyphrase extraction techniques: a literature review. Int J Comput Appl 109(2)
Song G, Ye Y, Du X, Huang X, Bie S (2014) Short text classification: a survey. J Multimed 9(5):1–2
Google Scholar
Sugiana D, Hafiar H (2018) Construction of self-identity and social identity of“koes plus’’ music fans. MIMBAR: Jurnal Sosial Dan Pembangunan 34(1):176–184
Google Scholar
Uzun Y (2005) Keyword extraction using naive bayes. In: Bilkent University, Department of Computer Science, Turkey www. cs. bilkent. edu. tr/$\sim $ guvenir/courses/CS550/Workshop/Yasin_Uzun. pdf
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762
Wang H, Fu R (2020) Exploring user experience of music social mode-take netease cloud music as an example. In: International conference on applied human factors and ergonomics. Springer, pp 993–999
Wang Y, Haleem H, Shi C, Wu Y, Zhao X, Fu S, Qu H (2018) Towards easy comparison of local businesses using online reviews. In: Proceedings of computer graphics forum, vol 37. Wiley Online Library, pp 63–74
Wei F, Liu S, Song Y, Pan S, Zhou MX, Qian W, Shi L, Tan L, Zhang Q (2010) Tiara: a visual exploratory text analytic system. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 153–162
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest QA (2020) State-of-the-art natural language processing, M. Rush. Huggingface’s transformers
Yatani K, Novati M, Trusty A, Truong KN (2011) Review spotlight: a user interface for summarizing user-generated reviews using adjective-noun word pairs. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 1541–1550
Zeng H, Shu X, Wang Y, Wang Y, Zhang L, Pong T-C, Qu H (2020) Emotioncues: Emotion-oriented visual summarization of classroom videos. IEEE Trans Vis Comput Graphics
Zeng H, Wang X, Wu A, Wang Y, Li Q, Endert A, Qu H (2019) Emoco: visual analysis of emotion coherence in presentation videos. IEEE Trans Visual Comput Graphics 26(1):927–937
Google Scholar
Zhang Q, Wang Y, Gong Y, Huang X-J (2016) Keyphrase extraction using deep recurrent neural networks on twitter. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 836–845
Zhao J, Gou L, Wang F, Zhou M (2014) Pearl: an interactive visual analytic tool for understanding personal emotion style derived from social media. In: Proceedings of 2014 IEEE conference on visual analytics science and technology (VAST). IEEE, pp 203–212

Download references

Acknowledgements

We would like to express our gratitude to our domain experts and the anonymous reviewers for their insightful comments. This work is funded by grants from the National Natural Science Foundation of China (No. 62372298), the Shanghai Frontiers Science Center of Human-centered Artificial Intelligence (ShangHAI), and the Key Laboratory of Intelligent Perception and Human–Machine Collaboration (ShanghaiTech University), Ministry of Education.

Author information

Authors and Affiliations

School of Information Science and Technology, ShanghaiTech University, Shanghai, China
Longfei Chen, Qianyu Liu & Quan Li
Shanghai Engineering Research Center of Intelligent Vision and Imaging China, Shanghai, China
Longfei Chen, Qianyu Liu & Quan Li
Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, USA
Chenyang Zhang
Tandon School of Engineering, New York University, New York, USA
Yangkun Huang
School of Artificial Intelligence, Sun Yat-Sen University, Guangzhou, China
Zhenhui Peng
School of Intelligent Systems Engineering, Sun Yat-Sen University, Guangzhou, China
Haipeng Zeng
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Zhida Sun
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Xiaojuan Ma

Authors

Longfei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qianyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chenyang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yangkun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenhui Peng
View author publications
You can also search for this author in PubMed Google Scholar
Haipeng Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Zhida Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojuan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Quan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quan Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 51609 KB)

A The definition of comment style

There are 13 different kinds of comment in our observational study. Their definitions are as follows.

Contextual Background: This type of comment provides relevant background information related to the discussed topic, offering context and historical perspective to enhance the understanding of the discussion.

Expert Analysis: These comments are written by professionals or experts and aim to provide in-depth assessments and insights regarding specific topics, products, works, or services. They often include professional opinions and ratings.

Shared Emotions: This comment expresses the commenter’s emotions or experiences that resonate with the discussed topic or work, emphasizing the emotional connection and shared feelings.

Trending Highlights: These comments highlight the current popularity and trends surrounding a particular topic, product, service, or work, often based on social media or internet trends.

Creative Team Insights: This comment type offers detailed insights into the creative team or authors behind a work, including their background, previous works, artistic style, and other relevant information.

Literary Assessment: These comments pertain to literary works such as novels, poetry, or plays, providing evaluations of the work’s structure, themes, language, or style.

Creative Excellence: These comments focus on the creative and artistic aspects of the content, emphasizing its uniqueness and creative qualities.

Latest Updates: These comments provide information about the most recent developments or news regarding a specific topic, product, or event, offering insights into current events.

Personal Experiences: This comment includes the commenter’s personal experiences or stories related to the topic or work, using personal narratives to support or explain their viewpoints.

Real-time Commentary: These comments are related to ongoing events, live broadcasts, or on-site activities, offering real-time commentary and viewpoints on current happenings.

Social Media Trends: These comments relate to trends, news, or hot topics on social media platforms, often including commentary and analysis of social media events.

Fan Sentiments: This comment type encompasses opinions and emotional expressions from both fans and critics regarding specific celebrities, works, teams, or products, highlighting the sentiments and reasons for their support or criticism.

Concise Remarks: These comments are brief and to the point, providing a succinct opinion or comment without detailed analysis or descriptions.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, L., Liu, Q., Zhang, C. et al. Amplifying the music listening experience through song comments on music streaming platforms. J Vis 27, 401–419 (2024). https://doi.org/10.1007/s12650-024-00966-2

Download citation

Received: 27 November 2023
Revised: 07 January 2024
Accepted: 26 January 2024
Published: 10 March 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s12650-024-00966-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Amplifying the music listening experience through song comments on music streaming platforms