Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3539813.3545147acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

Two-sided Rank Consistent Ordinal Regression for Interpretable Music Key Recommendation

Published: 25 August 2022 Publication History

Abstract

Model interpretability has attracted increasing attention in the IR community since it is important to ensure that end-users (decision-makers) correctly understand and consequently trust the functionality of the models. On the other hand, ordinal regression has been widely used in many ranking and prediction tasks, but it could not guarantee the rank consistent predictions for the output labels, which makes the predicted results hard to explain. Take the music key recommendation in karaoke as an example where a user could select a key ranging from -7 to +7 so that the song could meet the user's vocal competence for better performance. If the best key for a user to sing a song is -3, the keys smaller than -3 should be ranked in decreasing order. Similarly, the keys on the positive side should also be ranked in the decreasing order. To address this challenge, we propose a novel Two-sided Rank Consistent Ordinal Regression model. We show that the model is not only able to predict the key for the target song given the user's singing history, but it also has the theoretical guarantees for the two-sided rank-monotonicity. We train the model with a history encoder using the recurrent units and a key decoder using the Transformer. The experimental results on the real-world karaoke dataset demonstrate the effectiveness of our proposed model.

References

[1]
Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering 17, 6 (2005), 734--749.
[2]
Oshin Agarwal, Yinfei Yang, Byron C Wallace, and Ani Nenkova. 2021. Interpretability analysis for named entity recognition to understand system predictions and how they can improve. Computational Linguistics 47, 1 (2021), 117--140.
[3]
David Alvarez-Melis and Tommi Jaakkola. 2017. A causal framework for explaining the predictions of black-box sequence-to-sequence models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 412--421.
[4]
Robin Burke. 2002. Hybrid recommender systems: Survey and experiments. User modeling and user-adapted interaction 12, 4 (2002), 331--370.
[5]
Paul-Christian Bürkner and Matti Vuorre. 2019. Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science 2, 1 (2019), 77--101.
[6]
Wenzhi Cao, Vahid Mirjalili, and Sebastian Raschka. 2020. Rank consistent ordinal regression for neural networks with application to age estimation. Pattern Recognition Letters 140 (2020), 325--331.
[7]
Wei Chu and S. Sathiya Keerthi. 2005. New Approaches to Support Vector Ordinal Regression. In Proceedings of the 22nd International Conference on Machine Learning. Association for Computing Machinery, New York, NY, USA, 145--152.
[8]
Koby Crammer and Yoram Singer. 2002. Pranking with Ranking. In Advances in Neural Information Processing Systems, T. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Vol. 14. MIT Press, 641--647.
[9]
Piotr Dabkowski and Yarin Gal. 2017. Real time image saliency for black box classifiers. Advances in neural information processing systems 30 (2017).
[10]
J Stephen Downie. 2003. Music information retrieval. Annual review of information science and technology 37, 1 (2003), 295--340.
[11]
Mohamed Farah. 2009. Ordinal Regression Based Model for Personalized Information Retrieval. In Conference on the Theory of Information Retrieval. Springer, 66--78.
[12]
Zeon Trevor Fernando, Jaspreet Singh, and Avishek Anand. 2019. A study on the Interpretability of Neural Retrieval Models using DeepSHAP. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1005--1008.
[13]
Ferdos Fessahaye, Luis Perez, Tiffany Zhan, Raymond Zhang, Calais Fossier, Robyn Markarian, Carter Chiu, Justin Zhan, Laxmi Gewali, and Paul Oh. 2019. T-recsys: A novel music recommendation system using deep learning. In IEEE International Conference on Consumer Electronics. IEEE, 1--6.
[14]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press.
[15]
Chu Guan, Yanjie Fu, Xinjiang Lu, Enhong Chen, Xiaolin Li, and Hui Xiong. 2017. Efficient karaoke song recommendation via multiple kernel learning approximation. Neurocomputing 254 (2017), 22--32.
[16]
Chu Guan, Yanjie Fu, Xinjiang Lu, Hui Xiong, Enhong Chen, and Yingling Liu. 2016. Vocal Competence Based Karaoke Recommendation: A Maximum-Margin Joint Model. In Proceedings of the SIAM International Conference on Data Mining. 135--143.
[17]
Ming He, Hao Guo, Guangyi Lv, LeWu, Yong Ge, Enhong Chen, and Haiping Ma. 2020. Leveraging proficiency and preference for online Karaoke recommendation. Frontiers of Computer Science 14, 2 (2020), 273--290.
[18]
Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 1999. Support vector learning for ordinal regression. (1999).
[19]
Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7482--7491.
[20]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, Yoshua Bengio and Yann LeCun (Eds.).
[21]
Yehuda Koren and Joe Sill. 2011. OrdRec: An Ordinal Model for Predicting Personalized Item Rating Distributions. In Proceedings of the Fifth ACM Conference on Recommender Systems. Association for Computing Machinery, New York, NY, USA, 117--124.
[22]
G. Levi and T. Hassncer. 2015. Age and gender classification using convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops. 34--42.
[23]
Ling Li and Hsuan-tien Lin. 2007. Ordinal Regression by Extended Binary Classification. In Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman (Eds.), Vol. 19. MIT Press, 865--872.
[24]
K. Mao, L. Shou, J. Fan, G. Chen, and M. S. Kankanhalli. 2015. Competence-Based Song Recommendation: Matching Songs to One's Singing Skill. IEEE Transactions on Multimedia 17, 3 (2015), 396--408.
[25]
Tomas Mikolov, Kai Chen, G. Corrado, and J. Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations.
[26]
Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, and Gang Hua. 2016. Ordinal Regression With Multiple Output CNN for Age Estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[27]
Fabian Pedregosa-Izquierdo. 2015. Feature extraction and supervised learning on fMRI : from practice to theory. Theses. Université Pierre et Marie Curie - Paris VI.
[28]
Colin Raffel. 2016. Learning-based methods for comparing sequences, with applications to audio-to-midi alignment and matching. Ph.D. Dissertation. Columbia University.
[29]
Shyamsundar Rajaram, Ashutosh Garg, Xiang Sean Zhou, and Thomas S. Huang. 2003. Classification Approach towards Ranking and Sorting Problems. In Machine Learning: European Conference on Machine Learning, Nada Lavra?, Dragan Gamberger, Hendrik Blockeel, and Ljupo Todorovski (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 301--312.
[30]
Steffen Rendle. 2010. Factorization machines. In 2010 IEEE International Conference on Data Mining. IEEE, 995--1000.
[31]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, Arlington, Virginia, USA, 452--461.
[32]
R. Rothe, R. Timofte, and L. Van Gool. 2015. DEX: Deep EXpectation of Apparent Age from a Single Image. In IEEE International Conference on Computer Vision Workshop. 252--257.
[33]
J Ben Schafer, Dan Frankowski, Jon Herlocker, and Shilad Sen. 2007. Collaborative filtering recommender systems. In The adaptive web. Springer, 291--324.
[34]
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE transactions on Signal Processing 45, 11 (1997), 2673--2681.
[35]
Amnon Shashua and Anat Levin. 2002. Ranking with Large Margin Principle: Two Approaches. In Proceedings of the 15th International Conference on Neural Information Processing Systems. MIT Press, Cambridge, MA, USA, 961--968.
[36]
Libin Shen and Aravind K Joshi. 2005. Ranking and reranking with perceptron. Machine Learning 60, 1--3 (2005), 73--96.
[37]
Jaspreet Singh and Avishek Anand. 2020. Model agnostic interpretability of rankers via intent modelling. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 618--628.
[38]
Bing-Yu Sun, Jiuyong Li, Desheng Dash Wu, Xiao-Ming Zhang, and Wen-Bo Li. 2009. Kernel discriminant learning for ordinal regression. IEEE Transactions on Knowledge and Data Engineering 22, 6 (2009), 906--910.
[39]
Aäron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in Neural Information Processing Systems, Vol. 26. Neural Information Processing Systems Foundation, 9.
[40]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc.
[41]
Jun Wang, Arjen P De Vries, and Marcel JT Reinders. 2006. Unifying userbased and item-based collaborative filtering approaches by similarity fusion. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. 501--508.
[42]
Xinxi Wang and Ye Wang. 2014. Improving Content-Based and Hybrid Music Recommendation Using Deep Learning. In Proceedings of the 22nd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 627--636.
[43]
Yuan Wang, Shigeki Tanaka, Keita Yokoyama, Hsin-Tai Wu, and Yi Fang. 2021. Karaoke Key Recommendation Via Personalized Competence-Based Rating Prediction. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 286--290.
[44]
Yongfeng Zhang, Xu Chen, et al. 2020. Explainable recommendation: A survey and new perspectives. Foundations and Trends® in Information Retrieval 14, 1 (2020), 1--101.

Index Terms

  1. Two-sided Rank Consistent Ordinal Regression for Interpretable Music Key Recommendation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICTIR '22: Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval
    August 2022
    289 pages
    ISBN:9781450394123
    DOI:10.1145/3539813
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. model interpretability
    2. music information retrieval
    3. ordinal regression

    Qualifiers

    • Research-article

    Conference

    ICTIR '22
    Sponsor:

    Acceptance Rates

    ICTIR '22 Paper Acceptance Rate 32 of 80 submissions, 40%;
    Overall Acceptance Rate 235 of 527 submissions, 45%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 77
      Total Downloads
    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 01 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media