research-article

Two-sided Rank Consistent Ordinal Regression for Interpretable Music Key Recommendation

Authors:

Shigeki Tanaka,

Keita Yokoyama,

Yi FangAuthors Info & Claims

ICTIR '22: Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval

Pages 223 - 231

https://doi.org/10.1145/3539813.3545147

Published: 25 August 2022 Publication History

Abstract

Model interpretability has attracted increasing attention in the IR community since it is important to ensure that end-users (decision-makers) correctly understand and consequently trust the functionality of the models. On the other hand, ordinal regression has been widely used in many ranking and prediction tasks, but it could not guarantee the rank consistent predictions for the output labels, which makes the predicted results hard to explain. Take the music key recommendation in karaoke as an example where a user could select a key ranging from -7 to +7 so that the song could meet the user's vocal competence for better performance. If the best key for a user to sing a song is -3, the keys smaller than -3 should be ranked in decreasing order. Similarly, the keys on the positive side should also be ranked in the decreasing order. To address this challenge, we propose a novel Two-sided Rank Consistent Ordinal Regression model. We show that the model is not only able to predict the key for the target song given the user's singing history, but it also has the theoretical guarantees for the two-sided rank-monotonicity. We train the model with a history encoder using the recurrent units and a key decoder using the Transformer. The experimental results on the real-world karaoke dataset demonstrate the effectiveness of our proposed model.

References

[1]

Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering 17, 6 (2005), 734--749.

Digital Library

[2]

Oshin Agarwal, Yinfei Yang, Byron C Wallace, and Ani Nenkova. 2021. Interpretability analysis for named entity recognition to understand system predictions and how they can improve. Computational Linguistics 47, 1 (2021), 117--140.

[3]

David Alvarez-Melis and Tommi Jaakkola. 2017. A causal framework for explaining the predictions of black-box sequence-to-sequence models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 412--421.

[4]

Robin Burke. 2002. Hybrid recommender systems: Survey and experiments. User modeling and user-adapted interaction 12, 4 (2002), 331--370.

[5]

Paul-Christian Bürkner and Matti Vuorre. 2019. Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science 2, 1 (2019), 77--101.

[6]

Wenzhi Cao, Vahid Mirjalili, and Sebastian Raschka. 2020. Rank consistent ordinal regression for neural networks with application to age estimation. Pattern Recognition Letters 140 (2020), 325--331.

Digital Library

[7]

Wei Chu and S. Sathiya Keerthi. 2005. New Approaches to Support Vector Ordinal Regression. In Proceedings of the 22nd International Conference on Machine Learning. Association for Computing Machinery, New York, NY, USA, 145--152.

[8]

Koby Crammer and Yoram Singer. 2002. Pranking with Ranking. In Advances in Neural Information Processing Systems, T. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Vol. 14. MIT Press, 641--647.

[9]

Piotr Dabkowski and Yarin Gal. 2017. Real time image saliency for black box classifiers. Advances in neural information processing systems 30 (2017).

[10]

J Stephen Downie. 2003. Music information retrieval. Annual review of information science and technology 37, 1 (2003), 295--340.

[11]

Mohamed Farah. 2009. Ordinal Regression Based Model for Personalized Information Retrieval. In Conference on the Theory of Information Retrieval. Springer, 66--78.

[12]

Zeon Trevor Fernando, Jaspreet Singh, and Avishek Anand. 2019. A study on the Interpretability of Neural Retrieval Models using DeepSHAP. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1005--1008.

Digital Library

[13]

Ferdos Fessahaye, Luis Perez, Tiffany Zhan, Raymond Zhang, Calais Fossier, Robyn Markarian, Carter Chiu, Justin Zhan, Laxmi Gewali, and Paul Oh. 2019. T-recsys: A novel music recommendation system using deep learning. In IEEE International Conference on Consumer Electronics. IEEE, 1--6.

[14]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press.

Digital Library

[15]

Chu Guan, Yanjie Fu, Xinjiang Lu, Enhong Chen, Xiaolin Li, and Hui Xiong. 2017. Efficient karaoke song recommendation via multiple kernel learning approximation. Neurocomputing 254 (2017), 22--32.

[16]

Chu Guan, Yanjie Fu, Xinjiang Lu, Hui Xiong, Enhong Chen, and Yingling Liu. 2016. Vocal Competence Based Karaoke Recommendation: A Maximum-Margin Joint Model. In Proceedings of the SIAM International Conference on Data Mining. 135--143.

[17]

Ming He, Hao Guo, Guangyi Lv, LeWu, Yong Ge, Enhong Chen, and Haiping Ma. 2020. Leveraging proficiency and preference for online Karaoke recommendation. Frontiers of Computer Science 14, 2 (2020), 273--290.

Digital Library

[18]

Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 1999. Support vector learning for ordinal regression. (1999).

[19]

Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7482--7491.

[20]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, Yoshua Bengio and Yann LeCun (Eds.).

[21]

Yehuda Koren and Joe Sill. 2011. OrdRec: An Ordinal Model for Predicting Personalized Item Rating Distributions. In Proceedings of the Fifth ACM Conference on Recommender Systems. Association for Computing Machinery, New York, NY, USA, 117--124.

Digital Library

[22]

G. Levi and T. Hassncer. 2015. Age and gender classification using convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops. 34--42.

[23]

Ling Li and Hsuan-tien Lin. 2007. Ordinal Regression by Extended Binary Classification. In Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman (Eds.), Vol. 19. MIT Press, 865--872.

[24]

K. Mao, L. Shou, J. Fan, G. Chen, and M. S. Kankanhalli. 2015. Competence-Based Song Recommendation: Matching Songs to One's Singing Skill. IEEE Transactions on Multimedia 17, 3 (2015), 396--408.

Digital Library

[25]

Tomas Mikolov, Kai Chen, G. Corrado, and J. Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations.

[26]

Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, and Gang Hua. 2016. Ordinal Regression With Multiple Output CNN for Age Estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[27]

Fabian Pedregosa-Izquierdo. 2015. Feature extraction and supervised learning on fMRI : from practice to theory. Theses. Université Pierre et Marie Curie - Paris VI.

[28]

Colin Raffel. 2016. Learning-based methods for comparing sequences, with applications to audio-to-midi alignment and matching. Ph.D. Dissertation. Columbia University.

[29]

Shyamsundar Rajaram, Ashutosh Garg, Xiang Sean Zhou, and Thomas S. Huang. 2003. Classification Approach towards Ranking and Sorting Problems. In Machine Learning: European Conference on Machine Learning, Nada Lavra?, Dragan Gamberger, Hendrik Blockeel, and Ljupo Todorovski (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 301--312.

[30]

Steffen Rendle. 2010. Factorization machines. In 2010 IEEE International Conference on Data Mining. IEEE, 995--1000.

Digital Library

[31]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, Arlington, Virginia, USA, 452--461.

[32]

R. Rothe, R. Timofte, and L. Van Gool. 2015. DEX: Deep EXpectation of Apparent Age from a Single Image. In IEEE International Conference on Computer Vision Workshop. 252--257.

[33]

J Ben Schafer, Dan Frankowski, Jon Herlocker, and Shilad Sen. 2007. Collaborative filtering recommender systems. In The adaptive web. Springer, 291--324.

[34]

Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE transactions on Signal Processing 45, 11 (1997), 2673--2681.

Digital Library

[35]

Amnon Shashua and Anat Levin. 2002. Ranking with Large Margin Principle: Two Approaches. In Proceedings of the 15th International Conference on Neural Information Processing Systems. MIT Press, Cambridge, MA, USA, 961--968.

Digital Library

[36]

Libin Shen and Aravind K Joshi. 2005. Ranking and reranking with perceptron. Machine Learning 60, 1--3 (2005), 73--96.

Digital Library

[37]

Jaspreet Singh and Avishek Anand. 2020. Model agnostic interpretability of rankers via intent modelling. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 618--628.

Digital Library

[38]

Bing-Yu Sun, Jiuyong Li, Desheng Dash Wu, Xiao-Ming Zhang, and Wen-Bo Li. 2009. Kernel discriminant learning for ordinal regression. IEEE Transactions on Knowledge and Data Engineering 22, 6 (2009), 906--910.

Digital Library

[39]

Aäron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in Neural Information Processing Systems, Vol. 26. Neural Information Processing Systems Foundation, 9.

[40]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc.

[41]

Jun Wang, Arjen P De Vries, and Marcel JT Reinders. 2006. Unifying userbased and item-based collaborative filtering approaches by similarity fusion. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. 501--508.

Digital Library

[42]

Xinxi Wang and Ye Wang. 2014. Improving Content-Based and Hybrid Music Recommendation Using Deep Learning. In Proceedings of the 22nd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 627--636.

Digital Library

[43]

Yuan Wang, Shigeki Tanaka, Keita Yokoyama, Hsin-Tai Wu, and Yi Fang. 2021. Karaoke Key Recommendation Via Personalized Competence-Based Rating Prediction. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 286--290.

[44]

Yongfeng Zhang, Xu Chen, et al. 2020. Explainable recommendation: A survey and new perspectives. Foundations and Trends® in Information Retrieval 14, 1 (2020), 1--101.

Index Terms

Two-sided Rank Consistent Ordinal Regression for Interpretable Music Key Recommendation
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Music retrieval

Recommendations

Computational Analysis of Jazz Music: Estimating Tonality through Chord Progression Distances
CSAE '23: Proceedings of the 7th International Conference on Computer Science and Application Engineering

Currently, research in music informatics focuses extensively on music theory, particularly on the theoretical systems of Western classical music dating back to the 19th century. However, contemporary popular music genres such as pop, rock, and jazz often ...
Music Key Detection for Musical Audio
MMM '05: Proceedings of the 11th International Multimedia Modelling Conference

The key or the scale information of a piece of music provides important clues on its high level musical content, like harmonic and melodic context, which can be useful for music classification, retrieval or further content analysis. Researchers have ...
Music Retrieval and Recommendation: A Tutorial Overview
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

In this tutorial, we give an introduction to the field of and state of the art in music information retrieval (MIR). The tutorial particularly spotlights the question of music similarity, which is an essential aspect in music retrieval and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICTIR '22: Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval

August 2022

289 pages

ISBN:9781450394123

DOI:10.1145/3539813

Program Chairs:
Fabio Crestani
Università della Svizzera Italiana - USI, Switzerland
,
Gabriella Pasi
Univ. Milano-Bicocca, Italy
,
Eric Gaussier
Univ. Grenoble-Alpes, France

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICTIR '22

Sponsor:

SIGIR

ICTIR '22: The 2022 ACM SIGIR International Conference on the Theory of Information Retrieval

July 11 - 12, 2022

Madrid, Spain

Acceptance Rates

ICTIR '22 Paper Acceptance Rate 32 of 80 submissions, 40%;

Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
77
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)2

Reflects downloads up to 01 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten