A large-scale evaluation of acoustic and subjective music-similarity measures
A valuable goal in the field of Music Information Retrieval (MIR) is to devise an automatic
measure of the similarity between two musical recordings based only on an analysis of their
audio content. Such a, tool-a quantitative measure of similarity--can be used to build
classification, retrieval, brows-ing, and recommendation systems. To develop such a
measure, however, presupposes some ground truth, a single underlying similarity that
constitutes the desired output of the measure. Mu-sic similarity is an elusive concept-wholly …
measure of the similarity between two musical recordings based only on an analysis of their
audio content. Such a, tool-a quantitative measure of similarity--can be used to build
classification, retrieval, brows-ing, and recommendation systems. To develop such a
measure, however, presupposes some ground truth, a single underlying similarity that
constitutes the desired output of the measure. Mu-sic similarity is an elusive concept-wholly …
A valuable goal in the field of Music Information Retrieval (MIR) is to devise an automatic measure of the similarity between two musical recordings based only on an analysis of their audio content. Such a, tool-a quantitative measure of similarity--can be used to build classification, retrieval, brows-ing, and recommendation systems. To develop such a measure, however, presupposes some ground truth, a single underlying similarity that constitutes the desired output of the measure. Mu-sic similarity is an elusive concept-wholly subjective, multifaceted, and a moving target-but one that must be pursued in support of applications to provide automatic organization of large music col-lections.
In this article, we explore music-similarity measures in several ways, motivated by different types of questions. We are first motivated by the desire to improve automatic, acoustic-based similarity measures. Researchers from several groups have recently tried many variations of a few basic ideas, but it remains unclear which are best-suited for a given application. Few authors perform compari-sons across multiple techniques, and it is impossi-ble to compare results from different authors, because they do not share the required common ground: a common database and a common evalua-tion method.
JSTOR