Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3126686.3126757acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Hierarchical Representation Based on Bayesian Nonparametric Tree-Structured Mixture Model for Playing Technique Classification

Published: 23 October 2017 Publication History

Abstract

This work develops a topic model-based hierarchical representation for identifying the latent characteristics behind the frame-level musical features. Frame-level features and music clips are regarded as acoustic words and acoustic documents, respectively. A Gaussian hierarchical latent Dirichlet allocation (G-hLDA) is proposed to find the latent topics behind the acoustic document. The G-hLDA directly handles the continuous features instead of transforming them into discrete words, reducing information loss from discretization-based vector quantization. Specially, each latent topic that is identified by G-hLDA is represented as a node in the infinitely deep, infinitely branching tree. For a music clip, the number of its acoustic words at each node is computed to form the hierarchical representation. The proposed representation hierarchically captures not only the shared components but also the unique components among music clips, resulting in improved performance. The experimental results on the guitar playing technique database demonstrate that the proposed method outperforms baselines.

References

[1]
J. Abeßer, H. Lukashevich, and G. Schuller. 2010. Feature-based extraction of plucking and expression styles of the electric bass guitar Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2290--2293.
[2]
F. Auger and P. Flandrin. 1995. Improving the readability of time-frequency and time-scale representations by the reassignment method. IEEE Transactions on Signal Processing Vol. 43, 5 (May. 1995), 1068--1089.
[3]
D. M. Blei, T. L. Griffiths, and M. I. Jordan. 2010. The nested chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM Vol. 57, 2, Article 7 (Feb. 2010), 30 pages.
[4]
D. M. Blei, A. Y. Ng, and M. I Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research Vol. 3, 4-5 (Jan. 2003), 993--1022.
[5]
C. C. Chang and C. J. Lin. 2011. LIBSVM: A Library for support vector machines. ACM Transactions on Intelligent Systems and Technology, Vol. 2, 3, Article 27 (May. 2011), 27 pages.
[6]
Y. P. Chen, L. Su, and Y. H. Yang. 2015. Electric guitar playing technique detection in real-world recordings based on F0 sequence pattern recognition. In Proceedings of the International Society for Music Information Retrieval (ISMIR). 708--714.
[7]
K. R. Fitz and S. A. Fulop. 2009. A unified theory of time-frequency reassignment. CoRR Vol. abs/0903.3080 (2009).
[8]
A. Gersho and Robert M. Gray. 1991. Vector Quantization and Signal Compression. Kluwer Academic Publishers.
[9]
P. Hu, W. Liu, W. Jiang, and Z. Yang. 2012. Latent Topic Model Based on Gaussian-LDA for Audio Retrieval. Springer Berlin Heidelberg, 556--563.
[10]
E. J. Humphrey, J. P. Bello, and Y. LeCun. 2012. Deep architectures and automatic feature learning in music informatics Proceedings of the International Society for Music Information Retrieval (ISMIR). 403--408.
[11]
S. Kim, P. Georgiou, and S. Narayanan. 2012. Supervised acoustic topic model with a consequent classifier for unstructured audio classification. In Proceedings of the International Workshop on Content-Based Multimedia Indexing (CBMI). 1--6.
[12]
S. Kim, S. Narayanan, and S. Sundaram. 2009. Acoustic topic model for audio information retrieval Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (ASPAA). 37--40.
[13]
S. Kim, S. Sundaram, P. Georgiou, and S. Narayanan. 2009. Audio scene understanding using topic models. In Proceedings of the Neural Information Processing Systems (NIPS) Workshop. 1--4.
[14]
O. Lartillot and P. Toiviainen. 2007. A Matlab toolbox for musical feature extraction from audio Proceedings of the International Conference on Digital Audio Effects. 237--244.
[15]
B. Logan. 2000. Mel frequency cepstral coefficients for music modeling Proceedings of the International Society of Music Information Retrieval (ISMIR).
[16]
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. 2009. Online dictionary learning for sparse coding. In Proceedings of the Annual International Conference on Machine Learning (ICML). 689--696.
[17]
P. Manzagol, T. Bertin-Mahieux, and D. Eck. 2008. On the use of sparse time-relative auditory codes for music Proceedings of the International Society of Music Information Retrieval (ISMIR). 14--18.
[18]
M. Muller, D. P. W. Ellis, A. Klapuri, and G. Richard. 2011. Signal Processing for Music Analysis. IEEE Journal of Selected Topics in Signal Processing, Vol. 5, 6 (Oct. 2011), 1088--1110.
[19]
T. Nakano, K. Yoshii, and M. Goto. 2014. Vocal timbre analysis using latent Dirichlet allocation and cross-gender vocal timbre similarity. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 5202--5206.
[20]
J. Nam, J. Herrera, M. Slaney, and J. Smith. 2012. Learning Sparse Feature Representations for Music Annotation and Retrieval Proceedings of the International Society for Music Information Retrieval (ISMIR).
[21]
J. Nam, J. Herrera, M. Slaney, and J. Smith. 2012. Learning sparse feature representations for music annotation and retrieval Proceedings of the International Society for Music Information Retrieval (ISMIR). 565--560.
[22]
K. O'Hanlon and M. D. Plumbley. 2013. Automatic Music Transcription using row weighted decompositions Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 16--20.
[23]
I. Porteous, D. Newman, A. Ihler, A. Asuncion, P. Smyth, and M. Welling. 2008. Fast collapsed gibbs sampling for latent Dirichlet allocation Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 569--577.
[24]
L. Su, H. M. Lin, and Y. H. Yang. 2014. Sparse modeling of magnitude and phase-derived spectra for playing technique classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22, 12 (Dec. 2014), 2122--2132.
[25]
L. Su, C. C. M. Yeh, J. Y. Liu, J. C. Wang, and Y. H. Yang. 2014. A systematic evaluation of the bag-of-frames representation for music information retrieval. IEEE Transactions on Multimedia Vol. 16, 5 (Mar. 2014), 1188--1200.
[26]
L. Su, L. F. Yu, and Y. H. Yang. 2014. Sparse cepstral and phase codes for guitar playing technique classification Proceedings of the International Society for Music Information Retrieval (ISMIR). 9--14.
[27]
A. Tindale, A. Kapur, G. Tzanetakis, and I. Fujinaga. 2004. Retrieval of percussion gestures using timbre classification techniques Proceedings of the International Society for Music Information Retrieval (ISMIR). 541--545.
[28]
K. Yazawa, K. Itoyama, and H. G. Okuno. 2014. Automatic transcription of guitar tablature from audio signals in accordance with player's proficiency. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3122--3126.

Cited By

View all
  • (2021)Instrument Playing Technique Recognition: A Greek Music Use CaseProceedings of the Worldwide Music Conference 202110.1007/978-3-030-74039-9_13(124-136)Online publication date: 13-Apr-2021
  • (2018)Learning a Hierarchical Latent Semantic Model for Multimedia Data2018 24th International Conference on Pattern Recognition (ICPR)10.1109/ICPR.2018.8545305(2995-3000)Online publication date: Aug-2018

Index Terms

  1. Hierarchical Representation Based on Bayesian Nonparametric Tree-Structured Mixture Model for Playing Technique Classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017
    October 2017
    558 pages
    ISBN:9781450354165
    DOI:10.1145/3126686
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 October 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hierarchical representation
    2. playing technique classification

    Qualifiers

    • Research-article

    Conference

    MM '17
    Sponsor:
    MM '17: ACM Multimedia Conference
    October 23 - 27, 2017
    California, Mountain View, USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 23 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Instrument Playing Technique Recognition: A Greek Music Use CaseProceedings of the Worldwide Music Conference 202110.1007/978-3-030-74039-9_13(124-136)Online publication date: 13-Apr-2021
    • (2018)Learning a Hierarchical Latent Semantic Model for Multimedia Data2018 24th International Conference on Pattern Recognition (ICPR)10.1109/ICPR.2018.8545305(2995-3000)Online publication date: Aug-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media