Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Towards Music Structural Segmentation across Genres: Features, Structural Hypotheses, and Annotation Principles

Published: 21 October 2016 Publication History

Abstract

This article faces the problem of how different audio features and segmentation methods work with different music genres. A new annotated corpus of Chinese traditional Jingju music is presented. We incorporate this dataset with two existing music datasets from the literature in an integrated retrieval system to evaluate existing features, structural hypotheses, and segmentation algorithms outside a Western bias. A harmonic-percussive source separation technique is introduced to the feature extraction process and brings significant improvement to the segmentation. Results show that different features capture the structural patterns of different music genres in different ways. Novelty- or homogeneity-based segmentation algorithms and timbre features can surpass the investigated alternatives for the structure analysis of Jingju due to their lack of harmonic repetition patterns. Findings indicate that the design of audio features and segmentation algorithms as well as the consideration of contextual information related to the music corpora should be accounted dependently in an effective segmentation system.

References

[1]
J.-J. Aucouturier, François Pachet, and Mark Sandler. 2005. “The way it sounds”: Timbre models for analysis and retrieval of music signals. IEEE Trans. Multimed. 7, 6 (2005), 1028--1035.
[2]
Dawn A. A. Black, Ma Li, and Mi Tian. 2014. Automatic identification of emotional cues in Chinese opera singing. In Proceedings of the 13th International Conference on Music Perception and Cognition. Seoul, South Korea, 250--255.
[3]
Kainan Chen. 2013. Characterization of Pitch Intonation of Beijing Opera. Master’s thesis. Universitat Pompeu Fabra.
[4]
China Music Group (CMG). 2010. Peking Opera Box set, Limited Edition. Audio CD. (2010).
[5]
Matthew E. P. Davies and Mark D. Plumbley. 2004. Causal tempo tracking of audio. In Proceedings of the 5th International Society for Music Information Retrieval Conference. 164--169.
[6]
Diana Deutsch. 2012. The Psychology of Music. Academic Press.
[7]
Jonathan Driedger, Meinard Müller, and Sascha Disch. 2014. Extending harmonic-percussive separation of audio signals. In Proceedings of the 15th International Society for Music Information Conference. 611--616.
[8]
Jean-Pierre Eckmann, S. Oliffson Kamphorst, and David Ruelle. 1987. Recurrence plots of dynamical systems. Europhys. Lett. 4, 9 (1987), 973--977.
[9]
Derry FitzGerald. 2010. Harmonic/percussive separation using median filtering. In Proceedings of the 13th International Conference on Digital Audio Effects. 15--18.
[10]
Jonathan Foote. 2000. Automatic audio segmentation using a measure of audio novelty. In Proceedings of IEEE International Conference on Multimedia and Expo. IEEE, Los Alamitos, CA, 452--455.
[11]
Takuya Fujishima. 1999. Realtime chord recognition of musical sound: A system using common lisp music. In Proceedings of the International Computer Music Conference. 464--467.
[12]
Aggelos Gkiokas, Vassilios Katsouros, George Carayannis, and Themos Stajylakis. 2012. Music tempo estimation and beat tracking by applying source separation and metrical relations. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13]
Harald Grohganz, Michael Clausen, Nanzhu Jiang, and Meinard Müller. 2013. Converting path structures into block structures using eigenvalue decompositions of self-similarity matrices. In Proceedings of the 14th International Society for Music Information Retrieval Conference. 209--214.
[14]
Christopher A. Harte and Mark B. Sandler. 2005. Automatic chord identification using a quantised chromagram. In Proceedings of the 118th Convention of the Audio Engineering Society.
[15]
Kristoffer Jensen. 2005. A causal rhythm grouping. In Computer Music Modeling and Retrieval. Springer.
[16]
Florian Kaiser and Thomas Sikora. 2010. Music structure discovery in popular music using non-negative matrix factorization. In Proceedings of the 11th International Society for Music Information Retrieval Conference. 429--434.
[17]
Mark Levy and Mark B. Sandler. 2008. Structural segmentation of musical audio by constrained clustering. IEEE Trans. Aud. Speech. Lang. Process. 16, 2 (2008), 318--326.
[18]
Yuxiang Liu, Qiaoliang Xiang, Ye Wang, and Lianhong Cai. 2009. Cultural style based music classification of audio signals. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 57--60.
[19]
Beth Logan. 2000. Mel frequency cepstral coefficients for music modeling. In Proceedings of the 1st International Conference on Music Information Retrieval.
[20]
Brian C. J. Moore, Brian R. Glasberg, and Thomas Baer. 1997. A model for the prediction of thresholds, loudness, and partial loudness. J. Aud. Eng. Soc. 45, 4 (1997), 224--240.
[21]
Oriol Nieto and Juan P. Bello. 2015. MSAF: Music structure analytis framework. In16th International Society for Music Information Retrieval Conference.
[22]
Oriol Nieto and Tristan Jehan. 2013. Convex non-negative matrix factorization for automatic music structure identification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, Los Alamitos, CA, 236--240.
[23]
Richard Parncutt. 1994. A perceptual model of pulse salience and metrical accent in musical rhythms. Music Percept. 11 (1994), 409--464.
[24]
Jouni Paulus and Anssi Klapuri. 2008a. Acoustic features for music piece structure analysis. In Proceedings of the 11th International Conference on Digital Audio Effects (Dafx).
[25]
Jouni Paulus and Anssi Klapuri. 2008b. Labelling the structural parts of a music piece with Markov models. In Proceedings of Computers in Music Modeling and Retrieval Conference (CMMR).
[26]
Jouni Paulus, Meinard Müller, and Anssi Klapuri. 2010. State of the art report: Audio-based music structure analysis. In Proceedings of the 11th International Society for Music Information Retrieval Conference. 625--636.
[27]
Geoffroy Peeters and Emmanuel Deruty. 2009. Is music structure annotation multi-dimensional? A proposal for robust local music annotational music annotation. In Proceedings of the 10th International Society for Music Information Retrieval Conference. 337--342.
[28]
Rafael Caro Repetto and Xavier Serra. 2014. Creating a corpus of jingju (beijing opera) music and possibilities for melodic analysis. In Proceedings of the 15th International Society for Music Information Retrieval Conference.
[29]
Rafael Caro Repetto, Ajay Srinivasamurthy, Sankalp Gulati, and Xavier Serra. 2014. Jingju Music: Concepts and Computational Tools for Its Analysis. Technical Report. International Society for Music Information Retrieval Conference, Taipei, Taiwan.
[30]
Xavier Rodet. 2001. Project Ecrins: Calcul Des Descripteur De Bas-niveaux. Technical Report. Ircam.
[31]
Halfdan Rump, Shigeki Miyabe, Emiru Tsunoo, and Nobutaka Ono. 2010. Autoregressive MFCC models for genre classification improved by harmonic-percussion separation. In Proceedings of the 11th International Society for Music Information Retrieval Conference. 87--92.
[32]
Markus Schedl, Emilia Gómez, and Julián Urbano. 2014. Music information retrieval: Recent developments and applications. Found. Trends Inform. Retriev. 8, 2--3 (2014), 127--261.
[33]
Joan Serrà, Meinard Müller, Peter Grosche, and Josep Lluis Arcos. 2012. Unsupervised detection of music boundaries by time series structure features. In Proceedings of the 26th AAAI Conference on Artificial Intelligence. 1613--1619.
[34]
Xavier Serra. 2011. A multicultural approach in music information research. In Proceedings of the 12th International Society for Music Information Retrieval Conference.
[35]
Jordan B. L. Smith. 2010. A Comparison and Evaluation of Approaches to the Automatic Formal Analysis of Musical Audio. Master’s thesis. McGill University.
[36]
Jordan B. L. Smith, J. Ashley Burgoyne, Ichiro Fujinaga, David De Roure, and J. Stephen Downie. 2011. Design and creation of a large-scale database of structural annotations. In Proceedings of the 12th International Society for Music Information Retrieval Conference. 555--560.
[37]
Jordan B. L Smith, Isaac Schankler, and Elaine Chew. 2014. Listening as a creative act: Meaningful differences in structural annotations of improvised performances. Music Theory Online 20, 3 (2014).
[38]
Ajay Srinivasamurthy, Rafael Caro Repetto, Harshavardhan Sundar, and Xavier Serra. 2014. Transcription and recognition of syllable based percussion patterns: the case of Beijing opera. In Proceedings of the 15th Society for Music Information Retrieval. 431--436.
[39]
Jonathan P. J. Stock. 1999. A reassessment of the relationship between text, speech tone, melody, and aria structure in beijing opera. J. Musicol. Res. 18, 3 (1999), 183--206.
[40]
Mi Tian, György Fazekas, Dawn A. A. Black, and Mark Sandler. 2015. On the use of tempogram to describe audio content and its application to music structural segmentation. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 256--260.
[41]
Mi Tian, Ajay Srinivasamurthy, Mark Sandler, and Xavier Serra. 2014. A study of instrument-wise onset detection in Beijing opera percussion ensembles. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2159--2163.
[42]
Shanghai wenyi chubanshe. 1992. Collection of Jingju Scores (“Jingju Qupu Jicheng”). Shanghai wenyi Press.
[43]
Elizabeth Wichmann. 1991. Listening to Theatre: The Aural Dimension of Beijing Opera. University of Hawaii Press.
[44]
Luwei Yang, Mi Tian, and Elaine Chew. 2015. Vibrato characteristics and frequency histogram envelopes in beijing opera singing. In Proceedings of the 5th International Workshop on Folk Music Analysis. 139--140.

Cited By

View all
  • (2020)Pattern analysis based acoustic signal processing: a survey of the state-of-artInternational Journal of Speech Technology10.1007/s10772-020-09681-3Online publication date: 3-Feb-2020
  • (2019)Content-Based Music Classification by Advanced Features and Progressive LearningIntelligent Information and Database Systems10.1007/978-3-030-14802-7_10(117-130)Online publication date: 7-Mar-2019
  • (2018)Rethinking Summarization and Storytelling for Modern Social MultimediaMultiMedia Modeling10.1007/978-3-319-73603-7_51(632-644)Online publication date: 13-Jan-2018
  • Show More Cited By

Index Terms

  1. Towards Music Structural Segmentation across Genres: Features, Structural Hypotheses, and Annotation Principles

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 2
      Survey Paper, Special Issue: Intelligent Music Systems and Applications and Regular Papers
      March 2017
      407 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/3004291
      • Editor:
      • Yu Zheng
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 October 2016
      Accepted: 01 May 2016
      Revised: 01 May 2016
      Received: 01 October 2015
      Published in TIST Volume 8, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Music information retrieval
      2. data collection
      3. evaluation
      4. harmonic-percussive source separation
      5. music structural segmentation
      6. non-western music

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • China Scholarship Council (CSC) and EPSRC
      • Fusing Semantic and Audio Technologies for Intelligent Music Production and Consumption (FAST-IMPACt)
      • Royal Society as a recipient of a Wolfson Research Merit Award

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)16
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 24 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)Pattern analysis based acoustic signal processing: a survey of the state-of-artInternational Journal of Speech Technology10.1007/s10772-020-09681-3Online publication date: 3-Feb-2020
      • (2019)Content-Based Music Classification by Advanced Features and Progressive LearningIntelligent Information and Database Systems10.1007/978-3-030-14802-7_10(117-130)Online publication date: 7-Mar-2019
      • (2018)Rethinking Summarization and Storytelling for Modern Social MultimediaMultiMedia Modeling10.1007/978-3-319-73603-7_51(632-644)Online publication date: 13-Jan-2018
      • (2017)Creating an A Cappella Singing Audio Dataset for Automatic Jingju Singing Evaluation ResearchProceedings of the 4th International Workshop on Digital Libraries for Musicology10.1145/3144749.3144757(37-40)Online publication date: 28-Oct-2017
      • (2017)Exploiting Continuity/Discontinuity of Basis Vectors in Spectrogram Decomposition for Harmonic-Percussive Sound SeparationIEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)10.1109/TASLP.2017.268174225:5(1061-1074)Online publication date: 1-May-2017

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media