Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Audio thumbnailing of popular music using chroma-based representations

Published: 01 February 2005 Publication History

Abstract

With the growing prevalence of large databases of multimedia content, methods for facilitating rapid browsing of such databases or the results of a database search are becoming increasingly important. However, these methods are necessarily media dependent. We present a system for producing short, representative samples (or "audio thumbnails") of selections of popular music. The system searches for structural redundancy within a given song with the aim of identifying something like a chorus or refrain. To isolate a useful class of features for performing such structure-based pattern recognition, we present a development of the chromagram, a variation on traditional time-frequency distributions that seeks to represent the cyclic attribute of pitch perception, known as chroma. The pattern recognition system itself employs a quantized chromagram that represents the spectral energy at each of the 12 pitch classes. We evaluate the system on a database of popular music and score its performance against a set of "ideal" thumbnail locations. Overall performance is found to be quite good, with the majority of errors resulting from songs that do not meet our structural assumptions.

Cited By

View all
  • (2023)Is it Violin or Viola? Classifying the Instruments’ Music Pieces using Descriptive StatisticsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356321819:2s(1-22)Online publication date: 16-Mar-2023
  • (2023)A deep learning approach for robust speaker identification using chroma energy normalized statistics and mel frequency cepstral coefficientsInternational Journal of Speech Technology10.1007/s10772-021-09888-y26:3(579-587)Online publication date: 1-Sep-2023
  • (2023)System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based ApproachSpeech and Computer10.1007/978-3-031-48309-7_41(506-519)Online publication date: 29-Nov-2023
  • Show More Cited By
  1. Audio thumbnailing of popular music using chroma-based representations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Multimedia
    IEEE Transactions on Multimedia  Volume 7, Issue 1
    February 2005
    182 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 February 2005

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Is it Violin or Viola? Classifying the Instruments’ Music Pieces using Descriptive StatisticsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356321819:2s(1-22)Online publication date: 16-Mar-2023
    • (2023)A deep learning approach for robust speaker identification using chroma energy normalized statistics and mel frequency cepstral coefficientsInternational Journal of Speech Technology10.1007/s10772-021-09888-y26:3(579-587)Online publication date: 1-Sep-2023
    • (2023)System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based ApproachSpeech and Computer10.1007/978-3-031-48309-7_41(506-519)Online publication date: 29-Nov-2023
    • (2021)Multi-modal Chorus Recognition for Improving Song SearchArtificial Neural Networks and Machine Learning – ICANN 202110.1007/978-3-030-86362-3_35(427-438)Online publication date: 14-Sep-2021
    • (2020)Analyzing the Use of Audio Messages in WhatsApp GroupsProceedings of The Web Conference 202010.1145/3366423.3380070(3005-3011)Online publication date: 20-Apr-2020
    • (2020)Multimedia Analytics Challenges and Opportunities for Creating Interactive Radio ContentMultiMedia Modeling10.1007/978-3-030-37734-2_31(375-387)Online publication date: 5-Jan-2020
    • (2019)Fast Similarity Matrix Profile for Music Analysis and ExplorationIEEE Transactions on Multimedia10.1109/TMM.2018.284956321:1(29-38)Online publication date: 1-Jan-2019
    • (2018)Data-Driven Audio Feature Selection for Audio Quality Recognition in Broadcast NewsProceedings of the 10th Hellenic Conference on Artificial Intelligence10.1145/3200947.3201035(1-6)Online publication date: 9-Jul-2018
    • (2018)Semi-supervised minimum redundancy maximum relevance feature selection for audio classificationMultimedia Tools and Applications10.1007/s11042-016-4287-077:1(713-739)Online publication date: 1-Jan-2018
    • (2017)Estimating the Structural Segmentation of Popular Music Pieces Under Regularity ConstraintsIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2016.263503125:2(344-358)Online publication date: 1-Feb-2017
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media