Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1873951.1874130acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

A novel audio fingerprinting method robust to time scale modification and pitch shifting

Published: 25 October 2010 Publication History
  • Get Citation Alerts
  • Abstract

    A novel audio fingerprinting method that is highly robust to Time Scale Modification (TSM) and pitch shifting is proposed. Instead of simply employing spectral or tempo-related features, our system is based on computer-vision techniques. We transform each 1-D audio signal into a 2-D image and treat TSM and pitch shifting of the audio signal as stretch and translation of the corresponding image. Robust local descriptors are extracted from the image and matched against those of the reference audio signals. Experimental results show that our system is highly robust to various audio distortions, including the challenging TSM and pitch shifting.

    References

    [1]
    S. Baluja and M. Covell. Waveprint: efficient wavelet-based audio fingerprinting. Pattern Recognition, 41(11):3467--3480, 2008.
    [2]
    R. Bardeli and F. Kurth. Robust identification of time-scaled audio. In AES 25th International Conference on Metadata for Audio, 2004.
    [3]
    P. Cano, E. Batlle, T. Kalker, and J. Haitsma. A review of audio fingerprinting. The Journal of VLSI Signal Processing, 41(3):271--284, 2005.
    [4]
    J. Haitsma and T. Kalker. A highly robust audio fingerprinting system. In International Symposium on Music Information Retrieval, pages 107--115, 2002.
    [5]
    Y. Ke, D. Hoiem, and R. Sukthankar. Computer vision for music identification. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 597--604, 2005.
    [6]
    F. Kurth, T. Gehrmann, and M. Mueller. The cyclic beat spectrum: tempo related audio features for time-scale invariant audio identification. In International Symposium on Music Information Retrieval, pages 35--40, 2006.
    [7]
    D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.
    [8]
    K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):1615--1630, 2005.

    Cited By

    View all
    • (2024)Enhancing Insect Sound Classification Using Dual-Tower Network: A Fusion of Temporal and Spectral Feature PerceptionApplied Sciences10.3390/app1407311614:7(3116)Online publication date: 8-Apr-2024
    • (2024)Audio Fingerprinting Method for Byzantine Hymn Recognition2024 Panhellenic Conference on Electronics & Telecommunications (PACET)10.1109/PACET60398.2024.10497021(1-4)Online publication date: 28-Mar-2024
    • (2023)SkipStreaming: Pinpointing User-Perceived Redundancy in Correlated Web Video Streaming through the Lens of ScenesProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611845(3944-3953)Online publication date: 26-Oct-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '10: Proceedings of the 18th ACM international conference on Multimedia
    October 2010
    1836 pages
    ISBN:9781605589336
    DOI:10.1145/1873951
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. audio fingerprinting
    2. pitch shifting
    3. robustness
    4. time scale modification

    Qualifiers

    • Short-paper

    Conference

    MM '10
    Sponsor:
    MM '10: ACM Multimedia Conference
    October 25 - 29, 2010
    Firenze, Italy

    Acceptance Rates

    Overall Acceptance Rate 995 of 4,171 submissions, 24%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Enhancing Insect Sound Classification Using Dual-Tower Network: A Fusion of Temporal and Spectral Feature PerceptionApplied Sciences10.3390/app1407311614:7(3116)Online publication date: 8-Apr-2024
    • (2024)Audio Fingerprinting Method for Byzantine Hymn Recognition2024 Panhellenic Conference on Electronics & Telecommunications (PACET)10.1109/PACET60398.2024.10497021(1-4)Online publication date: 28-Mar-2024
    • (2023)SkipStreaming: Pinpointing User-Perceived Redundancy in Correlated Web Video Streaming through the Lens of ScenesProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611845(3944-3953)Online publication date: 26-Oct-2023
    • (2018)Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.267056040:2(352-364)Online publication date: 1-Feb-2018
    • (2018)A Hierarchical Method of Forming Fingerprints of a Sound Signal2018 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon)10.1109/FarEastCon.2018.8602783(1-4)Online publication date: Oct-2018
    • (2017)SNORAP: A Device for the Correction of Impaired Sleep Health by Using Tactile Stimulation for Individuals with Mild and Moderate Sleep Disordered BreathingSensors10.3390/s1709200617:9(2006)Online publication date: 1-Sep-2017
    • (2017)Applications of duplicate detectionProceedings of the 4th International Workshop on Digital Libraries for Musicology10.1145/3144749.3144759(45-48)Online publication date: 28-Oct-2017
    • (2017)Applications of Duplicate Detection in Music Archives: From Metadata Comparison to Storage OptimisationDigital Libraries and Multimedia Archives10.1007/978-3-319-73165-0_10(101-113)Online publication date: 21-Dec-2017
    • (2016)A spectrogram-based audio fingerprinting system for content-based copy detectionMultimedia Tools and Applications10.1007/s11042-015-3081-875:15(9145-9165)Online publication date: 1-Aug-2016
    • (2016)Landmark-based music recognition system optimisation using genetic algorithmsMultimedia Tools and Applications10.1007/s11042-015-2963-075:24(16905-16922)Online publication date: 1-Dec-2016
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media