Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3015166.3015208acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicspsConference Proceedingsconference-collections
research-article

Overlapping Speech Detection with Cluster-based HMM Framework

Published: 21 November 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Overlapping speech is known to be the major source of error in various speech processing algorithm. Many previous studies on overlapping speech detection focus on exploring the various feature set for representing speech and overlapping speech characteristics while using the HMM framework. In this study, however, we hypothesize that the capacity of single HMM will not be enough to cover the whole speech and overlapping speech distribution. Thus, we proposed a simple cluster-based HMM framework to construct multiple speech and overlapping speech model. The experimental results on GRID corpus show significant improvements compare to the conventional overlap detection system.

    References

    [1]
    Yella, S.H. and Bourlard, H. 2014. Overlapping speech detection using long-term conversational features for speaker diarization in meeting room conversations. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22, 12 (Dec. 2014), 1688--1700.
    [2]
    Cetin, O. and Shriberg, E. 2006. Speaker overlaps and ASR errors in meetings: Effects before, during, and after the overlap. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (Toulouse, France, 2006), 357--360.
    [3]
    Tsai, W. and Liao, S. 2010. Speaker identification in overlapping speech, Journal of Information Science and Engineering, 26, 1891--1903.
    [4]
    Miro, X. A., Bozonnet, S., Evans, N., Fredouille, C., Friedland, G. and Vinyals, O. 2012. Speaker diarization: A review of recent research. IEEE Transactions on Audio, Speech, and Language Processing, 20, 2 (Feb. 2012), 356--370.
    [5]
    Yella, S.H. and Bourlard, H. 2013. Improved overlap speech diarization of meeting recordings using long-term conversational features. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing(Vancouver, Canada, 2013), 7746--7750.
    [6]
    Wrigley, S.N., Brown, G.J., Wan, V. and Renals, S. 2005. Speech and crosstalk detection in multi-channel audio. IEEE Transactions on Speech and Audio Processing, 13, 1 (Jan. 2005), 84--91.
    [7]
    Zelenak, M., Segura, C., Luque, J. and Hernando, J. 2012. Simultaneous speech detection with spatial features for speaker diarization. IEEE Transactions on Audio, Speech, and Language Processing, 20, 2 (Feb 2012), 436--446.
    [8]
    Boakye, K., Vinyals, O. and Friedland, G. 2011. Improved overlapped speech handling for speaker diarization. In Proceedings of the INTERSPEECH (Florence, Italy, 2011), 941--944.
    [9]
    Vipperla, R., Geiger, J.T., Bozonnet, S., Wang, D., Evans, N., Schuller, B. and Rigoll, G. 2012. Speech overlap detection and attribution using convolutive non-negative sparse coding. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (Toulouse, France, 2012), 4181--4184.
    [10]
    Geiger, J.T., Eyben, F., Schuller, B.and Rigoll, G. 2013. Detecting overlapping speech with long-term short memory recurrent neural networks. In Proceedings of the INTERSPEECH (Lyon, France, 2013).
    [11]
    Cooke, M.P., Barker, J., Cunningham, S. and Shao, X. 2006. An audio-visual corpus for speech perception and automatic speech recognition. The Journal of the Acoustical Society of America, 120, 5 (Nov. 2006), 2421--2424.

    Cited By

    View all
    • (2017)Improving separation of overlapped speech for meeting conversations using uncalibrated microphone array2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)10.1109/ASRU.2017.8268916(55-62)Online publication date: Dec-2017

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICSPS 2016: Proceedings of the 8th International Conference on Signal Processing Systems
    November 2016
    235 pages
    ISBN:9781450347907
    DOI:10.1145/3015166
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 November 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Expectation-maximizationclustering
    2. Hidden Markov model
    3. Overlapping speech
    4. Overlapping speech detection

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICSPS 2016

    Acceptance Rates

    ICSPS 2016 Paper Acceptance Rate 46 of 83 submissions, 55%;
    Overall Acceptance Rate 46 of 83 submissions, 55%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Improving separation of overlapped speech for meeting conversations using uncalibrated microphone array2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)10.1109/ASRU.2017.8268916(55-62)Online publication date: Dec-2017

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media