Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Toward Improving the Performance of Epoch Extraction from Telephonic Speech

Published: 01 April 2021 Publication History

Abstract

Epoch is an abrupt closure event within a glottal cycle at which significant excitation to the vocal-tract system happens during the production of voiced speech. The state-of-the-art zero frequency filtering technique is a simple and efficient method that shows robustness in extracting the epochs from clean speech. However, this method has shown poor performance for telephonic quality speech, due to the presence of spurious zero crossings in epoch evidence, which leads to a high false alarm rate. Recently, zero-phase zero frequency resonator (ZP-ZFR) an alternative to zero frequency filter is proposed for stable implementation of zero frequency filtering technique. In this study, higher-order ZP-ZFR is investigated to improve the performance of zero frequency filtering for epoch extraction from telephonic speech. The performance of the proposed ZP-ZFR method is quantitatively evaluated on telephonic speech simulated using six standard databases having simultaneous electroglottograph recordings as ground truth. Experimental results suggest that the performance of the proposed method is significantly better than the state-of-the-art methods in terms of identification rate and false alarm rate.

References

[1]
Airaksinen M, Raitio T, Story B, and Alku P Quasi closed phase glottal inverse filtering analysis with weighted linear prediction IEEE/ACM Trans. Audio Speech Lang. Process. 2014 22 3 596-607
[2]
Alku P Glottal inverse filtering analysis of human voice production-a review of estimation and parameterization methods of the glottal excitation and their applications Sadhana 2011 36 5 623-650
[3]
Ananthapadmanabha T and Yegnanarayana B Epoch extraction of voiced speech IEEE Trans. Acoust. Speech Signal Process. 1975 23 6 562-570
[4]
Cabral JP, Richmond K, Yamagishi J, and Renals S Glottal spectral separation for speech synthesis IEEE J. Sel. Top. Signal Process. 2014 8 2 195-208
[5]
T. Drugman, A. Alwan, Joint robust voicing detection and pitch estimation based on residual harmonics, in Proceedings of Interspeech, pp. 1973–1976 (2011)
[6]
T. Drugman, T. Dutoit, Glottal closure and opening instant detection from speech signals, in Proceedings of interspeech, pp. 2891–2894 (2009)
[7]
Drugman T, Thomas M, Gudnason J, Naylor P, and Dutoit T Detection of glottal closure instants from speech signals: a quantitative review IEEE Trans. Audio Speech Lang. Process. 2011 20 3 994-1006
[8]
Gerratt BR, Kreiman J, and Garellek M Comparing measures of voice quality from sustained phonation and continuous speech J. Speech Lang. Hear. Res. 2016 59 5 994-1001
[9]
Gómez-Vilda P, Fernández-Baillo R, Rodellar-Biarge V, Lluis VN, Álvarez-Marquina A, Mazaira-Fernández LM, Martínez-Olalla R, and Godino-Llorente JI Glottal source biometrical signature for voice pathology detection Speech Commun. 2009 51 9 759-781
[10]
Gurugubelli K and Vuppala AK Stable implementation of zero frequency filtering of speech signals for efficient epoch extraction IEEE Signal Process. Lett. 2019 26 9 1310-1314
[11]
S.R. Kadiri, A quantitative comparison of epoch extraction algorithms for telephone speech, in Proceedings of IEEE ICASSP, pp. 6500–6504 (2019)
[12]
Kane J and Gobl C Evaluation of glottal closure instant detection in a range of voice qualities Speech Commun. 2013 55 2 295-314
[13]
Keerthana YM, Reddy MK, and Rao KS Cwt-based approach for epoch extraction from telephone quality speech IEEE Signal Process. Lett. 2019 26 8 1107-1111
[14]
J. Kominek, A.W. Black, The CMU Arctic speech databases, in Proceedings of 5th ISCA speech synthesis workshop, pp. 223–224 (2004)
[15]
A. Kounoudes, P.A. Naylor, M. Brookes, The DYPSA algorithm for estimation of glottal closure instants in voiced speech, in Proceedings of IEEE ICASSP, pp. 349–352 (2002)
[16]
Koutrouvelis AI, Kafentzis GP, Gaubitch ND, and Heusdens R A fast method for high-resolution voiced/unvoiced detection and glottal closure/opening instant estimation of speech IEEE/ACM Trans. Audio Speech Lang Process. 2016 24 2 316-328
[17]
Murty KSR and Yegnanarayana B Epoch extraction from speech signals IEEE Trans. Audio Speech Lang. Process. 2008 16 8 1602-1613
[18]
Naylor PA, Kounoudes A, Gudnason J, and Brookes M Estimation of glottal closure instants in voiced speech using the dypsa algorithm IEEE Trans. Audio Speech Lang. Process. 2007 15 1 34-43
[19]
S.M. Prasanna, D. Govind, Analysis of excitation source information in emotional speech, in Proceedings of Interspeech, pp. 781–784 (2010)
[20]
Prathosh A, Ananthapadmanabha T, and Ramakrishnan A Epoch extraction based on integrated linear prediction residual using plosion index IEEE Trans. Audio Speech Lang. Process. 2013 21 12 2471-2480
[21]
Rao KS, Prasanna SM, and Yegnanarayana B Determination of instants of significant excitation in speech using hilbert envelope and group delay function IEEE Signal Process. Lett. 2007 14 10 762-765
[22]
Rao KS and Yegnanarayana B Prosody modification using instants of significant excitation IEEE Trans. Audio Speech Lang. Process. 2006 14 3 972-980
[23]
Thomas MR, Gudnason J, and Naylor PA Estimation of glottal closing and opening instants in voiced speech using the YAGA algorithm IEEE Trans. Audio Speech Lang. Process. 2012 20 1 82-91
[24]
K. Vijayan, K.S.R. Murty, Epoch extraction from all pass residual of speech signals, in Proceedings of IEEE ICASSP, pp. 1493–1497 (2014)
[25]
Vijayan K and Murty KSR Epoch extraction by phase modelling of speech signals Circuits Syst. Signal Process. 2016 35 7 2584-2609
[26]
Vikram C and Prasanna SM Epoch extraction from telephone quality speech using single pole filter IEEE/ACM Trans. Audio Speech Lang. Process. 2017 25 3 624-636

Cited By

View all

Index Terms

  1. Toward Improving the Performance of Epoch Extraction from Telephonic Speech
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Circuits, Systems, and Signal Processing
        Circuits, Systems, and Signal Processing  Volume 40, Issue 4
        Apr 2021
        522 pages

        Publisher

        Birkhauser Boston Inc.

        United States

        Publication History

        Published: 01 April 2021
        Accepted: 13 September 2020
        Revision received: 10 September 2020
        Received: 07 February 2020

        Author Tags

        1. Epoch extraction
        2. Telephonic speech
        3. Zero-phase zero frequency filtering

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 20 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media