Abstract
In this paper, we present a recent algorithm for pitch detection based on an implicit circular autocorrelation of the glottal excitation signal. This algorithm operates in real time without the use of any post-processing technique. This article focuses on the correction of the pitch contours estimated and on the reduction in classification errors in speech signals using simple voicing decision techniques. To evaluate the performance of our algorithms, we used the Bagshaw and Keele databases. We show in this study that the sum of the percentage of the unvoiced errors and the percentage of the voiced errors, for the male Bagshaw corpus, reaches a very good score of 14.67. For the female corpus, our results are also competitive compared to other algorithms using the same database. Concerning the Keele database, we succeed to obtain very good gross pitch error, voicing decision error and F0 frame error rates, respectively, 0.44, 0.65 and 1.55 % in the whole corpus.









Similar content being viewed by others
References
Bagshaw, P.C., Hiller, S.M., Jack, M.A.: Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching. In: Proceedings of the European Conference on Speech Technology, Berlin, 2, pp. 1000–1003 (1993)
Bahja, F., Di Martino, J., Ibn Elhaj, E.: Real-time pitch tracking using the eCATE algorithm. Presented at the ISIVC, Rabat, Morocco, 1–2 Oct 2010
Bahja, F., Di Martino, J., Ibn Elhaj, E., Aboutajdine, D.: An improvement of the eCATE algorithm for F0 detection. Presented at the 10th International Symposium on Communications and Information Technologies, Tokyo, Japan, 26–29 Oct 2010
Camacho, A.: SWIPE: a sawtooth waveform inspired pitch estimator for speech and music. PhD thesis, University of Florida, USA (2007)
Chu, W., Alwan, A.: Reducing F0 frame error of F0 tracking algorithms under noisy conditions with an unvoiced/voiced classification frontend. ICASSP (2009)
Chu, W., Alwan, A.: SAFE: a statistical approach to F0 estimation under clean and noisy conditions. IEEE Trans. Audio Speech Lang. Process. 20(3), 933–967 (2012)
De Cheveigne, A., Kawahara, H.: YIN, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. 111(4), 1917–1930 (2002)
Di Martino, J., Laprie, Y.: An efficient F0 determination algorithm based on the implicit calculation of the autocorrelation of the temporal excitation signal. Presented at the 6th European Conference on Speech Communication and Technology EUROSPEECH, Budapest, Hungary (1999)
Gold, B., Rabiner, L.R.: Parallel processing techniques for estimating pitch periods of speech in the time domain. J. Acoust. Soc. Am. 46(2), 442–448 (1969)
Krusback, D., Niederjohn, R.: An autocorrelation pitch detector and voicing decision with confidence measures developed for noise-corrupted speech. IEEE Trans. Signal Process. 39(2), 319–329 (1991)
Mahadevan, V., Espy-Wilson, C.Y.: Maximum likelihood pitch estimation using sinusoidal modeling. Presented at the International Conference on Communications and Signal Processing (ICCSP) (2011)
Markel, J.D.: The SIFT algorithm for fundamental frequency estimation. IEEE Trans. Audio Electroacoust. 20, 367–377 (1972)
Medan, Y., Yair, E., Chazan, D.: Super resolution pitch determination of speech signals. IEEE Trans. Signal Process. ASSP-39 1, 40–48 (1991)
Messaoud, M.A.B., Bouzid, A., Ellouze, N.: Using multi-scale product spectrum for single and multi-pitch estimation. IET Signal Process. J. 5(3), 344–355 (2011)
Messaoud, M.A.B., Bouzid, A., Ellouze, N.: Pitch estimation and voiced decision by spectral autocorrelation compression of multi-scale product. In: JEP-TALN-RECITAL conference, Grenoble, June 4–8, 2012, vol. 1: JEP, Grenoble, Juin 4–8, pp. 201–208 (2012)
Nakatani, T., Amano, S., Irino, T., Ishizuka, K., Kondo, T.: A method for fundamental frequency estimation and voicing decision: application to infant utterances recorded in real acoustical environments. Speech Commun. 50(3), 203–214 (2008)
Ney, H.: A dynamic programming algorithm for nonlinear smoothing. Signal Process. 5(2), 163–173 (1983)
Noll, A.M.: Cepstrum pitch determination. J. Acoust. Soc. Am. 41(2), 293–309 (1967)
Noll, A.M.: Pitch determination of human speech by the harmonic product spectrum, the harmonic sum spectrum and a maximum likelihood estimate. In: Proceedings of the Symposium on Computer Processing in, Communication, pp. 779–798 (1969)
Oppenheim, A.V.: A speech analysis synthesis system based on homomorphic filtering. J. Acoust. Soc. Am. 45, 458–465 (1969)
Oppenheim, A.V., Schafer, R.W.: Digital signal processing. Prentice Hall, Englewood Cliffs (1975)
Philips, M.S.: A feature-based time domain pitch tracker. J. Acoust. Soc. Am. 77, S9–S10 (1985)
Plante, F., Meyer, G., Ainsworth, W.A.: A pitch extraction reference database. In: Proceedings of the Eurospeech, pp. 837–840 (1995)
Rabiner, L.R., Sambur, M.R.: Voiced-unvoiced-silence detection using the Itakura LPC distance measure. In: Proceedings of ICASSP, pp. 323–326 (1977)
Saul, L.K., Lee, D.D., Isbell, C.L., LeCun, Y.: Real time voice processing with audiovisual feedback: toward autonomous agents with perfect pitch. In: Proceedings of NIPS (2002)
Schroeder, M.R.: Period histogram and product spectrum: new methods for fundamental frequency measurement. J. Acoust. Soc. Am. 43(4), 829–834 (1968)
Secrest, B.G., Doddington, G.R.: An integrated pitch tracking algorithm for speech systems. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Boston, pp. 1352–1355 (1983)
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Kleijn, W.B., Paliwal, K.K. (eds.) Speech Coding and Synthesis, chapter 14. Elsevier Science B.V, Amsterdam (1995)
Yegnanarayana, B., Murty, K.S.R.: Event-based instantaneous fundamental frequency estimation from speech signals. IEEE Trans. Audio Speech Lang Process. 17(4), 614–624 (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bahja, F., Di Martino, J., Ibn Elhaj, E. et al. An overview of the CATE algorithms for real-time pitch determination. SIViP 9, 589–599 (2015). https://doi.org/10.1007/s11760-013-0488-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-013-0488-4