Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Evaluation of EMD-Based Speaker Recognition Using ISCSLP2006 Chinese Speaker Recognition Evaluation Corpus

  • Conference paper
Chinese Spoken Language Processing (ISCSLP 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

  • 1624 Accesses


In this paper, we present the evaluation results of our proposed text-independent speaker recognition method based on the Earth Mover’s Distance (EMD) using ISCSLP2006 Chinese speaker recognition evaluation corpus developed by the Chinese Corpus Consortium (CCC). The EMD based speaker recognition (EMD-SR) was originally designed to apply to a distributed speaker identification system, in which the feature vectors are compressed by vector quantization at a terminal and sent to a server that executes a pattern matching process. In this structure, we had to train speaker models using quantized data, so that we utilized a non-parametric speaker model and EMD. From the experimental results on a Japanese speech corpus, EMD-SR showed higher robustness to the quantized data than the conventional GMM technique. Moreover, it has achieved higher accuracy than the GMM even if the data were not quantized. Hence, we have taken the challenge of ISCSLP2006 speaker recognition evaluation by using EMD-SR. Since the identification tasks defined in the evaluation were on an open-set basis, we introduce a new speaker verification module in this paper. Evaluation results showed that EMD-SR achieves 99.3% Identification Correctness Rate in a closed-channel speaker identification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Fattah, M.A., Ren, F., Kuroiwa, S., Fukuda, I.: Phoneme Based Speaker Modeling to Improve Speaker Recognition. Information 9(1), 135–147 (2006)

    Google Scholar 

  2. Kuroiwa, S., Umeda, Y., Tsuge, S., Ren, F.: Nonparametric Speaker Recognition Method using Earth Mover’s Distance. IEICE Transactions on Information and Systems E89-D(3), 1074–1081 (2006)

    Article  Google Scholar 

  3. Fattah, M.A., Ren, F., Kuroiwa, S.: Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification. IEICE Transactions on Information and Systems E89-D(5), 1712–1719 (2006)

    Article  Google Scholar 

  4. Pearce, D.: Enabling New Speech Driven Services for Mobile Devices: An overview of the ETSI standards activities for Distributed Speech Recognition Front-ends. In: Proc. Applied Voice Input/Output Society Conference (December 2000)

    Google Scholar 

  5. ETSI standard document, Speech processing, transmission and auality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithm, ETSI ES 201 108 v1.1.2 (April 2000)

    Google Scholar 

  6. http://www.itu.int/ITU-T/2001-2004/com16/sg16-q15.html

  7. Broun, C.C., Campbell, W.M., Pearce, D., Kelleher, H.: Distributed Speaker Recognition Using the ETSI Distributed Speech Recognition Standard. In: Proc. A Speaker Odyssey - The Speaker Recognition Workshop, June 2001, pp. 121–124 (2001)

    Google Scholar 

  8. Grassi, S., Ansorge, M., Pellandini, F., Farine, P.-A.: Distributed Speaker Recognition Using the ETSI AURORA Standard. In: Proc. 3rd COST 276 Workshop on Information and Knowledge Management for Integrated Media Communication, pp. 120–125 (2002)

    Google Scholar 

  9. Sit, C.-H., Mak, M.-W., Kung, S.-Y.: Maximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems. In: Proc. the 1st Int. Conf. on Biometric Authentication (July 2004)

    Google Scholar 

  10. Fukuda, I., Fattah, M.A., Tsuge, S., Kuroiwa, S.: Distributed Speaker Identification on Japanese Speech Corpus Using the ETSI Aurora Standard. In: Proc. 3rd Int. Conf. on Information, November 2004, pp. 207–210 (2004)

    Google Scholar 

  11. http://www.kddi.com/english/corporate/newsrelease/2006/0112/

  12. Rubner, Y., Guibas, L., Tomasi, C.: The Earth Mover’s Distance, Multi- Dimensional Scaling, and Color-Based Image Retrieval. In: Proc. the ARPA Image Understanding Workshop, May 1997, pp. 661–668 (1997)

    Google Scholar 

  13. Kuroiwa, S., Sakayori, S., Yamamoto, S., Fujioka, M.: Prank call rejection system for home country direct service. In: Proc. of IVTTA 1996, Basking Ridge, U.S.A, Sepember-October 1996, pp. 135–138 (1996)

    Google Scholar 

  14. Park, A., Hazen, T.: ASR dependent techniques for speaker identification. In: Proc. ICSLP 2002, September 2002, pp. 1337–1340 (2002)

    Google Scholar 

  15. Uchibe, T., Kuroiwa, S., Higuchi, N.: Determination of threshold for speaker verification using speaker adaptation gain in likelihood during training. In: Proc. ICSLP 2000, Beijing, China, October 2000, vol. 2, pp. 326–329 (2000)

    Google Scholar 

  16. http://htk.eng.cam.ac.uk/

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kuroiwa, S., Tsuge, S., Kita, M., Ren, F. (2006). Evaluation of EMD-Based Speaker Recognition Using ISCSLP2006 Chinese Speaker Recognition Evaluation Corpus. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_56

Download citation

  • DOI: https://doi.org/10.1007/11939993_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49665-6

  • Online ISBN: 978-3-540-49666-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics