Evaluation of EMD-Based Speaker Recognition Using ISCSLP2006 Chinese Speaker Recognition Evaluation Corpus

Kuroiwa, Shingo; Tsuge, Satoru; Kita, Masahiko; Ren, Fuji

doi:10.1007/11939993_56

Shingo Kuroiwa²²,
Satoru Tsuge²²,
Masahiko Kita²² &
…
Fuji Ren^22,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1624 Accesses

Abstract

In this paper, we present the evaluation results of our proposed text-independent speaker recognition method based on the Earth Mover’s Distance (EMD) using ISCSLP2006 Chinese speaker recognition evaluation corpus developed by the Chinese Corpus Consortium (CCC). The EMD based speaker recognition (EMD-SR) was originally designed to apply to a distributed speaker identification system, in which the feature vectors are compressed by vector quantization at a terminal and sent to a server that executes a pattern matching process. In this structure, we had to train speaker models using quantized data, so that we utilized a non-parametric speaker model and EMD. From the experimental results on a Japanese speech corpus, EMD-SR showed higher robustness to the quantized data than the conventional GMM technique. Moreover, it has achieved higher accuracy than the GMM even if the data were not quantized. Hence, we have taken the challenge of ISCSLP2006 speaker recognition evaluation by using EMD-SR. Since the identification tasks defined in the evaluation were on an open-set basis, we introduce a new speaker verification module in this paper. Evaluation results showed that EMD-SR achieves 99.3% Identification Correctness Rate in a closed-channel speaker identification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automatic Speaker Recognition Using Hybrid Parameters Based on Machine Learning Applied on Two Dataset

Investigating Language Variability on the Performance of Speaker Verification Systems

Closed-Set Text-Independent Automatic Speaker Recognition System Using VQ/GMM

References

Fattah, M.A., Ren, F., Kuroiwa, S., Fukuda, I.: Phoneme Based Speaker Modeling to Improve Speaker Recognition. Information 9(1), 135–147 (2006)
Google Scholar
Kuroiwa, S., Umeda, Y., Tsuge, S., Ren, F.: Nonparametric Speaker Recognition Method using Earth Mover’s Distance. IEICE Transactions on Information and Systems E89-D(3), 1074–1081 (2006)
Article Google Scholar
Fattah, M.A., Ren, F., Kuroiwa, S.: Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification. IEICE Transactions on Information and Systems E89-D(5), 1712–1719 (2006)
Article Google Scholar
Pearce, D.: Enabling New Speech Driven Services for Mobile Devices: An overview of the ETSI standards activities for Distributed Speech Recognition Front-ends. In: Proc. Applied Voice Input/Output Society Conference (December 2000)
Google Scholar
ETSI standard document, Speech processing, transmission and auality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithm, ETSI ES 201 108 v1.1.2 (April 2000)
Google Scholar
http://www.itu.int/ITU-T/2001-2004/com16/sg16-q15.html
Broun, C.C., Campbell, W.M., Pearce, D., Kelleher, H.: Distributed Speaker Recognition Using the ETSI Distributed Speech Recognition Standard. In: Proc. A Speaker Odyssey - The Speaker Recognition Workshop, June 2001, pp. 121–124 (2001)
Google Scholar
Grassi, S., Ansorge, M., Pellandini, F., Farine, P.-A.: Distributed Speaker Recognition Using the ETSI AURORA Standard. In: Proc. 3rd COST 276 Workshop on Information and Knowledge Management for Integrated Media Communication, pp. 120–125 (2002)
Google Scholar
Sit, C.-H., Mak, M.-W., Kung, S.-Y.: Maximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems. In: Proc. the 1st Int. Conf. on Biometric Authentication (July 2004)
Google Scholar
Fukuda, I., Fattah, M.A., Tsuge, S., Kuroiwa, S.: Distributed Speaker Identification on Japanese Speech Corpus Using the ETSI Aurora Standard. In: Proc. 3rd Int. Conf. on Information, November 2004, pp. 207–210 (2004)
Google Scholar
http://www.kddi.com/english/corporate/newsrelease/2006/0112/
Rubner, Y., Guibas, L., Tomasi, C.: The Earth Mover’s Distance, Multi- Dimensional Scaling, and Color-Based Image Retrieval. In: Proc. the ARPA Image Understanding Workshop, May 1997, pp. 661–668 (1997)
Google Scholar
Kuroiwa, S., Sakayori, S., Yamamoto, S., Fujioka, M.: Prank call rejection system for home country direct service. In: Proc. of IVTTA 1996, Basking Ridge, U.S.A, Sepember-October 1996, pp. 135–138 (1996)
Google Scholar
Park, A., Hazen, T.: ASR dependent techniques for speaker identification. In: Proc. ICSLP 2002, September 2002, pp. 1337–1340 (2002)
Google Scholar
Uchibe, T., Kuroiwa, S., Higuchi, N.: Determination of threshold for speaker verification using speaker adaptation gain in likelihood during training. In: Proc. ICSLP 2000, Beijing, China, October 2000, vol. 2, pp. 326–329 (2000)
Google Scholar
http://htk.eng.cam.ac.uk/

Download references

Author information

Authors and Affiliations

Faculty of Engineering, The University of Tokushima, Tokushimashi, 770-8506, Japan
Shingo Kuroiwa, Satoru Tsuge, Masahiko Kita & Fuji Ren
School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing, 100876
Fuji Ren

Authors

Shingo Kuroiwa
View author publications
You can also search for this author in PubMed Google Scholar
Satoru Tsuge
View author publications
You can also search for this author in PubMed Google Scholar
Masahiko Kita
View author publications
You can also search for this author in PubMed Google Scholar
Fuji Ren
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kuroiwa, S., Tsuge, S., Kita, M., Ren, F. (2006). Evaluation of EMD-Based Speaker Recognition Using ISCSLP2006 Chinese Speaker Recognition Evaluation Corpus. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_56

Download citation

DOI: https://doi.org/10.1007/11939993_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evaluation of EMD-Based Speaker Recognition Using ISCSLP2006 Chinese Speaker Recognition Evaluation Corpus

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic Speaker Recognition Using Hybrid Parameters Based on Machine Learning Applied on Two Dataset

Investigating Language Variability on the Performance of Speaker Verification Systems

Closed-Set Text-Independent Automatic Speaker Recognition System Using VQ/GMM

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Evaluation of EMD-Based Speaker Recognition Using ISCSLP2006 Chinese Speaker Recognition Evaluation Corpus

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic Speaker Recognition Using Hybrid Parameters Based on Machine Learning Applied on Two Dataset

Investigating Language Variability on the Performance of Speaker Verification Systems

Closed-Set Text-Independent Automatic Speaker Recognition System Using VQ/GMM

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation