Abstract
Automatic speaker recognition is an important technology for intelligence gathering, law enforcement, and audio mining. Conventional speaker recognition systems, which are based on independent short-term spectral samples, suffer from a lack of noise robustness and are unable to model a speaker’s idiosyncratic stylistic features. This paper describes “TalkPrinting”, a program of research aimed at adding such stylistic features to conventional systems. Results on three preliminary systems based on stylistic features demonstrate that (1) the new features alone carry significant speaker information; (2) they also carry significant complementary information compared to the conventional features; and (3) they provide increasing improvements in performance with increasing test durations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Doddington, G.: “Some Experiments on Ideolectal Differences Among Speakers,” http://www.nist.gov/speech/tests/spk/2001/doc/ (2001).
Fukunaga, K.: “Statistical Pattern Recognition,” Academic Press, Indiana.
Gadde, V. R. R.: “Modeling Word Durations,” Proc. Intl. Conf. on Spoken Language Processing, Beijing, (2000) 601–604.
Godfrey, J., Holliman, E., and McDaniel, J.: (1992) “SWITCHBOARD: Telephone speech corpus for research and development,” Proc. ICASSP, (1992) 517–520.
NIST 2001, http://www.nist.gov/speech/tests/spk/2001/doc/2001-spkrec-evalplan-v05.9.ps
Reynolds, D.: “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” Speech Communication, Vol. 17, No. 1–2, August (1995) 91–108.
Reynolds, D., et al.: “The SuperSID Project: Exploiting high-level Information for highaccuracy speaker recognition,” To appear in Proc. ICASSP, Hong Kong (2003).
Shriberg, E., Stolcke, A., Hakkani-Tur, D., and Tur, G.: “Prosody-Based Automatic Segmentation of Speech into Sentences and Topics,” Speech Communication, Vol. 32, No. 1–2, (2000) 127–154.
Sönmez, M. K., Heck, L., and Weintraub, M.: “Speaker Tracking and Detection with Multiple Speakers,” Proc. EUROSPEECH, Vol. 5, Budapest, Hungary, (1999) 2219–2222.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kajarekar, S. et al. (2003). “TalkPrinting”: Improving Speaker Recognition by Modeling Stylistic Features. In: Chen, H., Miranda, R., Zeng, D.D., Demchak, C., Schroeder, J., Madhusudan, T. (eds) Intelligence and Security Informatics. ISI 2003. Lecture Notes in Computer Science, vol 2665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44853-5_28
Download citation
DOI: https://doi.org/10.1007/3-540-44853-5_28
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40189-6
Online ISBN: 978-3-540-44853-2
eBook Packages: Springer Book Archive