Abstract
Many adjectives have been used to describe voice characteristics, yet it is challenging to define sound style precisely using quantitative measure. In this paper, we attempt to tackle the voice style classification problem based on techniques designed for speaker recognition. Specifically, we employ i-vector, a widely adopted feature in speaker identification, and support vector machine (SVM), for style classification. In order to verify the reliability of i-vector, we conduct pilot study, including noise sensitivity, minimum voice duration, and mimicry style test. In this study, we define eight voice styles and collect appropriate voice data to process and verify our hypothesis through the experiment. The results indicate that i-vector can indeed be utilized to classify voice styles that are commonly perceived in daily life.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Maxine, E.: Trends in speaking styles research. In: Third European Conference on Speech Communication and Technology (1993)
Cox, D.: Is your voice trustworthy, engaging or soothing to strangers? (2015). https://www.theguardian.com/science/blog/2015/apr/16/is-your-voice-trustworthy-engaging-or-soothing-to-strangers
Chattopadhyay, A., Dahl, D.W., Ritchie, R.J., Shahin, K.N.: Hearing voices: the impact of announcer speech characteristics on consumer response to broadcast advertising. J. Consum. Psychol. 13(3), 198–204 (2003)
Bou-Ghazale, S.E., Hansen, J.H.: A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Trans. Speech Audio Process. 8(4), 429–442 (2000)
Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digit. Signal Process. 10(1–3), 19–41 (2000)
Kenny, P.: Joint factor analysis of speaker and session variability: theory and algorithms. CRIM, Montreal, (Report) CRIM-06/08-13, 14, 28–29 (2005)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
E-classical Radio. https://www.e-classical.com.tw/index.html
Police Broadcasting Service. https://www.pbs.gov.tw/cht/index.php
Google TTS. https://translate.google.com.tw/
Baidu TTS. https://fanyi.baidu.com/#auto/zh/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liao, WH., Kao, WT., Wu, YC. (2019). Analysis of Voice Styles Using i-Vector Features. In: Chang, CY., Lin, CC., Lin, HH. (eds) New Trends in Computer Technologies and Applications. ICS 2018. Communications in Computer and Information Science, vol 1013. Springer, Singapore. https://doi.org/10.1007/978-981-13-9190-3_70
Download citation
DOI: https://doi.org/10.1007/978-981-13-9190-3_70
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9189-7
Online ISBN: 978-981-13-9190-3
eBook Packages: Computer ScienceComputer Science (R0)