Spectral Flatness Analysis for Emotional Speech Synthesis and Transformation

Přibil, Jiří; Přibilová, Anna

doi:10.1007/978-3-642-03320-9_11

Jiří Přibil^21,22 &
Anna Přibilová²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5641))

1707 Accesses

Abstract

According to psychological research of emotional speech different emotions are accompanied by different spectral noise. We control its amount by spectral flatness according to which the high frequency noise is mixed in voiced frames during cepstral speech synthesis. Our experiments are aimed at statistical analysis of spectral flatness in three emotions (joy, sadness, anger), and a neutral state for comparison. Calculated histograms of spectral flatness distribution are visually compared and modelled by Gamma probability distribution. Obtained statistical parameters and emotional-to-neutral ratios of their mean values show good correlation for both male and female voices and all three emotions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Modification of energy spectra, epoch parameters and prosody for emotion conversion in speech

Article 27 October 2016

Transformation of Emotion by Modifying Prosody and Spectral Energy Using Discrete Wavelet Transform

Article 01 November 2023

Vocal Emotion Conversion Using WSOLA and Linear Prediction

References

Gray Jr., A.H., Markel, J.D.: A Spectral-Flatness Measure for Studying the Autocorrelation Method of Linear Prediction of Speech Analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-22, 207–217 (1974)
Article Google Scholar
Esposito, A., Stejskal, V., Smékal, Z., Bourbakis, N.: The Significance of Empty Speech Pauses: Cognitive and Algorithmic Issues. In: Proceedings of the 2nd International Symposium on Brain Vision and Artificial Intelligence, Naples, pp. 542–554 (2007)
Google Scholar
Ito, T., Takeda, K., Itakura, F.: Analysis and Recognition of Whispered Speech. Speech Communication 45, 139–152 (2005)
Article Google Scholar
Přibil, J., Přibilová, A.: Voicing Transition Frequency Determination for Harmonic Speech Model. In: Proceedings of the 13th International Conference on Systems, Signals and Image Processing, Budapest, pp. 25–28 (2006)
Google Scholar
Přibil, J., Madlová, A.: Two Synthesis Methods Based on Cepstral Parameterization. Radioengineering 11(2), 35–39 (2002)
Google Scholar
Scherer, K.R.: Vocal Communication of Emotion: A Review of Research Paradigms. Speech Communication 40, 227–256 (2003)
Article MATH Google Scholar
Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation, and Gain Matching in Cepstral Speech Synthesis. In: Proceedings of the 15th Biennial International EURASIP Conference Biosignal, Brno, pp. 77–82 (2000)
Google Scholar
Paeschke, A.: Global Trend of Fundamental Frequency in Emotional Speech. In: Proceedings of Speech Prosody, Nara, Japan, pp. 671–674 (2004)
Google Scholar
Bulut, M., Lee, S., Narayanan, S.: A Statistical Approach for Modeling Prosody Features Using POS Tags for Emotional Speech Synthesis. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, Hawai, pp. 1237–1240 (2007)
Google Scholar
Markel, J.D., Gray Jr., A.H.: Linear Prediction of Speech. Springer, Heidelberg (1976)
Book MATH Google Scholar
Suhov, Y., Kelbert, M.: Probability and Statistics by Example. Basic Probability and Statistics, vol. I. Cambridge University Press, Cambridge (2005)
Book MATH Google Scholar
Everitt, B.S.: The Cambridge Dictionary of Statistics, 3rd edn. Cambridge University Press, Cambridge (2006)
MATH Google Scholar
Boersma, P., Weenink, D.: Praat: Doing Phonetics by Computer (Version 5.0.32) [Computer Program], http://www.praat.org/ (retrieved August 12, 2008)
Boersma, P., Weenink, D.: Praat - Tutorial, Intro 4. Pitch analysis (September 5, 2007), http://www.fon.hum.uva.nl/praat/manual/Intro_4__Pitch_analysis.html
Přibil, J., Přibilová, A.: Application of Expressive Speech in TTS System with Cepstral Description. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds.) HH and HM Interaction. LNCS (LNAI), vol. 5042, pp. 200–212. Springer, Heidelberg (2008)
Chapter Google Scholar
Přibilová, A., Přibil, J.: Spectrum Modification for Emotional Speech Synthesis. In: Esposito, A., et al. (eds.) Multimodal Signals: Cognitive and Algorithmic Issues. LNCS (LNAI), vol. 5398, pp. 232–241. Springer, Heidelberg (2009)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Photonics and Electronics, Academy of Sciences CR, v.v.i., Chaberská 57, CZ-182 51, Prague 8, Czech Republic
Jiří Přibil
Institute of Measurement Science, SAS, Dúbravská cesta 9, SK-841 04, Bratislava, Slovakia
Jiří Přibil
Faculty of Electrical Engineering & Information Technology, Dept. of Radio Electronics, Slovak University of Technology, Ilkovičova 3, SK-812 19, Bratislava, Slovakia
Anna Přibilová

Authors

Jiří Přibil
View author publications
You can also search for this author in PubMed Google Scholar
Anna Přibilová
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Psychology, Second University of Naples, and IIASS, Via G. Pellegrino 19, 84019, Vietri sul Mare, (SA), Italy
Anna Esposito
Institute of Photonics and Electronics, Academy of Sciences of the Czech Republic, Chaberská 57, 182 52, Prague 8, Czech Republic
Robert Vích

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Přibil, J., Přibilová, A. (2009). Spectral Flatness Analysis for Emotional Speech Synthesis and Transformation. In: Esposito, A., Vích, R. (eds) Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. Lecture Notes in Computer Science(), vol 5641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03320-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-03320-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03319-3
Online ISBN: 978-3-642-03320-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Spectral Flatness Analysis for Emotional Speech Synthesis and Transformation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Modification of energy spectra, epoch parameters and prosody for emotion conversion in speech

Transformation of Emotion by Modifying Prosody and Spectral Energy Using Discrete Wavelet Transform

Vocal Emotion Conversion Using WSOLA and Linear Prediction

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Spectral Flatness Analysis for Emotional Speech Synthesis and Transformation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Modification of energy spectra, epoch parameters and prosody for emotion conversion in speech

Transformation of Emotion by Modifying Prosody and Spectral Energy Using Discrete Wavelet Transform

Vocal Emotion Conversion Using WSOLA and Linear Prediction

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation