Mandarin Voice Conversion Using Tone Codebook Mapping

Zuo, Guoyu; Chen, Yao; Ruan, Xiaogang; Liu, Wenju

doi:10.1007/11739685_101

Guoyu Zuo^22,24,
Yao Chen²³,
Xiaogang Ruan²² &
…
Wenju Liu²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3930))

1166 Accesses

Abstract

A tone codebook mapping method is proposed to obtain a better performance in voice conversion of Mandarin speech than the conventional conversion method which deals mainly with short-time spectral envelopes. The pitch contour of the whole Mandarin syllable is used as a unit type for pitch conversion. The syllable pitch contours are first extracted from the source and target utterances. Time normalization and moving average filtering are then performed on them. These preprocessed pitch contours are classified to generate the source and target tone codebooks, and by associating them, a Mandarin tone mapping codebook is finally obtained in terms of speech alignment. Experiment results show that the proposed method for voice conversion can deliver a satisfactory performance in Mandarin speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

MaskMel-Prosody-CycleGAN-VC: High-Quality Cross-Lingual Voice Conversion

Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks

Article 08 February 2018

Non-linear Pitch Modification in Voice Conversion Using Artificial Neural Networks

References

Moulines, E., Sagisaka, Y.: Voice conversion: state of the art and perspectives. Special Issue of Speech Communication 16(2), 125–126 (1995)
Google Scholar
Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice Conversion through Vector Quantization. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, NY, USA, pp. 655–658 (1988)
Google Scholar
Stylianou, Y., Cappe, O., Moulines, E.: Continuous Probabilistic Transform for Voice Conversion. IEEE Transaction on Speech and Audio Processing 6(2), 131–142 (1998)
Article Google Scholar
Türk, O.: New Methods for Voice Conversion (MS thesis). Boğaziçi University, Turkey (2003)
Google Scholar
Zhou, T.: Modern Chinese Phonetics. Beijing Normal University Press, Beijing (1990)
Google Scholar
Chu, M.: Research on Chinese TTS system with high intelligibility and naturalness (Doctoral thesis). Institute of Acoustic, Chinese Academy of Sciences, Beijing (1995)
Google Scholar
Zhu, T., Gao, W.: Data Mining for Learning Mandarin Prosodic Models. Chinese Journal of Computer 23(11), 1179–1183 (2000)
Google Scholar
Kain, A., Macon, M.: Spectral Voice Conversion for Text-to-Speech Synthesis. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Seattle, USA, May 1998, pp. 285–288 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Artificial Intelligence and Robotics, Beijing University of Technology, Beijing, 100022, China
Guoyu Zuo & Xiaogang Ruan
School of Computer Sciences, Beijing University of Technology, Beijing, 100022, China
Yao Chen
Institute of Automation, Chinese Academy of Sciences, Beijing, 100080, China
Guoyu Zuo & Wenju Liu

Authors

Guoyu Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Yao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaogang Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Wenju Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing, Hong Kong Polytechnic University, P.O. Box, Hong Kong, China
Daniel S. Yeung
School of Creative Media, City University of Hong Kong,, China
Zhi-Qiang Liu
Department of Mathematics and Computer Science, Hebei University, 071002, Baoding, Hebei, P.R. China
Xi-Zhao Wang
School of Electrical and Information Engineering, University of Sydney, 2006, NSW, Australia
Hong Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zuo, G., Chen, Y., Ruan, X., Liu, W. (2006). Mandarin Voice Conversion Using Tone Codebook Mapping. In: Yeung, D.S., Liu, ZQ., Wang, XZ., Yan, H. (eds) Advances in Machine Learning and Cybernetics. Lecture Notes in Computer Science(), vol 3930. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11739685_101

Download citation

DOI: https://doi.org/10.1007/11739685_101
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33584-9
Online ISBN: 978-3-540-33585-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mandarin Voice Conversion Using Tone Codebook Mapping

Abstract

Access this chapter

Preview

Similar content being viewed by others

MaskMel-Prosody-CycleGAN-VC: High-Quality Cross-Lingual Voice Conversion

Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks

Non-linear Pitch Modification in Voice Conversion Using Artificial Neural Networks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Mandarin Voice Conversion Using Tone Codebook Mapping

Abstract

Access this chapter

Preview

Similar content being viewed by others

MaskMel-Prosody-CycleGAN-VC: High-Quality Cross-Lingual Voice Conversion

Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks

Non-linear Pitch Modification in Voice Conversion Using Artificial Neural Networks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation