Remote-based text-to-speech modules’ evaluation framework: the RES framework

Rojc, Matej; Höge, Harald; Kačič, Zdravko

doi:10.1007/s10579-009-9110-3

Remote-based text-to-speech modules’ evaluation framework: the RES framework

Published: 29 November 2009

Volume 44, pages 371–386, (2010)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Matej Rojc¹,
Harald Höge² &
Zdravko Kačič¹

133 Accesses
Explore all metrics

Abstract

The ECESS consortium (European Center of Excellence in Speech Synthesis) aims to speed up progress in speech synthesis technology, by providing an appropriate evaluation framework. The key element of the evaluation framework is based on the partition of a text-to-speech synthesis system into distributed TTS modules. A text processing, prosody generation, and an acoustic synthesis module have been specified currently. A split into various modules has the advantage that the developers of an institution active in ECESS, can concentrate its efforts on a single module, and test its performance in a complete system using missing modules from the developers of other institutions. In this way, complete TTS systems can be built using high performance modules from different institutions. In order to evaluate the modules and to connect modules efficiently, a remote evaluation platform—the Remote Evaluation System (RES) based on the existing internet infrastructure—has been developed within ECESS. The RES is based on client–server architecture. It consists of RES module servers, which encapsulate the modules of the developers, a RES client, which sends data to and receives data from the RES module servers, and a RES server, which connects the RES module servers, and organizes the flow of information. RES can be used by developers for selecting RES module from the internet, which contains a missing TTS module needed to test and improve the performances of their own modules. Finally, the RES allows for the evaluation of TTS modules running at different institutions worldwide. When using the RES client, the institution performing the evaluation is able to set-up and performs various evaluation tasks by sending test data via the RES client and receiving results from the RES module servers. Currently ELDA www.elda.org is setting-up an evaluation using the RES client, which will then be extended to an evaluation client specializing in the envisaged evaluation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Multi-layered Approach to Evaluating Speech Translation Performance of Meetings

Open source platform for Estonian speech transcription

Article Open access 16 October 2024

End-to-End Speech Synthesis for the Serbian Language Based on Tacotron

Notes

EU project TC-STAR (Technology and Corpora for Speech to Speech Translation) www.tc-star.org.
The Blizzard challenge: http://festvox.org/blizzard/.
www.ecess.eu. The ECESS consortium is from its beginning an open, non funded consortium for institutions active in speech synthesis and related topics.

References

Bonafonte, A., Höge, H., Kiss, I., Moreno, A., Ziegenhain, U., Van den Heuvel, H., et al. (2006). TC-STAR: Specifications of language resources and evaluation for speech synthesis, Proceedings of LREC.
Burke, D. (2007). Speech processing for IP networks/media resource control protocol (MRCP). West Sussex: Wiley.
Google Scholar
Copeland, T. (2007). Generating parsers with JavaCC. Alexandria: Centennial Books.
Google Scholar
Höge, H., Kacic, Z., Kotnik, B., Rojc, M., Moreau, N., & Hain, H.-U. (2008). Evaluation of modules and tools for speech synthesis—The ECESS framework. Proceedings of LREC.
Perez, J., Bonafonte, A., Hain, H-U., Keller, E., Breuer, S. & Tian, J. (2006). ECESS inter-module interface specification for speech synthesis, Proceedings of LREC.
Shalyto, A. A. (2001). Logic control and “reactive” systems: Algorithmization and programming. Automation and remote control, Vol. 62, No. 1, pp. 1–29. (Avtomatika i Telemekhanika, Trans. No. 1, pp. 3–39).
Terrazas, A., Ostuni, J., & Barlow, M. (2002). Java media APIs: Cross-platform imaging, media and visualization. Sams publishing.
Weyns, D., Boucke, N., Holvoet, T., & Demarsin, B. (2007). DynCNET: A protocol for flexible transport assignment in AGV transportation systems. Katholieke Universiteit Leuven, Report CW 478.

Download references

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia
Matej Rojc & Zdravko Kačič
IC 5, Siemens AG, Corporate Technology, München, Germany
Harald Höge

Authors

Matej Rojc
View author publications
You can also search for this author in PubMed Google Scholar
Harald Höge
View author publications
You can also search for this author in PubMed Google Scholar
Zdravko Kačič
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matej Rojc.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rojc, M., Höge, H. & Kačič, Z. Remote-based text-to-speech modules’ evaluation framework: the RES framework. Lang Resources & Evaluation 44, 371–386 (2010). https://doi.org/10.1007/s10579-009-9110-3

Download citation

Published: 29 November 2009
Issue Date: December 2010
DOI: https://doi.org/10.1007/s10579-009-9110-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Remote-based text-to-speech modules’ evaluation framework: the RES framework

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-layered Approach to Evaluating Speech Translation Performance of Meetings

Open source platform for Estonian speech transcription

End-to-End Speech Synthesis for the Serbian Language Based on Tacotron

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Remote-based text-to-speech modules’ evaluation framework: the RES framework

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-layered Approach to Evaluating Speech Translation Performance of Meetings

Open source platform for Estonian speech transcription

End-to-End Speech Synthesis for the Serbian Language Based on Tacotron

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation