Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Combining Text-to-Speech Services with Conventional Voiceover for News Oralization

  • Conference paper
  • First Online:
Applications and Usability of Interactive TV (jAUTI 2022)

Abstract

The surge in digital content consumption has, in many cases, posed challenges for media companies, resulting in reduced revenue and the need to reinvent business models. The digitalization of content has introduced new consumption formats, and news podcasts have already become a reality in this landscape. While their existence is relatively recent in journalism, the increasing popularity of this format makes it an appealing addition to the field. But the production of podcasts may be demanding in what relates to the time, resources and even the technical expertise needed. In this scope, this paper primarily focuses on the premise of facilitating the creation of news podcasts. To achieve this, we propose employing Text-to-Speech technologies (TTS) for the oralization of journalistic texts in European Portuguese. We conducted tests using TTS services from Amazon Polly and Google Speech Cloud, with Google Speech Cloud Wavenet services yielding superior results among potential users. Additionally, we developed three podcast models incorporating human voiceover and/or TTS to get the users acceptance of those models. One model used only human voices, another only voice created with TTS and a hybrid podcast integrating both types of voices. The presence of human voice positively influenced the results, with the human voice model and hybrid voice outperforming the exclusive TTS voice model. However, the differences between the models were not significantly pronounced, and the results demonstrated an acceptance of Text-to-Speech technology in the context of news podcasts. Nonetheless, there remains a need for continuous technological advancement to converge with human discourse.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Harte, D., Howells, R., Williams, A.: Hyperlocal Journalism: The Decline of Local Newspapers and the Rise of Online Community News. Routledge, Milton Park (2018)

    Google Scholar 

  2. Newman, N., Gallo, N.: News podcasts and the opportunities for publishers (2019)

    Google Scholar 

  3. Allan, S.: Online News: Journalism and the Internet. McGraw-Hill Education, UK (2006)

    Google Scholar 

  4. Newman, N., Fletcher, R., Schulz, A., Andı, S., Nielsen, R.K.: Reuters institute digital news report 2020 (2020)

    Google Scholar 

  5. Botelho, M.: A crise dos jornais e do jornalismo. Meios & Publicidade (2017)

    Google Scholar 

  6. Stephens, M.: A History of News. Oxford University Press, Oxford (2007)

    Google Scholar 

  7. Sweney, M.: Spotify credits podcast popularity for 24% growth in subscribers | Spotify | The Guardian, 03 February 2021. https://www.theguardian.com/technology/2021/feb/03/spotify-podcast-popularity-24-percent-growth-subscribers. Accessed 23 Feb 2021

  8. Bhattacharjee, M.: News podcasts grow by 32% as daily news shows become increasingly popular, reports Reuters | What’s New in Publishing | Digital Publishing News, 10 December 2019. https://whatsnewinpublishing.com/news-podcasts-grow-by-32-as-daily-news-shows-become-increasingly-popular-reports-reuters/. Accessed 23 Feb 2021

  9. Edison Media: Comedy Tops the Podcast Genre Chart in the U.S. for Q2 2022 - Edison Research. https://www.edisonresearch.com/comedy-tops-the-podcast-genre-chart-in-the-u-s-for-q2-2022/. Accessed 05 Nov 2022

  10. Klatt, D.H.: Review of text-to-speech conversion for English. J. Acoust. Soc. Am. 82(3), 737–793 (1987)

    Article  Google Scholar 

  11. Arik, S.O., et al.: Deep voice: real-time neural text-to-speech. arXiv preprint arXiv:1702.07825 (2017)

  12. Tian, Q., Wan, X., Liu, S.: Generative adversarial network based speaker adaptation for high fidelity waveNet vocoder (2019). https://arxiv.org/pdf/1812.02339.pdf. Accessed 09 Feb 2021

  13. Gibiansky, A., et al.: Deep voice 2: multi-speaker neural text-to-speech. Adv. Neural. Inf. Process. Syst. 30, 2962–2970 (2017)

    Google Scholar 

  14. Rowan, D.: DeepMind: inside Google’s groundbreaking artificial intelligence startup | WIRED UK, 22 June 2015. https://www.wired.co.uk/article/deepmind. Accessed 08 Feb 2021

  15. Mendelson, J., Aylett, M.P. Beyond the listening test: an interactive approach to TTS evaluation. In: INTERSPEECH, pp. 249–253 (2017)

    Google Scholar 

  16. Wagner, P., et al.: Speech synthesis evaluation—state-of-the-art assessment and suggestion for a novel research program. In: Proceedings of the 10th Speech Synthesis Workshop (SSW10) (2019)

    Google Scholar 

  17. Rec, I.: P. 85. A method for subjective performance assessment of the quality of speech voice output devices. Int. Telecommun. Union Geneva (1994)

    Google Scholar 

  18. Hoβfeld, T., Schatz, R., Egger, S.: SOS: the MOS is not enough! In: 2011 Third International Workshop on Quality of Multimedia Experience, pp. 131–136. IEEE (2011)

    Google Scholar 

  19. Cambre, J., Maddock, J., Tsai, J., Colnago, J.: Choice of voices: a large-scale evaluation of text-to-speech voice quality for long-form content, vol. 20, April 2020. https://doi.org/10.1145/3313831.3376789

  20. Likert, R.: A technique for the measurement of attitudes. Arch. Psychol. (1932)

    Google Scholar 

  21. Almeida, P., Beça, P., Soares, J., Soares, B.: MixMyVisit – a solution for the automatic creation of videos to enhance the visitors’ experience. In: Abásolo, M.J., Olmedo Cifuentes, G.F. (eds.) jAUTI 2021. CCIS, vol. 1597, pp. 105–118. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-22210-8_7

  22. Almeida, P., Beça, P., Silva, T., Afonso, M., Covalenco, I., Duarte Nicolau, C.: A podcast creation platform to support news corporations: results from UX evaluation. In: ACM International Conference on Interactive Media Experiences, pp. 343–348, June 2022

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Almeida .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Afonso, M., Almeida, P. (2023). Combining Text-to-Speech Services with Conventional Voiceover for News Oralization. In: Abásolo, M.J., de Castro Lozano, C., Olmedo Cifuentes, G.F. (eds) Applications and Usability of Interactive TV. jAUTI 2022. Communications in Computer and Information Science, vol 1820. Springer, Cham. https://doi.org/10.1007/978-3-031-45611-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-45611-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-45610-7

  • Online ISBN: 978-3-031-45611-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics