Abstract
Research on intelligent agents has demonstrated that the degree an artificial entity resembles a human correlates with the likelihood that the entity will evoke social and psychological processes in humans. Language-attitude studies based on natural voices have provided evidence that human listeners socially assess and evaluate their communication partners according to the language variety they use. Taking the two findings together, we hypothesize that synthetically generated language varieties have social effects similar to those reported from language-attitude studies on natural speech. We present the design of a set of synthetic voices representing standard and dialectal varieties of Austrian German which were built into an existing cultural-heritage application letting virtual tourist guides speak in different varieties. With this setup, we performed a language-attitude study assessing the social evaluation of the characters represented by the synthetic voices. Our results are in accordance with previous findings from natural speech, but it also turned out that the specific context constitutes a major criterion for the preference or rejection of certain language varieties. In addition, we show that not only the particular variety, but also features relating to the voice quality of the synthesized speech bring about attributions of different social aspects and stereotypes. Together they strongly influence the attitudes of the listeners towards the artificial speakers showing the importance of an accurate voice design—including features related to particular language varieties—for the development of artificial agents.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Project Creative Histories: The Josefsplatz Experience http://ofai.at/research/nlu/projects/project_josefsplatz.html, http://www.youtube.com/watch?v=K1QQRb9gno8 (project video in German).
VSDS—Viennese Sociolect and Dialect Synthesis, 2007–2009; http://dialect-tts.ftw.at.
References
Black AW, Taylor PA (1997) The festival speech synthesis system: system documentation. Technical report HCRC/TR-83, human communication research centre, University of Edinburgh, Scotland
Cassell J (2009) Culture as social practice: being enculturated in human-computer interaction. In: Stephanidis C (ed) Proceedings of HCII, (published as universal access in HCI, Part III. Berlin Heidelberg: Springer-Verlag), pp 303–313
Cavallaro F, Ng BCh (2009) Between status and solidarity in Singapore. World Engl 28(2):143–159
Clark R, Richmond K, King S (2007) Multisyn voices from ARCTIC data for the Blizzard challenge. In: Proceedings of Interspeech, pp 101–104
Cohen MH, Giangola JP, Balogh J (2004) Voice user interface design. Addison-Wesley, Boston
Crowell C, Scheutz M, Schermerhorn P, Villano M (2009) Gendered voice and robot entities: perceptions and reactions of male and female subjects. In: Proceedings of the 2009 IEEE/RSJ international conference on intelligent robots and systems. St. Louis
Dubinsky AJ, Skinner SJ, Whittler TE (1989) Evaluating sales personnel: an attribution theory perspective. J Sell Sales Manag 9(2):9–21
Garrett P, Coupland N, Williams A (2003) Investigating language attitudes. Social meanings of dialect, ethnicity and performance. University of Wales Press, Cardiff
Heider F (1958) The psychology of interpersonal relations. Wiley, New York
Holm S (1979) A simple sequential rejective multiple test procedure. Scand J Stat 6:65–70
Iacobelli F, Cassell J (2007) Ethnic identity and engagement in embodied conversational agents. In: Proceedings of intelligent virtual agents (IVA), Sept. 17–19, Paris, France, pp 57–63
Kelley HH (1972) Causal schemata and the attribution process. General Learning Press, New York
Kraus A (2006) Language attitudes of Québécois students towards le français québécois standard and le franco-québécois. Master thesis, university of north carolina at chapel hill
Krenn B, Sieber G, Petschar H (2006) Metadata generation for cultural heritage: creative histories—the josefsplatz experience. In: Proceedings of EVA (Electronic Information, the Visual Arts and Beyond) 2006, Vienna, pp 27–34
Krenn B., Schreitter S., Neubarth F., Sieber G. (2012) Social evaluation of artificial agents by language varieties. In: Intelligent virtual agents–12th international conference, IVA 2012, Santa Cruz, 12–14 Sept, 2012, pp 377–389
Lambert W (1967) A Social Psychology of Bilingualism. J Soc Issues 23(2):91–109
Lambert W, Hodgson R, Gardner R, Fillenbaum S (1960) Evaluational reactions to spoken languages. J Abnorm Soc Psychol 60(1):44–51
Moon Y, Nass C (1996) How ‘real’ are computer personalities? Psychological responses to personality types in human–computer interaction. Commun Res 23(6):651–674
Moosmüller S (1988) Dialekt ist nicht gleich Dialekt. Spracheinschätzung in Wien. Wien Linguist Gaz 40–41:55–80
Moosmüller S (1991) Hochsprache und dialekt in österreich. soziophonologische untersuchungen zu ihrer abgrenzung in wien, graz, salzburg und innsbruck. Böhlau, Weimar
Nass C, Brave S (2005) Wired for speech. MIT Press, Cambridge
Nass C, Moon Y (2000) Machines and mindlessness: social responses to computers. J Soc Issues 56(1):81–103
Nass C, Moon Y, Fogg BJ, Reeves B, Dryer DC (1995) Can computer personalities be human personalities? Int J Hum Computer Stud 43:223–239
Nass C, Moon Y, Green T (1997) Are computers gender neutral? Gender stereotypic responses to computers. J Appl Soc Psychol 27(10):864–876
Nomura T, Kanda T, Suzuki T (2006) Experimental investigation into influence of negative attitudes toward robots on human–robot interaction. AI & Soc 20(2):138–150
Pucher M, Schuchmann G, Fröhlich P (2008) Regionalized text-to-speech systems: persona design and application scenarios. In: Lecture notes in artificial intelligence (LNAI), Vol. 5398, COST Action 2102 School, Vietri sul Mare, pp 216–222
Pucher M, Neubarth F, Schabus D (2010a) Design and development of spoken dialog systems incorporating speech synthesis of viennese varieties. In: Miesenberger K, Klaus J, Zagler W, Karshmer A (eds) Proceedings of the 12th international conference ICCHP, Vienna, July 2010, Lecture notes in computer sciences (LNCS), Vol. 6179, Springer, pp 361–366
Pucher M, Neubarth F, Strom V, Moosmüller S, Hofer G, Kranzler C, Schuchmann G, Schabus D (2010b) Resources for speech synthesis of Viennese varieties. In: Proceedings of LREC, 2010, Malta, pp 105–108
Quintanar L, Crowell C, Pryor J, Adamopoulos J (1982) Human–computer interaction: a preliminary social–psychological analysis. Behav Res Methods Instrum 14:210–220
Quintanar L, Crowell C, Moskal P (1987) The interactive computer as a social stimulus in human-computer interactions. In: Salvendy G, Sauter S, Hurrell J (eds) Social ergonomic and stress aspects of work with computers. Elsevier, Amsterdam, pp 303–310
Rakic T, Steffens MC, Mummendey A (2011) Blinded by the accent! The minor role of looks in ethnic categorization. J PersSoc Psychol 100(1):16–29
Ryan E, Giles H (eds) (1982) Attitudes towards language variation. Edward Arnold, London
Schermerhorn P, Scheutz M, Crowell C (2008) Robot social presence and gender: do females view robots differently than males? In: Proceedings of the 3rd ACM IEEE international conference on human–robot interaction, Amsterdam, pp 263–270
Sormann M, Reitinger B, Bauer J, Klaus A, Karner K (2004) Fast and detailed 3d reconstruction of cultural heritage. In: International workshop on vision techniques applied to the rehabilitation of city centres 2004, Lisbon, CD proceedings
Soukup B (2001) ‘Y’all come back now, y’hear!?’ Language attitudes in the United States towards Southern American English. VIEWS (Vienna English Working Papers) 10(2):56–68
Soukup B (2009) Dialect use as interaction strategy: a sociolinguistic study of contextualization, speech perception, and language attitudes in Austria. Braumüller, Wien
Weiner B (1994) Motivationspsychologie. Beltz, Weinheim
Zahn C, Hopper R (1985) Measuring language attitudes: the speech evaluation instrument. J Lang Soc Psychol 4(2):113–123
Acknowledgments
The projects ‘Creative Histories: The Josefsplatz Experience’ and ‘Viennese Sociolect and Dialect Synthesis (VSDS)’ were both funded by the Vienna Science and Technology Fund (WWTF). In addition, the study presented in this paper was in part funded by the Austrian Federal Ministry for Transport, Innovation and Technology (BMVIT) under the research programme ‘FEMtech women in research and technology’ within the project ‘Companions für Userinnen’ (C4U).
The authors wish to thank the anonymous reviewers for their valuable and insightful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Krenn, B., Schreitter, S. & Neubarth, F. Speak to me and I tell you who you are! A language-attitude study in a cultural-heritage application. AI & Soc 32, 65–77 (2017). https://doi.org/10.1007/s00146-014-0569-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00146-014-0569-0