Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3448018.3458013acmconferencesArticle/Chapter ViewAbstractPublication PagesetraConference Proceedingsconference-collections
short-paper

Combining Oculo-motor Indices to Measure Cognitive Load of Synthetic Speech in Noisy Listening Conditions.

Published: 25 May 2021 Publication History

Abstract

Gaze-based assistive technologies (ATs) that feature speech have the potential to improve the life of people with communication disorders. However, due to a limited understanding of how different speech types affect the cognitive load of users, an evaluation of ATs remains a challenge. Expanding on previous work, we combined temporal changes in pupil size and ocular movements (saccades and fixation differentials) to evaluate cognitive workload of two types of speech (natural and synthetic) mixed with noise, through a listening test. While observed pupil sizes were significantly larger at lower signal-to-noise levels, as participants listened and memorised speech stimuli; saccadic eye-movements were significantly more frequent for synthetic speech. In the synthetic condition, there was a strong negative correlation between pupil dilation and fixation differentials, indicating a higher strain on participants’ cognitive resources. These results suggest that combining oculo-motor indices can aid our understanding of the cognitive implications of different speech types.

References

[1]
Rúbia EO Schultz Ascari, Roberto Pereira, and Luciano Silva. 2020. Computer Vision-based Methodology to Improve Interaction for People with Motor and Speech Impairment. ACM Transactions on Accessible Computing 13, 4 (2020), 1–33.
[2]
Tanya Bafna, John Paulin Paulin Hansen, and Per Baekgaard. 2020. Cognitive Load during Eye-typing. In ACM ETRA. 1–8.
[3]
Jackson Beatty. 1982. Task-evoked pupillary responses, processing load, and the structure of processing resources.Psychological bulletin 91, 2 (1982), 276.
[4]
Marco Caligari, Marco Godi, Simone Guglielmetti, Franco Franchignoni, and Antonio Nardone. 2013. Eye tracking communication devices in amyotrophic lateral sclerosis: impact on disability and quality of life. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration 14, 7-8(2013), 546–552.
[5]
Carol Chermaz, Cassia Valentini-Botinhao, Henning Schepker, and Simon King. 2019. Evaluating Near End Listening Enhancement Algorithms in Realistic Environments. Proc. Interspeech 2019(2019), 1373–1377.
[6]
Mateusz Dubiel, Minoru Nakayama, and Xin Wang. 2020. Using Pupillary Responses to Measure Cognitive Load of Japanese Synthetic Speech mixed with Noise. Technical Report HIP2020-51. IEICE Technical report. 93–96 pages.
[7]
Mateusz Dubiel, Minoru Nakayama, and Xin Wang. 2021. Evaluating Synthetic Speech Workload with Oculo-motor indices: Preliminary Observations for Japanese Speech. In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC2021), Vol. 4:BIOSIGNALS. INSTICC publishing, Lisbon, 335–342.
[8]
Andrew T Duchowski, Krzysztof Krejtz, Nina A Gehrer, Tanya Bafna, and Per Bækgaard. 2020. The Low/High Index of Pupillary Activity. In Proceedings of the 2020 Conference on Human Factors in Computing Systems. 1–12.
[9]
Andrew T Duchowski, Krzysztof Krejtz, Izabela Krejtz, Cezary Biele, Anna Niedzielska, Peter Kiefer, Martin Raubal, and Ioannis Giannopoulos. 2018. The index of pupillary activity: Measuring cognitive load vis-à-vis task difficulty with pupil oscillation. In Proceedings of the 2018 Conference on Human Factors in Computing Systems. 1–13.
[10]
Yoshinobu Ebisawa and Mitsuhiro Sugiura. 1998. Influences of Target and Fixation Point Conditions on Characteristics of Visually Guided Voluntary Saccade. The Journal of the Institute of Image Information and Television En gineers 52, 11 (1998), 1730–1737.
[11]
Ralf Engbert and Reinhold Kliegl. 2003. Microsaccades uncover the orientation of covert attention. Vision research 43, 9 (2003), 1035–1045.
[12]
Lex Fridman, Bryan Reimer, Bruce Mehler, and William T Freeman. 2018. Cognitive load estimation in the wild. In Proceedings of the 2018 chi conference on human factors in computing systems. 1–9.
[13]
Avashna Govender and Simon King. 2018a. Measuring the Cognitive Load of Synthetic Speech Using a Dual Task Paradigm. In Interspeech. 2843–2847.
[14]
Avashna Govender and Simon King. 2018b. Using pupillometry to measure the cognitive load of synthetic speech. System 50(2018), 100.
[15]
Avashna Govender, Cassia Valentini-Botinhao, and Simon King. 2019a. Measuring the contribution to cognitive load of each predicted vocoder speech parameter in dnn-based speech synthesis. In Speech Synthesis Workshop (SSW), Vol. 2019.
[16]
Avashna Govender, Anita E Wagner, and Simon King. 2019b. Using Pupil Dilation to Measure Cognitive Load When Listening to Text-to-Speech in Quiet and in Noise. In INTERSPEECH. 1551–1555.
[17]
Stephen J Guastello, Anton Shircel, Matthew Malon, and Paul Timm. 2015. Individual differences in the experience of cognitive workload. Theoretical Issues in Ergonomics Science 16, 1 (2015), 20–52.
[18]
Helena Hemmingsson and Maria Borgestig. 2020. Usability of eye-gaze controlled computers in Sweden: A total population survey. International journal of environmental research and public health 17, 5(2020), 1639.
[19]
Eckhard H Hess and James M Polt. 1964. Pupil size in relation to mental activity during simple problem-solving. Science 143, 3611 (1964), 1190–1192.
[20]
HTS Working Group. 2015. The Japanese TTS System Open JTalk. http://open-jtalk.sourceforge.net/
[21]
Karen Hux, Kelly Knollman-Porter, Jessica Brown, and Sarah E Wallace. 2017. Comprehension of synthetic speech and digitized natural speech by adults with aphasia. Journal of Communication Disorders 69 (2017), 15–26.
[22]
Robert JK Jacob and Keith S Karn. 2003. Eye tracking in human-computer interaction and usability research: Ready to deliver the promises. In The mind’s eye. Elsevier, 573–605.
[23]
Marcel Adam Just and Patricia A Carpenter. 1976. Eye fixations and cognitive processes. Cognitive psychology 8, 4 (1976), 441–480.
[24]
Shaun K Kane, Meredith Ringel Morris, Ann Paradiso, and Jon Campbell. 2017. ” At times avuncular and cantankerous, with the reflexes of a mongoose” Understanding Self-Expression through Augmentative and Alternative Communication Devices. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1166–1179.
[25]
Jan-Louis Kruger, Esté Hefer, and Gordon Matthew. 2013. Measuring the impact of subtitles on cognitive load: Eye tracking and dynamic audiovisual texts. In Proceedings of the 2013 Conference on Eye Tracking South Africa. 62–66.
[26]
Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, and Nobuyuki Nishizawa. 2019. Training multi-speaker neural text-to-speech systems using speaker-imbalanced speech corpora. arXiv preprint arXiv:1904.00771(2019).
[27]
James G May, Robert S Kennedy, Mary C Williams, William P Dunlap, and Julie R Brannan. 1990. Eye movement indices of mental workload. Acta psychologica 75, 1 (1990), 75–89.
[28]
David McNaughton, Janice Light, David R Beukelman, Chris Klein, Dana Nieder, and Godfrey Nazareth. 2019. Building capacity in AAC: A person-centred approach to supporting participation by people with complex communication needs. Augmentative and Alternative Communication 35, 1 (2019), 56–68.
[29]
W Scott Peavler. 1974. Pupil size, information overload, and performance differences. Psychophysiology 11, 5 (1974), 559–566.
[30]
John L Sibert, Mehmet Gokturk, and Robert A Lavine. 2000. The reading assistant: eye gaze triggered auditory prompting for reading remediation. In Proceedings of the 13th annual ACM symposium on User interface software and technology. 101–107.
[31]
Olympia Simantiraki, Martin Cooke, and Simon King. 2018. Impact of Different Speech Types on Listening Effort. In INTERSPEECH. 2267–2271.
[32]
John Sweller. 1988. Cognitive load during problem solving: Effects on learning. Cognitive science 12, 2 (1988), 257–285.
[33]
Keiichi Tokuda, Heiga Zen, and Alan W Black. 2002. An HMM-based speech synthesis system applied to English. In IEEE Speech Synthesis Workshop. 227–230.
[34]
Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King, and Yannis Stylianou. 2013. Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise. In Interspeech. 3567–3571.
[35]
Matthew B Winn, Dorothea Wendt, Thomas Koelewijn, and Stefanie E Kuchinsky. 2018. Best practices and advice for using pupillometry to measure listening effort: An introduction for those who want to get started. Trends in hearing 22(2018), 2331216518800869.
[36]
Alfred L Yarbus. 2013. Eye movements and vision. Springer.
[37]
Heiga Zen, Alan Senior, and Martin Schuster. 2013. Statistical parametric speech synthesis using deep neural networks. In Proc. ICASSP. 7962–7966.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ETRA '21 Short Papers: ACM Symposium on Eye Tracking Research and Applications
May 2021
232 pages
ISBN:9781450383455
DOI:10.1145/3448018
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 May 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Evaluation study
  2. Eye movement
  3. Pupil response
  4. Speech Perception

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

ETRA '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 69 of 137 submissions, 50%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 110
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media