Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3526113.3545628acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article

Detecting Input Recognition Errors and User Errors using Gaze Dynamics in Virtual Reality

Published: 28 October 2022 Publication History

Abstract

Gesture-based recognition systems are susceptible to input recognition errors and user errors, both of which negatively affect user experiences and can be frustrating to correct. Prior work has suggested that user gaze patterns following an input event could be used to detect input recognition errors and subsequently improve interaction. However, to be useful, error detection systems would need to detect various types of high-cost errors. Furthermore, to build a reliable detection model for errors, gaze behaviour following these errors must be manifested consistently across different tasks. Using data analysis and machine learning models, this research examined gaze dynamics following input events in virtual reality (VR). Across three distinct point-and-select tasks, we found differences in user gaze patterns following three input events: correctly recognized input actions, input recognition errors, and user errors. These differences were consistent across tasks, selection versus deselection actions, and naturally occurring versus experimentally injected input recognition errors. A multi-class deep neural network successfully discriminated between these three input events using only gaze dynamics, achieving an AUC-ROC-OVR score of 0.78. Together, these results demonstrate the utility of gaze in detecting interaction errors and have implications for the design of intelligent systems that can assist with adaptive error recovery.

References

[1]
Aliza Abeles, Ann Blandford, Paul Cairns, Anna L Cox, Simon YW Li, and Richard M Young. 2006. Further investigations into post-completion error: the effects of interruption position and duration. In Proceedings of the Annual Meeting of the Cognitive Science Society, Vol. 28.
[2]
Gregory D Abowd and Alan J Dix. 1992. Giving undo attention. Interacting with computers 4, 3 (1992), 317–342.
[3]
Richard A Abrams, David E Meyer, and Sylvan Kornblum. 1990. Eye-hand coordination: oculomotor control in rapid aimed limb movements.Journal of experimental psychology: human perception and performance 16, 2(1990), 248.
[4]
RW Angel, W Alston, and H Garland. 1970. Functional relations between the manual and oculomotor control systems. Experimental Neurology 27, 2 (1970), 248–257.
[5]
Caroline Appert, Olivier Chapuis, and Emmanuel Pietriga. 2012. Dwell-and-spring: undo for direct manipulation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1957–1966.
[6]
Shaojie Bai, J Zico Kolter, and Vladlen Koltun. 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271(2018).
[7]
Brian P Bailey and Joseph A Konstan. 2006. On the need for attention-aware systems: Measuring effects of interruption on task performance, error rate, and affective state. Computers in human behavior 22, 4 (2006), 685–708.
[8]
Nikola Banovic, Tovi Grossman, and George Fitzmaurice. 2013. The effect of time-based cost of error in target-directed pointing tasks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1373–1382.
[9]
Nikola Banovic, Varun Rao, Abinaya Saravanan, Anind K Dey, and Jennifer Mankoff. 2017. Quantifying aversion to costly typing errors in expert mobile text entry. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 4229–4241.
[10]
Roman Bednarik, Hana Vrzakova, and Michal Hradis. 2012. What do you want to do next: a novel approach for intent prediction in gaze-based interaction. In Proceedings of the symposium on eye tracking research and applications. 83–90.
[11]
Ty W Boyer and Matthew Wang. 2018. Direct gaze, eye movements, and covert and overt social attention processes. Attention, Perception, & Psychophysics 80, 7 (2018), 1654–1659.
[12]
Katherine Breeden and Pat Hanrahan. 2017. Gaze data for the analysis of attention in feature films. ACM Transactions on Applied Perception (TAP) 14, 4 (2017), 1–14.
[13]
Michael D Byrne and Susan Bovair. 1997. A working memory model of a common procedural error. Cognitive science 21, 1 (1997), 31–61.
[14]
Michael D Byrne and Elizabeth M Davis. 2006. Task structure and postcompletion error in the execution of a routine procedure. Human Factors 48, 4 (2006), 627–638.
[15]
Ricardo Chavarriaga, Pierre W Ferrez, and José del R Millán. 2008. To err is human: Learning from error potentials in brain-computer interfaces. In Advances in cognitive neurodynamics ICCN 2007. Springer, 777–782.
[16]
Brendan David-John, Candace Peacock, Ting Zhang, T Scott Murdison, Hrvoje Benko, and Tanya R Jonker. 2021. Towards gaze-based prediction of the intent to interact in virtual reality. In ACM Symposium on Eye Tracking Research and Applications. 1–7.
[17]
Gabriel Diaz, Joseph Cooper, Dmitry Kit, and Mary Hayhoe. 2013. Real-time recording and classification of eye movements in an immersive virtual environment. Journal of vision 13, 12 (2013), 5–5.
[18]
Stefan Dowiasch, Svenja Marx, Wolfgang Einhäuser, and Frank Bremmer. 2015. Effects of aging on eye movements in the real world. Frontiers in human neuroscience 9 (2015), 46.
[19]
Sarah L Elliott, Mark A Georgeson, and Michael A Webster. 2011. Response normalization and blur adaptation: Data and multi-scale model. Journal of Vision 11, 2 (2011), 7–7.
[20]
B Fischer and L Rogal. 1987. EYE-HAND-COORDINATION: A REACTION TIME STUDY IN MAN AND MONKEY. In Eye Movements from Physiology to Cognition. Elsevier, 162–163.
[21]
Marco Furtner and Pierre Sachse. 2008. The psychology of eye-hand coordination in human computer interaction. In Proc. HCI, Vol. 8. 144–149.
[22]
Christoph Gebhardt, Brian Hecox, Bas van Opheusden, Daniel Wigdor, James Hillis, Otmar Hilliges, and Hrvoje Benko. 2019. Learning cooperative personalized policies from gaze data. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology. 197–208.
[23]
Anjith George and Aurobinda Routray. 2016. Real-time eye gaze direction classification using convolutional neural network. In 2016 International Conference on Signal Processing and Communications (SPCOM). IEEE, 1–5.
[24]
Sylvie Gibet, Nicolas Courty, and Jean-François Kamp. 2005. Gesture in Human-Computer Interaction and Simulation 6th International Gesture Workshop, GW 2005, Berder Island, France, May 18-20, 2005, Revised Selected Papers. In Conference proceedings GW. Springer, 239.
[25]
Denis Glencross and Nicholas Barrett. 1983. Programming precision in repetitive tapping. Journal of Motor Behavior 15, 2 (1983), 191–200.
[26]
Peter A Hancock and Karl M Newell. 1985. The movement speed-accuracy relationship in space-time. In Motor behavior. Springer, 153–188.
[27]
Mary M Hayhoe, Anurag Shrivastava, Ryan Mruczek, and Jeff B Pelz. 2003. Visual memory and motor planning in a natural task. Journal of vision 3, 1 (2003), 6–6.
[28]
Hao He, Yingying She, Jianbing Xiahou, Junfeng Yao, Jun Li, Qingqi Hong, and Yingxuan Ji. 2018. Real-time eye-gaze based interaction for human intention prediction and emotion analysis. In Proceedings of Computer Graphics International 2018. 185–194.
[29]
John M Henderson, Phillip A Weeks Jr, and Andrew Hollingworth. 1999. The effects of semantic consistency on eye movements during complex scene viewing.Journal of experimental psychology: Human perception and performance 25, 1(1999), 210.
[30]
Geoffrey E Hinton and Sam Roweis. 2002. Stochastic neighbor embedding. Advances in neural information processing systems 15 (2002).
[31]
Alexander Hoffmann, Daniel Spelmezan, and Jan Borchers. 2009. TypeRight: a keyboard with tactile error prevention. In Proceedings of the SIGCHI conference on human factors in computing systems. 2265–2268.
[32]
HTC. 2021. Vive Trcker. https://www.vive.com/us/accessory/tracker3/(2021).
[33]
Tomoki Ishikawa and Takahiro Yakoh. 2021. Saliency prediction based on object recognition and gaze analysis. Electronics and Communications in Japan 104, 2 (2021), e12303.
[34]
Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: a review. Data mining and knowledge discovery 33, 4 (2019), 917–963.
[35]
Howell Istance, Richard Bates, Aulikki Hyrskykari, and Stephen Vickers. 2008. Snap clutch, a moded approach to solving the Midas touch problem. In Proceedings of the 2008 symposium on Eye tracking research & applications. 221–228.
[36]
Keiko Katsuragawa, Ankit Kamal, Qi Feng Liu, Matei Negulescu, and Edward Lank. 2019. Bi-Level Thresholding: Analyzing the Effect of Repeated Errors in Gesture Input. ACM Transactions on Interactive Intelligent Systems 9, 2-3 (25 4 2019), 1–30. https://doi.org/10.1145/3181672
[37]
Wolf Kienzle, Eric Whitmire, Chris Rittaler, and Hrvoje Benko. 2021. ElectroRing: Subtle Pinch and Touch Detection with a Ring. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 3, 12 pages. https://doi.org/10.1145/3411764.3445094
[38]
Kathryn Koehler and Miguel P Eckstein. 2015. Scene Inversion Slows the Rejection of False Positives through Saccade Exploration During Search. In CogSci.
[39]
Scott A Kuhl, William B Thompson, and Sarah H Creem-Regehr. 2009. HMD calibration and its effects on distance judgments. ACM Transactions on Applied Perception (TAP) 6, 3 (2009), 1–20.
[40]
Manu Kumar, Terry Winograd, and Andreas Paepcke. 2007. Gaze-enhanced scrolling techniques. In CHI’07 Extended Abstracts on Human Factors in Computing Systems. 2531–2536.
[41]
Ben Lafreniere, Tanya R. Jonker, Stephanie Santosa, Mark Parent, Michael Glueck, Tovi Grossman, Hrvoje Benko, and Daniel Wigdor. 2021. False Positives vs. False Negatives: The Effects of Recovery Time and Cognitive Costs on Input Error Preference. In The 34th Annual ACM Symposium on User Interface Software and Technology. 54–68.
[42]
Page Laubheimer. 2015. Preventing user errors: avoiding unconscious slips. Nielsen Norman Group(2015).
[43]
Geoffrey R Loftus and Norman H Mackworth. 1978. Cognitive determinants of fixation location during picture viewing.Journal of Experimental Psychology: Human perception and performance 4, 4(1978), 565.
[44]
Marco Loregian. 2008. Undo for mobile phones: does your mobile phone need an undo key? do you?. In Proceedings of the 5th Nordic conference on Human-computer interaction: building bridges. 274–282.
[45]
Katsutoshi Masai and Kai Kunze. [n.d.]. Maki sugimoto, and Mark Billinghurst. 2016. Empathy Glasses. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA’16). ACM, New York, NY, USA. 1257–1263.
[46]
Ann McNamara, Reynold Bailey, and Cindy Grimm. 2008. Improving search task performance using subtle gaze direction. In Proceedings of the 5th Symposium on Applied Perception in Graphics and Visualization. 51–56.
[47]
David E Meyer, Richard A Abrams, Sylvan Kornblum, Charles E Wright, and JE Keith Smith. 1988. Optimality in human motor performance: ideal control of.
[48]
George Nagy, Thomas A Nartker, and Stephen V Rice. 1999. Optical character recognition: An illustrated guide to the frontier. In Document recognition and retrieval VII, Vol. 3967. SPIE, 58–69.
[49]
Matei Negulescu, Jaime Ruiz, and Edward Lank. 2012. A recognition safety net: bi-level threshold recognition for mobile motion gestures. In Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services. 147–150.
[50]
Abigail Noyce, Jessica Maryott, and Robert Sekuler. 2010. Unexpected events, predictive eye movements, and imitation learning. Journal of Vision 10, 7 (2010), 755–755.
[51]
Candace E Peacock, Brendan David-John, Ting Zhang, T Scott Murdison, Matthew J Boring, Hrvoje Benko, and Tanya R Jonker. 2021. Gaze signatures decode the onset of working memory encoding. In CHI2021 Eye Movements as an Interface to Cognitive State (EMICS) Workshop Proceedings. ACM.
[52]
Candace E. Peacock, Ben Lafreniere, Ting Zhang, Stephanie Santosa, Hrvoje Benko, and Tanya R. Jonker. 2022. Gaze as an Indicator of Input Recognition Errors. In ACM Symposium on Eye Tracking Research and Applications. 14 pages. (in press).
[53]
C Prablanc, JF Echallier, E Komilis, and M Jeannerod. 1979. Optimal response of eye and hand motor systems in pointing at a visual target. Biological cybernetics 35, 2 (1979), 113–124.
[54]
Raj M Ratwani, J Malcolm McCurry, and J Gregory Trafton. 2008. Predicting postcompletion errors using eye movements. In Proceedings of the SIGCHI conference on human factors in computing systems. 539–542.
[55]
Raj M Ratwani, J Gregory Trafton, and Christopher Myers. 2006. Helpful or harmful? Examining the effects of interruptions on task performance. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 50. SAGE Publications Sage CA: Los Angeles, CA, 372–375.
[56]
Jun Rekimoto. 1999. Time-machine computing: a time-centric approach for the information environment. In Proceedings of the 12th annual ACM symposium on User interface software and technology. 45–54.
[57]
Thomas R Reppert, Karolina M Lempert, Paul W Glimcher, and Reza Shadmehr. 2015. Modulation of saccade vigor during value-based decision making. Journal of Neuroscience 35, 46 (2015), 15369–15378.
[58]
Gerhard Rigoll, Andreas Kosmala, and Stefan Eickeler. 1997. High performance real-time gesture recognition using hidden markov models. In International Gesture Workshop. Springer, 69–80.
[59]
Jaime Ruiz and Yang Li. 2011. DoubleFlip: a motion gesture delimiter for mobile interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2717–2720.
[60]
Kareisha Sackaloo, Emily Strouse, and Martin S Rice. 2015. Degree of preference and its influence on motor control when reaching for most preferred, neutrally preferred, and least preferred candy. OTJR: Occupation, Participation and Health 35, 2 (2015), 81–88.
[61]
Dario D Salvucci and Joseph H Goldberg. 2000. Identifying fixations and saccades in eye-tracking protocols. In Proceedings of the 2000 symposium on Eye tracking research & applications. 71–78.
[62]
Philippe Schmid, Sylvain Malacria, Andy Cockburn, and Mathieu Nancel. 2020. Interaction Interferences: Implications of Last-Instant System State Changes. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 516–528.
[63]
Naveen Sendhilnathan, Debaleena Basu, Michael E Goldberg, Jeffrey D Schall, and Aditya Murthy. 2021. Neural correlates of goal-directed and non–goal-directed movements. Proceedings of the National Academy of Sciences 118, 6 (2021).
[64]
Hemant Bhaskar Surale, Fabrice Matulic, and Daniel Vogel. 2017. Experimental analysis of mode switching techniques in touch-based user interfaces. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 3267–3280.
[65]
Markus Tatzgern, Valeria Orso, Denis Kalkofen, Giulio Jacucci, Luciano Gamberini, and Dieter Schmalstieg. 2016. Adaptive information density for augmented reality displays. In 2016 IEEE Virtual Reality (VR). 83–92. https://doi.org/10.1109/VR.2016.7504691
[66]
Diman Zad Tootaghaj, Adrian Sampson, Todd Mytkowicz, and Kathryn S McKinley. 2017. High five: improving gesture recognition by embracing uncertainty. arXiv preprint arXiv:1710.09441(2017).
[67]
Neff Walker, David E Meyer, and John B Smelcer. 1993. Spatial and temporal characteristics of rapid cursor-positioning movements with electromechanical mice in human-computer interaction. Human Factors 35, 3 (1993), 431–458.
[68]
Colin Ware and Harutune H Mikaelian. 1986. An evaluation of an eye tracker as a device for computer input2. In Proceedings of the SIGCHI/GI conference on Human factors in computing systems and graphics interface. 183–188.
[69]
Hongyi Wen, Julian Ramos Rojas, and Anind K. Dey. 2016. Serendipity: Finger Gesture Recognition Using an Off-the-Shelf Smartwatch. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 3847–3851. https://doi.org/10.1145/2858036.2858466
[70]
Wikipedia. 2022. Yahtzee. https://en.wikipedia.org/wiki/Yahtzee(2022).
[71]
Huiyue Wu and Jianmin Wang. 2016. A visual attention-based method to address the midas touch problem existing in gesture-based interaction. The Visual Computer 32, 1 (2016), 123–136.
[72]
Jiawei Yang, Guangtao Zhai, and Huiyu Duan. 2019. Predicting the visual saliency of the people with VIMS. In 2019 IEEE Visual Communications and Image Processing (VCIP). IEEE, 1–4.
[73]
Suparat Yeamkuan and Kosin Chamnongthai. 2021. 3D Point-of-Intention Determination Using a Multimodal Fusion of Hand Pointing and Eye Gaze for a 3D Display. Sensors 21, 4 (2021), 1155.
[74]
Weilie Yi and Dana Ballard. 2009. Recognizing behavior in hand-eye coordination patterns. International Journal of Humanoid Robotics 6, 03 (2009), 337–359.
[75]
Difeng Yu, Xueshi Lu, Rongkai Shi, Hai-Ning Liang, Tilman Dingler, Eduardo Velloso, and Jorge Goncalves. 2021. Gaze-supported 3d object manipulation in virtual reality. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13.
[76]
Ting Zhang, Zhenhong Hu, Aakar Gupta, Chi-Hao Wu, Hrvoje Benko, and Tanya R. Jonker. 2022. RIDS: Implicit Detection of a Selection Gesture Using Hand Motion Dynamics During Freehand Pointing in Virtual Reality. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. https://doi.org/10.1145/3526113.3545701

Cited By

View all
  • (2024)Interactive Mediation Techniques for Error-Aware Gesture Input SystemsProceedings of the 50th Graphics Interface Conference10.1145/3670947.3670964(1-12)Online publication date: 3-Jun-2024
  • (2024)Understanding How Blind Users Handle Object Recognition Errors: Strategies and ChallengesProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675635(1-15)Online publication date: 27-Oct-2024
  • (2024)Real-World Scanpaths Exhibit Long-Term Temporal Dependencies: Considerations for Contextual AI for AR ApplicationsProceedings of the 2024 Symposium on Eye Tracking Research and Applications10.1145/3649902.3656352(1-7)Online publication date: 4-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
UIST '22: Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology
October 2022
1363 pages
ISBN:9781450393201
DOI:10.1145/3526113
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Recognizer error
  2. adaptive user interfaces
  3. eye tracking
  4. gaze behavior
  5. gaze dynamics
  6. input recognition errors

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

UIST '22

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Upcoming Conference

UIST '25
The 38th Annual ACM Symposium on User Interface Software and Technology
September 28 - October 1, 2025
Busan , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)274
  • Downloads (Last 6 weeks)19
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Interactive Mediation Techniques for Error-Aware Gesture Input SystemsProceedings of the 50th Graphics Interface Conference10.1145/3670947.3670964(1-12)Online publication date: 3-Jun-2024
  • (2024)Understanding How Blind Users Handle Object Recognition Errors: Strategies and ChallengesProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675635(1-15)Online publication date: 27-Oct-2024
  • (2024)Real-World Scanpaths Exhibit Long-Term Temporal Dependencies: Considerations for Contextual AI for AR ApplicationsProceedings of the 2024 Symposium on Eye Tracking Research and Applications10.1145/3649902.3656352(1-7)Online publication date: 4-Jun-2024
  • (2024)Communication breakdown: Gaze-based prediction of system error for AI-assisted robotic arm simulated in VRProceedings of the 2024 Symposium on Eye Tracking Research and Applications10.1145/3649902.3653339(1-7)Online publication date: 4-Jun-2024
  • (2024)GEARS: Generalizable Multi-Purpose Embeddings for Gaze and Hand Data in VR InteractionsProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659551(279-289)Online publication date: 22-Jun-2024
  • (2024)Exploring Visualizations for Precisely Guiding Bare Hand Gestures in Virtual RealityProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642935(1-19)Online publication date: 11-May-2024
  • (2024)FocusFlow: 3D Gaze-Depth Interaction in Virtual Reality Leveraging Active Visual Depth ManipulationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642589(1-18)Online publication date: 11-May-2024
  • (2024)A Stereohaptics Accessory for Spatial Computing PlatformsHCI International 2024 – Late Breaking Papers10.1007/978-3-031-76803-3_19(325-340)Online publication date: 6-Dec-2024
  • (2023)Neural Network Implementation of Gaze-Target Prediction for Human-Robot Interaction2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN57019.2023.10309483(2238-2244)Online publication date: 28-Aug-2023
  • (2023)XR Input Error Mediation for Hand-Based Input: Task and Context Influences a User’s Preference2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)10.1109/ISMAR59233.2023.00117(1006-1015)Online publication date: 16-Oct-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media