research-article

Enabling Uniform Computer Interaction Experience for Blind Users through Large Language Models

Authors:

Satwik Ram Kodandaram,

IV Ramakrishnan,

Vikas AshokAuthors Info & Claims

ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility

Article No.: 73, Pages 1 - 14

https://doi.org/10.1145/3663548.3675605

Published: 27 October 2024 Publication History

Abstract

Blind individuals, who by necessity depend on screen readers to interact with computers, face considerable challenges in navigating the diverse and complex graphical user interfaces of different computer applications. The heterogeneity of various application interfaces often requires blind users to remember different keyboard combinations and navigation methods to use each application effectively. To alleviate this significant interaction burden imposed by heterogeneous application interfaces, we present Savant, a novel assistive technology powered by large language models (LLMs) that allows blind screen reader users to interact uniformly with any application interface through natural language. Novelly, Savant can automate a series of tedious screen reader actions on the control elements of the application when prompted by a natural language command from the user. These commands can be flexible in the sense that the user is not strictly required to specify the exact names of the control elements in the command. A user study evaluation of Savant with 11 blind participants demonstrated significant improvements in interaction efficiency and usability compared to current practices.

References

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

[2]

Amazon. 2023. Amazon Alexa Voice Assistant. https://www.amazon.com/b?ie=UTF8&node=21576558011.

[3]

Apple. 2023. Siri Voice Assistant. https://www.apple.com/siri/.

[4]

Vikas Ashok, Yevgen Borodin, Yury Puzis, and IV Ramakrishnan. 2015. Capti-speak: a speech-enabled web screen reader. In Proceedings of the 12th International Web for All Conference. 1–10.

Digital Library

[5]

Vikas Ashok, Yury Puzis, Yevgen Borodin, and IV Ramakrishnan. 2017. Web screen reading automation assistance using semantic abstraction. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. 407–418.

Digital Library

[6]

Vikas Ashok, Mohan Sunkara, and Satwik Ram. 2023. Assistive Technologies for People with Visual Impairments. (2023).

[7]

Vikas Ganjigunte Ashok. 2018. Non-Visual Web Browsing: From Accessibility with Screen Readers to Usability with Assistants. Ph. D. Dissertation. State University of New York at Stony Brook.

[8]

Ali Selman Aydin, Shirin Feiz, Vikas Ashok, and IV Ramakrishnan. 2020. Sail: Saliency-driven injection of aria landmarks. In Proceedings of the 25th international conference on intelligent user interfaces. 111–115.

Digital Library

[9]

Mark S Baldwin, Gillian R Hayes, Oliver L Haimson, Jennifer Mankoff, and Scott E Hudson. 2017. The tangible desktop: a multimodal approach to nonvisual computing. ACM Transactions on Accessible Computing (TACCESS) 10, 3 (2017), 1–28.

Digital Library

[10]

Syed Masum Billah, Vikas Ashok, Donald E Porter, and IV Ramakrishnan. 2017. Ubiquitous accessibility for people with visual impairments: Are we there yet?. In Proceedings of the 2017 chi conference on human factors in computing systems. 5862–5868.

Digital Library

[11]

Syed Masum Billah, Yu-Jung Ko, Vikas Ashok, Xiaojun Bi, and IV Ramakrishnan. 2019. Accessible gesture typing for non-visual text entry on smartphones. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[12]

Yevgen Borodin, Jeffrey P Bigham, Glenn Dausch, and IV Ramakrishnan. 2010. More than meets the eye: a survey of screen-reader browsing strategies. In Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A). 1–10.

Digital Library

[13]

Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Giulio Mori. 2012. Designing e-learning collaborative tools for blind people. E-Learning-Long-Distance and Lifelong Perspectives (2012), 125–144.

[14]

Iyad Abu Doush and Enrico Pontelli. 2013. Non-visual navigation of spreadsheets: Enhancing accessibility of Microsoft Excel™. Universal access in the information society 12 (2013), 143–159.

Digital Library

[15]

Javedul Ferdous, Hae-Na Lee, Sampath Jayarathna, and Vikas Ashok. 2022. InSupport: Proxy interface for enabling efficient non-visual interaction with web data records. In 27th International Conference on Intelligent User Interfaces. 49–62.

Digital Library

[16]

Javedul Ferdous, Hae-Na Lee, Sampath Jayarathna, and Vikas Ashok. 2023. Enabling Efficient Web Data-Record Interaction for People with Visual Impairments via Proxy Interfaces. ACM Transactions on Interactive Intelligent Systems 13, 3 (2023), 1–27.

Digital Library

[17]

Prathik Gadde and Davide Bolchini. 2014. From screen reading to aural glancing: towards instant access to key page sections. In Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibility. 67–74.

Digital Library

[18]

Google. 2023. Google Voice Assistant. https://assistant.google.com/.

[19]

Simon Harper and Alex Q Chen. 2012. Web accessibility guidelines: A lesson from the evolving Web. World Wide Web 15 (2012), 61–88.

Digital Library

[20]

Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psychology. Vol. 52. Elsevier, 139–183.

[21]

Md Touhidul Islam, Donald E Porter, and Syed Masum Billah. 2023. A Probabilistic Model and Metrics for Estimating Perceived Accessibility of Desktop Applications in Keystroke-Based Non-Visual Interactions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–20.

Digital Library

[22]

Yu-Jung Ko, Aini Putkonen, Ali Selman Aydin, Shirin Feiz, Yuheng Wang, Vikas Ashok, IV Ramakrishnan, Antti Oulasvirta, and Xiaojun Bi. 2021. Modeling Gliding-based Target Selection for Blind Touchscreen Users. In Proceedings of the 23rd International Conference on Mobile Human-Computer Interaction. 1–14.

Digital Library

[23]

Satwik Ram Kodandaram, Mohan Sunkara, Sampath Jayarathna, and Vikas Ashok. 2023. Detecting Deceptive Dark-Pattern Web Advertisements for Blind Screen-Reader Users. Journal of Imaging 9, 11 (2023), 239.

[24]

Hae-Na Lee, Vikas Ashok, and IV Ramakrishnan. 2020. Repurposing visual input modalities for blind users: a case study of word processors. In 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2714–2721.

Digital Library

[25]

Hae-Na Lee, Vikas Ashok, and IV Ramakrishnan. 2020. Rotate-and-Press: A Non-visual Alternative to Point-and-Click?. In International Conference on Human-Computer Interaction. Springer, 291–305.

Digital Library

[26]

H. N. Lee, V. Ashok, and I. V. Ramakrishnan. 2020. Repurposing Visual Input Modalities for Blind Users: A Case Study of Word Processors. In 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). 2714–2721. https://doi.org/10.1109/SMC42975.2020.9283015

Digital Library

[27]

Barbara Leporini, Maria Claudia Buzzi, and Marina Buzzi. 2012. Interacting with mobile devices via VoiceOver: usability and accessibility issues. In Proceedings of the 24th Australian computer-human interaction conference. 339–348.

Digital Library

[28]

Zhi Li, Yu-Jung Ko, Aini Putkonen, Shirin Feiz, Vikas Ashok, IV Ramakrishnan, Antti Oulasvirta, and Xiaojun Bi. 2023. Modeling touch-based menu selection performance of blind users via reinforcement learning. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–18.

Digital Library

[29]

Zhi Li, Maozheng Zhao, Dibyendu Das, Hang Zhao, Yan Ma, Wanyu Liu, Michel Beaudouin-Lafon, Fusheng Wang, Iv Ramakrishnan, and Xiaojun Bi. 2022. Select or Suggest? Reinforcement Learning-based Method for High-Accuracy Target Selection on Touchscreens. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–15.

Digital Library

[30]

Mei Miao, Hoai Anh Pham, Jens Friebe, and Gerhard Weber. 2016. Contrasting usability evaluation methods with blind users. Universal access in the Information Society 15 (2016), 63–76.

[31]

Microsoft. 2021. Microsoft Office guide for people who are blind or low-vision. https://docs.microsoft.com/en-us/windows/win32/winauto/entry-uiauto-win32.

[32]

Microsoft. 2021. Microsoft UI Automation. https://docs.microsoft.com/en-us/windows/win32/winauto/entry-uiauto-win32.

[33]

Microsoft. 2023. Cortana Voice Assistant. https://www.microsoft.com/en-us/cortana.

[34]

Lourdes Morales, Sonia M Arteaga, and Sri Kurniawan. 2013. Design guidelines of a tool to help blind authors independently format their word documents. In CHI’13 Extended Abstracts on Human Factors in Computing Systems. 31–36.

[35]

Giulio Mori, Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Victor MR Penichet. 2011. Making “Google Docs” user interface more accessible for blind people. In Advances in New Technologies, Interactive Interfaces, and Communicability: First International Conference, ADNTIIC 2010, Huerta Grande, Argentina, October 20-22, 2010, Revised Selected Papers 1. Springer, 20–29.

[36]

Mahika Phutane, Crescentia Jung, Niu Chen, and Shiri Azenkot. 2023. Speaking with My Screen Reader: Using Audio Fictions to Explore Conversational Access to Interfaces. In Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility. 1–18.

Digital Library

[37]

Johnny Saldaña. 2021. The coding manual for qualitative researchers. (2021).

[38]

John Gerard Schoeberlein and Yuanqiong Wang. 2014. Usability Evaluation of an Accessible Collaborative Writing Prototype for Blind Users.Journal of Usability Studies 10, 1 (2014).

[39]

Kristen Shinohara and Josh Tenenberg. 2007. Observing Sara: a case study of a blind person’s interactions with technology. In Proceedings of the 9th international ACM SIGACCESS conference on Computers and accessibility. 171–178.

Digital Library

[40]

Justina Sidlauskiene, Yannick Joye, and Vilte Auruskeviciene. 2023. AI-based chatbots in conversational commerce and their effects on product and price perceptions. Electronic Markets 33, 1 (2023), 24.

[41]

Reeta Singh. 2012. Blind handicapped vs. technology: How do blind people use computers. International Journal of Scientific & Engineering Research 3, 4 (2012), 1–7.

[42]

Mohan Sunkara, Sandeep Kalari, Sampath Jayarathna, and Vikas Ashok. 2023. Assessing the Accessibility of Web Archives. In 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE, 253–255.

[43]

Mohan Sunkara, Yash Prakash, Hae-Na Lee, Sampath Jayarathna, and Vikas Ashok. 2023. Enabling Customization of Discussion Forums for Blind Users. Proceedings of the ACM on Human-Computer Interaction 7, EICS (2023), 1–20.

Digital Library

[44]

Utku Uckun, Ali Selman Aydin, Vikas Ashok, and IV Ramakrishnan. 2020. Breaking the accessibility barrier in non-visual interaction with pdf forms. Proceedings of the ACM on Human-computer Interaction 4, EICS (2020), 1–16.

Digital Library

[45]

Utku Uckun, Ali Selman Aydin, Vikas Ashok, and IV Ramakrishnan. 2020. Ontology-Driven Transformations for PDF Form Accessibility. In Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility. 1–3.

Digital Library

[46]

Utku Uckun, Rohan Tumkur Suresh, Md Javedul Ferdous, Xiaojun Bi, I.V. Ramakrishnan, and Vikas Ashok. 2022. Taming User-Interface Heterogeneity with Uniform Overlays for Blind Users. In Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization(UMAP ’22). Association for Computing Machinery, New York, NY, USA, 212–222. https://doi.org/10.1145/3503252.3531317

Digital Library

[47]

Mirza Muhammad Waqar, Muhammad Aslam, and Muhammad Farhan. 2019. An intelligent and interactive interface to support symmetrical collaborative educational writing among visually impaired and sighted users. Symmetry 11, 2 (2019), 238.

[48]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824–24837.

[49]

Brian Wentz, Harry Hochheiser, and Jonathan Lazar. 2013. A survey of blind users on the usability of email applications. Universal access in the information society 12 (2013), 327–336.

[50]

Brian Wentz and Jonathan Lazar. 2011. Usability evaluation of email applications by blind users. Journal of Usability Studies 6, 2 (2011), 75–89.

Digital Library

[51]

Yichi Zhang and Joyce Chai. 2021. Hierarchical task learning from language instructions with unified transformers and self-monitoring. arXiv preprint arXiv:2106.03427 (2021).

[52]

Yu Zhong, TV Raman, Casey Burkhardt, Fadi Biadsy, and Jeffrey P Bigham. 2014. JustSpeak: enabling universal voice control on Android. In Proceedings of the 11th Web for All Conference. 1–4.

Digital Library

[53]

Hong Zou and Jutta Treviranus. 2015. ChartMaster: A tool for interacting with stock market charts using a screen reader. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. 107–116.

Digital Library

Index Terms

Enabling Uniform Computer Interaction Experience for Blind Users through Large Language Models
1. Human-centered computing
  1. Accessibility
    1. Accessibility technologies
    2. Empirical studies in accessibility

Recommendations

Screen Reading Enabled by Large Language Models
ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility

Large language models (LLMs), such as the pioneering GPT technology by OpenAI, have undeniably become one of the most significant innovations in recent history. They have achieved phenomenal success across a broad spectrum of applications in numerous ...
Taming User-Interface Heterogeneity with Uniform Overlays for Blind Users
UMAP '22: Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization

For many blind users, interaction with computer applications using screen reader assistive technology is a frustrating and time-consuming affair, mostly due to the complexity and heterogeneity of applications’ user interfaces. An interview study ...
Making touch-based kiosks accessible to blind users through simple gestures

Touch-based interaction is becoming increasingly popular and is commonly used as the main interaction paradigm for self-service kiosks in public spaces. Touch-based interaction is known to be visually intensive, and current non-haptic touch-display ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility

October 2024

1475 pages

ISBN:9798400706776

DOI:10.1145/3663548

Editors:
David Flatla
University of Guelph, CANADA
,
Faustina Hwang
University of Reading, UNITED KINGDOM
,
Tiago Guerreiro
University of Lisbon, PORTUGAL
,
Robin Brewer
University of Michigan, UNITED STATES

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGACCESS: ACM Special Interest Group on Accessible Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ASSETS '24

Sponsor:

SIGACCESS

ASSETS '24: The 26th International ACM SIGACCESS Conference on Computers and Accessibility

October 27 - 30, 2024

NL, St. John's, Canada

Acceptance Rates

Overall Acceptance Rate 436 of 1,556 submissions, 28%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
165
Total Downloads

Downloads (Last 12 months)165
Downloads (Last 6 weeks)43

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten