research-article

Open access

Self-guided Spatial Composition as an Additional Layer of Information to Enhance Accessibility of Images for Blind Users

Authors:

Raquel Hervás,

Alberto Chaves,

Víctor RuizAuthors Info & Claims

Interacción '24: Proceedings of the XXIV International Conference on Human Computer Interaction

Article No.: 4, Pages 1 - 8

https://doi.org/10.1145/3657242.3658590

Published: 19 June 2024 Publication History

All formats PDF

Abstract

Image spatial composition can provide additional information in an image or photograph. However, the usual approaches to make images accessible to blind people focus mainly on describing the image’s content, without delving into other aspects such as spatial composition, colors, background, known faces, etc. In doing so, much information that is present in the image but not included in the description is missing for a blind user. This work explores the combination of image captioning and object detection techniques with the final goal of making images more accessible to blind users. The approach is twofold: (1) state-of-the-art algorithms of image captioning and object detection will be combined so blind users can visualize the spatial composition of a given image; and (2) blind users will guide the exploration of the images, so they can gather all the information in a personalized manner and make their own interpretation. We implemented a preliminary prototype based on requirements obtained from blind users and performed an evaluation that provided promising results. The participants were reasonably satisfied with the usability of the prototype, and in several cases, they were able to obtain more information during their self-guided exploration of the images than with the general original description. However, some issues that were detected in the evaluation, and functionalities that could not be implemented, will be addressed in future work.

References

[1]

John Brooke. 1995. SUS: A quick and dirty usability scale. Usability Eval. Ind. 189 (11 1995).

[2]

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End Object Detection with Transformers. CoRR abs/2005.12872 (2020). arxiv:2005.12872https://arxiv.org/abs/2005.12872

[3]

Shashank Mohan Jain. 2022. Hugging Face. Apress, Berkeley, CA, 51–67. https://doi.org/10.1007/978-1-4842-8844-3_4

[4]

Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. arxiv:2201.12086

[5]

Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollár. 2015. Microsoft COCO: Common Objects in Context. arxiv:1405.0312

[6]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. Springer International Publishing, 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

[7]

Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems 32 (2019).

[8]

Jiasen Lu, Caiming Xiong, Devi Parikh, and Richard Socher. 2017. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3242–3250. https://doi.org/10.1109/CVPR.2017.345

[9]

Vishnu Nair, Hanxiu ’Hazel’ Zhu, and Brian A. Smith. 2023. ImageAssist: Tools for Enhancing Touchscreen-Based Image Exploration Systems for Blind and Low Vision Users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Article 76, 17 pages. https://doi.org/10.1145/3544548.3581302

Digital Library

[10]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 779–788. https://doi.org/10.1109/CVPR.2016.91

[11]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2016. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arxiv:1506.01497

[12]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. arxiv:1409.0575

[13]

Anastasia Schaadhardt, Alexis Hiniker, and Jacob O. Wobbrock. 2021. Understanding Blind Screen-Reader Users’ Experiences of Digital Artboards. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Article 270, 19 pages. https://doi.org/10.1145/3411764.3445242

Digital Library

[14]

Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3156–3164. https://doi.org/10.1109/CVPR.2015.7298935

[15]

Zhuohao (Jerry) Zhang and Jacob O. Wobbrock. 2023. A11yBoard: Making Digital Artboards Accessible to Blind and Low-Vision Users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Article 55, 17 pages. https://doi.org/10.1145/3544548.3580655

Digital Library

Index Terms

Self-guided Spatial Composition as an Additional Layer of Information to Enhance Accessibility of Images for Blind Users
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
2. Human-centered computing
  1. Accessibility
    1. Accessibility technologies
    2. Empirical studies in accessibility

Recommendations

Accessibility for Blind Users: An Innovative Framework
ICCHP '08: Proceedings of the 11th international conference on Computers Helping People with Special Needs

Accessibility, i.e. the possibility for users with specific disabilities to access Web resources, has received specific attention by the W3C consortium that has produced guidelines for web developers. These guidelines are, mostly, technical ...
Human-in-the-Loop Machine Learning to Increase Video Accessibility for Visually Impaired and Blind Users
DIS '20: Proceedings of the 2020 ACM Designing Interactive Systems Conference

Video accessibility is crucial for blind and visually impaired individuals for education, employment, and entertainment purposes. However, professional video descriptions are costly and time-consuming. Volunteer-created video descriptions could be a ...
Requirements of indoor navigation system from blind users
USAB'11: Proceedings of the 7th conference on Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society: information Quality in e-Health

Most blind people navigate within buildings with help only from other people. One of the reasons is that there isn't enough information about the buildings available to them. To address this problem, we are working on a project named MOBILITY. One of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

Interacción '24: Proceedings of the XXIV International Conference on Human Computer Interaction

June 2024

155 pages

ISBN:9798400717871

DOI:10.1145/3657242

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Agencia Estatal de Investigación, España

Conference

INTERACCION 2024

INTERACCION 2024: XXIV Congreso Internacional de Interacción Persona-Ordenador \ XXIV International Conference on Human Computer Interaction

June 19 - 21, 2024

A Coruña, Spain

Acceptance Rates

Overall Acceptance Rate 109 of 163 submissions, 67%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
86
Total Downloads

Downloads (Last 12 months)86
Downloads (Last 6 weeks)22

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents