research-article

"Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions

Authors:

Abigale Stangl,

Meredith Ringel Morris,

Danna GurariAuthors Info & Claims

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

Pages 1 - 13

https://doi.org/10.1145/3313831.3376404

Published: 23 April 2020 Publication History

Abstract

Access to digital images is important to people who are blind or have low vision (BLV). Many contemporary image description efforts do not take into account this population's nuanced image description preferences. In this paper, we present a qualitative study that provides insight into 28 BLV people's experiences with descriptions of digital images from news websites, social networking sites/platforms, eCommerce websites, employment websites, online dating websites/platforms, productivity applications, and e-publications. Our findings reveal how image description preferences vary based on the source where digital images are encountered and the surrounding context. We provide recommendations for the development of next-generation image description technologies inspired by our empirical analysis.

Supplemental Material

PDF File

We include our semi-structured and contextual inquiry procedures as supplementary materials.

Download
63.49 KB

References

[1]

Roobaea Alroobaea and Pam J Mayhew. 2014. How many participants are really enough for usability studies?. In 2014 Science and Information Conference. IEEE, 48--56.

[2]

Cynthia L Bennett, Martez E Mott, Edward Cutrell, Meredith Ringel Morris, and others. 2018. How Teens with Visual Impairments Take, Edit, and Share Photos on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 76.

Digital Library

[3]

Nilavra Bhattacharya, Qing Li, and Danna Gurari. 2019. Why Does a Visual Question Have Different Answers?. In Proceedings of the IEEE International Conference on Computer Vision. 4271--4280.

[4]

Jeffrey P Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, and others. 2010. VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 333--342.

Digital Library

[5]

Jeffrey P Bigham, Richard E Ladner, and Yevgen Borodin. 2011. The design of human-powered access technology. In The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility. ACM, 3--10.

Digital Library

[6]

Erin Brady, Meredith Ringel Morris, and Jeffrey P Bigham. 2015. Gauging receptiveness to social microvolunteering. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 1055--1064.

Digital Library

[7]

Erin Brady, Meredith Ringel Morris, Yu Zhong, Samuel White, and Jeffrey P. Bigham. 2013. Visual Challenges in the Everyday Lives of Blind People. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, NY, NY, USA, 2117--2126.

Digital Library

[8]

Stacy M Branham, Ali Abdolrahmani, William Easley, Morgan Scheuerman, Erick Ronquillo, and Amy Hurst. 2017. Is Someone There? Do They Have a Gun: How Visual Information about Others Can Improve Personal Safety Management for Blind Individuals. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 260--269.

Digital Library

[9]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77--101.

[10]

Michele A Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P Bigham, and Amy Hurst. 2012. Crowdsourcing subjective fashion advice using VizWiz: challenges and opportunities. In Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility. ACM, 135--142.

Digital Library

[11]

Diagram Center. Specific Guidelines: Art, Photos & Cartoons. http://diagramcenter.org/ specific-guidelines-final-draft.html#20. (No Date).

[12]

Diagram Center and Touch Graphics. No Date. Decision Tree. http://diagramcenter.org/decision-tree.html. (No Date). (Accessed on 12/30/2019).

[13]

World Wide Web Consortium. 2019. How to Meet WCAG (Quickref Reference). https://www.w3.org/WAI/WCAG21/quickref/. (October 2019). (Accessed on 01/02/2019).

[14]

Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh K Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C Platt, and others. 2015. From captions to visual concepts and back. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1473--1482.

[15]

Kay Alicyn Ferrell, Silvia M Correa-Torres, Jennifer Johnson Howell, Robert Pearson, Wendy Morrow Carver, Amy Spencer Groll, Tanni L Anthony, Deborah Matthews, Bryan Gould, Trisha O'Connell, and others. 2017. Audible Image Description as an Accommodation in Statewide Assessments for Students with Visual and Print Disabilities. Journal of Visual Impairment & Blindness 111, 4 (2017), 325--339.

[16]

Cole Gleason, Patrick Carrington, Cameron Cassidy, Meredith Ringel Morris, Kris M. Kitani, and Jeffrey P. Bigham. 2019. &Ldquo;It's Almost Like They'Re Trying to Hide It&Rdquo;: How User-Provided Image Descriptions Have Failed to Make Twitter Accessible. In The World Wide Web Conference (WWW '19). ACM, NY, NY, USA, 549--559.

[17]

Darren Guinness, Edward Cutrell, and Meredith Ringel Morris. 2018. Caption crawler: Enabling reusable alternative text descriptions using reverse image search. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 518.

Digital Library

[18]

Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, and Meredith Ringel Morris. 2019. Toward Fairness in AI for People with Disabilities: A Research Roadmap. arXiv preprint arXiv:1907.02227 (2019).

[19]

Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P. Bigham. 2018. VizWiz Grand Challenge: Answering Visual Questions From Blind People. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]

Simon Harper and Alex Q Chen. 2012. Web accessibility guidelines. World Wide Web 15, 1 (2012), 61--88.

Digital Library

[21]

AI Now Institute. 2019. Disability, Bias, and AI. https://ainowinstitute.org/disabilitybiasai-2019.pdf. (November 2019). (Accessed on 01/02/2020).

[22]

Os Keyes and Cynthia L. Bennett. 2019. What Is the Point of Fairness? Disability, AI and The Complexity of Justice. arXiv preprint arXiv:1908.01024 (2019).

[23]

Jonathan Lazar, Alfreda Dudley-Sponaugle, and Kisha-Dawn Greenidge. 2004. Improving web accessibility: a study of webmaster perceptions. Computers in human behavior 20, 2 (2004), 269--288.

[24]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.

[25]

Haley MacLeod, Cynthia L Bennett, Meredith Ringel Morris, and Edward Cutrell. 2017. Understanding Blind People's Experiences with Computer-Generated Captions of Social Media Images. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 5988--5999.

Digital Library

[26]

Valerie S Morash, Yue-Ting Siu, Joshua A Miele, Lucia Hasty, and Steven Landau. 2015. Guiding novice web workers in making image descriptions using templates. ACM Transactions on Accessible Computing (TACCESS) 7, 4 (2015), 12.

[27]

Meredith Ringel Morris. 2019. AI and Accessibility: A Discussion of Ethical Considerations. arXiv preprint arXiv:1908.08939 (2019).

[28]

Meredith Ringel Morris, Jazette Johnson, Cynthia L Bennett, and Edward Cutrell. 2018. Rich representations of visual content for screen reader users. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 59.

Digital Library

[29]

Meredith Ringel Morris, Annuska Zolyomi, Catherine Yao, Sina Bahram, Jeffrey P Bigham, and Shaun K Kane. 2016. With most of it being pictures now, I rarely use it: Understanding Twitter's Evolving Accessibility to Blind Users. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5506--5516.

Digital Library

[30]

Helen Petrie, Chandra Harrison, and Sundeep Dev. 2005. Describing images on the web: a survey of current practice and prospects for the future. Proceedings of Human Computer Interaction International (HCII) 71 (2005).

[31]

John R Porter, Kiley Sobel, Sarah E Fox, Cynthia L Bennett, and Julie A Kientz. 2017. Filtered out: Disability disclosure practices in online dating communities. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 87.

Digital Library

[32]

Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2017. Toward scalable social alt text: Conversational crowdsourcing as a tool for refining vision-to-language technology for the blind. In Fifth AAAI Conference on Human Computation and Crowdsourcing.

[33]

Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2018. Evaluating and Complementing Vision-to-Language Technology for People who are Blind with Conversational Crowdsourcing. In IJCAI. 5349--5353.

[34]

John M Slatin and Sharron Rush. 2002. Maximum accessibility: Making your web site more usable for everyone. Addison-Wesley Longman Publishing Co., Inc.

Digital Library

[35]

Abigale J Stangl, Esha Kothari, Suyog D Jain, Tom Yeh, Kristen Grauman, and Danna Gurari. 2018. BrowseWithMe: An Online Clothes Shopping Assistant for People with Visual Impairments. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 107--118.

Digital Library

[36]

Anselm Strauss and Juliet Corbin. 1998. Basics of qualitative research techniques. Sage publications Thousand Oaks, CA.

[37]

Kenneth Tran, Xiaodong He, Lei Zhang, Jian Sun, Cornelia Carapcea, Chris Thrasher, Chris Buehler, and Chris Sienkiewicz. 2016. Rich image captioning in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 49--56.

[38]

Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3156--3164.

[39]

Violeta Voykinska, Shiri Azenkot, Shaomei Wu, and Gilly Leshed. 2016. How blind people interact with visual content on social networking services. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. ACM, 1584--1595.

Digital Library

[40]

Alexandra Vtyurina and Adam Fourney. 2018. Exploring the role of conversational cues in guided task support with virtual assistants. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 208.

Digital Library

[41]

Web Accessibility Initiative (WAI) W3C. Date. Web Content Accessibility Guidelines (WCAG) Overview. https://www.w3.org/WAI/standards-guidelines/wcag/. (No Date). (Accessed on 06/20/2019).

[42]

Qi Wu, Peng Wang, Chunhua Shen, Anthony Dick, and Anton van den Hengel. 2016. Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4622--4630.

[43]

Shaomei Wu, Jeffrey Wieland, Omid Farivar, and Julie Schiller. 2017. Automatic alt-text: Computer-generated image descriptions for blind users on a social network service. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 1180--1192.

Digital Library

[44]

Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2017. The Effect of Computer-Generated Descriptions on Photo-Sharing Experiences of People With Visual Impairments. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 121.

Digital Library

[45]

Yu Zhong, Walter S Lasecki, Erin Brady, and Jeffrey P Bigham. 2015. Regionspeak: Quick comprehensive spatial descriptions of complex images for blind users. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2353--2362.

Digital Library

Cited By

Olivetti JMelo P(2024)Through the Eyes of Instagram: Analyzing Image Content utilizing Meta's Automatic Alt-TextProceedings of the 30th Brazilian Symposium on Multimedia and the Web (WebMedia 2024)10.5753/webmedia.2024.241695(275-282)Online publication date: 14-Oct-2024
https://doi.org/10.5753/webmedia.2024.241695
Lyu YCai JDosono BYadav DCarroll J(2024)"I Upload... All Types of Different Things to Say the World of Blindness Is More Than What They Think It Is": A Study of Blind TikTokers' Identity Work from a Flourishing PerspectiveProceedings of the ACM on Human-Computer Interaction10.1145/36870138:CSCW2(1-24)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.1145/3687013
Xu ACai MHou DChang RGuo A(2024)ImageExplorer Deployment: Understanding Text-Based and Touch-Based Image Exploration in the WildProceedings of the 21st International Web for All Conference10.1145/3677846.3677861(59-69)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3677846.3677861
Show More Cited By

Index Terms

"Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions
1. Human-centered computing
  1. Accessibility
    1. Empirical studies in accessibility

Recommendations

Going Beyond One-Size-Fits-All Image Descriptions to Satisfy the Information Wants of People Who are Blind or Have Low Vision
ASSETS '21: Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility

Image descriptions are how people who are blind or have low vision (BLV) access information depicted within images. To our knowledge, no prior work has examined how a description for an image should be designed for different scenarios in which users ...
Rich Representations of Visual Content for Screen Reader Users
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

Alt text (short for "alternative text") is descriptive text associated with an image in HTML and other document formats. Screen reader technologies speak the alt text aloud to people who are visually impaired. Introduced with HTML 2.0 in 1995, the alt ...
“Honestly I Never Really Thought About Adding a Description”: Why Highly Engaged Tweets Are Inaccessible
Human-Computer Interaction – INTERACT 2021
Abstract
Alternative (alt) text is vital for visually impaired users to consume digital images with screen readers. When these image descriptions are not incorporated, these users encounter accessibility challenges. In this study, we explore the prevalence ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

April 2020

10688 pages

ISBN:9781450367080

DOI:10.1145/3313831

General Chairs:
Regina Bernhaupt
Eindhoven University of Technology, Netherlands
,
Florian 'Floyd' Mueller
Monash University, Australia
,
David Verweij
Newcastle University, UK
,
Josh Andres
RMIT, Australia
,
Program Chairs:
Joanna McGrenere
University of British Columbia, Canada
,
Andy Cockburn
University of Canterbury, New Zealand
,
Ignacio Avellino
University of Maryland Baltimore County, USA
,
Alix Goguey
Grenoble Alpes University, France
,
Pernille Bjørn
University of Copenhagen, Denmark
,
Shengdong (Shen) Zhao
National University of Singapore, Singapore
,
Briane Paul Samson
Future University Hakodate, Japan & De La Salle University, Philippines
,
Rafal Kocielnik
University of Washington, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Micorsoft

Conference

CHI '20

Sponsor:

SIGCHI

CHI '20: CHI Conference on Human Factors in Computing Systems

April 25 - 30, 2020

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

92
Total Citations
View Citations
1,542
Total Downloads

Downloads (Last 12 months)315
Downloads (Last 6 weeks)27

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Olivetti JMelo P(2024)Through the Eyes of Instagram: Analyzing Image Content utilizing Meta's Automatic Alt-TextProceedings of the 30th Brazilian Symposium on Multimedia and the Web (WebMedia 2024)10.5753/webmedia.2024.241695(275-282)Online publication date: 14-Oct-2024
https://doi.org/10.5753/webmedia.2024.241695
Lyu YCai JDosono BYadav DCarroll J(2024)"I Upload... All Types of Different Things to Say the World of Blindness Is More Than What They Think It Is": A Study of Blind TikTokers' Identity Work from a Flourishing PerspectiveProceedings of the ACM on Human-Computer Interaction10.1145/36870138:CSCW2(1-24)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.1145/3687013
Xu ACai MHou DChang RGuo A(2024)ImageExplorer Deployment: Understanding Text-Based and Touch-Based Image Exploration in the WildProceedings of the 21st International Web for All Conference10.1145/3677846.3677861(59-69)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3677846.3677861
Alharbi RLor PHerskovitz JSchoenebeck SBrewer R(2024)Misfitting With AI: How Blind People Verify and Contest AI ErrorsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675659(1-17)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675659
Gubbi Mohanbabu APavel A(2024)Context-Aware Image Descriptions for Web AccessibilityProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675658(1-17)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675658
Natalie RChang RSheshadri SGuo AHara K(2024)Audio Description CustomizationProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675617(1-19)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675617
Chang RLiu YGuo A(2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676375
Xu SChen CLiu ZJin XYuan LYan YQu H(2024)Memory Reviver: Supporting Photo-Collection Reminiscence for People with Visual Impairment via a Proactive ChatbotProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676336(1-17)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676336
Seixas Pereira LGuerreiro JRodrigues AGuerreiro TDuarte C(2024)From Automation to User Empowerment: Investigating the Role of a Semi-automatic Tool in Social Media AccessibilityACM Transactions on Accessible Computing10.1145/364764317:3(1-25)Online publication date: 27-Sep-2024
https://dl.acm.org/doi/10.1145/3647643
Yin ASogani ROewel BPhan KPark JYeo MYazzolino LArcos KAbdolrahmani ABlank EGilbert MBranham S(2024)“Malicious” Pictorials: How Alt Text Matters to Screen Reader Users' Experience of Image-Dense MediaProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660747(1262-1274)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3660747
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten