Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3313831.3376404acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

"Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions

Published: 23 April 2020 Publication History

Abstract

Access to digital images is important to people who are blind or have low vision (BLV). Many contemporary image description efforts do not take into account this population's nuanced image description preferences. In this paper, we present a qualitative study that provides insight into 28 BLV people's experiences with descriptions of digital images from news websites, social networking sites/platforms, eCommerce websites, employment websites, online dating websites/platforms, productivity applications, and e-publications. Our findings reveal how image description preferences vary based on the source where digital images are encountered and the surrounding context. We provide recommendations for the development of next-generation image description technologies inspired by our empirical analysis.

Supplemental Material

PDF File
We include our semi-structured and contextual inquiry procedures as supplementary materials.

References

[1]
Roobaea Alroobaea and Pam J Mayhew. 2014. How many participants are really enough for usability studies?. In 2014 Science and Information Conference. IEEE, 48--56.
[2]
Cynthia L Bennett, Martez E Mott, Edward Cutrell, Meredith Ringel Morris, and others. 2018. How Teens with Visual Impairments Take, Edit, and Share Photos on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 76.
[3]
Nilavra Bhattacharya, Qing Li, and Danna Gurari. 2019. Why Does a Visual Question Have Different Answers?. In Proceedings of the IEEE International Conference on Computer Vision. 4271--4280.
[4]
Jeffrey P Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, and others. 2010. VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 333--342.
[5]
Jeffrey P Bigham, Richard E Ladner, and Yevgen Borodin. 2011. The design of human-powered access technology. In The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility. ACM, 3--10.
[6]
Erin Brady, Meredith Ringel Morris, and Jeffrey P Bigham. 2015. Gauging receptiveness to social microvolunteering. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 1055--1064.
[7]
Erin Brady, Meredith Ringel Morris, Yu Zhong, Samuel White, and Jeffrey P. Bigham. 2013. Visual Challenges in the Everyday Lives of Blind People. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, NY, NY, USA, 2117--2126.
[8]
Stacy M Branham, Ali Abdolrahmani, William Easley, Morgan Scheuerman, Erick Ronquillo, and Amy Hurst. 2017. Is Someone There? Do They Have a Gun: How Visual Information about Others Can Improve Personal Safety Management for Blind Individuals. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 260--269.
[9]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77--101.
[10]
Michele A Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P Bigham, and Amy Hurst. 2012. Crowdsourcing subjective fashion advice using VizWiz: challenges and opportunities. In Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility. ACM, 135--142.
[11]
Diagram Center. Specific Guidelines: Art, Photos & Cartoons. http://diagramcenter.org/ specific-guidelines-final-draft.html#20. (No Date).
[12]
Diagram Center and Touch Graphics. No Date. Decision Tree. http://diagramcenter.org/decision-tree.html. (No Date). (Accessed on 12/30/2019).
[13]
World Wide Web Consortium. 2019. How to Meet WCAG (Quickref Reference). https://www.w3.org/WAI/WCAG21/quickref/. (October 2019). (Accessed on 01/02/2019).
[14]
Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh K Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C Platt, and others. 2015. From captions to visual concepts and back. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1473--1482.
[15]
Kay Alicyn Ferrell, Silvia M Correa-Torres, Jennifer Johnson Howell, Robert Pearson, Wendy Morrow Carver, Amy Spencer Groll, Tanni L Anthony, Deborah Matthews, Bryan Gould, Trisha O'Connell, and others. 2017. Audible Image Description as an Accommodation in Statewide Assessments for Students with Visual and Print Disabilities. Journal of Visual Impairment & Blindness 111, 4 (2017), 325--339.
[16]
Cole Gleason, Patrick Carrington, Cameron Cassidy, Meredith Ringel Morris, Kris M. Kitani, and Jeffrey P. Bigham. 2019. &Ldquo;It's Almost Like They'Re Trying to Hide It&Rdquo;: How User-Provided Image Descriptions Have Failed to Make Twitter Accessible. In The World Wide Web Conference (WWW '19). ACM, NY, NY, USA, 549--559.
[17]
Darren Guinness, Edward Cutrell, and Meredith Ringel Morris. 2018. Caption crawler: Enabling reusable alternative text descriptions using reverse image search. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 518.
[18]
Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, and Meredith Ringel Morris. 2019. Toward Fairness in AI for People with Disabilities: A Research Roadmap. arXiv preprint arXiv:1907.02227 (2019).
[19]
Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P. Bigham. 2018. VizWiz Grand Challenge: Answering Visual Questions From Blind People. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20]
Simon Harper and Alex Q Chen. 2012. Web accessibility guidelines. World Wide Web 15, 1 (2012), 61--88.
[21]
AI Now Institute. 2019. Disability, Bias, and AI. https://ainowinstitute.org/disabilitybiasai-2019.pdf. (November 2019). (Accessed on 01/02/2020).
[22]
Os Keyes and Cynthia L. Bennett. 2019. What Is the Point of Fairness? Disability, AI and The Complexity of Justice. arXiv preprint arXiv:1908.01024 (2019).
[23]
Jonathan Lazar, Alfreda Dudley-Sponaugle, and Kisha-Dawn Greenidge. 2004. Improving web accessibility: a study of webmaster perceptions. Computers in human behavior 20, 2 (2004), 269--288.
[24]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.
[25]
Haley MacLeod, Cynthia L Bennett, Meredith Ringel Morris, and Edward Cutrell. 2017. Understanding Blind People's Experiences with Computer-Generated Captions of Social Media Images. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 5988--5999.
[26]
Valerie S Morash, Yue-Ting Siu, Joshua A Miele, Lucia Hasty, and Steven Landau. 2015. Guiding novice web workers in making image descriptions using templates. ACM Transactions on Accessible Computing (TACCESS) 7, 4 (2015), 12.
[27]
Meredith Ringel Morris. 2019. AI and Accessibility: A Discussion of Ethical Considerations. arXiv preprint arXiv:1908.08939 (2019).
[28]
Meredith Ringel Morris, Jazette Johnson, Cynthia L Bennett, and Edward Cutrell. 2018. Rich representations of visual content for screen reader users. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 59.
[29]
Meredith Ringel Morris, Annuska Zolyomi, Catherine Yao, Sina Bahram, Jeffrey P Bigham, and Shaun K Kane. 2016. With most of it being pictures now, I rarely use it: Understanding Twitter's Evolving Accessibility to Blind Users. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5506--5516.
[30]
Helen Petrie, Chandra Harrison, and Sundeep Dev. 2005. Describing images on the web: a survey of current practice and prospects for the future. Proceedings of Human Computer Interaction International (HCII) 71 (2005).
[31]
John R Porter, Kiley Sobel, Sarah E Fox, Cynthia L Bennett, and Julie A Kientz. 2017. Filtered out: Disability disclosure practices in online dating communities. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 87.
[32]
Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2017. Toward scalable social alt text: Conversational crowdsourcing as a tool for refining vision-to-language technology for the blind. In Fifth AAAI Conference on Human Computation and Crowdsourcing.
[33]
Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2018. Evaluating and Complementing Vision-to-Language Technology for People who are Blind with Conversational Crowdsourcing. In IJCAI. 5349--5353.
[34]
John M Slatin and Sharron Rush. 2002. Maximum accessibility: Making your web site more usable for everyone. Addison-Wesley Longman Publishing Co., Inc.
[35]
Abigale J Stangl, Esha Kothari, Suyog D Jain, Tom Yeh, Kristen Grauman, and Danna Gurari. 2018. BrowseWithMe: An Online Clothes Shopping Assistant for People with Visual Impairments. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 107--118.
[36]
Anselm Strauss and Juliet Corbin. 1998. Basics of qualitative research techniques. Sage publications Thousand Oaks, CA.
[37]
Kenneth Tran, Xiaodong He, Lei Zhang, Jian Sun, Cornelia Carapcea, Chris Thrasher, Chris Buehler, and Chris Sienkiewicz. 2016. Rich image captioning in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 49--56.
[38]
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3156--3164.
[39]
Violeta Voykinska, Shiri Azenkot, Shaomei Wu, and Gilly Leshed. 2016. How blind people interact with visual content on social networking services. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. ACM, 1584--1595.
[40]
Alexandra Vtyurina and Adam Fourney. 2018. Exploring the role of conversational cues in guided task support with virtual assistants. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 208.
[41]
Web Accessibility Initiative (WAI) W3C. Date. Web Content Accessibility Guidelines (WCAG) Overview. https://www.w3.org/WAI/standards-guidelines/wcag/. (No Date). (Accessed on 06/20/2019).
[42]
Qi Wu, Peng Wang, Chunhua Shen, Anthony Dick, and Anton van den Hengel. 2016. Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4622--4630.
[43]
Shaomei Wu, Jeffrey Wieland, Omid Farivar, and Julie Schiller. 2017. Automatic alt-text: Computer-generated image descriptions for blind users on a social network service. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 1180--1192.
[44]
Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2017. The Effect of Computer-Generated Descriptions on Photo-Sharing Experiences of People With Visual Impairments. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 121.
[45]
Yu Zhong, Walter S Lasecki, Erin Brady, and Jeffrey P Bigham. 2015. Regionspeak: Quick comprehensive spatial descriptions of complex images for blind users. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2353--2362.

Cited By

View all
  • (2024)Through the Eyes of Instagram: Analyzing Image Content utilizing Meta's Automatic Alt-TextProceedings of the 30th Brazilian Symposium on Multimedia and the Web (WebMedia 2024)10.5753/webmedia.2024.241695(275-282)Online publication date: 14-Oct-2024
  • (2024)"I Upload... All Types of Different Things to Say the World of Blindness Is More Than What They Think It Is": A Study of Blind TikTokers' Identity Work from a Flourishing PerspectiveProceedings of the ACM on Human-Computer Interaction10.1145/36870138:CSCW2(1-24)Online publication date: 8-Nov-2024
  • (2024)ImageExplorer Deployment: Understanding Text-Based and Touch-Based Image Exploration in the WildProceedings of the 21st International Web for All Conference10.1145/3677846.3677861(59-69)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. "Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
    April 2020
    10688 pages
    ISBN:9781450367080
    DOI:10.1145/3313831
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 April 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. accessibility
    2. alt text
    3. image captions
    4. visual impairment

    Qualifiers

    • Research-article

    Funding Sources

    • Micorsoft

    Conference

    CHI '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)315
    • Downloads (Last 6 weeks)27
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Through the Eyes of Instagram: Analyzing Image Content utilizing Meta's Automatic Alt-TextProceedings of the 30th Brazilian Symposium on Multimedia and the Web (WebMedia 2024)10.5753/webmedia.2024.241695(275-282)Online publication date: 14-Oct-2024
    • (2024)"I Upload... All Types of Different Things to Say the World of Blindness Is More Than What They Think It Is": A Study of Blind TikTokers' Identity Work from a Flourishing PerspectiveProceedings of the ACM on Human-Computer Interaction10.1145/36870138:CSCW2(1-24)Online publication date: 8-Nov-2024
    • (2024)ImageExplorer Deployment: Understanding Text-Based and Touch-Based Image Exploration in the WildProceedings of the 21st International Web for All Conference10.1145/3677846.3677861(59-69)Online publication date: 13-May-2024
    • (2024)Misfitting With AI: How Blind People Verify and Contest AI ErrorsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675659(1-17)Online publication date: 27-Oct-2024
    • (2024)Context-Aware Image Descriptions for Web AccessibilityProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675658(1-17)Online publication date: 27-Oct-2024
    • (2024)Audio Description CustomizationProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675617(1-19)Online publication date: 27-Oct-2024
    • (2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024
    • (2024)Memory Reviver: Supporting Photo-Collection Reminiscence for People with Visual Impairment via a Proactive ChatbotProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676336(1-17)Online publication date: 13-Oct-2024
    • (2024)From Automation to User Empowerment: Investigating the Role of a Semi-automatic Tool in Social Media AccessibilityACM Transactions on Accessible Computing10.1145/364764317:3(1-25)Online publication date: 27-Sep-2024
    • (2024)“Malicious” Pictorials: How Alt Text Matters to Screen Reader Users' Experience of Image-Dense MediaProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660747(1262-1274)Online publication date: 1-Jul-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media